OpenBLAS, as a high-performance open-source BLAS/LAPACK implementation, released a new version on Sunday with more CPU optimizations and expanded CPU coverage.
OpenBLAS 0.3.21 supports more processors, especially on the Arm side. There’s also support for more compilers, build system improvements, and more. Highlights of the OpenBLAS 0.3.21 release include:
– Support for building OpenBLAS with Intel IFX, Fujitsu FCC and Cray C/Fortran compilers is now supported.
– Initial support for Zhaoxin/Centaur KH40000 processors.
– OpenBLAS CMake build system now supports cross-compilation for Intel and AMD x86_64 single targets. There are now CMake targets on display ranging from Intel Prescott to Sapphire Rapids, then from the AMD side of Barcelona to Zen.
– Various IBM POWER fixes, including a number of Power10 fixes. The OpenBLAS POWER version now also allows compiling BFLOAT16 kernels by default.
– Fixed OpenBLAS RISC-V processor auto-detection logic.
– SBGEMM core for Neoverse-N2 arm is added.
– Support for 64-bit Arm systems running Microsoft Windows.
– Initial support for the Apple M1 processor on Linux.
– Initial support for the Phytium FT2000 processor.
– Initial support for Arm Cortex A510/A710/X1/X2 processors.
– Fixed OpenBLAS compiling on various x86_64 CPU targets under different conditions.
– Initial support for Loongson 2K1000 processor.
All details and downloads via OpenBLAS on GitHub.