OpenMathLib / OpenBLAS
OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
AI Architecture Analysis
This repository is indexed by RepoMind. By analyzing OpenMathLib/OpenBLAS in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Overview (README excerpt)
Crawler viewOpenBLAS Cirrus CI: OSUOSL POWERCI OSUOSL IBMZ-CI Introduction OpenBLAS is an optimized BLAS (Basic Linear Algebra Subprograms) library based on GotoBLAS2 1.13 BSD version. For more information about OpenBLAS, please see: • The documentation at openmathlib.org/OpenBLAS/docs/, • The home page at openmathlib.org/OpenBLAS/. For a general introduction to the BLAS routines, please refer to the extensive documentation of their reference implementation hosted at netlib: . On that site you will likewise find documentation for the reference implementation of the higher-level library LAPACK - the **L**inear **A**lgebra **Pack**age that comes included with OpenBLAS. If you are looking for a general primer or refresher on Linear Algebra, the set of six 20-minute lecture videos by Prof. Gilbert Strang on either MIT OpenCourseWare here or YouTube here may be helpful. Binary Packages We provide official binary packages for the following platform: • Windows x86/x86_64 • Windows arm64 (woa) You can download them from file hosting on sourceforge.net or from the Releases section of the GitHub project page. OpenBLAS is also packaged for many package managers - see the installation section of the docs for details. Installation from Source Obtain the source code from https://github.com/OpenMathLib/OpenBLAS/. Note that the default branch is (a branch is still present, but far out of date). Build-time parameters can be chosen in , see there for a short description of each option. Most options can also be given directly on the command line as parameters to your or invocation. Dependencies Building OpenBLAS requires the following to be installed: • GNU Make or CMake • A C compiler, e.g. GCC or Clang • A Fortran compiler (optional, for LAPACK) In general, using a recent version of the compiler is strongly recommended. If a Fortran compiler is not available, it is possible to compile an older version of the included LAPACK that has been machine-translated to C. Normal compile Simply invoking (or on BSD) will detect the CPU automatically. To set a specific target CPU, use , e.g. . The full target list is in the file , other build optionss are documented in Makefile.rule and can either be set there (typically by removing the comment character from the respective line), or used on the command line. Note that when you run after building, you need to repeat all command line options you provided to in the build step, as some settings like the supported maximum number of threads are automatically derived from the build host by default, which might not be what you want. For building with , the usual conventions apply, i.e. create a build directory either underneath the toplevel OpenBLAS source directory or separate from it, and invoke there with the path to the source tree and any build options you plan to set. For more details, see the Building from source section in the docs. Cross compile Set and to point to the cross toolchains, and if you use , also set to your host C compiler. The target must be specified explicitly when cross compiling. Examples: • On a Linux system, cross-compiling to an older MIPS64 router board: • or to a Windows x64 host: You can find instructions for other cases both in the "Supported Systems" section below and in the Building from source docs. The scripts included with the sources (which contain the build scripts for the "continuous integration" (CI) build tests automatically run on every proposed change to the sources) may also provide additional hints. When compiling for a more modern CPU target of the same architecture, e.g. on a host, option can be used to suppress the automatic invocation of the tests at the end of the build. Debug version A debug version can be built using . Compile with MASS support on Power CPU (optional) The IBM MASS library consists of a set of mathematical functions for C, C++, and Fortran applications that are tuned for optimum performance on POWER architectures. OpenBLAS with MASS requires a 64-bit, little-endian OS on POWER. The library can be installed as shown: • On Ubuntu: • On RHEL/CentOS: After installing the MASS library, compile OpenBLAS with . For example, to compile on Power8 with MASS support: . Install to a specific directory (optional) Use when invoking , for example (along with all options you added on the command line in the preceding build step) The default installation directory is . Supported CPUs and Operating Systems Please read for older CPU models already supported by the 2010 GotoBLAS. Additional supported CPUs x86/x86-64 • **Intel Xeon 56xx (Westmere)**: Used GotoBLAS2 Nehalem codes. • **Intel Sandy Bridge**: Optimized Level-3 and Level-2 BLAS with AVX on x86-64. • **Intel Haswell**: Optimized Level-3 and Level-2 BLAS with AVX2 and FMA on x86-64. • **Intel Skylake-X**: Optimized Level-3 and Level-2 BLAS with AVX512 and FMA on x86-64. • **Intel Cooper Lake**: as Skylake-X with improved BFLOAT16 support. • **AMD Bobcat**: Used GotoBLAS2 Barcelona codes. • **AMD Bulldozer**: x86-64 ?GEMM FMA4 kernels. (Thanks to Werner Saar) • **AMD PILEDRIVER**: Uses Bulldozer codes with some optimizations. • **AMD STEAMROLLER**: Uses Bulldozer codes with some optimizations. • **AMD ZEN**: Uses Haswell codes with some optimizations for Zen 2/3 (use SkylakeX for Zen4) MIPS32 • **MIPS 1004K**: uses P5600 codes • **MIPS 24K**: uses P5600 codes MIPS64 • **ICT Loongson 3A**: Optimized Level-3 BLAS and the part of Level-1,2. • **ICT Loongson 3B**: Experimental ARM • **ARMv6**: Optimized BLAS for vfpv2 and vfpv3-d16 (e.g. BCM2835, Cortex M0+) • **ARMv7**: Optimized BLAS for vfpv3-d32 (e.g. Cortex A8, A9 and A15) ARM64 • **ARMv8**: Basic ARMV8 with small caches, optimized Level-3 and Level-2 BLAS • **Cortex-A53**: same as ARMV8 (different cpu specifications) • **Cortex-A55**: same as ARMV8 (different cpu specifications) • **Cortex A57**: Optimized Level-3 and Level-2 functions • **Cortex A72**: same as A57 ( different cpu spe…