1. Problem statement
17/12/21 11:11:56 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS 17/12/21 11:11:56 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
The warning messages are often displayed when you use MLlib in Apache Spark. It means native BLAS implementations are not rightly installed or configured for your Apache Spark. A pure Java implementation is used which could harm the performance. See  for more information.
The official Spark document  has an explanation about the warning message
MLlib uses the linear algebra package Breeze, which depends on netlib-java for optimised numerical processing. If native libraries are not available at runtime, you will see a warning message and a pure JVM implementation will be used instead.
Due to licensing issues with runtime proprietary binaries, we do not include netlib-java’s native proxies by default.
As stated in the official Spark document ,
To configure netlib-java / Breeze to use system optimised binaries, include com.github.fommil.netlib:all:1.1.2 (or build Spark with -Pnetlib-lgpl) as a dependency of your project
there are two kinds of solutions
- rebuild Apache Spark
- configure your project
The first one is almost impossible in some scenario such as Amazon EMR. This post focus on the second solution instead.
Most of the linear algebra related functions in Spark MLlib are based on Breeze which is a numerical processing library for Scala, while some of them are directly based on the low level library netlib-java which is also used by Breeze. In addition, Spark MLlib has some non-BLAS in-house implementations as well.
In netlib-java, the implementations of BLAS/LAPACK are provided by
- “F2J to ensure full portability on the JVM”
- “self-contained native builds using the reference Fortran from netlib.org”
- “delegating builds that use machine optimised system libraries”
The relation is illustrated as the figure. In this post, we are trying to configure and use system-provided BLAS (in green).
Make sure a native BLAS/LAPACK implementation is installed such as ATLAS, Intel MKL, and OpenBLAS. OpenBLAS generally has an excellent performance among free implementations. If you work on macOS, its vecLib contains Apple’s highly tuned implementation of BLAS/LAPACK.
As sugguested in the official Spark documeny , include
com.github.fommil.netlib:all:1.1.2 in your project to use system optimized binaries.
Add your generated fat
spark-default.conf. Do not use
--jars when you
spark-submit your jobs, it does not work (I also want to know why).
pyspark jobs, you only need to do the same configuration in order to use native BLAS.
NOTE: frequently changing
spark-default.conf is not convenient. Instead, you can prepare two
JARs, one is for your project and one is for netlib-java.
3.1 Amazon Linux
As some people said , the BLAS/LAPACK installed in Amazon Linux does not perform well. We can install OpenBLAS instead. Here is a bash script to install OpenBLAS in Amazon Linux:
#!/bin/bash set -e sudo yum install -y git git clone https://github.com/xianyi/OpenBlas.git cd OpenBlas/ make clean make -j sudo mkdir /usr/lib64/OpenBLAS sudo chmod o+w,g+w /usr/lib64/OpenBLAS/ make PREFIX=/usr/lib64/OpenBLAS install sudo rm /etc/ld.so.conf.d/atlas-x86_64.conf sudo ldconfig sudo ln -sf /usr/lib64/OpenBLAS/lib/libopenblas.so /usr/lib64/libblas.so sudo ln -sf /usr/lib64/OpenBLAS/lib/libopenblas.so /usr/lib64/libblas.so.3 sudo ln -sf /usr/lib64/OpenBLAS/lib/libopenblas.so /usr/lib64/libblas.so.3.5 sudo ln -sf /usr/lib64/OpenBLAS/lib/libopenblas.so /usr/lib64/libblas.so.3.5.0 sudo ln -sf /usr/lib64/OpenBLAS/lib/libopenblas.so /usr/lib64/liblapack.so sudo ln -sf /usr/lib64/OpenBLAS/lib/libopenblas.so /usr/lib64/liblapack.so.3 sudo ln -sf /usr/lib64/OpenBLAS/lib/libopenblas.so /usr/lib64/liblapack.so.3.5 sudo ln -sf /usr/lib64/OpenBLAS/lib/libopenblas.so /usr/lib64/liblapack.so.3.5.0
3.2 The multi-thread issue
As presented in the issue Spark-21305, BLAS with multi-thread support can cause worse performance because it conflicts with Spark executors. Therefore, it is better to disable multi-thread.
 is a bit out of date but is still very worth to read. It gives lots of details about implementations in Spark and experimental results using different BLAS implementations.
This post is originated from reading . I found lots of related posts but they are either not complete or out of date. Thus, I decide to record all what I read during solving the problem. All comments are welcome.