Intel MKL

Intel Math Kernel Library (Intel MKL) is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math. The routines in MKL are hand-optimized specifically for Intel processors.

Documentation

The reference manual for INTEL MKL may be found here.

It includes:

Benchmarks

These benchmarks are offered to help you make informed decisions about which routines to use in your applications, including performance for each major function domain in Intel® oneAPI Math Kernel Library (oneMKL) by processor family. Some benchmark charts only include absolute performance measurements for specific problem sizes. Others compare previous versions, popular alternate open-source libraries, and other functions for oneMKL [2].

image-1611056010894.png

image-1611056010898.png

Why is Intel MKL faster?

Optimization done for maximum speed. Resource limited optimization – exhaust one or more resource of system [3]:

Compilation

Compile with intel/2020

#Environment setup
module purge
module load intel/2020
module load intel/2020.mkl
source /cvmfs/sw.el7/intel/2020/mkl/bin/mklvars.sh intel64
    
icc -mkl <source_file.c> -o <output_binary_name>

./<output_binary_name> #Execute binary

Compile with intel/mvapich2/2.3.3

#Environment setup
module purge
module load intel/2020
module load intel/2020.mkl
module load intel/mvapich2/2.3.3
source /cvmfs/sw.el7/intel/2020/mkl/bin/mklvars.sh intel64
    
mpicc -mkl <source_file.c> -o <output_binary_name>

./<output_binary_name> #Execute binary

Compile with gcc-8.1

#Environment setup
module purge
module load gcc-8.1
module load intel/2020.mkl
source /cvmfs/sw.el7/intel/2020/mkl/bin/mklvars.sh intel64

#Program compile
gcc -L${MKLROOT}/lib/intel64 -Wl,--no-as-needed -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm -ldl  <source_file.c>  -o <output_binary_name>

#Execute binary
./<output_binary_name> 

Performance Test

To test performance, we start by running an example and perform the following calculation: C = alpha*A*B + C where A, B and C are matrices of the same dimension.

WITH MKL

GCC MPICC ICC
n = 2000 0.19 s 0.14 s 0.16 s
n = 20000 51.86 s 50.01 s 49.71 s

WITH MKL AND MPI

1 Node 2 Nodes 3 Nodes
MVAPICH2
MPICH
INTEL MPI

References

[1] https://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_lapack_examples/c_bindings.htm

[2] https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html

[3] intel.cn/content/dam/www/public/apac/xa/en/pdfs/ssg/Intel_Performance_Libraries_Intel_Math_Kernel_Library(MKL).pdf


Revision #6
Created 27 January 2021 10:54:25 by Miguel Viana
Updated 28 January 2021 12:24:34 by Miguel Viana