@CrazyHenry
        
        2018-04-20T06:02:11.000000Z
        字数 6850
        阅读 3574
    hhhhfaiss
- Author:李英民 | Henry
 - E-mail: li
 _yingmin@outlookdotcom- Home: https://liyingmin.wixsite.com/henry
 
快速了解我: About Me
转载请保留上述引用内容,谢谢配合!
版本:MLK 2017.0.098 (2017 Initial Release)
# you may have to set the LD_LIBRARY_PATH=$MKLROOT/lib/intel64 at runtime.# If at runtime you get the error:# Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.# You may add set# LD_PRELOAD=$MKLROOT/lib/intel64/libmkl_core.so:$MKLROOT/lib/intel64/libmkl_sequential.so# at runtime as well.echo 'export LD_LIBRARY_PATH="$MKLROOT/lib/intel64:$LD_LIBRARY_PATH"' >> ~/.bashrcecho 'export LD_PRELOAD="$MKLROOT/lib/intel64/libmkl_core.so:$MKLROOT/lib/intel64/libmkl_sequential.so:$LD_PRELOAD"' >> ~/.bashrcsource ~/.bashrc
解压bash install.sh选3,user级别安装但之后貌似还是需要by root install如果没有lisence file?$MKLROOT = /home/liyingmin/intel/compilers_and_libraries/linux/mklecho 'export LD_LIBRARY_PATH="/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64:$LD_LIBRARY_PATH"' >> ~/.bashrcecho 'export LD_PRELOAD="/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_core.so:/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_sequential.so:$LD_PRELOAD"' >> ~/.bashrcsource ~/.bashrc
make uninstall #也许要切换到rootmake cleanmake
A basic usage example is indemos/demo_ivfpq_indexingit makes a small index, stores it and performs some searches. A normal runtime is around 20s. With a fast machine and Intel MKL's BLAS it runs in 2.5s.
速度确实快了很多!
先修改makefie.inc
取消MKL的注释,注释掉openblas
修改路径:$MKLROOT = /home/users/yingmin.li/intel/compilers_and_libraries/linux/mkl
# Copyright (c) 2015-present, Facebook, Inc.# All rights reserved.## This source code is licensed under the BSD+Patents license found in the# LICENSE file in the root directory of this source tree.# -*- makefile -*-# tested on CentOS 7, Ubuntu 16 and Ubuntu 14, see below to adjust flags to distribution.CC=gccCXX=g++CFLAGS=-fPIC -m64 -Wall -g -O3 -mavx -msse4 -mpopcnt -fopenmp -Wno-sign-compare -fopenmpCXXFLAGS=$(CFLAGS) -std=c++11LDFLAGS=-g -fPIC -fopenmp# common linux flagsSHAREDEXT=soSHAREDFLAGS=-sharedFAISSSHAREDFLAGS=-shared########################################################################### Uncomment one of the 4 BLAS/Lapack implementation options# below. They are sorted # from fastest to slowest (in our# experiments).############################################################################ 1. Intel MKL## This is the fastest BLAS implementation we tested. Unfortunately it# is not open-source and determining the correct linking flags is a# nightmare. See## https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor## The latest tested version is MLK 2017.0.098 (2017 Initial Release) and can# be downloaded here:## https://registrationcenter.intel.com/en/forms/?productid=2558&licensetype=2## The following settings are working if MLK is installed on its default folder:MKLROOT=/home/liyingmin/intel/compilers_and_libraries/linux/mkl/BLASLDFLAGS=-Wl,--no-as-needed -L$(MKLROOT)/lib/intel64 -lmkl_intel_ilp64 \-lmkl_core -lmkl_gnu_thread -ldl -lpthreadBLASCFLAGS=-DFINTEGER=long# you may have to set the LD_LIBRARY_PATH=/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64 at runtime.# If at runtime you get the error:# Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.# You may add set# LD_PRELOAD=/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_core.so:/home/liyingmin/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_sequential.so# at runtime as well.## 2. Openblas## The library contains both BLAS and Lapack. About 30% slower than MKL. Please see# https://github.com/facebookresearch/faiss/wiki/Troubleshooting#slow-brute-force-search-with-openblas# to fix performance problemes with OpenBLAS#BLASCFLAGS=-DFINTEGER=int# This is for Centos:#BLASLDFLAGS?=/usr/lib64/libopenblas.so.0# for Ubuntu 16:# sudo apt-get install libopenblas-dev python-numpy python-dev# BLASLDFLAGS?=/usr/lib/libopenblas.so.0# for Ubuntu 14:# sudo apt-get install libopenblas-dev liblapack3 python-numpy python-dev# BLASLDFLAGS?=/usr/lib/libopenblas.so.0 /usr/lib/lapack/liblapack.so.3.0## 3. Atlas## Automatically tuned linear algebra package. As the name indicates,# it is tuned automatically for a give architecture, and in Linux# distributions, it the architecture is typically indicated by the# directory name, eg. atlas-sse3 = optimized for SSE3 architecture.## BLASCFLAGS=-DFINTEGER=int# BLASLDFLAGS=/usr/lib64/atlas-sse3/libptf77blas.so.3 /usr/lib64/atlas-sse3/liblapack.so## 4. reference implementation## This is just a compiled version of the reference BLAS# implementation, that is not optimized at all.## BLASCFLAGS=-DFINTEGER=int# BLASLDFLAGS=/usr/lib64/libblas.so.3 /usr/lib64/liblapack.so.3.2############################################################################ SWIG and Python flags########################################################################### SWIG executable. This should be at least version 3.xSWIGEXEC=swig# The Python include directories for a given python executable can# typically be found with## python -c "import distutils.sysconfig; print distutils.sysconfig.get_python_inc()"# python -c "import numpy ; print numpy.get_include()"## or, for Python 3, with## python3 -c "import distutils.sysconfig; print(distutils.sysconfig.get_python_inc())"# python3 -c "import numpy ; print(numpy.get_include())"#PYTHONCFLAGS=-I/usr/include/python2.7/ -I/usr/lib64/python2.7/site-packages/numpy/core/include/############################################################################ Cuda GPU flags############################################################################ root of the cuda 8 installationCUDAROOT=/usr/local/cuda-8.0/CUDACFLAGS=-I$(CUDAROOT)/includeNVCC=$(CUDAROOT)/bin/nvccNVCCFLAGS= $(CUDAFLAGS) \-I $(CUDAROOT)/targets/x86_64-linux/include/ \-Xcompiler -fPIC \-Xcudafe --diag_suppress=unrecognized_attribute \-gencode arch=compute_35,code="compute_35" \-gencode arch=compute_52,code="compute_52" \-gencode arch=compute_60,code="compute_60" \--std c++11 -lineinfo \-ccbin $(CXX) -DFAISS_USE_FLOAT16# BLAS LD flags for nvcc (used to generate an executable)# if BLASLDFLAGS contains several flags, each one may# need to be prepended with -XlinkerBLASLDFLAGSNVCC=-Xlinker $(BLASLDFLAGS)# Same, but to generate a .soBLASLDFLAGSSONVCC=-Xlinker $(BLASLDFLAGS)
scp l_mkl_2017.0.098.tgz yingmin.li@yz-gpu023.hogpu.cc:/home/users/yingmin.li/temp只要输入了serial number(3VGW-N6PJ7GCN)就行,选3用user安装$MKLROOT = /home/users/yingmin.li/intel/compilers_and_libraries/linux/mklecho 'export LD_LIBRARY_PATH="/home/users/yingmin.li/intel/compilers_and_libraries/linux/mkl/lib/intel64:$LD_LIBRARY_PATH"' >> ~/.bashrcecho 'export LD_PRELOAD="/home/users/yingmin.li/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_core.so:/home/users/yingmin.li/intel/compilers_and_libraries/linux/mkl/lib/intel64/libmkl_sequential.so:$LD_PRELOAD"' >> ~/.bashrcsource ~/.bashrcmake cleanmake
测试:
A basic usage example is indemos/demo_ivfpq_indexingit makes a small index, stores it and performs some searches. A normal runtime is around 20s. With a fast machine and Intel MKL's BLAS it runs in 2.5s.
GPU机子上疑似变慢了!
怀疑有人占用GPU资源:
nvidia-smi #查询GPU占用率watch -n 1 nvidia-smi #1s刷新一次free -m #内存top #CPU使用率htop #CPU使用: https://linux.cn/article-3141-1.html