@BIGBALLON 2017-10-12T09:01:28.000000Z 字数 4400 阅读 4384

NVIDIA Driver & Pytorch installation

The simple setup tutorial for deep learning beginner.

STEP 1: Install NVIDIA Driver

sudo apt-get update
sudo apt-get upgrade

Disable Nouveau

sudo vi /etc/modprobe.d/disable-nouveau.conf
//insert the following lines
blacklist nouveau
options nouveau modeset=0

Update kernel initramfs

sudo update-initramfs -u
sudo reboot

Install NVIDIA driver

hit Ctrl+Alt+F1(tty1) and login, stop lightdm service & install dirver

sudo service lightdm stop
sudo apt-get install nvidia-375
sudo reboot

(optional)To install latest drivers add PPA:

sudo apt-get purge nvidia-*
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-3xx

Then, you can use nvidia-smi to get the GPU info.

STEP 2: Install CUDA 8.0

Download setup file from official website
The latest release version is 9.0, but we use CUDA 8.0 cuda_8.0.61_375.26_linux.run

cd Downloads/
sudo chmod a+x cuda_8.0.61_375.26_linux.run
sudo ./cuda_8.0.61_375.26_linux.run

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?
(Attention!) Please input no

Do you accept the previously read EULA?
accept/decline/quit: accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?
(y)es/(n)o/(q)uit: no
Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
 [ default is /usr/local/cuda-8.0 ]: 
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y
Enter CUDA Samples Location
 [ default is /home/bg ]:

Set up the environment

sudo vi ~/.bashrc  
//insert the following lines
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64
export CUDA_HOME=/usr/local/cuda-8.0

Reload .bashrc

source ~/.bashrc

(optional)You can also test CUDA Samples

cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery
sudo make
sudo ./deviceQuery

STEP 3: Install Cudnn

Download cuDNN files from official website (need to login, cuDNN v6.0 for CUDA 8.0 in this tutorial)

cd Downloads/
tar -zxvf cudnn-8.0-linux-x64-v6.0.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda-8.0/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda-8.0/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda-8.0/lib64/libcudnn*

STEP 4-1: Install PyTorch via virtualenv

virtualenv is a tool to create isolated Python environments.
We recommend you to use it instead of "native" pip.

1) Install pip and virtualenv(python3)

sudo apt-get install python3-pip python3-dev python-virtualenv

2) Create a virtualenv environment:

virtualenv --system-site-packages -p python3 targetDirectory

where targetDirectory specifies the top of the virtualenv tree, our instructions assume that targetDirectory is ~/deeplearning, but you may choose any directory.

3) Activate the virtualenv environment:

source ~/deeplearning/bin/activate

The preceding source command should change your prompt to the following:

(deeplearning) bg@bg-cgi:~$

4) Ensure pip ≥8.1 is installed:

(deeplearning) bg@bg-cgi:~$ easy_install -U pip

5) Run this command to install Pytorch(python3.5):

pip3 install http://download.pytorch.org/whl/cu80/torch-0.2.0.post3-cp35-cp35m-manylinux1_x86_64.whl 
pip3 install torchvision

Then, just test the Warm Up demo:

(deeplearning) bg@bg-cgi:~$ git clone https://github.com/2017-fall-DL-training-program/PyTorchWarmUp.git
(deeplearning) bg@bg-cgi:~$ cd PyTorchWarmUp/
(deeplearning) bg@bg-cgi:~/PyTorchWarmUp$ python3 CNN_MNIST_pytorch.py

After training(a few mins), you can see the following result:

Train Epoch: 20 [58496/60000 (97%)] Loss: 0.012621
Train Epoch: 20 [58624/60000 (98%)] Loss: 0.011611
Train Epoch: 20 [58752/60000 (98%)] Loss: 0.010518
Train Epoch: 20 [58880/60000 (98%)] Loss: 0.008049
Train Epoch: 20 [59008/60000 (98%)] Loss: 0.042140
Train Epoch: 20 [59136/60000 (99%)] Loss: 0.002592
Train Epoch: 20 [59264/60000 (99%)] Loss: 0.003716
Train Epoch: 20 [59392/60000 (99%)] Loss: 0.006034
Train Epoch: 20 [59520/60000 (99%)] Loss: 0.015948
Train Epoch: 20 [59648/60000 (99%)] Loss: 0.005287
Train Epoch: 20 [59776/60000 (100%)]    Loss: 0.039094
Train Epoch: 20 [44928/60000 (100%)]    Loss: 0.005245
Test set: Average loss: 0.0271, Accuracy: 9908/10000 (99%)

STEP 4-2: Install TensorFlow via virtualenv

1 - 4: same as PyTorch
5) Issue one of the following commands to install TensorFlow(GPU) in the active virtualenv environment:

(deeplearning) bg@bg-cgi:~$ pip install --upgrade tensorflow      # for Python 2.7
(deeplearning) bg@bg-cgi:~$ pip3 install --upgrade tensorflow     # for Python 3.n
(deeplearning) bg@bg-cgi:~$ pip install --upgrade tensorflow-gpu  # for Python 2.7 and GPU
(deeplearning) bg@bg-cgi:~$ pip3 install --upgrade tensorflow-gpu # for Python 3.n and GPU

6) When you are done using Pytorch(TensorFlow), you may deactivate the environment by invoking the deactivate function as follows:

(deeplearning) bg@bg-cgi:~$ deactivate

7) Uninstalling Pytorch(TensorFlow)

To uninstall Pytorch(TensorFlow), simply remove the tree you created. For example:

rm -r ~/deeplearning