-
Notifications
You must be signed in to change notification settings - Fork 5
Install NVIDIA software to run TensorFlow GPU on Ubuntu
The following steps are based on my experience to install the required software provided by NVIDIA to run TensorFlow with GPU support v1.8 in Ubuntu 16.04.
To run TensorFlow-GPU v1.8 you will need to cover the following requirements.
- Ubuntu 16.04, 64-bit
- NVIDIA GPU supporting compute capability 3.0 or higher
- Python 2.7 or 3.5
- pip 10
- NVIDIA driver release 396
- CUDA toolkit 9.0
- cuDNN v7.1.3 Library for Linux, for CUDA 9.0
Check that the GPU is listed in your PCI devices. Run the command lspci | grep -i nvidia
, and verify the output is similar to the following:
01:00.0 VGA compatible controller: NVIDIA Corporation Device 1b81 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 10f0 (rev a1)
Look for your GPU here to make sure it has compute capability 3.0 or higher.
Open a terminal and run the following commands:
$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt update
$ sudo apt-get install nvidia-396
Reboot PC after the installation.
Download the CUDA toolkit from here. Make sure to select deb(local) as the installer type. Run the following commands to install the toolkit:
$ sudo dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
$ sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
$ sudo apt-get update
$ sudo apt-get install cuda
The binary and library of CUDA need to be setup in the environment as well. Add the following exports in file ~/.bashrc
:
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Reboot PC after the installation.
Verify the CUDA installation by running the following commands:
$ cd /usr/local/cuda-9.0/samples/5_Simulations/nbody
$ sudo make
$ ./nbody
An animated simulation must start right after.
Download the cuDNN v7.1.3 library from here. Look in the archived releases if it's not found in the first link. Note: You will need to sign up for the NVIDIA Developer Program and fill out their survey to download the library.
Run the following commands to setup the library:
tar -xzvf cudnn-9.0-linux-x64-v7.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
Currently, the only dependency is the NVIDIA CUDA Profile Tools Interface (CUPTI). Install it with the following command:
$ sudo apt-get install cuda-command-line-tools-9-0
Add CUPTI to the library path in file ~/.bashrc
:
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/usr/local/cuda/extras/CUPTI/lib64
Note: rllab has its own environment, and therefore its own python packages. If you want to install TensorFlow GPU for rllab, make sure to run $ source activate rllab3
before continuing.
Verify that there are no previous installations of TensorFlow packages. Run the following command:
$ pip list | grep tensorflow
If found, remove the regular tensorflow package with the command:
$ pip uninstall tensorflow
If by chance the tensorflow-gpu package was already installed, remove it first so all python environment variables are reset again when the package is reinstalled:
$ pip uninstall tensorflow-gpu
Install the tensorflow-gpu package with the command:
$ pip install tensorflow-gpu==1.8
Start python, and introduce the following lines of code:
import tensorflow as tf
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print(sess.run(c))
When log_device_placement is set to True, a verbose output is produced to indicate in which device each operation is mapped to. In this case, all operations must be mapped to the GPU device, so the output should be similar to the following:
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
2018-05-09 09:19:45.046329: I tensorflow/core/common_runtime/placer.cc:886] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:GPU:0
b: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2018-05-09 09:19:45.046397: I tensorflow/core/common_runtime/placer.cc:886] b: (Const)/job:localhost/replica:0/task:0/device:GPU:0
a: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2018-05-09 09:19:45.046433: I tensorflow/core/common_runtime/placer.cc:886] a: (Const)/job:localhost/replica:0/task:0/device:GPU:0
[[22. 28.]
[49. 64.]]