Install NVIDIA software to run TensorFlow GPU on Ubuntu

The following steps are based on my experience to install the required software provided by NVIDIA to run TensorFlow with GPU support v1.8 in Ubuntu 16.04.

System Requirements

To run TensorFlow-GPU v1.8 you will need to cover the following requirements.

Ubuntu 16.04, 64-bit
NVIDIA GPU supporting compute capability 3.0 or higher
Python 2.7 or 3.5
pip 10
NVIDIA driver release 396
CUDA toolkit 9.0
cuDNN v7.1.3 Library for Linux, for CUDA 9.0

Verify your GPU

Check that the GPU is listed in your PCI devices. Run the command lspci | grep -i nvidia, and verify the output is similar to the following:

01:00.0 VGA compatible controller: NVIDIA Corporation Device 1b81 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 10f0 (rev a1)

Look for your GPU here to make sure it has compute capability 3.0 or higher.

Install NVIDIA Drivers

Open a terminal and run the following commands:

$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt update
$ sudo apt-get install nvidia-396

Reboot PC after the installation.

Install CUDA toolkit 9.0

Download the CUDA toolkit from here. Make sure to select deb(local) as the installer type. Run the following commands to install the toolkit:

$ sudo dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
$ sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
$ sudo apt-get update
$ sudo apt-get install cuda

The binary and library of CUDA need to be setup in the environment as well. Add the following exports in file ~/.bashrc:

export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Reboot PC after the installation.

Test the CUDA installation

Verify the CUDA installation by running the following commands:

$ cd /usr/local/cuda-9.0/samples/5_Simulations/nbody
$ sudo make
$ ./nbody

An animated simulation must start right after.

Install cuDNN v7.1.3 Library

Download the cuDNN v7.1.3 library from here. Look in the archived releases if it's not found in the first link. Note: You will need to sign up for the NVIDIA Developer Program and fill out their survey to download the library.

Run the following commands to setup the library:

tar -xzvf cudnn-9.0-linux-x64-v7.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

Install TensorFlow dependencies

Currently, the only dependency is the NVIDIA CUDA Profile Tools Interface (CUPTI). Install it with the following command:

$ sudo apt-get install cuda-command-line-tools-9-0

Add CUPTI to the library path in file ~/.bashrc:

export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/usr/local/cuda/extras/CUPTI/lib64

Install TensorFlow GPU

Note: rllab has its own environment, and therefore its own python packages. If you want to install TensorFlow GPU for rllab, make sure to run $ source activate rllab3 before continuing.

Verify that there are no previous installations of TensorFlow packages. Run the following command:

$ pip list | grep tensorflow

If found, remove the regular tensorflow package with the command:

$ pip uninstall tensorflow

If by chance the tensorflow-gpu package was already installed, remove it first so all python environment variables are reset again when the package is reinstalled:

$ pip uninstall tensorflow-gpu

Install the tensorflow-gpu package with the command:

$ pip install tensorflow-gpu==1.8

Test TensorFlow GPU

Start python, and introduce the following lines of code:

import tensorflow as tf
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print(sess.run(c))

When log_device_placement is set to True, a verbose output is produced to indicate in which device each operation is mapped to. In this case, all operations must be mapped to the GPU device, so the output should be similar to the following:

MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
2018-05-09 09:19:45.046329: I tensorflow/core/common_runtime/placer.cc:886] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:GPU:0
b: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2018-05-09 09:19:45.046397: I tensorflow/core/common_runtime/placer.cc:886] b: (Const)/job:localhost/replica:0/task:0/device:GPU:0
a: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2018-05-09 09:19:45.046433: I tensorflow/core/common_runtime/placer.cc:886] a: (Const)/job:localhost/replica:0/task:0/device:GPU:0
[[22. 28.]
 [49. 64.]]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly