Tensorflow with gpu

Requirements

Kubuntu 16.04 LTS

Setup (guide)

Just follow:

this guide (Ubuntu 16.04 64-bit, CUDA 9.1, cuDNN 7.1.2, python3) or
this newer one (Ubuntu 16.04 64-bit, CUDA 9.2, cuDNN 7.1.4, python3)
The walkthrough in the bottom is for CUDA 10.1, cuDNN 7.6.1, python3

Setup (some details)

Check device

~$ lspci | grep NVIDIA
81:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 610] (rev a1)
81:00.1 Audio device: NVIDIA Corporation GF119 HDMI Audio Controller (rev a1)

Check driver version:

~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  387.26  Thu Nov  2 21:20:16 PDT 2017
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)

Install cuda 9.2 with patch(es):

# https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=deblocal:
~$ sudo dpkg -i cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64.deb
~$ sudo apt-key add /var/cuda-repo-9-2-local/7fa2af80.pub
~$ sudo apt-get update
~$ sudo apt-get install cuda
# INSTALL THE PATCH(ES)

Might need to reboot PC. If cuda 9.2 got installed over other version, nvidia tools will be throwing errors about driver versions mismatching, try

~$ nvidia-smi

Good looking output:

Wed Jun 13 15:55:44 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.26                 Driver Version: 396.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 750 Ti  Off  | 00000000:01:00.0  On |                  N/A |
| 33%   36C    P8     1W /  46W |    229MiB /  2000MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+                                                                             
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1305      G   /usr/lib/xorg/Xorg                           136MiB |
|    0      3587      G   /usr/bin/krunner                               1MiB |
|    0      3590      G   /usr/bin/plasmashell                          67MiB |
|    0      3693      G   /usr/bin/plasma-discover                      20MiB |
+-----------------------------------------------------------------------------+

Check out post installation docs:

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions:
# Export paths
~$ export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}
~$ export LD_LIBRARY_PATH=/usr/local/cuda-9.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
~$ export LD_LIBRARY_PATH=/usr/local/cuda-9.2/extras/CUPTI/lib64/${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Install TensorFlow (build from sources for cuda 9.2):

link 1: (preferrable guide): http://www.python36.com/install-tensorflow141-gpu/
link 2: https://www.tensorflow.org/install/install_sources

[Optional] Install TensorFlow (prebuilt for cuda 9.0?):

# docs: 
# - https://www.tensorflow.org/install/install_linux
# some instructions:
# - install cuDNN
~$ sudo apt-get install python3-pip # if it is not already installed
~$ sudo pip3 install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.7.0-cp35-cp35m-linux_x86_64.whl

Testing setup

Supported card GeForce GTX 750 Ti (list of supported graphic cards):

~$ python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, World!')  
>>> sess = tf.Session()
2018-04-26 18:14:05.427668: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning 
NUMA node zero
2018-04-26 18:14:05.428033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: 
name: GeForce GTX 750 Ti major: 5 minor: 0 memoryClockRate(GHz): 1.1105
pciBusID: 0000:01:00.0
totalMemory: 1.95GiB freeMemory: 1.53GiB
2018-04-26 18:14:05.428061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-26 18:14:05.927106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-26 18:14:05.927149: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-04-26 18:14:05.927163: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-04-26 18:14:05.927313: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1289 MB memory) -> physical GPU (device: 0, 
name: GeForce GTX 750 Ti, pci bus id: 0000:01:00.0, compute capability: 5.0)
>>> print(sess.run(hello))
b'Hello, World!'

Unsupported card GeForce GT 610

~$ python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, World!')                                                                                                                                                                            
>>> sess = tf.Session()                                                                                                                                                                                                 
2018-04-26 13:00:19.050625: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA                                           
2018-04-26 13:00:19.181581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:                                                                                                             
name: GeForce GT 610 major: 2 minor: 1 memoryClockRate(GHz): 1.62                                                                                                                                                                
pciBusID: 0000:81:00.0                                                                                                                                                                                                           
totalMemory: 956.50MiB freeMemory: 631.69MiB                                                                                                                                                                                              
2018-04-26 13:00:19.181648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1394] Ignoring visible gpu device (device: 0, name: GeForce GT 610, pci bus id: 0000:81:00.0, compute capability: 2.1) with Cuda compute capability 2.1. The minimum required Cuda capability is 3.5.                                                                                                                                                                                                              
2018-04-26 13:00:19.181669: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:                                                                                       
2018-04-26 13:00:19.181683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0                                                                                                                                                       
2018-04-26 13:00:19.181695: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N                                                                                                                                                       
>>> print(sess.run(hello))                                                                                                                                                                                                                             
b'Hello, World!'

As a quickfix had to install CuDNN 7.0.5 instead of latest:

https://stackoverflow.com/questions/49960132/cudnn-library-compatibility-error-after-loading-model-weights

Print tensorflow version

>>> print(tf.__version__)

Problems

[SOLVED] AttributeError: '_NamespacePath' object has no attribute 'sort'

# Notes:
 After updating some packages probably. python3?
# How to reproduce:
1:
~$ python3
>>> import tensorflow
2:
~$ virtualenv --system-site-packages -p python3
# Solution:
~$ sudo pip3 install setuptools --upgrade

Walkthrough for CUDA 10.1 (20190602)

Install CUDA

In this guide there's a link to CUDA toolkit.
- That toolkit (CUDA Toolkit 10.1 update1 (May 2019)) also updated the system driver to 418.67
- Reboot

Install cuDNN

Have to have an account with NVIDIA - downloaded cuDNN v7.6.1 (June 24, 2019), for CUDA 10.1

Option 1: installing tensorflow from source

Basically, this guide, some key notes:

Install bazel - version 0.25.2 (newer will not work)
To build, read this link:

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout r1.14
./configure
 
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
# 4-5 hours later
./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
sudo pip3 install /tmp/tensorflow_pkg/tensorflow-[Tab]

Testing:

~$ python3
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, World!')                                                                                                                                                                            
>>> sess = tf.Session()

Option 2: using docker

Follow this guide. Key notes:

Tensorflow docker image requires nvidia docker image, nvidia docker image requires apt install nvidia-docker2, nvidia-docker2 requires apt install docker-ce:

- https://github.com/NVIDIA/nvidia-docker
- https://docs.docker.com/install/linux/docker-ce/ubuntu/

Test run:

# Test 1: GPU support inside container:
sudo docker run --runtime=nvidia --rm nvidia/cuda:10.1-base nvidia-smi
# Test 2: Test all together
sudo docker pull tensorflow/tensorflow:latest-gpu-py3-jupyter
sudo docker run --runtime=nvidia -it --rm tensorflow/tensorflow:latest-gpu-py3-jupyter python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"
# Test 3: Run a local script (and include a local dir) in contatiner:
https://www.tensorflow.org/install/docker

Tensorflow with gpu

Contents

Requirements

Setup (guide)

Setup (some details)

Testing setup

Problems

Walkthrough for CUDA 10.1 (20190602)

Install CUDA

Install cuDNN

Option 1: installing tensorflow from source

Option 2: using docker

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools