Difference between revisions of "Tensorflow with gpu"

Latest revision as of 17:49, 7 January 2020

OS

Kubuntu 16.04 LTS

Setup (guide)

Just follow:

The walkthrough in the bottom is for CUDA 10.1, cuDNN 7.6.1, python3
This guide (Ubuntu 16.04 64-bit, CUDA 9.2, cuDNN 7.1.4, python3)
This guide (Ubuntu 16.04 64-bit, CUDA 9.1, cuDNN 7.1.2, python3)

Setup (some details)

Check device

~$ lspci | grep NVIDIA
81:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 610] (rev a1)
81:00.1 Audio device: NVIDIA Corporation GF119 HDMI Audio Controller (rev a1)

Check driver version:

~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  387.26  Thu Nov  2 21:20:16 PDT 2017
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)

Install cuda 9.2 with patch(es):

# https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=deblocal:
~$ sudo dpkg -i cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64.deb
~$ sudo apt-key add /var/cuda-repo-9-2-local/7fa2af80.pub
~$ sudo apt-get update
~$ sudo apt-get install cuda
# INSTALL THE PATCH(ES)

Might need to reboot PC. If cuda 9.2 got installed over other version, nvidia tools will be throwing errors about driver versions mismatching, try

~$ nvidia-smi

Good looking output:

Wed Jun 13 15:55:44 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.26                 Driver Version: 396.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 750 Ti  Off  | 00000000:01:00.0  On |                  N/A |
| 33%   36C    P8     1W /  46W |    229MiB /  2000MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+                                                                             
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1305      G   /usr/lib/xorg/Xorg                           136MiB |
|    0      3587      G   /usr/bin/krunner                               1MiB |
|    0      3590      G   /usr/bin/plasmashell                          67MiB |
|    0      3693      G   /usr/bin/plasma-discover                      20MiB |
+-----------------------------------------------------------------------------+

Check out post installation docs:

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions:
# Export paths
~$ export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}
~$ export LD_LIBRARY_PATH=/usr/local/cuda-9.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
~$ export LD_LIBRARY_PATH=/usr/local/cuda-9.2/extras/CUPTI/lib64/${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Install TensorFlow (build from sources for cuda 9.2):

link 1: (preferrable guide): http://www.python36.com/install-tensorflow141-gpu/
link 2: https://www.tensorflow.org/install/install_sources

[Optional] Install TensorFlow (prebuilt for cuda 9.0?):

# docs: 
# - https://www.tensorflow.org/install/install_linux
# some instructions:
# - install cuDNN
~$ sudo apt-get install python3-pip # if it is not already installed
~$ sudo pip3 install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.7.0-cp35-cp35m-linux_x86_64.whl

Testing setup

Supported card GeForce GTX 750 Ti (list of supported graphic cards):

~$ python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, World!')  
>>> sess = tf.Session()
2018-04-26 18:14:05.427668: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning 
NUMA node zero
2018-04-26 18:14:05.428033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: 
name: GeForce GTX 750 Ti major: 5 minor: 0 memoryClockRate(GHz): 1.1105
pciBusID: 0000:01:00.0
totalMemory: 1.95GiB freeMemory: 1.53GiB
2018-04-26 18:14:05.428061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-26 18:14:05.927106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-26 18:14:05.927149: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-04-26 18:14:05.927163: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-04-26 18:14:05.927313: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1289 MB memory) -> physical GPU (device: 0, 
name: GeForce GTX 750 Ti, pci bus id: 0000:01:00.0, compute capability: 5.0)
>>> print(sess.run(hello))
b'Hello, World!'

Unsupported card GeForce GT 610

~$ python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, World!')                                                                                                                                                                            
>>> sess = tf.Session()                                                                                                                                                                                                 
2018-04-26 13:00:19.050625: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA                                           
2018-04-26 13:00:19.181581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:                                                                                                             
name: GeForce GT 610 major: 2 minor: 1 memoryClockRate(GHz): 1.62                                                                                                                                                                
pciBusID: 0000:81:00.0                                                                                                                                                                                                           
totalMemory: 956.50MiB freeMemory: 631.69MiB                                                                                                                                                                                              
2018-04-26 13:00:19.181648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1394] Ignoring visible gpu device (device: 0, name: GeForce GT 610, pci bus id: 0000:81:00.0, compute capability: 2.1) with Cuda compute capability 2.1. The minimum required Cuda capability is 3.5.                                                                                                                                                                                                              
2018-04-26 13:00:19.181669: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:                                                                                       
2018-04-26 13:00:19.181683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0                                                                                                                                                       
2018-04-26 13:00:19.181695: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N                                                                                                                                                       
>>> print(sess.run(hello))                                                                                                                                                                                                                             
b'Hello, World!'

As a quickfix had to install CuDNN 7.0.5 instead of latest:

https://stackoverflow.com/questions/49960132/cudnn-library-compatibility-error-after-loading-model-weights

Print tensorflow version

>>> print(tf.__version__)

Problems

[SOLVED] AttributeError: '_NamespacePath' object has no attribute 'sort'

# Notes:
 After updating some packages probably. python3?
# How to reproduce:
1:
~$ python3
>>> import tensorflow
2:
~$ virtualenv --system-site-packages -p python3
# Solution:
~$ sudo pip3 install setuptools --upgrade

Walkthrough for CUDA 10.1 (20190602)

Install CUDA

In this guide there's a link to CUDA toolkit.
- That toolkit (CUDA Toolkit 10.1 update1 (May 2019)) also updated the system driver to 418.67
- Reboot

Install cuDNN

Have to have an account with NVIDIA - downloaded cuDNN v7.6.1 (June 24, 2019), for CUDA 10.1

Option 1: installing tensorflow from source

Basically, this guide, some key notes:

Install bazel - version 0.25.2 (newer will not work)
To build, read this link:

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout r1.14
./configure
 
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
# 4-5 hours later
./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
sudo pip3 install /tmp/tensorflow_pkg/tensorflow-[Tab]

Testing:

~$ python3
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, World!')                                                                                                                                                                            
>>> sess = tf.Session()

Option 2: using docker

Follow this guide. Key notes:

Tensorflow docker image requires nvidia docker image, nvidia docker image requires apt install nvidia-docker2, nvidia-docker2 requires apt install docker-ce:

- https://github.com/NVIDIA/nvidia-docker
- https://docs.docker.com/install/linux/docker-ce/ubuntu/

Test run:

# Test 1: GPU support inside container:
sudo docker run --runtime=nvidia --rm nvidia/cuda:10.1-base nvidia-smi
# Test 2: Test all together
sudo docker pull tensorflow/tensorflow:latest-gpu-py3-jupyter
sudo docker run --runtime=nvidia -it --rm tensorflow/tensorflow:latest-gpu-py3-jupyter python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"
# Test 3: Run a local script (and include a local dir) in contatiner:
https://www.tensorflow.org/install/docker

Setup walkthrough for CUDA 10.2 (Dec 2019)

Install CUDA

In this guide there's a link to CUDA toolkit.
- The toolkit (CUDA Toolkit 10.2) also updated the system driver to 440.33.01
- Will have to reboot

Docker

Instructions

https://www.tensorflow.org/install/docker

Quote:

Docker is the easiest way to enable TensorFlow GPU support on Linux since only the NVIDIA® GPU driver is required on the host machine (the NVIDIA® CUDA® Toolkit does not need to be installed).

Docker images

Where to browse: https://hub.docker.com/r/tensorflow/tensorflow/:

TF version	Python major version	GPU support	NAME:TAG for Docker command
1.15	3	yes	tensorflow/tensorflow:1.15.0-gpu-py3
2.0.0+	3	yes	tensorflow/tensorflow:latest-gpu-py3
2.0.0+	2	yes	tensorflow/tensorflow:latest-gpu

nvidia-docker

Somehow it was already installed.

Check NVIDIA docker version

~$ nvidia-docker version

In the docs it's clear that Docker version 19.03+ should use nvidia-docker2. For Docker of older versions - nvidia-docker v1 should be used.
It's not immediately clear about the nvidia-container-runtime. nvidia-docker v1 & v2 should have already registered it.

Notes

Can mount a local directory in a 'binding' mode - i.e., update files locally so they are updated in the docker container as well:

# this will bind-mount directory target located in $(pwd), which is a dir the command is run from 
# to /app in the docker container

~$ docker run \
   -it \
   --rm \
   --name devtest \
   -p 0.0.0.0:6006:6006 \
   --mount type=bind,source="$(pwd)"/target,target=/app \
   --gpus all \
   tensorflow/tensorflow:latest-gpu-py3 \
   bash

How to run tensorboard from the container:

# from here
# From the running container's command line (since it was run with 'bash' in the step above).
# set a correct --logdir
root@e9efee9e3fd3:/# tensorboard --bind_all --logdir=/app/log.txt  # remove --bind_all for TF 1.15
# Then open a browser:
http://localhost:6006

Tensorflow and OpenCV building notes

Build 1

TF 1.15.0
CUDA 10.0 and Toolkit and stuff
OpenCV 3.4.9

TF 1.15.0

Will build with Bazel 0.25.2 (installed from deb archive)
TF - downloaded as tensorflow-1.15.0.tar.gz

1. Unpack
2. cd tensorflow-1.15.0
3. ./configure
4.

Build 2

TF 1.13.1
CUDA 10.0 and Toolkit and stuff
OpenCV 3.4.9

TF 1.13.1

Will build with Bazel 0.21.0 (installed from deb archive)

Difference between revisions of "Tensorflow with gpu"

Latest revision as of 17:49, 7 January 2020

Contents

OS

Setup (guide)

Setup (some details)

Testing setup

Problems

Walkthrough for CUDA 10.1 (20190602)

Install CUDA

Install cuDNN

Option 1: installing tensorflow from source

Option 2: using docker

Setup walkthrough for CUDA 10.2 (Dec 2019)

Install CUDA

Docker

Instructions

Docker images

nvidia-docker

Notes

Tensorflow and OpenCV building notes

Build 1

TF 1.15.0

Build 2

TF 1.13.1

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

@@ Line 1: / Line 1: @@
-==Pre==
+==OS==
-* check device
+* Kubuntu 16.04 LTS
-  ~$ lspci | grep NVIDIA
+==Setup (guide)==
+Just follow:
+* The [[Tensorflow_with_gpu#Walkthrough_for_CUDA_10.1_.2820190602.29|'''walkthrough''']] in the bottom is for CUDA 10.1, cuDNN 7.6.1, python3
+* [http://www.python36.com/how-to-install-tensorflow-gpu-with-cuda-9-2-for-python-on-ubuntu/  '''This guide'''] (Ubuntu 16.04 64-bit, CUDA 9.2, cuDNN 7.1.4, python3)
+* [http://www.python36.com/install-tensorflow141-gpu/ '''This guide'''] (Ubuntu 16.04 64-bit, CUDA 9.1, cuDNN 7.1.2, python3)
+==Setup (some details)==
+* Check device
+  <font size='2'><b>~$ lspci | grep NVIDIA</b>
 :00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 610] (rev a1)
-:00.1 Audio device: NVIDIA Corporation GF119 HDMI Audio Controller (rev a1)
+:00.1 Audio device: NVIDIA Corporation GF119 HDMI Audio Controller (rev a1)</font>
-* check driver version
-  ~$ cat /proc/driver/nvidia/version
+* Check driver version:
+  <font size='2'><b>~$ cat /proc/driver/nvidia/version</b>
   NVRM version: NVIDIA UNIX x86_64 Kernel Module  387.26  Thu Nov  2 21:20:16 PDT 2017
-  GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)
+  GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)</font>
-* install cuda 9.0 with patches
- https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=deblocal
-* then
- sudo apt-get install cuda-9-0
-  https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions
+* Install cuda 9.2 with patch(es):
-*
+  <font size='2'># https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=deblocal:
-  ~$ export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
+ <b>~$ sudo dpkg -i cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64.deb
-  ~$ export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
+  ~$ sudo apt-key add /var/cuda-repo-9-2-local/7fa2af80.pub
-*
+ ~$ sudo apt-get update
-  ~$ nvidia-smi
+  ~$ sudo apt-get install cuda</b>
-  Thu Apr 26 12:39:25 2018
+ <b># INSTALL THE PATCH(ES)</b></font>
+* Might need to reboot PC. If cuda 9.2 got installed over other version, nvidia tools will be throwing errors about driver  versions mismatching, try
+  <font size='2'><b>~$ nvidia-smi</b></font>
+Good looking output:
+  <font size='1'>Wed Jun 13 15:55:44 2018
   +-----------------------------------------------------------------------------+
-  | NVIDIA-SMI 387.26                 Driver Version: 387.26                    |
+  | NVIDIA-SMI 396.26                 Driver Version: 396.26                    |
   |-------------------------------+----------------------+----------------------+
   | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
   | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
   |===============================+======================+======================|
-  |   0  GeForce GT 610      Off  | 00000000:81:00.0 N/A |                  N/A |
+  |   0  GeForce GTX 750 Ti  Off  | 00000000:01:00.0  On |                  N/A |
-  | N/A   31C    P8    N/A /  N/A |    148MiB /   956MiB |     N/A      Default |
+  | 33%   36C    P8     1W /  46W |    229MiB /  2000MiB |      0%      Default |
   +-------------------------------+----------------------+----------------------+
   +-----------------------------------------------------------------------------+
   | Processes:                                                       GPU Memory |
   |  GPU       PID   Type   Process name                             Usage      |
   |=============================================================================|
-  |    0                    Not Supported                                       |
+  |    0      1305      G   /usr/lib/xorg/Xorg                           136MiB |
-  +-----------------------------------------------------------------------------+
+ |    0      3587      G   /usr/bin/krunner                               1MiB |
+ |    0      3590      G   /usr/bin/plasmashell                          67MiB |
+ |    0      3693      G   /usr/bin/plasma-discover                      20MiB |
+  +-----------------------------------------------------------------------------+</font>
+* Check out post installation docs:
+ <font size='2'>https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions:
+ # Export paths
+ <b>~$ export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}
+ ~$ export LD_LIBRARY_PATH=/usr/local/cuda-9.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
+ ~$ export LD_LIBRARY_PATH=/usr/local/cuda-9.2/extras/CUPTI/lib64/${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}</b></font>
+* Install TensorFlow (build from sources for cuda 9.2):
+ <font size='2'>link 1: (preferrable guide): http://www.python36.com/install-tensorflow141-gpu/
+ link 2: https://www.tensorflow.org/install/install_sources</font>
+* '''[Optional]''' Install TensorFlow (prebuilt for cuda 9.0?):
+ <font size='2'># docs:
+ # - https://www.tensorflow.org/install/install_linux
+ # some instructions:
+ # - install cuDNN
+ <b>~$ sudo apt-get install python3-pip # if it is not already installed</b>
+ <b>~$ sudo pip3 install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.7.0-cp35-cp35m-linux_x86_64.whl</b>
+==Testing setup==
-* install tensorflow
+* Supported card '''GeForce GTX 750 Ti''' (<b>list of [https://developer.nvidia.com/cuda-gpus supported graphic cards]</b>):
-  ~$ sudo pip3 install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.7.0-cp35-cp35m-linux_x86_64.whl
+  <font size='2'><b>~$ python3</b>
+ Python 3.5.2 (default, Nov 23 2017, 16:37:01)
+ [GCC 5.4.0 20160609] on linux
+ Type "help", "copyright", "credits" or "license" for more information.
+ '''>>> import tensorflow as tf'''
+ '''>>> hello = tf.constant('Hello, World!')'''
+ '''>>> sess = tf.Session()'''
+-04-26 18:14:05.427668: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning
+ NUMA node zero
+-04-26 18:14:05.428033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
+ name: GeForce GTX 750 Ti major: 5 minor: 0 memoryClockRate(GHz): 1.1105
+ pciBusID: 0000:01:00.0
+ totalMemory: 1.95GiB freeMemory: 1.53GiB
+-04-26 18:14:05.428061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
+-04-26 18:14:05.927106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
+-04-26 18:14:05.927149: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0
+-04-26 18:14:05.927163: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N
+-04-26 18:14:05.927313: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1289 MB memory) -> physical GPU (device: 0,
+ name: GeForce GTX 750 Ti, pci bus id: 0000:01:00.0, compute capability: 5.0)
+ '''>>> print(sess.run(hello))'''
+ b'Hello, World!'</font>
-* testing
+* Unsupported card '''GeForce GT 610'''
-elphel@elphel-SYS-7048R-TRT:~$ python3
+ '''~$ python3'''
   Python 3.5.2 (default, Nov 23 2017, 16:37:01)
   [GCC 5.4.0 20160609] on linux
   Type "help", "copyright", "credits" or "license" for more information.
-  >>>
+  '''>>> import tensorflow as tf'''
- >>>
+  '''>>> hello = tf.constant('Hello, World!')'''
- >>>
+  '''>>> sess = tf.Session()'''
- >>> import tensorflow as tf
-  >>> hello = tf.constant('Hello, World!')
-  >>> sess = tf.Session()
 -04-26 13:00:19.050625: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
 -04-26 13:00:19.181581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
@@ Line 60: / Line 111: @@
 -04-26 13:00:19.181683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0
 -04-26 13:00:19.181695: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N
-  >>> print(sess.run(hello))
+  '''>>> print(sess.run(hello))'''
   b'Hello, World!'
+* As a quickfix had to install CuDNN 7.0.5 instead of latest:
+ https://stackoverflow.com/questions/49960132/cudnn-library-compatibility-error-after-loading-model-weights
+* Print tensorflow version
+ <font size='2'>'''>>> print(tf.__version__)'''</font>
+==Problems==
+* <font color='green'>[SOLVED]</font> <b>AttributeError: '_NamespacePath' object has no attribute 'sort'</b>
+ <font size='2'># Notes:
+  After updating some packages probably. python3?
+ # How to reproduce:
+:
+ <b>~$ python3
+ >>> import tensorflow</b>
+:
+ <b>~$ virtualenv --system-site-packages -p python3</b>
+ # Solution:
+ <b>~$ sudo pip3 install setuptools --upgrade</b></font>
+==Walkthrough for CUDA 10.1 (20190602)==
+===Install CUDA===
+* In this [https://www.tensorflow.org/install/gpu guide] there's a [https://developer.nvidia.com/cuda-toolkit-archive link to CUDA toolkit].
+** That toolkit (CUDA Toolkit 10.1 update1 (May 2019)) also updated the system driver to 418.67
+** Reboot
+===Install cuDNN===
+* Have to have an account with NVIDIA - downloaded [https://developer.nvidia.com/rdp/cudnn-download#a-collapse761-101 cuDNN v7.6.1 (June 24, 2019), for CUDA 10.1]
+===Option 1: installing tensorflow from source===
+Basically, [https://www.tensorflow.org/install/source '''this guide'''], some key notes:
+* [https://www.tensorflow.org/install/source#install_bazel Install bazel] - version 0.25.2 (newer will not work)
+* To build, read [https://www.tensorflow.org/install/source#download_the_tensorflow_source_code this link]:
+ git clone https://github.com/tensorflow/tensorflow.git
+ cd tensorflow
+ git checkout r1.14
+ ./configure
+ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
+ # 4-5 hours later
+ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
+ sudo pip3 install /tmp/tensorflow_pkg/tensorflow-[Tab]
+* Testing:
+ ~$ python3
+ >>> import tensorflow as tf
+ >>> hello = tf.constant('Hello, World!')
+ >>> sess = tf.Session()
+===Option 2: using docker===
+Follow [https://www.tensorflow.org/install/docker '''this guide''']. Key notes:
+* Tensorflow docker image requires nvidia docker image, nvidia docker image requires ''apt install nvidia-docker2'', ''nvidia-docker2'' requires ''apt install docker-ce'':
+ - https://github.com/NVIDIA/nvidia-docker
+ - https://docs.docker.com/install/linux/docker-ce/ubuntu/
+* Test run:
+ # Test 1: GPU support inside container:
+ sudo docker run --runtime=nvidia --rm nvidia/cuda:10.1-base nvidia-smi
+ # Test 2: Test all together
+ sudo docker pull tensorflow/tensorflow:latest-gpu-py3-jupyter
+ sudo docker run --runtime=nvidia -it --rm tensorflow/tensorflow:latest-gpu-py3-jupyter python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"
+ # Test 3: Run a local script (and include a local dir) in contatiner:
+ https://www.tensorflow.org/install/docker
+==Setup walkthrough for CUDA 10.2 (Dec 2019)==
+===Install CUDA===
+* In this [https://www.tensorflow.org/install/gpu guide] there's a [https://developer.nvidia.com/cuda-toolkit-archive link to CUDA toolkit].
+** The toolkit (CUDA Toolkit 10.2) also updated the system driver to 440.33.01
+** Will have to reboot
+===Docker===
+====Instructions====
+'''https://www.tensorflow.org/install/docker'''
+Quote:
+ Docker is the easiest way to enable TensorFlow GPU support on Linux since only the NVIDIA® GPU driver is required on the host machine (the NVIDIA® CUDA® Toolkit does not need to be installed).
+====Docker images====
+Where to browse: https://hub.docker.com/r/tensorflow/tensorflow/:
+{| class='wikitable'
+!TF version
+!Python major version
+!GPU support
+!NAME:TAG for Docker command
+|-
+|align='center'|1.15
+|align='center'|3
+|align='center'|yes
+|<font color='darkgreen'>'''tensorflow/tensorflow:1.15.0-gpu-py3'''
+|-
+|align='center'|2.0.0+
+|align='center'|3
+|align='center'|yes
+|<font color='darkgreen'>'''tensorflow/tensorflow:latest-gpu-py3'''
+|-
+|align='center'|2.0.0+
+|align='center'|2
+|align='center'|yes
+|<font color='darkgreen'>'''tensorflow/tensorflow:latest-gpu'''
+|}
+====nvidia-docker====
+Somehow it was already installed.
+* Check NVIDIA docker version
+ ~$ nvidia-docker version
+* In the docs it's clear that Docker version 19.03+ should use nvidia-docker2. For Docker of older versions - nvidia-docker v1 should be used.
+* It's not immediately clear about the '''nvidia-container-runtime'''. nvidia-docker v1 & v2 should have already registered it.
+====Notes====
+* Can mount a local directory in a 'binding' mode - i.e., update files locally so they are updated in the docker container as well:
+ <font size='2'># this will bind-mount directory '''target''' located in '''$(pwd)''', which is a dir the command is run from
+ # to '''/app''' in the docker container
+ ~$ '''docker run \'''
+    '''-it \'''
+    '''--rm \'''
+    '''--name devtest \'''
+    '''-p 0.0.0.0:6006:6006 \'''
+    '''--mount type=bind,source="$(pwd)"/target,target=/app \'''
+    '''--gpus all \'''
+    <font color='darkgreen'>'''tensorflow/tensorflow:latest-gpu-py3</font> \'''
+    '''bash'''</font>
+* How to run tensorboard from the container:
+ <font size='2'># from [https://briancaffey.github.io/2017/11/20/using-tensorflow-and-tensor-board-with-docker.html here]
+ # From the running container's command line (since it was run with 'bash' in the step above).
+ # set a correct --logdir
+ root@e9efee9e3fd3:/# '''tensorboard --bind_all --logdir=/app/log.txt'''  # remove --bind_all for TF 1.15
+ # Then open a browser:
+ '''http://localhost:6006'''</font>
+==Tensorflow and OpenCV building notes==
+===Build 1===
+# TF 1.15.0
+# CUDA 10.0 and Toolkit and stuff
+# OpenCV 3.4.9
+====TF 1.15.0====
+* Will build with Bazel 0.25.2 (installed from [https://github.com/bazelbuild/bazel/releases/tag/0.25.2 deb archive])
+* TF - downloaded as [https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0 tensorflow-1.15.0.tar.gz]
+. Unpack
+. cd tensorflow-1.15.0
+. ./configure
+.
+===Build 2===
+# TF 1.13.1
+# CUDA 10.0 and Toolkit and stuff
+# OpenCV 3.4.9
+====TF 1.13.1====
+* Will build with Bazel 0.21.0 (installed from [https://github.com/bazelbuild/bazel/releases/tag/0.21.0 deb archive])