Difference between revisions of "Quad stereo tensorflow eclipse"
From ElphelWiki
(8 intermediate revisions by the same user not shown) | |||
Line 19: | Line 19: | ||
</properties> | </properties> | ||
</font> | </font> | ||
− | * | + | * Updated pom.xml to TF 1.15 - package exists |
* Install cuDNN all 3 packages - runtime, dev and docs. Used docs to verify installation - built mnistCUDNN: | * Install cuDNN all 3 packages - runtime, dev and docs. Used docs to verify installation - built mnistCUDNN: | ||
https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html | https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html | ||
+ | * I think TF 1.15 maven package was built for CUDA 10.0 driver, and so it whines when 10.2 is installed. | ||
+ | <font size=1 color=red>2019-12-27 13:05:15.754656: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory | ||
+ | 2019-12-27 13:05:15.754756: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory | ||
+ | 2019-12-27 13:05:15.754860: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory | ||
+ | 2019-12-27 13:05:15.754970: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory | ||
+ | 2019-12-27 13:05:15.755075: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory | ||
+ | 2019-12-27 13:05:15.755178: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory | ||
+ | 2019-12-27 13:05:15.762197: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 | ||
+ | 2019-12-27 13:05:15.762227: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. | ||
+ | Skipping registering GPU devices...</font> | ||
+ | |||
+ | * TF 1.15 and CUDA 10.0 require GPU compute capability = 6.0, GeForce GTX 750Ti is 5.0: | ||
+ | <font color=red size=1>2019-12-27 14:22:17.475717: I tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /home/oleg/GIT/imagej-elphel/target/classes/trained_model | ||
+ | 2019-12-27 14:22:17.477009: I tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve } | ||
+ | 2019-12-27 14:22:17.503393: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3392030000 Hz | ||
+ | 2019-12-27 14:22:17.504196: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f610dba1a20 initialized for platform Host (this does not guarantee that XLA will be used). Devices: | ||
+ | 2019-12-27 14:22:17.504235: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version | ||
+ | 2019-12-27 14:22:17.505378: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 | ||
+ | 2019-12-27 14:22:17.517647: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | ||
+ | 2019-12-27 14:22:17.518168: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: | ||
+ | name: GeForce GTX 750 Ti major: 5 minor: 0 memoryClockRate(GHz): 1.1105 | ||
+ | pciBusID: 0000:01:00.0 | ||
+ | 2019-12-27 14:22:17.518385: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0 | ||
+ | 2019-12-27 14:22:17.519624: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 | ||
+ | 2019-12-27 14:22:17.675574: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0 | ||
+ | 2019-12-27 14:22:17.716621: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0 | ||
+ | 2019-12-27 14:22:18.160070: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0 | ||
+ | 2019-12-27 14:22:18.439862: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0 | ||
+ | 2019-12-27 14:22:18.443378: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 | ||
+ | 2019-12-27 14:22:18.443483: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | ||
+ | 2019-12-27 14:22:18.444034: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | ||
+ | 2019-12-27 14:22:18.444510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Ignoring visible gpu device (device: 0, name: GeForce GTX 750 Ti, pci bus id: 0000:01:00.0, compute capability: 5.0) with | ||
+ | '''Cuda compute capability 5.0. The minimum required Cuda capability is 6.0.''' | ||
+ | 2019-12-27 14:22:18.498425: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: | ||
+ | 2019-12-27 14:22:18.498455: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 | ||
+ | 2019-12-27 14:22:18.498463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N | ||
+ | 2019-12-27 14:22:18.504855: I tensorflow/cc/saved_model/loader.cc:202] Restoring SavedModel bundle. | ||
+ | 2019-12-27 14:22:18.528948: I tensorflow/cc/saved_model/loader.cc:151] Running initialization op on SavedModel bundle at path: /home/oleg/GIT/imagej-elphel/target/classes/trained_model | ||
+ | 2019-12-27 14:22:18.581034: I tensorflow/cc/saved_model/loader.cc:311] SavedModel load for tags { serve }; Status: success. Took 1105321 microseconds. | ||
+ | </font> | ||
+ | * So, <font color='darkgreen'>'''TF1.15 + CUDA 10.0 might work with GeForce GTX 1080 Ti (compute capability 6.1)'''</font> | ||
+ | TF Test button in Eyesis_Correction plugin worked with CUDA 10.0 (even with nvidia-smi showing CUDA 10.1 - it's probably not relevant to the libs used) |
Latest revision as of 15:19, 27 December 2019
ImageJ plugin
- Install Eclipse
- Clone and Import imagej-elphel
NOTE: if project is updated/pulled outside Eclipse - might need a manual refresh
- TF version is pulled from pom.xml
- Trained TF model for EO sensors is auto-downloaded - trained_model_v1.0.zip
- Get some image samples, provide paths
- Before running the plugin (Eyesis_Correction), copy imagej options to /home/user/.imagejs/Eyesis_Correction.xml:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> <properties> <comment>last updated Thu Sep 08 14:09:47 MDT 2042</comment> <entry key="ADVANCED_MODE">True</entry> <entry key="DCT_MODE">True</entry> <entry key="MODE_3D">False</entry> <entry key="GPU_MODE">True</entry> <entry key="LWIR_MODE">True</entry> </properties>
- Updated pom.xml to TF 1.15 - package exists
- Install cuDNN all 3 packages - runtime, dev and docs. Used docs to verify installation - built mnistCUDNN:
https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html
- I think TF 1.15 maven package was built for CUDA 10.0 driver, and so it whines when 10.2 is installed.
2019-12-27 13:05:15.754656: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory 2019-12-27 13:05:15.754756: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory 2019-12-27 13:05:15.754860: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory 2019-12-27 13:05:15.754970: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory 2019-12-27 13:05:15.755075: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory 2019-12-27 13:05:15.755178: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory 2019-12-27 13:05:15.762197: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2019-12-27 13:05:15.762227: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices...
- TF 1.15 and CUDA 10.0 require GPU compute capability = 6.0, GeForce GTX 750Ti is 5.0:
2019-12-27 14:22:17.475717: I tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /home/oleg/GIT/imagej-elphel/target/classes/trained_model 2019-12-27 14:22:17.477009: I tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve } 2019-12-27 14:22:17.503393: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3392030000 Hz 2019-12-27 14:22:17.504196: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f610dba1a20 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2019-12-27 14:22:17.504235: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2019-12-27 14:22:17.505378: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2019-12-27 14:22:17.517647: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-12-27 14:22:17.518168: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: GeForce GTX 750 Ti major: 5 minor: 0 memoryClockRate(GHz): 1.1105 pciBusID: 0000:01:00.0 2019-12-27 14:22:17.518385: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0 2019-12-27 14:22:17.519624: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 2019-12-27 14:22:17.675574: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0 2019-12-27 14:22:17.716621: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0 2019-12-27 14:22:18.160070: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0 2019-12-27 14:22:18.439862: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0 2019-12-27 14:22:18.443378: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2019-12-27 14:22:18.443483: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-12-27 14:22:18.444034: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-12-27 14:22:18.444510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Ignoring visible gpu device (device: 0, name: GeForce GTX 750 Ti, pci bus id: 0000:01:00.0, compute capability: 5.0) with Cuda compute capability 5.0. The minimum required Cuda capability is 6.0. 2019-12-27 14:22:18.498425: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-12-27 14:22:18.498455: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 2019-12-27 14:22:18.498463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N 2019-12-27 14:22:18.504855: I tensorflow/cc/saved_model/loader.cc:202] Restoring SavedModel bundle. 2019-12-27 14:22:18.528948: I tensorflow/cc/saved_model/loader.cc:151] Running initialization op on SavedModel bundle at path: /home/oleg/GIT/imagej-elphel/target/classes/trained_model 2019-12-27 14:22:18.581034: I tensorflow/cc/saved_model/loader.cc:311] SavedModel load for tags { serve }; Status: success. Took 1105321 microseconds.
- So, TF1.15 + CUDA 10.0 might work with GeForce GTX 1080 Ti (compute capability 6.1)
TF Test button in Eyesis_Correction plugin worked with CUDA 10.0 (even with nvidia-smi showing CUDA 10.1 - it's probably not relevant to the libs used)