-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Description
Is there an existing issue for this?
- I have searched the existing issues
Bug description
The container images, without tagged "latest" (the oldest version of the image), have no libcudnn.so.8, and I cannot use GPU acceleration with these images.
It seems that these images were not built with this Dockerfile, and these base images do not use "nvidia/cuda:11.X.Y-cudnn8" looking at the history.
Operating System
operating system: Linux
DeepLabCut version
dlc version: 2.2.1.1, 2.2.0.6
DeepLabCut mode
single animal
Device type
gpu: NVIDIA GeForce RTX3080
Steps To Reproduce
docker run --rm -it --gpus all deeplabcut/deeplabcut:2.2.1.1-gui-cuda11.7.0-runtime-ubuntu20.04 python3 -c 'from tensorflow.python.client import device_lib; device_lib.list_local_devices()'
Relevant log output
2023-02-25 13:41:14.467894: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2023-02-25 13:41:15.138019: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-25 13:41:15.139964: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2023-02-25 13:41:15.156153: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-02-25 13:41:15.156207: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3080 computeCapability: 8.6
coreClock: 1.71GHz coreCount: 70 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 849.46GiB/s
2023-02-25 13:41:15.156236: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2023-02-25 13:41:15.159535: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2023-02-25 13:41:15.159597: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2023-02-25 13:41:15.177416: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2023-02-25 13:41:15.177711: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2023-02-25 13:41:15.178288: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2023-02-25 13:41:15.178931: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2023-02-25 13:41:15.179111: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2023-02-25 13:41:15.179138: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2023-02-25 13:41:15.280587: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2023-02-25 13:41:15.280631: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2023-02-25 13:41:15.280648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: NAnything else?
No response
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
No labels