Docker container fails to start with --gpus all option on WSL #7

TomPan-1901 · 2023-09-20T23:36:18Z

I am developing a multi-agent reinforcement learning environment based your framework, and I want to deploy your docker image on WSL. However, when I enable the --gpus all option, I get the following message and the environment fails to start:

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/<layer hash>/merged/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: file exists: unknown.

I found this issue on nvidia-docker that helped me solve the problem: NVIDIA/nvidia-container-toolkit#289. It says that WSL has its own cuda runtime libraries, which are injected into the container when the image is created, so the container cannot have those static libraries.

After starting the container under priviledged mode without gpu, enter the command below and create a new image, the hmp container can use gpu normally:

rm -rf /usr/lib/x86_64-linux-gnu/libcuda.so.1 /usr/lib/x86_64-linux-gnu/libnvidia-*.so.1 /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1

I hope you can add this solution to your documentation for future reference. This would make it easier for me and other users who encounter the same problem. Thank you!

The text was updated successfully, but these errors were encountered:

binary-husky · 2023-09-21T02:03:34Z

thank you~

binary-husky added the documentation Improvements or additions to documentation label Sep 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker container fails to start with --gpus all option on WSL #7

Docker container fails to start with --gpus all option on WSL #7

TomPan-1901 commented Sep 20, 2023

binary-husky commented Sep 21, 2023

Docker container fails to start with --gpus all option on WSL #7

Docker container fails to start with --gpus all option on WSL #7

Comments

TomPan-1901 commented Sep 20, 2023

binary-husky commented Sep 21, 2023