Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Missing drivers

  • By AI researcher unhappy with NVIDIA software
  • on 12/18/2023

This should be preconfigured to run NVIDIA GPU Cloud (NGC) containers such as the PyTorch one, however it fails on launch on AWS (on a p3.2xlarge instance).

After sshing in, I see this error message:
```
Installing drivers ...
modprobe: FATAL: Module nvidia not found in directory /lib/modules/6.2.0-1011-aws
```
And sure enough, running containers such as PyTorch (https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch) does not work:

```
~$ docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:23.11-py3
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: nvml error: driver not loaded: unknown.
```


There are no comments to display