Proxmox GPU Passthrough for Docker using LXC to host WebODM with ClusterODM
References:
- NVIDIA Drivers, NVIDIA Driver Search
- NVIDIA Container Toolkit Docs
- Ansible Playbook that runs the recommendations below
- GPU Passthrough on Proxmox
- ClusterODM Setup Guide
- Using Specific GPU with NodeODM
- ClusterODM Project
Remove Old NVIDIA Drivers
- List existing NVIDIA or CUDA packages:
apt list --installed | egrep -i "nvidia|cuda" | cut -d/ -f1
- If drivers are listed, uninstall the current NVIDIA runfile driver:
sudo ./NVIDIA-Linux-*.run --uninstall
- Re-check installed packages:
apt list --installed | egrep -i "nvidia|cuda" | cut -d/ -f1
- If any packages remain, remove them:
apt list --installed | egrep -i "nvidia|cuda" | cut -d/ -f1 | xargs apt remove -y
Setting Up GPU Passthrough on Proxmox Server
- Install required packages:
apt install pve-headers dkms pciutils
- Edit
/etc/default/grub
and update:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
- Update grub:
update-grub2
- Blacklist default GPU drivers:
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf
- Add to
/etc/modules
:vfio vfio_iommu_type1 vfio_pci vfio_virqfd
- Update initramfs:
update-initramfs -u -k all
- Reboot the Proxmox server.
- Download NVIDIA driver (from NVIDIA Drivers). This document uses the official NVIDIA driver runfile. The distro version can break with apt updates.
Example:NVIDIA-Linux-x86_64-570.133.07.run
- Set the installer as executable:
chmod +x NVIDIA-Linux-*.run
- Run the installer:
./NVIDIA-Linux-*.run
- Reboot the Proxmox server again.
- Check installation:
nvidia-smi
- Check NVIDIA device IDs:
ls -al /dev/nvidia*
Example output:crw-rw-rw- 1 root root 195, 0 /dev/nvidia0 crw-rw-rw- 1 root root 195, 255 /dev/nvidiactl crw-rw-rw- 1 root root 509, 0 /dev/nvidia-uvm crw-rw-rw- 1 root root 509, 1 /dev/nvidia-uvm-tools
Note device IDs like 195, 235, 255, 509.
- Edit LXC config file at
/etc/pve/lxc/<ID>.conf
:lxc.cgroup2.devices.allow: c 195:* rwm lxc.cgroup2.devices.allow: c 235:* rwm lxc.cgroup2.devices.allow: c 255:* rwm lxc.cgroup2.devices.allow: c 509:* rwm lxc.mount.entry: /dev/nvidia0 /dev/nvidia0 none bind,optional,create=file lxc.mount.entry: /dev/nvidiactl /dev/nvidiactl none bind,optional,create=file lxc.mount.entry: /dev/nvidia-modeset /dev/nvidia-modeset none bind,optional,create=file lxc.mount.entry: /dev/nvidia-uvm /dev/nvidia-uvm none bind,optional,create=file lxc.mount.entry: /dev/nvidia-uvm-tools /dev/nvidia-uvm-tools none bind,optional,create=file
- Check
nvidia-smi
again:
nvidia-smi
Set Up LXC Container for Docker
- Install tools:
apt install pciutils
- Install NVIDIA driver again with:
./NVIDIA-Linux-*.run --no-kernel-modules
- Install APT prerequisites:
apt update apt install -y apt-transport-https ca-certificates curl gnupg lsb-release
- Add NVIDIA APT repo:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
- Update package list:
apt update
- Install NVIDIA Container Toolkit:
apt install -y nvidia-container-toolkit
- Configure Docker to use NVIDIA runtime:
sudo nvidia-ctk runtime configure --runtime=docker
- Restart Docker:
sudo systemctl restart docker
- Verify GPU inside container:
nvidia-smi
Example output:+-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 570.133.07 Driver Version: 570.133.07 CUDA Version: 12.8 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 Quadro P600 On | 00000000:01:00.0 Off | N/A | | 0% 37C P8 N/A / N/A | 3MiB / 2048MiB | 0% Default | +-----------------------------------------+------------------------+----------------------+ | Processes: | | No running processes found | +-----------------------------------------------------------------------------------------+
Reference: NVIDIA Container Toolkit Installation Guide
WebODM + ClusterODM (Docker Setup)
docker run -d --rm -ti -p 3000:3000 -p 10000:10000 -p 8080:8080 opendronemap/clusterodm
Run on the worker node:
docker run -d -p 3001:3000 opendronemap/nodeodm:gpu --gpus all --restart always
./webodm.sh start --default-nodes 0 --detached --port 80
Connect Node in WebODM UI:
Go to: http://10.0.1.131:10000
Add Node: 10.0.1.131:3001
Configure Immich for GPU
(Adapted from: Immich Docs)
- Download the latest
hwaccel.ml.yml
file and place it in the same folder asdocker-compose.yml
. - In
docker-compose.yml
, underimmich-machine-learning
, uncomment theextends
section and changecpu
to the appropriate backend. - Also in
immich-machine-learning
, add one of:[armnn, cuda, rocm, openvino, rknn]
to the image tag. - Redeploy the
immich-machine-learning
container with the updated settings.