Fix: Docker GPU Not Found
Resolve the "could not select device driver" or "nvidia-container-cli" error when running Docker with GPU
Error Message
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]
or
nvidia-container-cli: initialization error
or
torch.cuda.is_available() returns False
Root Cause
This error occurs when Docker cannot access the host GPU. Common causes:
- NVIDIA Container Toolkit not installed - Required for GPU passthrough
- Docker daemon not configured - Missing nvidia runtime configuration
- Missing --gpus flag - Container launched without GPU access
- Driver not loaded - NVIDIA kernel module not active
Solution
Step 1: Verify NVIDIA Driver
nvidia-smi
Should show your GPU and driver version. If not, install NVIDIA drivers first.
Step 2: Install NVIDIA Container Toolkit
# Add repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# Install toolkit
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
Step 3: Configure Docker
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Step 4: Test GPU Access
docker run --rm --gpus all nvidia/cuda:12.1.1-base-ubuntu22.04 nvidia-smi
Step 5: Run Your Container
Always use the --gpus all flag:
docker run --gpus all your-image python -c "import torch; print(torch.cuda.is_available())"
Quick Checklist
-
nvidia-smiworks on host - nvidia-container-toolkit installed
- Docker daemon restarted after toolkit install
- Using
--gpus allor--gpus "device=0"
Generate GPU-Ready Dockerfile
Configuration
Local GPU or CPU environment
2025推荐,Blackwell(10.0)原生支持,官方cu128编译包
1# syntax=docker/dockerfile:12# ^ Required for BuildKit cache mounts and advanced features34# Generated by DockerFit (https://tools.eastondev.com/docker)5# PYTORCH 2.9.1 + CUDA 12.8 | Python 3.116# Multi-stage build for optimized image size78# ==============================================================================9# Stage 1: Builder - Install dependencies and compile10# ==============================================================================11FROM nvidia/cuda:12.8.0-cudnn-devel-ubuntu24.04 AS builder1213# Build arguments14ARG DEBIAN_FRONTEND=noninteractive1516# Environment variables17ENV PYTHONUNBUFFERED=118ENV PYTHONDONTWRITEBYTECODE=119ENV TORCH_CUDA_ARCH_LIST="8.0;8.6;8.9;9.0;10.0"2021# Install Python 3.11 from deadsnakes PPA (Ubuntu 24.04)22RUN apt-get update && apt-get install -y --no-install-recommends \23 software-properties-common \24 && add-apt-repository -y ppa:deadsnakes/ppa \25 && apt-get update && apt-get install -y --no-install-recommends \26 python3.11 \27 python3.11-venv \28 python3.11-dev \29 build-essential \30 git31 && rm -rf /var/lib/apt/lists/*3233# Create virtual environment34ENV VIRTUAL_ENV=/opt/venv35RUN python3.11 -m venv $VIRTUAL_ENV36ENV PATH="$VIRTUAL_ENV/bin:$PATH"3738# Upgrade pip39RUN pip install --no-cache-dir --upgrade pip setuptools wheel4041# Install PyTorch with BuildKit cache42RUN --mount=type=cache,target=/root/.cache/pip \43 pip install torch torchvision torchaudio \44 --index-url https://download.pytorch.org/whl/cu1284546# Install project dependencies47COPY requirements.txt .48RUN --mount=type=cache,target=/root/.cache/pip \49 pip install -r requirements.txt5051# ==============================================================================52# Stage 2: Runtime - Minimal production image53# ==============================================================================54FROM nvidia/cuda:12.8.0-cudnn-runtime-ubuntu24.04 AS runtime5556# Labels57LABEL maintainer="Generated by DockerFit"58LABEL version="2.9.1"59LABEL description="PYTORCH 2.9.1 + CUDA 12.8"6061# Environment variables62ENV PYTHONUNBUFFERED=163ENV PYTHONDONTWRITEBYTECODE=164ENV NVIDIA_VISIBLE_DEVICES=all65ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility6667# Install Python 3.11 runtime from deadsnakes PPA (Ubuntu 24.04)68RUN apt-get update && apt-get install -y --no-install-recommends \69 software-properties-common \70 && add-apt-repository -y ppa:deadsnakes/ppa \71 && apt-get update && apt-get install -y --no-install-recommends \72 python3.11 \73 libgomp174 && apt-get remove -y software-properties-common \75 && apt-get autoremove -y \76 && rm -rf /var/lib/apt/lists/*7778# Create non-root user for security79ARG USERNAME=appuser80ARG USER_UID=100081ARG USER_GID=$USER_UID82RUN groupadd --gid $USER_GID $USERNAME \83 && useradd --uid $USER_UID --gid $USER_GID -m $USERNAME8485# Copy virtual environment from builder86COPY --from=builder --chown=$USERNAME:$USERNAME /opt/venv /opt/venv87ENV VIRTUAL_ENV=/opt/venv88ENV PATH="$VIRTUAL_ENV/bin:$PATH"8990# Set working directory91WORKDIR /app9293# Copy application code94COPY --chown=$USERNAME:$USERNAME . .9596# Switch to non-root user97USER $USERNAME9899# Expose port100EXPOSE 8000101102# Default command103CMD ["python", "main.py"]
High-Performance GPU Cloud
Deploy your Docker containers with powerful NVIDIA GPUs. A100/H100 available, 32+ global locations.
- NVIDIA A100/H100 GPU instances
- Hourly billing, starting at $0.004/h
- 32+ global data centers
- One-click container & bare metal deployment