Bird
Raised Fist0
MLOpsdevops~5 mins

GPU support in containers in MLOps - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is the main purpose of GPU support in containers?
GPU support in containers allows applications running inside containers to use the computer's GPU hardware for faster processing, especially useful for tasks like machine learning and data processing.
Click to reveal answer
beginner
Which NVIDIA tool helps containers access GPU resources easily?
NVIDIA provides the 'NVIDIA Container Toolkit' which enables containers to use NVIDIA GPUs by integrating GPU drivers and libraries inside the container environment.
Click to reveal answer
intermediate
How do you run a Docker container with GPU support using NVIDIA runtime?
You run the container with the flag: docker run --gpus all <image>. This tells Docker to give the container access to all GPUs on the host machine.
Click to reveal answer
beginner
Why can't containers use GPUs by default without special setup?
Containers are isolated and do not have direct access to hardware like GPUs. Special drivers and runtimes are needed to bridge this gap and allow GPU usage inside containers.
Click to reveal answer
intermediate
Name one common environment variable used inside containers to detect GPU availability.
The environment variable CUDA_VISIBLE_DEVICES is often used to specify which GPUs a containerized application can see and use.
Click to reveal answer
Which command flag enables GPU support when running a Docker container?
A--use-gpu
B--enable-gpu
C--gpus all
D--gpu-access
What does the NVIDIA Container Toolkit provide?
AA way for containers to access NVIDIA GPUs
BA new GPU hardware
CA container orchestration tool
DA GPU monitoring dashboard
Why do containers need special setup to use GPUs?
ABecause containers only run on CPUs
BBecause containers isolate hardware access by default
CBecause GPUs are not compatible with containers
DBecause GPUs require internet access
Which environment variable controls GPU visibility inside a container?
AGPU_ACCESS_LEVEL
BNVIDIA_GPU_FLAG
CCONTAINER_GPU
DCUDA_VISIBLE_DEVICES
What is a common use case for GPU support in containers?
AMachine learning model training
BSimple text editing
CWeb page hosting
DFile storage
Explain how GPU support works in containers and why it is important.
Think about how containers normally isolate hardware and what tools help bridge that gap.
You got /5 concepts.
    Describe the steps to run a container with GPU support using Docker.
    Consider what setup is needed on the host and the command to start the container.
    You got /4 concepts.

      Practice

      (1/5)
      1. What is the main purpose of enabling GPU support in containers?
      easy
      A. To reduce the container's memory usage
      B. To increase the container's disk space
      C. To enable network access inside the container
      D. To allow containers to use the host's GPU for faster computing

      Solution

      1. Step 1: Understand GPU role in containers

        GPUs speed up computing tasks by handling parallel processing efficiently.
      2. Step 2: Identify GPU support purpose

        Enabling GPU support allows containers to access the host's GPU hardware for faster computation.
      3. Final Answer:

        To allow containers to use the host's GPU for faster computing -> Option D
      4. Quick Check:

        GPU support = faster computing [OK]
      Hint: GPU support means using host GPU inside container [OK]
      Common Mistakes:
      • Confusing GPU support with disk or memory changes
      • Thinking GPU enables network access
      • Assuming GPU support reduces container size
      2. Which Docker command flag is used to enable GPU support when running a container?
      easy
      A. --gpus
      B. --enable-gpu
      C. --gpu-access
      D. --use-gpu

      Solution

      1. Step 1: Recall Docker GPU flag syntax

        The official Docker flag to enable GPU support is --gpus.
      2. Step 2: Verify other options

        Options like --enable-gpu, --gpu-access, and --use-gpu are incorrect or do not exist.
      3. Final Answer:

        --gpus -> Option A
      4. Quick Check:

        Docker GPU flag = --gpus [OK]
      Hint: Docker GPU flag is exactly --gpus [OK]
      Common Mistakes:
      • Using incorrect flag names like --enable-gpu
      • Confusing GPU flag with network or volume flags
      • Omitting the flag entirely
      3. What will be the output of the command docker run --gpus all nvidia/cuda:11.0-base nvidia-smi if the host has a compatible NVIDIA GPU and drivers installed?
      medium
      A. Displays the NVIDIA GPU status and driver information
      B. Shows an error: 'nvidia-smi command not found'
      C. Runs the container but shows no GPU information
      D. Fails with 'GPU not accessible' error

      Solution

      1. Step 1: Understand the command purpose

        The command runs a container with full GPU access and executes nvidia-smi to show GPU info.
      2. Step 2: Check host requirements

        If the host has compatible NVIDIA GPU and drivers, nvidia-smi runs successfully inside the container.
      3. Final Answer:

        Displays the NVIDIA GPU status and driver information -> Option A
      4. Quick Check:

        Host GPU + drivers + --gpus = nvidia-smi output [OK]
      Hint: If host GPU ready, nvidia-smi shows GPU info inside container [OK]
      Common Mistakes:
      • Assuming nvidia-smi is missing inside official CUDA image
      • Ignoring host driver requirements
      • Expecting GPU info without --gpus flag
      4. You run docker run --gpus all nvidia/cuda:11.0-base nvidia-smi but get the error: 'docker: Error response from daemon: could not select device driver'. What is the most likely cause?
      medium
      A. The container command syntax is incorrect
      B. The Docker image does not support GPUs
      C. The NVIDIA Container Toolkit is not installed on the host
      D. The host has no internet connection

      Solution

      1. Step 1: Analyze the error message

        The error indicates Docker cannot find a GPU device driver to assign to the container.
      2. Step 2: Identify missing component

        This usually happens if the NVIDIA Container Toolkit (nvidia-docker2) is not installed or configured on the host.
      3. Final Answer:

        The NVIDIA Container Toolkit is not installed on the host -> Option C
      4. Quick Check:

        Missing NVIDIA toolkit = device driver error [OK]
      Hint: Device driver error means NVIDIA Container Toolkit missing [OK]
      Common Mistakes:
      • Blaming Docker image for GPU support
      • Assuming syntax error causes this message
      • Thinking internet is required for this error
      5. You want to run a container with access to only GPUs 0 and 1 on a host with 4 GPUs. Which Docker run command correctly limits GPU access?
      hard
      A. docker run --gpus 2 nvidia/cuda:11.0-base nvidia-smi
      B. docker run --gpus 'device=0,1' nvidia/cuda:11.0-base nvidia-smi
      C. docker run --gpus all nvidia/cuda:11.0-base nvidia-smi
      D. docker run --gpus 'count=2' nvidia/cuda:11.0-base nvidia-smi

      Solution

      1. Step 1: Understand GPU selection syntax

        To limit to specific GPUs 0 and 1, Docker uses the --gpus 'device=0,1' syntax to specify GPU IDs.
      2. Step 2: Evaluate options

        docker run --gpus 2 nvidia/cuda:11.0-base nvidia-smi requests any 2 GPUs but does not specify GPUs 0 and 1. docker run --gpus 'count=2' nvidia/cuda:11.0-base nvidia-smi uses invalid syntax count=2. docker run --gpus all nvidia/cuda:11.0-base nvidia-smi uses all GPUs.
      3. Final Answer:

        docker run --gpus 'device=0,1' nvidia/cuda:11.0-base nvidia-smi -> Option B
      4. Quick Check:

        Specify GPUs by device IDs with --gpus 'device=...' [OK]
      Hint: Use --gpus 'device=0,1' to pick specific GPUs [OK]
      Common Mistakes:
      • Using --gpus 2 without device IDs
      • Using invalid syntax like count=2
      • Assuming --gpus all limits GPUs