Skip to content

Support option --device=nvidia.com/gpu=all for nixosΒ #20597

Open
aaron-nall/minikube
#1
@xiaoxiangmoe

Description

@xiaoxiangmoe

What Happened?

minikube start --memory=max --driver=docker --container-runtime=docker --gpus=all --force --addons=nvidia-device-plugin

πŸ˜„  minikube v1.34.0 on Nixos 25.05
    β–ͺ MINIKUBE_WANTUPDATENOTIFICATION=false
❗  minikube skips various validations when --force is supplied; this may lead to unexpected behavior
✨  Using the docker driver based on user configuration

🧯  The requested memory allocation of 15875MiB does not leave room for system overhead (total system memory: 15875MiB). You may face stability issues.
πŸ’‘  Suggestion: Start minikube with less memory allocated: 'minikube start --memory=3900mb'

πŸ“Œ  Using Docker driver with root privileges
πŸ‘  Starting "minikube" primary control-plane node in "minikube" cluster
🚜  Pulling base image v0.0.45 ...
❗  minikube was unable to download gcr.io/k8s-minikube/kicbase:v0.0.45, but successfully downloaded docker.io/kicbase/stable:v0.0.45 as a fallback image
πŸ”₯  Creating docker container (CPUs=2, Memory=15875MB) ...
🀦  StartHost failed, but will try again: creating host: create: creating: create kic node: create container: docker run -d -t --privileged --security-opt seccomp=unconfined --tmpfs /tmp --tmpfs /run -v /run/current-system/kernel-modules/lib/modules:/lib/modules:ro --hostname minikube --name minikube --label created_by.minikube.sigs.k8s.io=true --label name.minikube.sigs.k8s.io=minikube --label role.minikube.sigs.k8s.io= --label mode.minikube.sigs.k8s.io=minikube --network minikube --ip 192.168.49.2 --gpus all --env NVIDIA_DRIVER_CAPABILITIES=all --volume minikube:/var --security-opt apparmor=unconfined --memory=15875mb -e container=docker --expose 8443 --publish=127.0.0.1::8443 --publish=127.0.0.1::22 --publish=127.0.0.1::2376 --publish=127.0.0.1::5000 --publish=127.0.0.1::32443 docker.io/kicbase/stable:v0.0.45@sha256:81df288595202a317b1a4dc2506ca2e4ed5f22373c19a441b88cfbf4b9867c85: exit status 125
stdout:
b42d8ba6ac445e1fd3de0dc12c79e9f3655b79b46376b6aa6d1d3bc99ddb1b9e

stderr:
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]

Run 'docker run --help' for more information

Attach the log file

None

Operating System

Other

Driver

Docker

Other infomation

In nixos's docker, --gpus=all is not supported, we can only use --device=nvidia.com/gpu=all

See also

https://github.com/NixOS/nixpkgs/issues/363505#issuecomment-2665190781

Example:

Ξ» ~/ docker run --rm -it --gpus=all ubuntu:latest nvidia-smi
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]

Run 'docker run --help' for more information
Ξ» ~/ docker run --rm -it --device=nvidia.com/gpu=all ubuntu:latest nvidia-smi                                               
Mon Apr  7 09:50:01 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.133.07             Driver Version: 570.133.07     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2070        Off |   00000000:01:00.0 Off |                  N/A |
| 28%   38C    P8             11W /  175W |     157MiB /   8192MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

Question

Can we support

minikube start --memory=max --driver=docker --container-runtime=docker --device=nvidia.com/gpu=all --force --addons=nvidia-device-plugin

?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions