GPU and NPU Device Plugins Usage Guide#
Overview#
Intel GPU and NPU device plugins expose hardware accelerators to Kubernetes pods without requiring privileged containers. The plugins register devices with the kubelet via the Kubernetes Device Plugin framework, allowing pods to request accelerators through standard resource limits — gpu.intel.com/xe for GPU and npu.intel.com/accel for NPU.
The Edge Node Infrastructure Blueprint pre-installs these plugins during first boot. This guide covers verification, pod scheduling, and common usage patterns.
Resource Name |
Plugin |
Hardware |
Use Case |
|---|---|---|---|
|
Intel GPU device plugin |
Intel Xe-based integrated/discrete GPU (including SR-IOV VFs) |
Media processing, inference, rendering |
|
Intel NPU device plugin |
Intel NPU 2000/3000/4000 series |
Low-power AI inference |
Prerequisites#
Edge Node Infrastructure Blueprint image deployed with Kubernetes (K3s) host type
Intel GPU and/or NPU hardware present
K3s running with device plugins installed
Step 1: Verify Plugin Installation#
After first boot, confirm that the device plugin pods are running:
# Check device plugin pods
kubectl get pods -n intel-device-plugins
Check all pods across namespaces:
sudo kubectl get pods -A
Expected healthy output includes the running Intel and Node Feature Discovery components:
intel-device-plugins intel-gpu-plugin-xxxxx 1/1 Running
intel-device-plugins intel-npu-plugin-xxxxx 1/1 Running
node-feature-discovery nfd-master-xxxxx 1/1 Running
node-feature-discovery nfd-worker-xxxxx 1/1 Running
Step 2: Verify Allocatable Resources#
Confirm that the node advertises GPU and NPU resources:
kubectl describe node | grep -A 20 "Allocatable:"
Expected output (example with SR-IOV VFs enabled):
Allocatable:
cpu: xx
ephemeral-storage: xxx
gpu.intel.com/monitoring: x
gpu.intel.com/xe: x
memory: xxx
npu.intel.com/accel: x
pods: xxx
The gpu.intel.com/xe count reflects the number of SR-IOV Virtual Functions available for allocation.
You can also verify the node labels applied by NFD:
kubectl get nodes --show-labels | tr ',' '\n' | grep intel
Step 3: Run a Pod with GPU Access#
Create a pod that requests one Intel GPU device:
apiVersion: v1
kind: Pod
metadata:
name: gpu-test
namespace: default
spec:
restartPolicy: Never
containers:
- name: gpu-check
image: ubuntu:24.04
command: ["sh", "-c", "ls -la /dev/dri/ && echo 'GPU accessible'"]
resources:
limits:
gpu.intel.com/xe: "1"
Apply and check output:
kubectl apply -f gpu-test.yaml
kubectl wait pod/gpu-test --for=jsonpath='{.status.phase}'=Succeeded --timeout=60s
kubectl logs gpu-test
Expected output:
crw-rw---- 1 root render 226, 129 Jun 17 10:00 renderD129
GPU accessible
Step 4: Run a Pod with NPU Access#
Create a pod that requests the Intel NPU device:
apiVersion: v1
kind: Pod
metadata:
name: npu-test
namespace: default
spec:
restartPolicy: Never
containers:
- name: npu-check
image: ubuntu:24.04
command: ["sh", "-c", "ls -la /dev/accel/ && echo 'NPU accessible'"]
resources:
limits:
npu.intel.com/accel: "1"
Apply and check output:
kubectl apply -f npu-test.yaml
kubectl wait pod/npu-test --for=jsonpath='{.status.phase}'=Succeeded --timeout=60s
kubectl logs npu-test
Expected output:
crw-rw---- 1 root render 261, 0 Jun 17 10:00 accel0
NPU accessible
Step 5: Run a Pod with Both GPU and NPU#
Request both accelerators in a single pod for combined inference workloads:
apiVersion: v1
kind: Pod
metadata:
name: gpu-npu-test
namespace: default
spec:
restartPolicy: Never
containers:
- name: accel-check
image: ubuntu:24.04
command: ["sh", "-c", "ls /dev/dri/ && ls /dev/accel/ && echo 'Both accelerators accessible'"]
resources:
limits:
gpu.intel.com/xe: "1"
npu.intel.com/accel: "1"
Using GPU/NPU in Deployments#
For production workloads, use a Deployment to manage GPU or NPU pods:
apiVersion: apps/v1
kind: Deployment
metadata:
name: inference-app
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: inference
template:
metadata:
labels:
app: inference
spec:
containers:
- name: inference
image: ubuntu:24.04
command: ["sh", "-c", "echo 'GPU device available:' && ls /dev/dri/ && sleep infinity"]
resources:
limits:
gpu.intel.com/xe: "1"
The scheduler only places pods on nodes that have available GPU or NPU resources. If the node has 7 SR-IOV VFs, up to 7 pods can each receive one GPU device.
Troubleshooting#
No GPU/NPU Resources on the Node#
Check that the device plugin pods are running:
kubectl get pods -n intel-device-plugins kubectl logs -n intel-device-plugins -l app=intel-gpu-plugin
Verify the hardware is detected by the host:
ls /dev/dri/ # GPU devices ls /dev/accel/ # NPU devices
Verify NFD labels are applied:
kubectl get nodes --show-labels | tr ',' '\n' | grep 'gpu.intel.com' kubectl get nodes --show-labels | tr ',' '\n' | grep 'npu.intel.com'
Pod Stuck in Pending State#
If a pod requesting GPU/NPU is stuck in Pending:
kubectl describe pod <pod-name>
Common causes:
Insufficient resources: All GPU/NPU devices are already allocated to other pods
No matching node: The node does not have the requested hardware
Plugin not running: The device plugin pod crashed or was not installed
GPU Plugin Shows 0 Devices#
If SR-IOV VFs are expected but not detected:
cat /sys/bus/pci/devices/0000:00:02.0/sriov_numvfs
ls /dev/dri/renderD*
If VFs are not created, the SR-IOV service may not have run. Check:
sudo systemctl status intel-sriov-vf.service
sudo journalctl -u intel-sriov-vf.service --no-pager
NPU Plugin Shows 0 Devices#
Verify the NPU driver is loaded:
ls /sys/bus/pci/drivers/intel_vpu/
lsmod | grep intel_vpu
If not loaded:
sudo modprobe intel_vpu
For kernel 6.17+, verify firmware:
ls /lib/firmware/intel/vpu/
dmesg | grep -i vpu
References#
Container Device Interface Guide — for Docker/Podman usage without Kubernetes