Deploy applications
Overview
Kubernetes manages and uses GPU resources in the same way as CPU resources. Depending on the GPU configuration selected for the Worker Group, declare GPU resources for the application on Kubernetes.
Note:
You can specify GPU limits without specifying requests, as Kubernetes uses limits as the default request value.
You can specify both GPU limits and requests, but these two values must be equal.
You cannot specify GPU requests without specifying limits.
Check the GPU configuration using the following command:
kubectl get node -o json | jq ‘.items[].metadata.labels‘
Example: The image below shows a worker using an Nvidia A30 card, configuration strategy: all-balanced, status: success.

Check the GPU Instance configuration on the worker using the following command (SSH into the worker, type the command):
Example of deploying an application using GPU:
With the sharing mode MIG and Single strategy
GPU resources are declared as follows:
nvidia.com/gpu:
#Example:
nvidia.com/gpu: 1
*(With the single strategy, the GPU card is divided into equal instances)Example deployment using the single GPU strategy
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-gpu-app
spec:
replicas: 1
selector:
matchLabels:
component: gpu-app
template:
metadata:
labels:
component: gpu-app
spec:
containers:
- name: gpu-container
securityContext:
capabilities:
add:
- SYS_ADMIN
resources:
limits:
nvidia.com/mig-1g.6gb: 1
image: nvidia/samples:dcgmproftester-2.0.10-cuda11.0-ubuntu18.04
command: ["/bin/sh", "-c"]
args:
- while true; do /usr/bin/dcgmproftester11 --no-dcgm-validation -t 1004 -d 300; sleep 30; done With MIG and mixed sharing modes
GPU resources are declared as follows:
nvidia.com/<type>:
#Example
nvidia.com/mig-1g.6gb: 2
*(With the mixed strategy, a GPU card can be split into two instance types, so you must specify the instance type when declaring resources.)Example deployment using the mixed GPU strategy
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-gpu-app
spec:
replicas: 1
selector:
matchLabels:
component: gpu-app
template:
metadata:
labels:
component: gpu-app
spec:
containers:
- name: gpu-container
securityContext:
capabilities:
add:
- SYS_ADMIN
resources:
limits:
nvidia.com/mig-1g.6gb: 1
image: nvidia/samples:dcgmproftester-2.0.10-cuda11.0-ubuntu18.04
command: ["/bin/sh", "-c"]
args:
- while true; do /usr/bin/dcgmproftester11 --no-dcgm-validation -t 1004 -d 300; sleep 30; done With the non-strategy
GPU resources are declared as follows:
#Syntax:
nvidia.com/gpu: 1
*(With the none strategy, the pod will use all the resources of a single GPU card.)Example deployment using the non-strategy
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-gpu-app
spec:
replicas: 1
selector:
matchLabels:
component: gpu-app
template:
metadata:
labels:
component: gpu-app
spec:
containers:
- name: gpu-container
securityContext:
capabilities:
add:
- SYS_ADMIN
resources:
limits:
nvidia.com/gpu: 1
image: nvidia/samples:dcgmproftester-2.0.10-cuda11.0-ubuntu18.04
command: ["/bin/sh", "-c"]
args:
- while true; do /usr/bin/dcgmproftester11 --no-dcgm-validation -t 1004 -d 300; sleep 30; done With MPS sharing mode
GPU resources are declared as follows:
#Syntax: nvidia.com/gpu:
#Example:
nvidia.com/gpu: 1Note: The maximum number of nvidia.com/gpu resources a pod can request is 1.
Last updated
