Deploy applications

Overview

Kubernetes manages and uses GPU resources in the same way as CPU resources. Depending on the GPU configuration selected for the Worker Group, declare GPU resources for the application on Kubernetes.

Note:

  • You can specify GPU limits without specifying requests, as Kubernetes uses limits as the default request value.

  • You can specify both GPU limits and requests, but these two values must be equal.

  • You cannot specify GPU requests without specifying limits.

  • Check the GPU configuration using the following command:

kubectl get node -o json | jq ‘.items[].metadata.labels‘

Example: The image below shows a worker using an Nvidia A30 card, configuration strategy: all-balanced, status: success.

Check the GPU Instance configuration on the worker using the following command (SSH into the worker, type the command):

Example of deploying an application using GPU:

With the sharing mode MIG and Single strategy

GPU resources are declared as follows:

nvidia.com/gpu:

#Example:
nvidia.com/gpu: 1

*(With the single strategy, the GPU card is divided into equal instances)

Example deployment using the single GPU strategy

With MIG and mixed sharing modes

GPU resources are declared as follows:

Example deployment using the mixed GPU strategy

With the non-strategy

GPU resources are declared as follows:

Example deployment using the non-strategy

With MPS sharing mode

GPU resources are declared as follows:

Note: The maximum number of nvidia.com/gpu resources a pod can request is 1.

Last updated