Cluster auto-scale using GPU custom metrics
Kubernetes supports automatic scaling based on custom metrics, such as GPU metrics, by integrating with Prometheus. This article introduces how to configure auto scale for GPU-based applications running on the FPT Kubernetes Engine platform.
Requirements:
Kubernetes cluster with attached GPUs
GPU-based application in running state
Step by Steps
Step 1: Install the kube-prometheus-stack and prometheus-adapter packages
Use the FPT App Catalog service
Use the FPT App Catalog service, create an App Catalog, then select Connect Cluster to connect to the GPU Cluster.
In the App Catalogs menu, select Repositories as fptcloud-catalogs, search for prometheus, then select install the kube-prometheus-stackpackage**,**enter the Release name and Namespace to deploy the package.
Using the Helm chart:
After deploying the kube-prometheus-stack package, we continue to deploy the prometheus-adapter, but we need to change the package values to point to the correct prometheus service of kube-prometheus-stack. For example, with the namespace of kube-prometheus-stack set to prometheus, the values we need to fill in are:
Next, we check the status of the two packages
Step 2: Configure Horizontal Pod Autoscaler for the GPU application
Horizontal Pod Autoscaler (HPA) automatically scales Pods to meet the conditions specified in the configuration. In the previous section, after configuring the prometheus-addapter, it will export the Custom Metrics of DCGM to monitor the GPU workload.
Example of an HPA manifest file
Refer to NVIDIA’s documentation for DCGM metrics at the following link.
Then check the newly created HPA:
kubectl get hpa -A
Last updated
