Cluster auto-scale using GPU custom metrics
Kubernetes supports automatic scaling based on custom metrics, such as GPU metrics, by integrating with Prometheus. This article introduces how to configure auto scale for GPU-based applications running on the FPT Kubernetes Engine platform.
Requirements:
Kubernetes cluster with attached GPUs
GPU-based application in running state
Step by Steps
Step 1: Install the kube-prometheus-stack and prometheus-adapter packages
Use the FPT App Catalog service
Use the FPT App Catalog service, create an App Catalog, then select Connect Cluster to connect to the GPU Cluster.
In the App Catalogs menu, select Repositories as fptcloud-catalogs, search for prometheus, then select install the kube-prometheus-stackpackage**,**enter the Release name and Namespace to deploy the package.
Using the Helm chart:
helm repo add xplat-fke
https://registry.fke.fptcloud.com/chartrepo/xplat-fke
&& helm repo update
helm install --wait --generate-name \
-n prometheus --create-namespace \ xplat-fke/kube-prometheus-stack
prometheus_service=$(kubectl get svc -n prometheus -lapp=kube-prometheus-stack-prometheus -ojsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}')
helm install --wait --generate-name \
-n prometheus --create-namespace \ xplat-fke/prometheus-adapter \
--set prometheus.url=http://${prometheus_service}.prometheus.svc.cluster.localAfter deploying the kube-prometheus-stack package, we continue to deploy the prometheus-adapter, but we need to change the package values to point to the correct prometheus service of kube-prometheus-stack. For example, with the namespace of kube-prometheus-stack set to prometheus, the values we need to fill in are:
<>.<>.svc.cluster.local
prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.localNext, we check the status of the two packages
Step 2: Configure Horizontal Pod Autoscaler for the GPU application
Horizontal Pod Autoscaler (HPA) automatically scales Pods to meet the conditions specified in the configuration. In the previous section, after configuring the prometheus-addapter, it will export the Custom Metrics of DCGM to monitor the GPU workload.
Example of an HPA manifest file
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-gpu-app
spec:
maxReplicas: 3 # Update this accordingly
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: my-gpu-app # Add label from Deployment we need to autoscale
metrics:
- type: Pods # scale pod based on gpu
pods:
metric:
name: DCGM_FI_PROF_GR_ENGINE_ACTIVE # Add the DCGM metric here accordingly
target:
type: AverageValue
averageValue: 0.8Refer to NVIDIA’s documentation for DCGM metrics at the following link.
Then check the newly created HPA:
kubectl get hpa -A
Last updated
