Cluster auto-scale using KEDA & Prometheus
Requirements
Kubernetes cluster with attached GPU
The GPU application is in a running state
The kube-prometheus-stack and prometheus-adapter packages in the FPT App Catalog service, as in this documentation.

Step by Step
Step 1: Install KEDA
Using the FPT App Catalog
Select the FPT Cloud App Catalog service, then search for KEDA in the fptcloud-catalogs repository.
Using the Helm chart
Check if the KEDA pods are running normally
kubectl -n keda get pod
Step 2: Check if Prometheus has GPU metrics
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq -r . | grep DCGM
Step 3: Create a ScaledObject to specify autoscaling for the application
Manifest
name: The name of the GPU deployment in the example is
gpu-testserverAddress:_The endpoint of the Prometheus server in the example is
http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090query: The PromQL query to find the value based on which autoscale is performed. In the example above, it finds the average values of the variable
DCGM_FI_PROF_GR_ENGINE_ACTIVEthreshold: The threshold value to trigger active autoscale; in the example it is
0.8
As shown in the example above, whenever the average value of DCGM_FI_PROF_GR_ENGINE_ACTIVE exceeds 0.8, ScaledObject will scale the pods of the Deployment named gpu-test.
After creating the ScaledObject, the deployment will automatically scale down to 0, indicating successful configuration.
Last updated
