Cluster auto-scale using KEDA & Prometheus

Requirements

  • Kubernetes cluster with attached GPU

  • The GPU application is in a running state

  • The kube-prometheus-stack and prometheus-adapter packages in the FPT App Catalog service, as in this documentation.

Step by Step

Step 1: Install KEDA

Using the FPT App Catalog

Select the FPT Cloud App Catalog service, then search for KEDA in the fptcloud-catalogs repository.

Using the Helm chart

Check if the KEDA pods are running normally

kubectl -n keda get pod

Step 2: Check if Prometheus has GPU metrics

kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq -r . | grep DCGM

Step 3: Create a ScaledObject to specify autoscaling for the application

  • Manifest

  • name: The name of the GPU deployment in the example is gpu-test

  • serverAddress:_The endpoint of the Prometheus server in the example is http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090

  • query: The PromQL query to find the value based on which autoscale is performed. In the example above, it finds the average values of the variable DCGM_FI_PROF_GR_ENGINE_ACTIVE

  • threshold: The threshold value to trigger active autoscale; in the example it is 0.8

As shown in the example above, whenever the average value of DCGM_FI_PROF_GR_ENGINE_ACTIVE exceeds 0.8, ScaledObject will scale the pods of the Deployment named gpu-test.

After creating the ScaledObject, the deployment will automatically scale down to 0, indicating successful configuration.

Last updated