Cluster Manual-Scaling

Manual Scale allows users to actively adjust the scale of system resources as needed. Users can increase or decrease the number of Metal Cloud Servers per day on the portal by following these steps:

Step 1: In the menu, select AI Infrastructure> Managed GPU Cluster.The system will display the

Managed GPU Management. Select the cluster to which you want to add a Worker Group.

Step 2: Click on the cluster you want to scale, then select Node Pools> Edit Workers.

Step 3: Update the Number of Serversto increase it according to your usage needs, then click the Save button.

Note:

The manual server scaling process will take a few minutes. The Cluster status will change to Processing until the new worker successfully joins the cluster. The cluster continues to operate normally while scaling new servers.

Labels and Taints are two important mechanisms that help manage and distribute workloads efficiently in systems with multiple Worker Groups, making it easy to group workers by purpose, performance, or geographic region. Managed GPU Cluster allows users to add, edit, or delete labels/taints directly on the Unify Portal.

Step 1: In the menu, select AI Infrastructure> Managed GPU Cluster. The system will display the Managed GPU Cluster Management page. Select the cluster you want to edit the Label/Taint for.

Step 2: Select Node Pools> Edit Workers

Step 3: Enter the Labels and Taints you want to add to the Worker Group and click the Save button

Notes:

The process of editing Labels and Taints will take a few minutes, and the Cluster status will change to Processing. While this is happening, users cannot edit the Cluster until the process is complete.
When users wish to change the base Worker Group, system components (coredns, metrics servers, CNI controller, etc.) will be redeployed on the Worker nodes belonging to the new base Worker Group. This feature is beneficial when users want to increase/decrease the flavor configuration of Worker nodes in the base Worker Group. In this case, users create a new Worker Group with the desired Worker node configuration, make the new Worker Group the base, and delete the old base Worker Group.

Step 1: In the menu, select AI Infrastructure> Managed GPU Cluster. The system will display the Managed GPU Cluster Management page. Select the cluster for which you want to change the Worker Group configuration.

Step 2: Select Node Pools> Edit Workers.

Step 3: Select the Worker Group you want to change and click the Save button.

Note:

The process of changing the Worker Group Base will be performed, and during this process, users cannot edit the Cluster until the process is complete.
When changing the parameters of the Worker Group, the system will first create new Worker nodes with the desired configuration. Once the new Worker nodes are successfully created, the Worker node with the old configuration will be removed from the system. The pods will be transferred from the old Worker node to the new Worker nodes.

PreviousGPU Sharing NextAdd a Worker group

Last updated 3 days ago