Cluster configuration

The Managed GPU Cluster product is developed from Kubernetes Native and integrates additional cloud provider components into Kubernetes, including the FPT Cloud Controller Manager component. This component aims to manage worker nodes in the cluster and Load Balancer-type services. Users can expose their applications to the internet in many ways so that their customers can access the applications and services. These methods may include creating an ingress for the service, creating a node port service and attaching a floating IP to the worker node, or using a Load Balancer service.

FPTCloud supports users in creating load balancer services with accompanying annotation options

in the service configuration:

Key

Value

Default

Purpose

service.beta.kubernetes.io/fpt-load-balancer-internal

"true"/"false"

"false"

If you do not want to expose the service to the internet, set the value to "true"

loadbalancer.fptcloud.com/keep-floatingip

"true"/"false"

"false"

If you want to keep the LoadBalancer service's floating IP within the VPC after deleting the service, set the value to "true"

loadbalancer.fptcloud.com/proxy-protocol

"true"/"false"

"false"

If you want the LoadBalancer to use the PROXY protocol, configure the value as "true". Note: The Proxy protocol is only used with Layer 4 LoadBalancers

loadbalancer.fptcloud.com/enable-health-monitor

"true"/"false"

"true"

To disable the health monitor for the LoadBalancer Pool, set the value to "false".

service.beta.kubernetes.io/fpt-load-balancer-type

LBv1 includes: basic/ advanced/ standard/ premium LBv2 includes: Basic-1/ Basic-2/ Standard/ Advanced/Premium

LBv1: "basic" LBv2: "Basic-1"

Configure the LoadBalancer flavor to handle the corresponding load of the application behind the LoadBalancer pool backend

loadbalancer.fptcloud.com/enable-ingress-hostname

"true"/"false"

"false"

To enable ingress hostname for the LoadBalancer service type, set the value to "true"

loadbalancer.fptcloud.com/load-balancer-version

"v1"/"v2"

"v1"

To use LBv2 for the LoadBalancer service type, configure the value as "v2". LBv1 will be created by default if not configured this annotation

loadbalancer.fptcloud.com/x-forwarded-for

"true"/"false"

"false"

To forward the request header to the LoadBalancer pool backend when using LoadBalancer layer7, configure the value as "true". Note: You cannot use the proxy protocol and x-forwarded-for at the same time.

Additionally, Managed GPU Cluster supports users to configure:

Create a LoadBalancer service type specifying a floating IP attached to the Load Balancer

Group 133, Grouped object

Group 139, Grouped object

Note: The public IP must be allocated to the VPC and be in the Inactive state. The user goes to the

Networking -> Floating IPs to check.

Restrict access to the Load Balancer by configuring

_"loadBalancerSourceRanges"_in the _"spec"_section of the service configuration:

14.233.234.0/24
10.250.0.0/24

Note: The "loadBalancerSourceRanges" configuration contains an array of public IP ranges allowed to access the Load Balancer. By default, M-FKE creates a Load Balancer service type with the source IP range configured as 0.0.0.0/0.

Ollama is an open-source tool that allows you to run, manage, and customize large language models (LLMs) on personal computers or servers, supporting various models such as Llama, DeepSeek, Mistral, etc. Open-WebUI is an open-source web interface specifically designed to

interact with Ollama, providing a user-friendly experience and making it easy to manage and use LLM models.

This document will guide you through the steps to deploy the DeepSeek-R1 model on the FPT Managed GPU Cluster using Ollama and Open-WebUI so that users can use it simply and easily.

Step 1: Clone the existing source code and script of Open-WebUI

git clone https://github.com/open-webui/open-webui

Step 2: Run the scripts to deploy ollama and open-webui. The directory contains all the files needed for deployment, such as namespace, ollama statefulSet, ollama service, open-webui deployment, and open-webui service.

kubectl apply -f ./kubernetes/manifest

Step 3: Access open-webui in your browser at the forwarded port, for example: http://localhost:52433. For the first time installing and using OpenWebUI, users will need to configure the following information: name, email, password.

Step 4: After installation is complete, the user selects the model to use. For example, here we will install the DeepSeek-R1 model, version** 1.5b**.

Step 5: After the model has been loaded and run, users can interact with the model very simply

and intuitively through the interface.

PreviousDelete a cluster NextUsing with High-performance Storage

Last updated 6 days ago