Cluster configuration
The Managed GPU Cluster product is developed from Kubernetes Native and integrates additional cloud provider components into Kubernetes, including the FPT Cloud Controller Manager component. This component aims to manage worker nodes in the cluster and Load Balancer-type services. Users can expose their applications to the internet in many ways so that their customers can access the applications and services. These methods may include creating an ingress for the service, creating a node port service and attaching a floating IP to the worker node, or using a Load Balancer service.
FPTCloud supports users in creating load balancer services with accompanying annotation options
in the service configuration:
Key
Value
Default
Purpose
service.beta.kubernetes.io/fpt-load-balancer-internal
"true"/"false"
"false"
If you do not want to expose the service to the internet, set the value to "true"
loadbalancer.fptcloud.com/keep-floatingip
"true"/"false"
"false"
If you want to keep the LoadBalancer service's floating IP within the VPC after deleting the service, set the value to "true"
loadbalancer.fptcloud.com/proxy-protocol
"true"/"false"
"false"
If you want the LoadBalancer to use the PROXY protocol, configure the value as "true". Note: The Proxy protocol is only used with Layer 4 LoadBalancers
loadbalancer.fptcloud.com/enable-health-monitor
"true"/"false"
"true"
To disable the health monitor for the LoadBalancer Pool, set the value to "false".
service.beta.kubernetes.io/fpt-load-balancer-type
LBv1 includes: basic/ advanced/ standard/ premium LBv2 includes: Basic-1/ Basic-2/ Standard/ Advanced/Premium
LBv1: "basic" LBv2: "Basic-1"
Configure the LoadBalancer flavor to handle the corresponding load of the application behind the LoadBalancer pool backend
loadbalancer.fptcloud.com/enable-ingress-hostname
"true"/"false"
"false"
To enable ingress hostname for the LoadBalancer service type, set the value to "true"
loadbalancer.fptcloud.com/load-balancer-version
"v1"/"v2"
"v1"
To use LBv2 for the LoadBalancer service type, configure the value as "v2". LBv1 will be created by default if not configured this annotation
loadbalancer.fptcloud.com/x-forwarded-for
"true"/"false"
"false"
To forward the request header to the LoadBalancer pool backend when using LoadBalancer layer7, configure the value as "true". Note: You cannot use the proxy protocol and x-forwarded-for at the same time.
Additionally, Managed GPU Cluster supports users to configure:
Create a LoadBalancer service type specifying a floating IP attached to the Load Balancer
Note: The public IP must be allocated to the VPC and be in the Inactive state. The user goes to the
Networking -> Floating IPs to check.
Restrict access to the Load Balancer by configuring
_"loadBalancerSourceRanges"_in the _"spec"_section of the service configuration:
14.233.234.0/24
10.250.0.0/24
Note: The "loadBalancerSourceRanges" configuration contains an array of public IP ranges allowed to access the Load Balancer. By default, M-FKE creates a Load Balancer service type with the source IP range configured as 0.0.0.0/0.
Ollama is an open-source tool that allows you to run, manage, and customize large language models (LLMs) on personal computers or servers, supporting various models such as Llama, DeepSeek, Mistral, etc. Open-WebUI is an open-source web interface specifically designed to
interact with Ollama, providing a user-friendly experience and making it easy to manage and use LLM models.
This document will guide you through the steps to deploy the DeepSeek-R1 model on the FPT Managed GPU Cluster using Ollama and Open-WebUI so that users can use it simply and easily.
Step 1: Clone the existing source code and script of Open-WebUI
git clone https://github.com/open-webui/open-webui
Step 2: Run the scripts to deploy ollama and open-webui. The directory contains all the files needed for deployment, such as namespace, ollama statefulSet, ollama service, open-webui deployment, and open-webui service.
kubectl apply -f ./kubernetes/manifest
Step 3: Access open-webui in your browser at the forwarded port, for example: http://localhost:52433. For the first time installing and using OpenWebUI, users will need to configure the following information: name, email, password.
Step 4: After installation is complete, the user selects the model to use. For example, here we will install the DeepSeek-R1 model, version** 1.5b**.
Step 5: After the model has been loaded and run, users can interact with the model very simply
and intuitively through the interface.
Last updated
