> For the complete documentation index, see [llms.txt](https://ai-docs.fptcloud.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://ai-docs.fptcloud.com/fpt-gpu-cloud/gpu-cluster/managed-k8s-with-metal-cloud/use-cases/serving-deepseek-r1.md).

# Serving DeepSeek-R1

Ollama is an open-source tool that enables running, managing, and customizing large language models (LLMs) on personal computers or servers, supporting various models such as Llama, DeepSeek, Mistral, and more. Open-WebUI is an open-source web interface specifically designed to

Interact with Ollama, providing a user-friendly experience for managing and using LLM models.

This document will guide you through the steps to deploy the DeepSeek-R1 model on the FPT Managed GPU Cluster using Ollama and Open-WebUI so that users can use it simply and easily.

**Step 1**: Clone the existing Open-WebUI source code and script

```
git clone https://github.com/open-webui/open-webui
cd open-webui/kubernetes
```

**Step 2**: Run the scripts to deploy ollama and open-webui. The directory contains all the necessary files for deployment, such as **namespace**, **ollama statefulSet**, **ollama service**, **open-webui deployment**and **open-webui service**.

```
cd kubernetes
kubectl apply -f ./kubernetes/manifest
```

**Step 3**: Access open-webui on the browser at the forwarded port, for example: [*http://localhost:52433*](http://localhost:52433/). For the first time installing and using OpenWebUI, users will need to configure the following information: name, email, password.

**Step 4**: After installation is complete, users select the model to use. For example, here we will install the DeepSeek-R1 model, version **1.5b**.

![](/files/1d2678b255596f6f567c0f0fd458c3209cbdebe6)

**Step 5**: Once the model has been loaded and run, users can interact with the model very simply and intuitively through the interface.

![](/files/577a5492d035c0eddcfa941b0d0ad70a3051fe5e)

![](/files/71416f1cda92a5580bbbc7a5380431b9eb40579c)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ai-docs.fptcloud.com/fpt-gpu-cloud/gpu-cluster/managed-k8s-with-metal-cloud/use-cases/serving-deepseek-r1.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
