# Deployment (LoRA Inference)

### **How to deploy a fine-tuned LoRA model?** <a href="#contentify_0" id="contentify_0"></a>

**User story:**\
As a user, I want to deploy my fine-tuned LoRA model so that I can use it immediately via API without managing infrastructure.

**Steps**

1. **Go to the Deployment page** from the navigation bar.
   * Or click **View deployment** from the success pop-up after fine-tuning.

<figure><img src="/files/RhewcNUAh2EV7OKalInA" alt=""><figcaption></figcaption></figure>

2. **Click Deploy** next to the LoRA model you want to deploy.
   * Status will change to **Deploying**.
3. Once deployment is successful, the status will show **Deployed**.

***

### **How to manage deployed models?**

On the **Deployment** page, you can:

* **Get API Key** – Retrieve the key to call your model.
* **View API request** – Open a pop-up with sample JSON response.
* **Try in Playground** – Test the model directly in the UI.
* **Undeploy** – Stop the deployed model (confirmation required).

**Status badges**

* **Deploying** – Model is being deployed.
* **Deployed** – Model is ready for inference.
* **Stopped** – Model is undeployed.
* **Failed** – Deployment failed.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ai-docs.fptcloud.com/fpt-ai-inference/fpt-ai-inference/tutorials/deployment-lora-inference.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.