Tutorial

How to create a container

Using GUI

Notice: Each tenant can only have a maximum of 10 containers. If you have reached this limit, please delete unused container to create a new one.

Select GPU Container in the Side menu and click button “Create New Container”
Give your container a name using Container Name field.
Select a GPU Instance (we currently support NVIDIA GPU H100 and H200)
Template: Users can either choose to use built-in templates or use their own images. We highly recommend that our customers to use built-in templates for faster deployment.

a. Built-in templates: Click “Change Template” and choose the template.

b. Custom template: Bring your own template by using the feature “Custom Template”.

Access container

a. Ports

This feature significantly enhances the flexibility of your containerized applications, allowing a single container to serve diverse functionalities on different ports.

Both HTTP and TCP ports are supported, with a maximum of 10 ports per type for each container.

b. SSH

Add SSH keys to enable remote access to your container. Each container supports a maximum of 10 SSH keys. These keys will be injected into the container at runtime, allowing you to SSH into the container using any of the provided keys.

Notice: Currently, v1.1.2 GPU Container only Ubuntu template already includes SSH configuration. If you want to connect via SSH in other templates, please install OpenSSH-server before using.

To add an SSH key, please follow the instructions:

Ensure you have an SSH key pair generated on your local machine. If you haven’t done this, you can generate one using this command on your local terminal:

ssh-keygen -t ed25519 -C [[email protected]](mailto:[email protected])

To retrieve your public SSH key, run this command:

cat ~/.ssh/id_ed25519.pub

This will output something similar to this:

ssh-ed25519 AAAAC4NzaC1lZDI1JTE5AAAAIGP+L8hnjIcBqUb8NRrDiC32FuJBvRA0m8jLShzgq6BQ [email protected]

Copy and paste the output into the SSH Public Keys field when you create the container.

Advanced Settings (Optional)

a. Persistent Disk: specify the amount of storage that users need to store training weights, models, etc. Read more about Storage here

b. Environment Variables: key-value pairs injected into the container at runtime.

c. Startup Command: command and arguments to run at the start of container

Click “Create New Container” to create and start your container.
In case your balance is not enough to create a new container (lower cost of using the container for 1 hour), please follow these instructions to add credit to your account

Importing YAML file

For quick deployment, or when you already have a configuration file prepared, use this feature to create a container rather than configuring it through the user interface.

Step 1: Open Import Configuration modal

Navigate to GPU Container from the side menu.
Click Import Configuration located on the top right of the container list page.

Step 2: Provide configuration file in YAML format

You can import the configuration in two ways:

Paste YAML directly into the YAML editor.
Upload a YAML file by clicking the Upload file button. Currently, GPU Container supports YAML files only.

A sample YAML template can be downloaded by clicking Download template.

Field

Data type

Sample data

Description

name

string

my-container

Name of your container. Must be unique per tenant

instance_type

string

GPU-H100-1

Vietnam site supports 1xH100 -> 8xH100; Japan site supports 1xH200 -> 8xH200

image_setting

Since a container can only have 1 image, please provide either template_name or image_url + image_tag

template_name

string

Jupyter Notebook

Built-in template name. Provides this in case you want to use built-in template provided by FPT. Please input an exact name in the list: Jupyter Notebook, Code Server, vllm-openai, vllm-openai-v0.10.1, ollama, ollama-openwebui, Ubuntu 24.04, Tensorflow 2.19.0, Nvidia Cuda 12.9.1, NVIDIA Pytorch 25.03.

image_url

string

registry/myimage:latest

(Optional) Custom image URL. Leave blank if using the built-in template.

image_tag

string

v1.0

(Optional) Tag for custom image.

image_user

string

admin

(Optional) Username for private image registry.

image_password

string

password123

(Optional) Password for private image registry

access_container

tcp_ports

list[int]

[22, 33]

TCP ports exposed by the container

http_ports

list[int]

[8888, 6006]

HTTP ports exposed by the container

ssh_keys

Provide each pair of name-key SSH keys. Allow a maximum of 10 keys

name

string

key01

Name of the SSH key

key

string

"ssh-rsa AAAAB3..."

SSH public key

advanced_settings

persistent_disk

mount_capacity

int (GB)

Amount of persistent storage to attach.

mount_path

string

/workspace

Path where persistent disk will be mounted inside the container.

environment_variables

key

string

USERNAME

Environment variables injected at runtime.

value

string

admin

startup_commands

cmds

list[string]

Startup commands (optional).

args

list[string]

Startup command arguments (optional).

Step 3: Review Configuration

**Notice:**The button “Review" will only be enabled when all the validations within the YAML editor have passed.

Click Reviewto continue. On this screen, you can:

Verify container configuration including template, GPU, CPU, RAM, and disk allocation.
Check pricing summary to view the estimated hourly cost.

Step 4: Create Container

Once confirmed, click Create Container to start deployment. The system will automatically create and launch your container based on the provided configuration file.

Export Container Configuration

For later reuse, the Export Configuration feature allows you to save a container’s configuration and download it into a YAML file.

From the List Containers screen, click the Action (3-dot) menu and select Export Configuration.

Alternatively, open the Container Details page and click Export Configuration.

A YAML file will be automatically downloaded with the name format: container-name.yaml

How to connect to a container

You can connect to your GPU Container using a few different methods, depending on your specific needs, preferences, and the template used to create the container.

HTTP Service

Connecting to a container using HTTP is convenient, quick, and secure via HTTPS. To connect using the HTTP Service:

Step 1: Once the container is running, navigate to Container Details Page.

Step 2: Find Access container Section, open HTTP Endpoint.

Step 3: Follow the guide that matches your template.

Template

Jupyter, Code Server

Ollama WebUI

Ollama

Vllm

Pre-condition

None

Hugging Face Token (*)Before creating a new container, you must fill your Hugging Face Token in Enviroment Variable section.

Next steps

Open the endpoint in your browser
Use the Username and Password fields in the Environment Variables section of the Container Details page to access your container

Open the endpoint in your browser
Create a new Open WebUI account or use your existing account.
Select a model to pull and test the model.

Testing your container using Postman (**)

(*) Hugging Face Token: Hugging Face Token in Environment Variable section is required when using Ollama template. If you do not have Hugging Face Token yet, please follow this guide User access tokens.

(**) Testing container by using Postman: Append /v1/models to your endpoint, then provide your API_TOKEN in the Authorization. If you're using the vLLM template, also include HUGGING_FACE_HUB_TOKEN in the request parameters to test your container.

TCP Ports

To access your instance via public endpoint, you will need to add TCP ports to the container configuration. When your container is created, you will receive a public domain and an external public port mapping to access your service. An external public port will be randomly selected from the range 30000-40000.

The format will be DOMAIN:EXTERNAL_PORT -> INTERNAL_PORT. For example:

tcp-endpoint-stg.serverless.fptcloud.com:34771 → :22

SSH Terminal

To get the SSH command for your container, navigate to the Container details page. Copy the command listed under SSH command.

It should look something like this:

ssh [[email protected]](mailto:[email protected]) -p 34771 ~/.ssh/id_e25595

Run the copied command in your local terminal to connect to your container.

How to manage container

Start Container

Open the List Containers.
Find the GPU Container you want to start and click the 3-dot icon.
Select “Start” action.

Edit Container

From the List Containers, select the container you want to edit and access "Container Details" screen.
Click the Edit icon of the section you want to modify. You can now edit the 'Access container' section (including Ports and SSH) and the 'Advanced settings' section (including Persistent Disk, Environment Variables, and Startup Commands).
Confirm by clicking “Save”.

Stop Container

Warning: You will be charged for idle GPU containers even if they are stopped. If you don’t need to retain your container, you should terminate it completely.

Open the List Containers
Find the GPU Container you want to stop and click the 3-dot icon
Select “Stop” action
Confirm by clicking “Confirm”

Delete Container

Danger: Deleting a container permanently deletes all data in temporary storage and persistent storage. Be sure you’ve saved any data you want to access again.

Open the List Containers.
Find the GPU Container you want to delete and click the 3-dot icon.
Select “Delete” action.
Confirm by entering “delete” in the text field and clicking “Confirm”.

How to monitor container

GPU Container provides container logs and metricsto help you monitor and troubleshoot your workloads. To view your logs and metrics, open the Details Container screen, open the Logs or Monitoring tab. This gives you container logs and metrics monitoring, making it easy to diagnose issues or monitor your container’s activity.

Container Logs

Container logs include all application logs. Note that logs are only kept for 14 days, and timestamps are shown in the UTC timezone.

Download: Download logs from the last 14 days of your container.
Search: Enter a keyword to search within the log content.
Time Filter: Filter logs by specific time ranges.
Refresh: Interval at which the container logs are automatically updated.

Metric Monitoring

Monitoring metrics are collected to track the performance, availability, and resource usage of containerized services, helping detect issues and optimize operations. Note that metric data is retained for 14 days.

There are 4 metric groups:

Utilization metrics: Monitor CPU, memory, and GPU usage to assess system performance and resource efficiency.
Disk metrics: Track disk read/write speed, and latency to detect storage issues or bottlenecks.
Network metric: Measure network traffic, latency, and errors to identify connectivity problems and ensure service reliability.
Temperature and Power metrics: Monitor hardware temperature and power consumption to prevent overheating and maintain hardware health.

Time Filter: Filter metrics by specific time ranges.
Refresh: Interval at which the container metrics are automatically updated.

Templates

Templates are used to launch images as containers and define the required container disk size, volume, volume paths, and ports needed. You can also define environment variables and startup commands within the template.

Built-in Templates

These templates are created and maintained by FPT AI Factory. We now offer 6 built-in templates:

Jupyter Notebook

Intended Use: This template provides Jupyter Notebook to adopt remote development for AI/Data Scientists without local hardware limitations.
Environment Variables

Some more useful environment variables are provided for container customization.

Variable

Type

Default

Description

USERNAME

string

admin

Username to access Jupyter Notebook

PASSWORD

string

Password to access Jupyter Notebook
Generate by system

Port

Type

Port

HTTP

8000

TCP

Ollama WebUI

Intended Use: This template supports running various large language model (LLM) programs, including Ollama and APIs compatible with OpenAI, making it easy for users to customize based on workflow.
Port:

Type

Port

HTTP

8080

TCP

Ollama

Intended Use: This template enables high-throughput inference using GPU resources with a state-of-the-art engine.
Environment Variables

Some more useful environment variables are provided for container customization.

Variable

Type

Default

Description

API_TOKEN

string

Auto-authenticate with external services
Generate by system

Port:

vLLM

Intended Use: This vLLM container image is built and maintained by AI Factory. This template enables high-throughput model inference using GPU resources with a state-of-the-art engine.
Environment Variables

Some more useful environment variables are provided for container customization.

Variable

Type

Default

Description

HUGGING_FACE_HUB_TOKEN

string

Your Hugging Face User Access Token

Port:

Type

Port

HTTP

8000

Code Server

Intended Use: This template offers cloud-based VS Code with GPU to train, test, and debug AI models remotely with full IDE capabilities.
Environment Variables

Some more useful environment variables are provided for container customization.

Variable

Type

Default

Description

PUID

int

UserID

PGID

int

GroupID

string

Etc/UTC

Your timezone

PROXY_DOMAIN

string

code-server.my.domain

The domain will be proxied for subdomain proxying

DEFAULT_WORKSPACE

string

Default folder opened when accessing code-server

PASSWORD

string

Generate by system

Port:

Type

Port

HTTP

8443

TCP

Ubuntu

Type

Port

TCP

Intended Use: This is a minimal Ubuntu CLI virtual machine with several useful additions to improve your user experience. While the root account is available as usual, we have created a normal system user for your convenience.
Development tools pre-installed: SSH access.

Custom Templates

You can use your own** Docker image** by clicking **"Custom Template"**and overriding your own image:tag. If your image is from a private Docker repository, make sure to provide your username and password for authentication.

Storage

Persistent Disk

GPU Container provides High-Performance Storage (HPS) remaining for the duration of a container’s life. It functions similarly to a hard disk, allowing you to store data that needs to be retained even if the container is stopped.

Key characteristics:

Available until the container is deleted permanently.
Prevent data loss by storing data, models, or files that need to be preserved across container restarts or reconfigurations.

Temporary Disk

Temporary disk (NVMe) is a type of storage that provides temporary storage for a container. Any data stored on the temporary disk will be lost when the container is stopped or deleted so make sure to back up important data before shutting down your container.

Storage type comparision

Data persistence

Lost on stop/ restart

Retained until container deletion

Lifecycle

Tied directly to the container’s active session

Tied to the container’s lease period

Performance

Fastest (locally attached)

Reliable, generally slower than temporary disk

Capacity

Fixed according to the selected GPU instance

Selectable at creation

Cost

FREE

Refer to ai.fptcloud.com/pricing

Best for

Temporary session data, cache

Persistent application data, models, datasets

PreviousQuickstart NextUse cases

Last updated 12 days ago