Tutorial
How to create a container
Using GUI
Notice: Each tenant can only have a maximum of 10 containers. If you have reached this limit, please delete unused container to create a new one.
Select GPU Container in the Side menu and click button “Create New Container”
Give your container a name using Container Name field.
Select a GPU Instance (we currently support NVIDIA GPU H100 and H200)
Template: Users can either choose to use built-in templates or use their own images. We highly recommend that our customers to use built-in templates for faster deployment.
a. Built-in templates: Click “Change Template” and choose the template.
b. Custom template: Bring your own template by using the feature “Custom Template”.
Access container
a. Ports
This feature significantly enhances the flexibility of your containerized applications, allowing a single container to serve diverse functionalities on different ports.
Both HTTP and TCP ports are supported, with a maximum of 10 ports per type for each container.
b. SSH
Add SSH keys to enable remote access to your container. Each container supports a maximum of 10 SSH keys. These keys will be injected into the container at runtime, allowing you to SSH into the container using any of the provided keys.
Notice: Currently, v1.1.2 GPU Container only Ubuntu template already includes SSH configuration. If you want to connect via SSH in other templates, please install OpenSSH-server before using.
To add an SSH key, please follow the instructions:
Ensure you have an SSH key pair generated on your local machine. If you haven’t done this, you can generate one using this command on your local terminal:
ssh-keygen -t ed25519 -C [[email protected]](mailto:[email protected])
To retrieve your public SSH key, run this command:
cat ~/.ssh/id_ed25519.pub
This will output something similar to this:
ssh-ed25519 AAAAC4NzaC1lZDI1JTE5AAAAIGP+L8hnjIcBqUb8NRrDiC32FuJBvRA0m8jLShzgq6BQ [email protected]
Copy and paste the output into the SSH Public Keys field when you create the container.
Advanced Settings (Optional)
a. Persistent Disk: specify the amount of storage that users need to store training weights, models, etc. Read more about Storage here
b. Environment Variables: key-value pairs injected into the container at runtime.
c. Startup Command: command and arguments to run at the start of container
Click “Create New Container” to create and start your container.
In case your balance is not enough to create a new container (lower cost of using the container for 1 hour), please follow these instructions to add credit to your account
Importing YAML file
For quick deployment, or when you already have a configuration file prepared, use this feature to create a container rather than configuring it through the user interface.
Step 1: Open Import Configuration modal
Navigate to GPU Container from the side menu.
Click Import Configuration located on the top right of the container list page.
Step 2: Provide configuration file in YAML format
You can import the configuration in two ways:
Paste YAML directly into the YAML editor.
Upload a YAML file by clicking the Upload file button. Currently, GPU Container supports YAML files only.
A sample YAML template can be downloaded by clicking Download template.
Field
Data type
Sample data
Description
name
string
my-container
Name of your container. Must be unique per tenant
instance_type
string
GPU-H100-1
Vietnam site supports 1xH100 -> 8xH100; Japan site supports 1xH200 -> 8xH200
image_setting
Since a container can only have 1 image, please provide either template_name or image_url + image_tag
template_name
string
Jupyter Notebook
Built-in template name. Provides this in case you want to use built-in template provided by FPT. Please input an exact name in the list: Jupyter Notebook, Code Server, vllm-openai, vllm-openai-v0.10.1, ollama, ollama-openwebui, Ubuntu 24.04, Tensorflow 2.19.0, Nvidia Cuda 12.9.1, NVIDIA Pytorch 25.03.
image_url
string
registry/myimage:latest
(Optional) Custom image URL. Leave blank if using the built-in template.
image_tag
string
v1.0
(Optional) Tag for custom image.
image_user
string
admin
(Optional) Username for private image registry.
image_password
string
password123
(Optional) Password for private image registry
access_container
tcp_ports
list[int]
[22, 33]
TCP ports exposed by the container
http_ports
list[int]
[8888, 6006]
HTTP ports exposed by the container
ssh_keys
Provide each pair of name-key SSH keys. Allow a maximum of 10 keys
name
string
key01
Name of the SSH key
key
string
"ssh-rsa AAAAB3..."
SSH public key
advanced_settings
persistent_disk
mount_capacity
int (GB)
20
Amount of persistent storage to attach.
mount_path
string
/workspace
Path where persistent disk will be mounted inside the container.
environment_variables
key
string
USERNAME
Environment variables injected at runtime.
value
string
admin
startup_commands
cmds
list[string]
Startup commands (optional).
args
list[string]
Startup command arguments (optional).
Step 3: Review Configuration
**Notice:**The button “Review" will only be enabled when all the validations within the YAML editor have passed.
Click Reviewto continue. On this screen, you can:
Verify container configuration including template, GPU, CPU, RAM, and disk allocation.
Check pricing summary to view the estimated hourly cost.
Step 4: Create Container
Once confirmed, click Create Container to start deployment. The system will automatically create and launch your container based on the provided configuration file.
Export Container Configuration
For later reuse, the Export Configuration feature allows you to save a container’s configuration and download it into a YAML file.
From the List Containers screen, click the Action (3-dot) menu and select Export Configuration.
Alternatively, open the Container Details page and click Export Configuration.
A YAML file will be automatically downloaded with the name format: container-name.yaml
How to connect to a container
You can connect to your GPU Container using a few different methods, depending on your specific needs, preferences, and the template used to create the container.
HTTP Service
Connecting to a container using HTTP is convenient, quick, and secure via HTTPS. To connect using the HTTP Service:
Step 1: Once the container is running, navigate to Container Details Page.
Step 2: Find Access container Section, open HTTP Endpoint.
Step 3: Follow the guide that matches your template.
Template
Jupyter, Code Server
Ollama WebUI
Ollama
Vllm
Pre-condition
None
None
None
Hugging Face Token (*)Before creating a new container, you must fill your Hugging Face Token in Enviroment Variable section.
Next steps
Open the endpoint in your browser
Use the Username and Password fields in the Environment Variables section of the Container Details page to access your container
Open the endpoint in your browser
Create a new Open WebUI account or use your existing account.
Select a model to pull and test the model.
Testing your container using Postman (**)
Testing your container using Postman (**)
(*) Hugging Face Token: Hugging Face Token in Environment Variable section is required when using Ollama template. If you do not have Hugging Face Token yet, please follow this guide User access tokens.
(**) Testing container by using Postman: Append /v1/models to your endpoint, then provide your API_TOKEN in the Authorization. If you're using the vLLM template, also include HUGGING_FACE_HUB_TOKEN in the request parameters to test your container.
TCP Ports
To access your instance via public endpoint, you will need to add TCP ports to the container configuration. When your container is created, you will receive a public domain and an external public port mapping to access your service. An external public port will be randomly selected from the range 30000-40000.
The format will be DOMAIN:EXTERNAL_PORT -> INTERNAL_PORT. For example:
tcp-endpoint-stg.serverless.fptcloud.com:34771 → :22
SSH Terminal
To get the SSH command for your container, navigate to the Container details page. Copy the command listed under SSH command.
It should look something like this:
ssh [[email protected]](mailto:[email protected]) -p 34771 ~/.ssh/id_e25595
Run the copied command in your local terminal to connect to your container.
How to manage container
Start Container
Open the List Containers.
Find the GPU Container you want to start and click the 3-dot icon.
Select “Start” action.
Edit Container
From the List Containers, select the container you want to edit and access "Container Details" screen.
Click the Edit icon of the section you want to modify. You can now edit the 'Access container' section (including Ports and SSH) and the 'Advanced settings' section (including Persistent Disk, Environment Variables, and Startup Commands).
Confirm by clicking “Save”.
Stop Container
Warning: You will be charged for idle GPU containers even if they are stopped. If you don’t need to retain your container, you should terminate it completely.
Open the List Containers
Find the GPU Container you want to stop and click the 3-dot icon
Select “Stop” action
Confirm by clicking “Confirm”
Delete Container
Danger: Deleting a container permanently deletes all data in temporary storage and persistent storage. Be sure you’ve saved any data you want to access again.
Open the List Containers.
Find the GPU Container you want to delete and click the 3-dot icon.
Select “Delete” action.
Confirm by entering “delete” in the text field and clicking “Confirm”.
How to monitor container
GPU Container provides container logs and metricsto help you monitor and troubleshoot your workloads. To view your logs and metrics, open the Details Container screen, open the Logs or Monitoring tab. This gives you container logs and metrics monitoring, making it easy to diagnose issues or monitor your container’s activity.
Container Logs
Container logs include all application logs. Note that logs are only kept for 14 days, and timestamps are shown in the UTC timezone.
Download: Download logs from the last 14 days of your container.
Search: Enter a keyword to search within the log content.
Time Filter: Filter logs by specific time ranges.
Refresh: Interval at which the container logs are automatically updated.
Metric Monitoring
Monitoring metrics are collected to track the performance, availability, and resource usage of containerized services, helping detect issues and optimize operations. Note that metric data is retained for 14 days.
There are 4 metric groups:
Utilization metrics: Monitor CPU, memory, and GPU usage to assess system performance and resource efficiency.
Disk metrics: Track disk read/write speed, and latency to detect storage issues or bottlenecks.
Network metric: Measure network traffic, latency, and errors to identify connectivity problems and ensure service reliability.
Temperature and Power metrics: Monitor hardware temperature and power consumption to prevent overheating and maintain hardware health.
Time Filter: Filter metrics by specific time ranges.
Refresh: Interval at which the container metrics are automatically updated.
Templates
Templates are used to launch images as containers and define the required container disk size, volume, volume paths, and ports needed. You can also define environment variables and startup commands within the template.
Built-in Templates
These templates are created and maintained by FPT AI Factory. We now offer 6 built-in templates:
Jupyter Notebook
Intended Use: This template provides Jupyter Notebook to adopt remote development for AI/Data Scientists without local hardware limitations.
Environment Variables
Some more useful environment variables are provided for container customization.
USERNAME
string
admin
Username to access Jupyter Notebook
PASSWORD
string
Password to access Jupyter Notebook
Generate by system
Port
HTTP
8000
TCP
22
Ollama WebUI
Intended Use: This template supports running various large language model (LLM) programs, including Ollama and APIs compatible with OpenAI, making it easy for users to customize based on workflow.
Port:
HTTP
8080
TCP
22
Ollama
Intended Use: This template enables high-throughput inference using GPU resources with a state-of-the-art engine.
Environment Variables
Some more useful environment variables are provided for container customization.
API_TOKEN
string
Auto-authenticate with external services
Generate by system
Port:
vLLM
Intended Use: This vLLM container image is built and maintained by AI Factory. This template enables high-throughput model inference using GPU resources with a state-of-the-art engine.
Environment Variables
Some more useful environment variables are provided for container customization.
HUGGING_FACE_HUB_TOKEN
string
Your Hugging Face User Access Token
Port:
HTTP
8000
Code Server
Intended Use: This template offers cloud-based VS Code with GPU to train, test, and debug AI models remotely with full IDE capabilities.
Environment Variables
Some more useful environment variables are provided for container customization.
Variable
Type
Default
Description
PUID
int
0
UserID
PGID
int
0
GroupID
TZ
string
Etc/UTC
Your timezone
PROXY_DOMAIN
string
code-server.my.domain
The domain will be proxied for subdomain proxying
DEFAULT_WORKSPACE
string
/
Default folder opened when accessing code-server
PASSWORD
string
Generate by system
Port:
HTTP
8443
TCP
22
Ubuntu
TCP
22
Intended Use: This is a minimal Ubuntu CLI virtual machine with several useful additions to improve your user experience. While the root account is available as usual, we have created a normal system user for your convenience.
Development tools pre-installed: SSH access.
Custom Templates
You can use your own** Docker image** by clicking **"Custom Template"**and overriding your own image:tag. If your image is from a private Docker repository, make sure to provide your username and password for authentication.
Storage
Persistent Disk
GPU Container provides High-Performance Storage (HPS) remaining for the duration of a container’s life. It functions similarly to a hard disk, allowing you to store data that needs to be retained even if the container is stopped.
Key characteristics:
Available until the container is deleted permanently.
Prevent data loss by storing data, models, or files that need to be preserved across container restarts or reconfigurations.
Temporary Disk
Temporary disk (NVMe) is a type of storage that provides temporary storage for a container. Any data stored on the temporary disk will be lost when the container is stopped or deleted so make sure to back up important data before shutting down your container.
Storage type comparision
Data persistence
Lost on stop/ restart
Retained until container deletion
Lifecycle
Tied directly to the container’s active session
Tied to the container’s lease period
Performance
Fastest (locally attached)
Reliable, generally slower than temporary disk
Capacity
Fixed according to the selected GPU instance
Selectable at creation
Cost
FREE
Refer to ai.fptcloud.com/pricing
Best for
Temporary session data, cache
Persistent application data, models, datasets
Last updated
