# Install GPU drivers

Users can install their preferred GPU driver on the FPT Kubernetes Engine cluster with integrated GPU support.

#### Step 1: Create a GPU Cluster with Driver Installation set to User-Install

*Create a cluster with Driver Installation set to User-Install*

#### Step 2: Customers install the software required to use the GPU (Driver, Toolkit, Device Plugin, etc.)

**Refer to the GPU driver versions:**

* **Release Notes**: <https://docs.nvidia.com/datacenter/tesla/index.html> <https://docs.nvidia.com/datacenter/tesla/drivers/releases.json>
* **Document**: <https://docs.nvidia.com/datacenter/tesla/drivers/index.html>
* **Installer**: <https://download.nvidia.com/XFree86/Linux-x86_64/>

*Customers can refer to the DaemonSet Driver installation below:*

```
# Copyright 2023 FPT Cloud - PaaS
# worker.fptcloud/type=gpu

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fptcloud-gpu-driver-installer
  namespace: kube-system
  labels:
    k8s-app: gpu-driver
spec:
  selector:
    matchLabels:
      k8s-app: gpu-driver
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        name: nvidia-driver-installer
        k8s-app: gpu-driver
    spec:
      priorityClassName: system-node-critical
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: worker.fptcloud/type
                operator: In
                values: ["gpu"]
      tolerations:
      - operator: "Exists"
      containers:
        - image: docker.io/alpine:3.13
          name: nvidia-driver-installer
          command:
            - 'nsenter'
            - '-t'
            - '1'
            - '-m'
            - '-u'
            - '-i'
            - '-n'
            - '--'
            - 'bash'
            - '-l'
            - '-c'
            - 'curl -Ls https://raw.githubusercontent.com/fci-xplat/fke-config/main/fptcloud-gpu-driver-installer.sh | bash -s -- -p admin'
          resources:
            requests:
              cpu: 150m
          env:
          - name: NVIDIA_DRIVER_VERSION
            value: "535.54.03"
          - name: NVIDIA_TOOLKIT_INSTALL
            value: "true"
          imagePullPolicy: IfNotPresent
          securityContext:
            privileged: true
            allowPrivilegeEscalation: true
      hostPID: true
      hostNetwork: true
      hostIPC: true
```

With environment variable parameters:

* N**VIDIA\_DRIVER\_VERSION**: Driver version
* **NVIDIA\_TOOLKIT\_INSTALL**: "true" or "false", default is "true". Automatically install the toolkit or not.

To apply the fptcloud DaemonSet to the K8s cluster, use the following command:

```
kubectl apply -f https://raw.githubusercontent.com/fci-xplat/fke-config/main/fptcloud-gpu-driver-installer.yaml
```

*Check the status of the DaemonSet's Pods*

`kubectl get pod -n kube-system | grep "gpu-driver"`

```
NAME                                                 READY   STATUS    RESTARTS        AGE
fptcloud-gpu-driver-installer-7tj55                  1/1     Running   0               2d17h
```

The DaemonSet fptcloud-gpu-driver-installer will schedule pods on all workers in the Worker Group (with the label worker.fptcloud/type: gpu) to install the Driver/Toolkit.

* *Check the logs of the fptcloud-gpu-driver-installer-7tj55 pod to see if the Installer has finished installing.*

`kubectl logs fptcloud-gpu-driver-installer-7tj55 -n kube-system`

* If the installation is successful, you will see logs as follows. The installation process usually takes a few minutes.

```
Verifying Nvidia installation... DONE. 
Clean Nvidia installation... DONE.
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ai-docs.fptcloud.com/fpt-gpu-cloud/gpu-cluster/managed-k8s-with-gpu-virtual-machine/tutorial/install-gpu-drivers.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
