Available in VPC
GPU nodes may be used as worker nodes of the Ncloud Kubernetes Service.
- GPU servers provided by NAVER Cloud Platform do not support NVIDIA's Multi-Instance GPU (MIG) feature. For stable service operation, you should use the default GPU settings provided.
- GPU nodes can only be used in the KR region.
- Clusters exclusively made up of GPU nodes have usage restrictions.
- As any default object in Ncloud Kubernetes Service requires general nodes, you need to set up both general and GPU node pools.
- In GPU nodes, you can upgrade your SMI/CUDA driver.
- You can upgrade your driver using runfile. For detailed instructions see Reinstall and upgrade GPU driver/CUDA.
Place NVIDIA GPU device plugin
You can't use GPU resources in the Kubernetes cluster before placing the GPU plugin.
In order to place the NVIDIA device plugin after the GPU node switches to Running at the cluster, execute the following command:
1. Install Helm
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 \
&& chmod 700 get_helm.sh \
&& ./get_helm.sh
2. Add NVIDIA device plugin helm storage
helm repo add nvdp https://nvidia.github.io/k8s-device-plugin \
&& helm repo update
3. Release NVIDIA device plugin
Now, deploy the NVIDIA device plugin in the kube-system namespace using the following command.
helm install --generate-name nvdp/nvidia-device-plugin -n kube-system
The HELM Chart is provided by NVIDIA. If you encounter any issues, check the official NVIDIA website to see if it has changed.
For a quick start, you can directly deploy a simple Damonset to activate your GPU resources.
$ kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.0/deployments/static/nvidia-device-plugin.yml
4. Check GPU usage
You can check the GPU resource availability by viewing the details of the node using the following command.
$ kubectl describe [node-name]
...
Capacity:
cpu: 4
ephemeral-storage: 51341792Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 20474628Ki
nvidia.com/gpu: 1
pods: 110
Allocatable:
cpu: 3920m
ephemeral-storage: 47316595429
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 17425156Ki
nvidia.com/gpu: 1
pods: 110
...
Use device plugin
Kubernetes implements device plugins and enables its pods to access special hardware features such as GPU. For more information on how to use the plugin, see the official website.