GPU Server
    • PDF

    GPU Server

    • PDF

    Article Summary

    The latest service changes have not yet been reflected in this content. We will update the content as soon as possible. Please refer to the Korean version for information on the latest updates.

    Available in VPC

    This guide describes how to create and manage a GPU server on NAVER Cloud Platform console.

    Note
    • Set up redundancy between server zones in order to ensure continuity of service without interruption in the event of unexpected server malfunctions or scheduled change operations. See Load Balancer overview to set up redundancy.
    • NAVER Cloud Platform provides a high availability (HA) structure to prepare for failures in the physical server, such as memory, CPU, and power supply. HA is a policy for preventing hardware failures from expanding into the virtual machine (VM) server. It supports live migration, which automatically migrates the VM on the host server to another secure host server when a failure occurs in the aforementioned host server. However, the VM server is rebooted when an error occurs where live migration can't be initiated. If the service is being operated with a single VM server, set up multiplexing for VM servers as described above to reduce the frequency of failures that may occur as a result of rebooting the VM server.

    Check server information

    You can view the GPU server information in the same way as viewing a regular server information. For more information, see Check server information.

    Caution

    In the case of GPU servers, fees are charged even when the server is stopped.

    Create server

    You can create a GPU server in Services > Compute > Server on the console. For more information on how to create a server, see Create server.

    Note
    • As for GPU A100, you can create a server in Services > Compute > Bare Metal Server. For more information on how to create a server, see Create GPU A100 server.
    • Company members can create up to 5 GPU servers. If you need more GPU servers or if you are an individual member who needs to create a GPU server, submit an inquiry in Customer support.

    Manage server

    You can manage a GPU server and change its settings in the same way as for a regular server. For more information, see Manage server.

    Note
    • Specifications of a GPU server can be changed only through a server of the same type.
    • After a GPU server is created, it cannot be converted to a regular server by removing GPU. To change to a regular server, you need to create a server image and use the image to newly create a regular server.
    • You can use the server image created in a regular server to create a GPU server.

    Re-install and upgrade GPU driver and CUDA

    In the following situations during use of a GPU server, you can re-install the GPU driver or CUDA of the server.

    • If the OS kernel version is changed (updated) and is no longer compatible with the current GPU driver: re-install the GPU driver only.
    • If you need to upgrade an old-version (418.67) GPU driver currently in use to the latest driver provided on NAVER Cloud Platform
    • If you need to upgrade the driver to a specific version
    Note
    • If you are re-installing the driver to a specific version, you may not be able to receive official support with regard to problems occurring during the process.
    • It is not recommended to downgrade the driver to a lower version than what is currently provided on NAVER Cloud Platform.

    See the following guides for the OS you are using.

    Re-install GPU driver (Linux)

    To re-install the GPU driver, you can simply run the script for auto-installation.
    If automatic re-installation fails, you can re-install the driver manually.

    Automatic re-installation

    To download and run a script file for automatic re-installation of the GPU driver, do the following:

    1. Enter the wget http://init.ncloud.com/gpu/ncp_gpu_reinstall.sh command to download the script file.
    2. Enter the ./ncp_gpu_reinstall.sh command to delete the existing GPU driver.
      # ./ncp_gpu_reinstall.sh
      This will delete current NVIDIA driver. Are you sure? [y/n]y
      
      --2022-07-25 14:56:30-- http://init.ncloud.com/gpu/nvidia_driver/nvidia-linux-driver.latest
      Resolving init.ncloud.com (init.ncloud.com)... 169.254.1.5
      Connecting to init.ncloud.com (init.ncloud.com)|169.254.1.5|:80... connected.
      HTTP request sent, awaiting response... 200 OK
      Length: 273219658 (261M) [text/plain]
      Saving to: '/root/nvidia-linux-driver.latest'
      
      nvidia-linux-driver.latest 100%[=================================================>] 260.56M 112MB/s in 2.3s
      
      2022-07-25 14:56:32 (112 MB/s) - '/root/nvidia-linux-driver.latest' saved [273219658/273219658]
      
      Verifying archive integrity... OK
      Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 470.57.02............
      
      The current NVIDIA driver has been deleted.
      Please reboot the server and run this script again to reinstall new NVIDIA driver.
      
    3. Reboot the server.
    4. Re-enter the ./ncp_gpu_reinstall.sh command to re-install the GPU driver.
      # ./ncp_gpu_reinstall.sh
      This will install a new NVIDIA driver version : 470.57.02. Are you sure? [y/n]y
      Verifying archive integrity... OK
      
      (Omitted)
      
      Installation of the kernel module for the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version 470.57.02) is now complete.
      
      New NVIDIA driver installed.
      Check the driver version. (via 'nvidia-smi' command.)
      
      
      Mon Jul 25 14:59:01 2022
      +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |                               |                      |               MIG M. |
      |===============================+======================+======================|
      |   0  Tesla T4            Off  | 00000000:00:05.0 Off |                   0* |
      | N/A   41C    P0    25W /  70W |      0MiB / 15109MiB |      3%      Default |
      |                               |                      |                  N/A |
      +-------------------------------+----------------------+----------------------+
      
      +-----------------------------------------------------------------------------+
      | Processes:                                                                  |
      |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
      |        ID   ID                                                   Usage      |
      |=============================================================================|
      |  No running processes found                                                 |
      +-----------------------------------------------------------------------------+
      

    Manual re-installation

    If you are unable to run automatic re-installation using the script, you can re-install the GPU driver manually as follows:

    1. Download the driver file of the version you wish to re-install or upgrade the driver to.

      • <example> Default version 470.57.02 provided on NAVER Cloud Platform
      # wget https://kr.download.nvidia.com/tesla/470.57.02/NVIDIA-Linux-x86_64-470.57.02.run
      # chmod +x NVIDIA-Linux-x86_64-470.57.02.run
      
      • <example> Different version: 510.47.03
      # DRIVER_VERSION=510.47.03
      # wget https://kr.download.nvidia.com/tesla/${DRIVER_VERSION}/NVIDIA-Linux-x86_64-${DRIVER_VERSION}.run 
      
    2. Enter the following command to delete the existing GPU driver.

      # ./NVIDIA-Linux-x86_64-470.57.02.run --uninstall -s
      Verifying archive integrity... OK
      Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 
      470.57.02............................................................................................................................................................
      #
      
    3. Reboot the server.

    4. Enter the following command to install the new GPU driver.

      # ./NVIDIA-Linux-x86_64-470.57.02.run -a --ui=none --no-questions --accept-license
      Verifying archive integrity... OK
      Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 470.57.02............................................................................................................................................................
      
      Welcome to the NVIDIA Software Installer for Unix/Linux
      
      (Omitted)
      
      Installation of the kernel module for the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version 470.57.02) is now complete.
      
    5. Reboot the server.

    6. Enter the nvidia-smi command to check the version of the successfully installed driver and the model and the number of the recognized GPU cards.

      # nvidia-smi
      Wed Jun 22 19:34:19 2022
      +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |                               |                      |               MIG M. |
      |===============================+======================+======================|
      |   0  Tesla T4            Off  | 00000000:00:05.0 Off |                  Off |
      | N/A   40C    P0    26W /  70W |      0MiB / 16127MiB |      3%      Default |
      |                               |                      |                  N/A |
      +-------------------------------+----------------------+----------------------+
      
      +-----------------------------------------------------------------------------+
      | Processes:                                                                  |
      |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
      |        ID   ID                                                   Usage      |
      |=============================================================================|
      |  No running processes found                                                 |
      +-----------------------------------------------------------------------------+
      
    Note

    When you run the nvidia-smi command, the following information is output.

    ItemDescription
    Driver VersionVersion of the installed driver
    CUDA VersionCUDA API version supported by the driver
    NameGPU model name
    TempGPU core temperature
    PerfGPU performance state
    • Ranges from P0 to P12, lower number representing higher performance
    • Changes flexibly according to the GPU temperature and power usage
    Pwr:Usage/CapCurrent level of power being used by the GPU
    Memory-UsageMemory usage by the GPU (current usage/GPU memory capacity)
    Volatile GPU-UtilGPU usage rate
    Uncorr. ECCNumber of Uncorrectable Error Correction Code (ECC) occurrences
    • GPU VM provided on NAVER Cloud Platform is disabled by default for maximum performance
    MIG M.MIG (Multi Instance GPU) Mode status
    • Not applicable to P40, T4, and V100 GPU provided on NAVER Cloud Platform
    ProcessesInformation of the processes currently using the GPU
    • GPU: number of GPU where the process is running
    • GI ID/CI ID: information of GPU instance and compute instance sliced through MIG (Multi-Instance GPU)
    • PID, Process name: ID and name of the process
    • Type: C (Compute) for CUDA/OpenCL processes and G (Graphics) for DirectX/OpenGL processes
    • GPU Memory Usage: GPU memory usage by the process

    Re-install CUDA (Linux)

    CUDA operates properly only if cuDNN is re-installed as well. To install it, do the following:

    1. Connect to CUDA Toolkit download website.

    2. Select the CUDA Runtime installation file for the version to install to bring the download link.

      • For installation type, select runfile (local) which does not depend on the OS.
        server-gpuserver-vpc_cuda_guide
      • <example> Default version CUDA 11.2.2 provided on NAVER Cloud Platform
        # wget https://developer.download.nvidia.com/compute/cuda/11.2.2/local_installers/cuda_11.2.2_460.32.03_linux.run
        # chmod +x cuda_11.2.2_460.32.03_linux.run
        
    3. Check the symbolic link of the existing CUDA path and delete the actual path directory of the existing version.

      • The existing CUDA Toolkit and cuDNN are deleted.
      # ll /usr/local/cuda
      lrwxrwxrwx 1 root root 21 Jul 4 11:02 /usr/local/cuda -> /usr/local/cuda-11.x/
      # rm -rf /usr/local/cuda-11.x
      
    4. Enter the following command to re-install CUDA Toolkit.

      # ./cuda_11.2.2_460.32.03_linux.run --toolkit --toolkitpath=/usr/local/cuda-11.2 --samples --samplespath=/usr/local/cuda-11.2/samples --silent
      
    5. Check the version of the re-installed CUDA.

      # nvcc --version
      nvcc: NVIDIA (R) Cuda compiler driver
      Copyright (c) 2005-2021 NVIDIA Corporation
      Built on Sun_Feb_14_21:12:58_PST_2021
      Cuda compilation tools, release 11.2, V11.2.152 <-- CUDA Runtime version
      Build cuda_11.2.r11.2/compiler.29618528_0
      
    6. Connect to cuDNN download website to bring the download link.

    7. Download cuDNN from the link.

      • <example> Default version cuDNN 8.1.1.33 provided on NAVER Cloud Platform
      # wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.1.1.33/11.2_20210301/cudnn-11.2-linux-x64-v8.1.1.33.tgz
      
    8. cuDNN installation does not bring up any installer but is completed simply when the file is unzipped in the directory where CUDA is installed. See the following to perform installation.

      # cd /root
      # tar -xzvf cudnn-11.2-linux-x64-v8.1.1.33.tgz
      # cp cuda/include/cudnn* /usr/local/cuda/include
      # cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
      # chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
      
    9. Check the version of cuDNN installed.

      • For cuDNN 8.x
      # cat /usr/local/cuda/include/cudnn_version.h | grep -A2 MAJOR
      #define CUDNN_MAJOR 8
      #define CUDNN_MINOR 1
      #define CUDNN_PATCHLEVEL 1
      
      • For cuDNN 7.x
      # cat /usr/local/cuda/include/cudnn.h | grep -A2 MAJOR
      #define CUDNN_MAJOR 7
      #define CUDNN_MINOR 6
      #define CUDNN_PATCHLEVEL 5
      

    Re-install GPU driver (Windows)

    To re-install the GPU driver, you can simply run the script for auto-installation.
    If automatic re-installation fails, you can re-install the driver manually.

    Automatic re-installation

    To download and run a script file for automatic re-installation of the GPU driver, do the following:

    1. Enter the following command to download the script file.

      Start-BitsTransfer -Source "http://init.ncloud.com/win_gpu/install_gpu.exe" -Destination "c:\install_gpu.exe"
      
    2. Run the install_gpu.exe file.

      • The Nvidia GPU driver install pop-up window appears, and installation takes about 10-15 minutes.
    3. When the Installation complete pop-up appears, reboot the server.

    4. Enter the run - devmgmt.msc command to open the Device manager console.

    5. On the device manager console, double-click the NVIDIA graphics card under Display Adapters.

    6. Under the [Driver] tab on the Attributes pop-up, check the driver version.

    7. Open the cmd window and enter cd C:\Program Files\NVIDIA Corporation\NVSMI to relocate the driver, and then enter nvidia-smi.

      • You can see the graphics card has been recognized.
      • <example> 1 Tesla T4 has been recognized
      C:\Users\Administrator>cd C:\Program Files\NVIDIA Corporation\NVSMI
      
      C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi
      Fri Jul 24 13:14:57 2022
      +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 461.33       Driver Version: 461.33       CUDA Version: 11.2     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |                               |                      |               MIG M. |
      |===============================+======================+======================|
      |   0  Tesla T4            TCC  | 00000000:00:05.0 Off |                  Off |
      | N/A   30C    P8     9W /  70W |      0MiB / 16225MiB |      0%      Default |
      |                               |                      |                  N/A |
      +-------------------------------+----------------------+----------------------+
      
      +-----------------------------------------------------------------------------+
      | Processes:                                                                  |
      |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
      |        ID   ID                                                   Usage      |
      |=============================================================================|
      |  No running processes found                                                 |
      +-----------------------------------------------------------------------------+
      
    Note

    When you run the nvidia-smi command, the following information is output.

    ItemDescription
    Driver VersionVersion of the installed driver
    CUDA VersionCUDA API version supported by the driver
    NameGPU model name
    TempGPU core temperature
    PerfGPU performance state
    • Ranges from P0 to P12, lower number representing higher performance
    • Changes flexibly according to the GPU temperature and power usage
    Pwr:Usage/CapCurrent level of power being used by the GPU
    Memory-UsageMemory usage by the GPU (current usage/GPU memory capacity)
    Volatile GPU-UtilGPU usage rate
    Uncorr. ECCNumber of Uncorrectable Error Correction Code (ECC) occurrences
    • GPU VM provided on NAVER Cloud Platform is disabled by default for maximum performance
    MIG M.MIG (Multi Instance GPU) Mode status
    • Not applicable to P40, T4, and V100 GPU provided on NAVER Cloud Platform
    ProcessesInformation of the processes currently using the GPU
    • GPU: number of GPU where the process is running
    • GI ID/CI ID: information of GPU instance and compute instance sliced through MIG (Multi-Instance GPU)
    • PID, Process name: ID and name of the process
    • Type: C (Compute) for CUDA/OpenCL processes and G (Graphics) for DirectX/OpenGL processes
    • GPU Memory Usage: GPU memory usage by the process

    Manual re-installation

    If you are unable to run automatic re-installation using the script, you can re-install the GPU driver manually as follows:

    1. Download the driver file of the version you wish to re-install or upgrade the driver to from GPU driver download website.
    2. Run the downloaded GPU driver EXE file to install the driver.
      • Follow the instructions on the installer pop-up.
      • You must agree to the software's Terms of service to be able to use it.
      • For Installation option, select Express.
    3. Reboot the server.
    4. Enter the run - devmgmt.msc command to open the Device manager console.
    5. On the device manager console, double-click the NVIDIA graphics card under Display Adapters.
    6. Under the [Driver] tab on the Attributes pop-up, check the driver version.

    Re-install CUDA (Windows)

    CUDA operates properly only if cuDNN is re-installed as well. To install it, do the following:

    1. Connect to CUDA Toolkit download website.

    2. Set the platform and click the link to download the EXE file.

    3. Run the downloaded CUDA EXE file to install CUDA.

      • Follow the instructions on the installer pop-up.
      • You must agree to the software's Terms of service to be able to use it.
      • For Installation option, select Express.
    4. Log in to cuDNN download website and download the cuDNN file of the desired version.

      Note

      Only members can download cuDNN. If you have no account, become a member and log in.

    5. Unzip the downloaded ZIP file and replace the bin, include, and lib folders in the CUDA 11.2.2 installation path with the folders of the same names from the ZIP file.

    6. Open the cmd window and enter cd C:\Program Files\NVIDIA Corporation\NVSMI to relocate the driver, and then enter nvidia-smi.

      • You can see the graphics card has been recognized.
      • <example> 1 Tesla T4 has been recognized
      C:\Users\Administrator>cd C:\Program Files\NVIDIA Corporation\NVSMI
      
      C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi
      Fri Jul 24 13:14:57 2022
      +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 461.33       Driver Version: 461.33       CUDA Version: 11.2     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name             TCC/WDDM| Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |                               |                      |               MIG M. |
      |===============================+======================+======================|
      |   0  Tesla T4            TCC  | 00000000:00:05.0 Off |                  Off |
      | N/A   30C    P8     9W /  70W |      0MiB / 16225MiB |      0%      Default |
      |                               |                      |                  N/A |
      +-------------------------------+----------------------+----------------------+
      
      +-----------------------------------------------------------------------------+
      | Processes:                                                                  |
      |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
      |        ID   ID                                                   Usage      |
      |=============================================================================|
      |  No running processes found                                                 |
      +-----------------------------------------------------------------------------+
      
    Note

    When you run the nvidia-smi command, the following information is output.

    ItemDescription
    Driver VersionVersion of the installed driver
    CUDA VersionCUDA API version supported by the driver
    NameGPU model name
    TempGPU core temperature
    PerfGPU performance state
    • Ranges from P0 to P12, lower number representing higher performance
    • Changes flexibly according to the GPU temperature and power usage
    Pwr:Usage/CapCurrent level of power being used by the GPU
    Memory-UsageMemory usage by the GPU (current usage/GPU memory capacity)
    Volatile GPU-UtilGPU usage rate
    Uncorr. ECCNumber of Uncorrectable Error Correction Code (ECC) occurrences
    • GPU VM provided on NAVER Cloud Platform is disabled by default for maximum performance
    MIG M.MIG (Multi Instance GPU) Mode status
    • Not applicable to P40, T4, and V100 GPU provided on NAVER Cloud Platform
    ProcessesInformation of the processes currently using the GPU
    • GPU: number of GPU where the process is running
    • GI ID/CI ID: information of GPU instance and compute instance sliced through MIG (Multi-Instance GPU)
    • PID, Process name: ID and name of the process
    • Type: C (Compute) for CUDA/OpenCL processes and G (Graphics) for DirectX/OpenGL processes
    • GPU Memory Usage: GPU memory usage by the process

    Collection/delivery of diagnostic data through NTK

    You can collect and save the NVIDIA debug log of GPU VM through Ncloud Tool Kit (NTK).
    The process of collecting and forwarding debug log is as follows:

    1. Run NTK
    2. Collect GPU debug logs

    Note

    For more information on NTK, see Ncloud Tool Kit (Linux/Windows).

    1. Run NTK

    To run NTK on the Linux server, do the following:

    1. Enter the cd /usr/local/etc command.
      • You are moved to the path where NTK is located.
    2. Enter the tar zxvf ntk.tar.gz command.
      • The NTK file is unzipped.
      • If no ntk.tar.gz file exists or if you wish to replace the existing file with the latest version, enter wget -P /usr/local/etc http://init.ncloud.com/server/ntk/linux/xen/ntk.tar.gz to download the file.
    3. Enter the /usr/local/etc/ntk/ntk command to run NTK.

    2. Collect GPU debug logs

    The following describes how to collect GPU debug logs in NTK.

    1. On the main screen of NTK, select E EXECUTE - << Run System Apps >>.
      gpu-server-createvpc_guide28_en

    2. Select G GPU DEBUG COLLECTING - FOR LOG COLLECT >>.
      gpu-server-createvpc_guide29_en

    3. Select Yes to run the log collection script.
      gpu-server-createvpc_guide30_en

    4. When a log collection success message and the log file storage path are displayed, check the details and select Ok.
      gpu-server-createvpc_guide31_en

    5. Select whether to transfer the log file to NAVER Cloud's technical support center.

    • If you want to transfer the log file, select Yes. The file transfer is started. If the file is transferred, a success message and ShortURL where you can download the log is displayed.
    • If you don't want to transfer the log file, select No to end it.
      gpu-server-createvpc_guide32_en

    Transfer created log

    The following describes how to transfer created logs to NAVER Cloud's technical support center:

    Note

    If you are unable to transfer the log file to NAVER Cloud's technical support center due to a network issue, attach and forward the log file stored in the VM.

    • Log file storage path: /usr/local/etc/ntk/logs/gpu_get_log
    1. On the main screen of NTK, select V VIEW - << View & Upload Logs >>.
      gpu-server-createvpc_guide34_en

    2. Select G - GPU DEBUG FILES.
      gpu-server-createvpc_guide35_en

    3. Check the list of log files created and select the log files to transfer to NAVER Cloud's technical support center.
      gpu-server-createvpc_guide36_en

    4. Select Yes.

      • The file transfer is started. If the file is transferred, a success message and ShortURL where you can download the log is displayed.
        gpu-server-createvpc_guide37_en

    GPU debug log file types

    The following are the types of GPU log files created through NTK.

    Log file nameCommand usedRole
    date.logdateOutputs the log creation date and time
    dmesg-xid.logdmesg grep -i xidOutputs the kernel message including xid
    dmesg.logdmesgOutputs the kernel message
    free.logfree -mOutputs the memory usage in MB
    last.loglastOutputs login and reboot logs
    ps.logps auxfChecks the process status
    top.logtop -b -n 1Outputs top (once in batch mode) and system information
    uptime.loguptimeOutputs uptime result
    nvidia-bug-report.log.gzcellRuns the nvidia-bug-report.sh script

    Monitoring GPU resources

    You can use Cloud Insight to monitor the GPU resources. For more information on Cloud Insight, see the Cloud Insight user guide.

    View dashboard

    Select Service Dashboard/Server dashboard in the Services > Management & Governance > Cloud Insight > Dashboard menu to view the default metrics related to servers at a glance.

    • Click the [Change widget data] button to filter the data to be displayed on Widget.
    • The metrics you can check in relation to GPU servers are as follows:
      • Current GPU MEM Usage (GPU/vmem_usage(%)): GPU memory usage = GPU/vmem_usage(%)
      • Current GPU MEM Usage (GPU/vmem_usage(MiB)): GPU memory usage = GPU/vmem_usage(MiB)
      • Current GPU Usage: GPU usage = GPU/ Usage(%)

    For more information on how to view the dashboard, see View Cloud Insight dashboard.

    Add user dashboard

    You can add user dashboards to monitor only the metrics you want.
    Click the [Create dashboard] button to create a new dashboard, and then click the [Add widget] button to set the types of widgets and metrics information to be displayed.

    • To create widget related to GPU server, you must select Server as Product Type when setting data.
    • If you are using a GPU-related metric as setting data, you must add as many dimensions (gpu_idx) as the number of GPUs.

    For more information on how to additionally create the dashboard, see Create Cloud Insight dashboard.


    Was this article helpful?

    What's Next
    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.