Using the AI Forest CLI

release/20240425
English

Using the AI Forest CLI

Article Summary

Share feedback

Thanks for sharing your feedback!

Available in VPC

AI Forest provides a command line interface (CLI). Users may run deep learning solutions and programs they want by getting GPU assigned dynamically when they use AI Forest CLI (AI CLI hereafter). This guide introduces how to use AI CLI based on a usage scenario.

Note

Currently, AI CLI can only be used in Linux environments.
This scenario assumes that the OS is CentOS 7.x.

Preparations

Creation of Data Forest account and workspace as well as cluster environment configuration is required.

Connect to the console in NAVER Cloud Platform.
From the Services > Big Data & Analytics > Data Forest > Account menu, click the [Create account] button.
Create an account that will submit a DL app.
From the Services > Big Data & Analytics > Data Forest > AI Forest > Workspace menu, click the [Create workspace] button to create a workspace.
Configure the environment so that access to a Data Forest cluster is enabled.
- Refer to Getting started with Data Forest for how to configure the environment.

Using AI CLI

You must connect to a VM server in the VPC environment to use AI CLI. In addition, the environment configuration must be completed through Preparations so the Data Forest cluster can access the VM server in the VPC environment. How to submit a DL app using AI CLI and check it is explained below.

Step1. Download AI CLI

Download the AI CLI executable file.

$ wget http://dist.kr.df.naverncp.com/repos/release/df-env/dist/df-aicli
$ chmod +x ./df-aicli

The permission for all files is, by default, set to root which is DF_USER, and /root for DF_USER_HOME.

The commands available in AI CLI are as follows.

AI CLI

Usage:

$ ./df-aicli COMMAND [ARGS]...

Options:

Commands:

app: the command for managing the DL app and submitting jobs
docker-images: available Docker image information

AI CLI app

Usage:

$ ./df-aicli app COMMAND [ARGS]...

Options:

Commands:

kill: Stop a running job.
list: Get a list of running jobs.
status: Get a specific job's details.
submit: Run a DL app.

Step2. Run Jupyter Notebook

Jupyter Notebook is an open source web platform and application. It provides a development environment where code can be written and run in various programming languages including Python. It is provided in the form of interactive shell by default, like Python IDLE, and allows you to manage different code and its execution results like a single document. Users can use Jupyter Notebook to write and edit code related to machine learning and deep learning.
For more information, refer to the Jupyter Notebook official document.

You can run Jupyter Notebook instances based on Docker containers in AI Forest.

Run AI CLI Jupyter app

Run the Jupyter Notebook DL app.

Usage:

$ ./df-aicli app submit jupyter [OPTIONS]

Options:

--workspace TEXT: space to save sources for a job [required]
--account TEXT: Data Forest account [required]
--docker-image TEXT: Docker image to use [required]
--input-path TEXT: HDFS path of the input data [required]
--output-path TEXT: HDFS path of the output data [required]

Note

The following are the Docker images available when running Jupyter Notebook.

notebook_tensorflow_2.3.1:20220414
notebook_pytorch_1.7:20220414

Output:
Jupyter Notebook DL app execution information

name: DL app name
id: DL app ID

$ ./df-aicli app submit jupyter --account df-user --workspace ws --docker-image notebook_tensorflow_2.3.1:20220414 --input-path data_in --output-path data_out

dlapp jupyter is submitted
<name>    <id>
--------  ------------------------------
jupyter-ojxh   application_1

Once the DL app's status becomes Running after submitting a Jupyter Notebook job, the notebook can be used normally.

Access Jupyter Notebook web UI

The following describes how to access Jupyter Notebook's web UI.

Complete Quick links access settings.
- It is done in the same way as Data Forest quick links. Create a SSL VPN, connect to the VPC server via SSH tunneling, and then you can access the Jupyter Notebook web UI. Refer to the guide below for more detailed configuration instruction.
Check the web UI URL and token information in AI CLI.

Web URL: notebook quick link URL from df-ailcli app status command
Token: the container's stderr.txt file

$ ./df-aicli app status --account df-user --app-id application_1643186470613_0577
<account>    <id>                            <name>        <status>
-----------  ------------------------------  ------------  ----------
df-user     application_1643186470613_0577  jupyter-irpw  RUNNING

<quicklink>    <url>
-------------  --------------------------------------------------------------------------------
Shell          http://df-user.jupyter-irpw.worker-0.9000.proxy.kr.df.naverncp.com
Notebook       http://df-user.jupyter-irpw.worker-0.8888.proxy.kr.df.naverncp.com/?token=

@see https://gnode001.kr.df.naverncp.com:9044/node/containerlogs/container_e814_1643186470613_0577_01_000002/df-user/stderr.txt/?start=0 <- check jupyter notebook token

Go to the URL in a browser, and enter the token information.
You can create and edit codes related to machine learning through Jupyter Notebook web UI.

Step3. Run DL app

You can get GPU resources assigned to run codes that have been written in Jupyter Notebook. DL apps can only be submitted as single batches at the moment. Check the Docker image version that can be used first, and then use the command to run the AI CLI single batch app to submit a DL app.

Get AI CLI Docker image list

View the list of Docker image versions for the job to run.

Usage:

$ ./df-aicli docker-images list [OPTIONS]

Output:
Docker image name list

docker image: name of the Docker image that can be run

$ ./df-aicli docker-images list
<docker image>
----------------------
jupyter:1.1
pytorch:2.2
tensorflow:3.3

Note

The following are the Docker images available when running single batch jobs.

notebook_tensorflow_2.3.1:20220414
notebook_pytorch_1.7:20220414
pytorch:v0.2.0
pytorch:v0.3.0
pytorch:v0.3.0-cuda9.0
pytorch:v0.4.0
pytorch:v0.4.1
pytorch:v1.0.0
pytorch:v1.1.0
pytorch:v1.1.0-cuda10
pytorch:v1.2.0-cuda10
pytorch:v1.4
pytorch:v1.7
tensorflow:r1.10
tensorflow:r1.10-py3
tensorflow:r1.11
tensorflow:r1.12
tensorflow:r1.12-py3
tensorflow:r1.14
tensorflow:r1.14-py3
tensorflow:r1.15
tensorflow:r1.15-py3
tensorflow:r1.3
tensorflow:r1.4
tensorflow:r1.4-py3
tensorflow:r1.5-py3
tensorflow:r1.6
tensorflow:r1.6-py3
tensorflow:r1.7
tensorflow:r1.7-py3
tensorflow:r1.8
tensorflow:r2.1
tensorflow:r2.1-py3
tensorflow:r2.3.1-py3

Run AI CLI single batch app

Run a single batch DL app.

Usage:

$ ./df-aicli app submit single [OPTIONS]

Options:

--name-prefix TEXT: name of the job to run [required]
--workspace TEXT: space to save sources for a job [required]
--command TEXT: script or command to run [required]
--account TEXT: Data Forest account [required]
--docker-image TEXT: Docker image to use [required]
--input-path TEXT: HDFS path of the input data [required]
--output-path TEXT: HDFS path of the output data [required]

Output:
Information of the DL app run

name: name of the DL app run
id: ID of the DL app run

$ ./df-aicli app submit single --name-prefix job --account df-user --workspace ws --command "sh run.sh" --docker-image tensorflow:r2.1 --input-path data_in --output-path data_out 

dlapp sb is submitted
<name>    <id>
--------  ------------------------------
sb-dwjw   application_1643186470613_0581

Note

Migrating DL app job result files to HDFS

Create a folder named data_out in the workspace created in the console, or create it from the code.
Write code so that DL app result files are created in data_out.
When running df-aicli app submit single, specify the desired hdfs path in the --output-path option.

Step4. View the list of DL apps and their details

You can view the DL app list and each DL app's details. They can only be viewed on the list when the DL app status is Running.

View AI CLI app list

View the list of DL apps that are running.

Usage:

$ ./df-aicli app list [OPTIONS]

Options:

--account TEXT: Data Forest account [required]

Output:
The list of DL apps that are running

name: DL app name
id: DL app ID

$ ./df-aicli app list --account df-user
<name>        <id>
------------  ------------------------------
ss-qt2u-siww  application_1643186470613_0561
jupyter-sbn6  application_1643186470613_0573

View AI CLI app details

View details for a running DL app.

Usage:

$ ./df-aicli app status [OPTIONS]

Options:

--account TEXT: Data Forest account [required]
--app-id TEXT: ID of the submitted DL app [required]

Output:
The running DL app's details

account: Data Forest account that has run the DL app
name: DL app name
id: DL app ID
status: DL app status
quicklink: name of the quick link that connects to the DL app
url: URL of the quick link that connects to the DL app

$ ./df-aicli app status --account df-user --app-id application_1643186470613_0575
<account>    <id>                            <name>    <status>
-----------  ------------------------------  --------  ----------
df-user     application_1643186470613_0575  sb-ojxh   RUNNING

<quicklink>    <url>
-------------  -------------------------------------------------------------------
Shell          http://df-user.sb-ojxh.worker-0.9000.proxy.kr.df.naverncp.com

Step5. End DL app

End a DL app. Only the DL apps that are in the Running status can be ended.

End AI CLI app

Usage:

$ ./df-aicli app kill [OPTIONS]

Options:

--account TEXT: Data Forest account [required]
--app-id TEXT: ID of the submitted DL app [required]

Output:

ID of the DL app that has been ended

$ ./df-aicli app kill --account df-user --app-id  application_1643186470613_0530
request kill app-id: ['application_1643186470613_0530']

Was this article helpful?

What's Next

Categorizing MNIST handwriting images with TensorFlow

Table of contents

Preparations
Using AI CLI
AI CLI
AI CLI app