Available in VPC
You can learn the data manager interface configuration. In Data Manager, you can view the list of datasets in the workspace and their details.
- You can refer to the dataset uploaded to the data manager in different projects within the workspace.
- You can only view the dataset list and details in the data manager interface.
- Use ML expert Platform SDKs for tasks such as uploading or deleting datasets, and creating tags and branches with data manager.
View data manager list
The information about the list of datasets you saved is as follows:

- Dataset Title: Name you set when uploading the dataset.
- Creation date and time: Initial time of creation.
- Operation: Click [dataset detail] to go to the view details interface.
View data manager details
You can view the details about the dataset you selected. The details are divided into tabs.
Overview
View the metadata of the dataset you selected.
Files and Versions
You can view the list of files by the directory of the selected dataset.
Use data manager SDKs
Data manager SDKs support Huggingface Dataset Interface based on Python.
To upload/download the dataset through SDKs:
Install SDKs
You can install SDKs using the following commands:
pip install "ncloud-mlx[data-manager]" # double quotes are required
Prerequisites
To use SDKs, you must create API key, and the MLX endpoint is required. Enter the created API key to complete prerequisites. You can set the endpoint URL as MLX_ENDPOINT_URL, an environment variable.
from mlx.sdk.data import login
login("{ API Key }") # MLXP API Key
login("{ API Key }", "{MLX endpoint}") # Method of pinning the endpoint URL during login, instead of using an environment variable
Read dataset
To use the dataset in the training logic, you must load it using the dataset class. For more information, see the Huggingface Python SDK official documentation.
To load a dataset from the local system:
from mlx.sdk.data import load_dataset
ds = load_dataset(
"{ path of the data in the local system }" # Local data path e.g., "path/to/folder/*"
)
To load the dataset managed in the data manager:
from mlx.sdk.data import load_dataset
ds = load_dataset(
"{ workspace name }/{ dataset name }" # Dataset location e.g., "workspaceA/datasetA"
)
Upload dataset
The dataset is uploaded in the same way as Huggingface Dataset Interface. For more information, see the Huggingface Python SDK official documentation.
The typical methods of uploading are as follows:
push_to_hub
...
ds.push_to_hub(
repo_id="{ workspace name }/{ dataset name }"
)
...
upload_file
from huggingface_hub import create_repo, upload_file
path = "{ workspace name }/{ dataset name }"# Location of the dataset to upload
create_repo(repo_id=path, repo_type="dataset")
upload_file(
repo_id=path,
path_or_fileobj="{ local file path }", # Local file path to upload
path_in_repo="path/to/folder/foo.csv", # Path of the remote file in the dataset
repo_type="dataset",
)
upload_folder
from huggingface_hub import create_repo, upload_folder
path = "{ workspace name }/{ dataset name }"# Location of the dataset to upload
create_repo(repo_id=path, repo_type="dataset")
upload_folder(
repo_id=path,
path_or_fileobj="{ local directory path }", # Local directory path to upload
path_in_repo="path/to/folder", # Path of the remote file in the dataset
repo_type="dataset",
)
Download dataset
To download the dataset to the local disk:
from huggingface_hub import snapshot_download
path = "{ workspace name }/{ dataset name }"# Location of the dataset to upload
snapshot_download(
repo_id=path,
repo_type="dataset",
local_dir="path/to/folder", # Directory path to download
local_dir_use_symlinks="auto" # Whether to use symlink using cache_dir
)
Create tag and branch
Once you create a dataset, a unique commit ID is assigned. You can read the dataset of a specific revision or log tags for additional information by using this commit ID.
To create a tag:
from huggingface_hub import create_tag
path = "{ workspace name }/{ dataset name }"
create_tag(
repo_id=path,
repo_type="dataset",
tag="{ name of the tag to be created}",
revision="{ revision }", # Baseline version. The default value is main.
tag_message="{ tag message }"
)
Metadata such as tag messages is immutable, so it cannot be edited. However, you can delete and recreate it. To delete a tag:
from huggingface_hub import delete_tag
path = "{ workspace name }/{ dataset name }"
delete_tag(
repo_id=path,
repo_type="dataset",
tag="{ name of the tag to be deleted}"
)
To create a branch:
from huggingface_hub import create_branch
path = "{ workspace name }/{ dataset name }"
create_branch(
repo_id=path,
repo_type="dataset",
branch="{ name of the branch to be created}",
revision="{ revision }"
)
Delete dataset
Note that once you delete a dataset, it cannot be recovered.
from huggingface_hub import delete_repo
path = "{ workspace name }/{ dataset name }"# Dataset to be deleted
delete_repo(
repo_id=path,
repo_type="dataset"
)