Available in VPC
This guide describes the Data Manager interface. Data Manager allows you to view the list and details of datasets within your Workspace.
- Datasets uploaded to Data Manager can be referenced across different projects within the Workspace.
- In the Data Manager interface, you can only view the dataset list and dataset details.
- To perform tasks such as uploading or deleting datasets, or creating tags and branches, use the ML expert Platform SDK.
View Data Manager list
The list of your datasets includes the following information:

- Dataset Title: Name set when uploading the dataset.
- Creation date: Initial creation date and time.
- Operation: Click [Dataset detail] to go to the details interface.
View Data Manager details
You can view details for the selected dataset. The information is organized into tabs.
Overview
View metadata for the selected dataset.
Files and Versions
You can view the file list for each directory in the selected dataset.
Use Data Manager SDKs
Data Manager SDKs support the Python-based Huggingface Dataset Interface.
You can upload and download datasets using the SDK as follows:
Install the SDK
You can install the SDK by running the following command:
pip install "ncloud-mlx[data-manager]" # double quotes are required
Prerequisites
To use the SDK, create an API Key and specify the MLX endpoint. Enter the generated API Key to complete the setup. You can set the endpoint URL using the MLX_ENDPOINT_URL environment variable.
from mlx.sdk.data import login
login("{ API Key }") # MLXP API Key
login("{ API Key }", "{MLX endpoint}") # Specify the endpoint URL at login instead of using an environment variable
Read datasets
To use datasets in training logic, you must load them as dataset classes. For details, see the official Huggingface Python SDK documentation.
To load a local dataset:
from mlx.sdk.data import load_dataset
ds = load_dataset(
"{ path to local data }" # local data path e.g. "path/to/folder/*"
)
To load a dataset managed in Data Manager:
from mlx.sdk.data import load_dataset
ds = load_dataset(
"{ Workspace name }/{ dataset name }" # Dataset location e.g. "workspaceA/datasetA"
)
Upload dataset
You can upload datasets using the same method as the Huggingface Dataset interface. For details, see the official Huggingface Python SDK documentation.
You can run create_repo with Workspace Admin privileges.
Common upload methods are as follows:
push_to_hub
...
ds.push_to_hub(
repo_id="{ Workspace name }/{ dataset name }"
)
...
upload_file
from huggingface_hub import create_repo, upload_file
path = "{ Workspace name }/{ dataset name }" # Location of the dataset to upload
create_repo(repo_id=path, repo_type="dataset")
upload_file(
repo_id=path,
path_or_fileobj="{ local file path }", # Path to the local file to upload
path_in_repo="path/to/folder/foo.csv", # Remote file path in the dataset
repo_type="dataset",
)
upload_folder
from huggingface_hub import create_repo, upload_folder
path = "{ Workspace name }/{ dataset name }" # Location of the dataset to upload
create_repo(repo_id=path, repo_type="dataset")
upload_folder(
repo_id=path,
folder_path="{ local directory path }", # Path to the local directory to upload
path_in_repo="path/to/folder", # Remote directory path in the dataset
repo_type="dataset",
)
Download a dataset
To download a dataset to a local disk:
from huggingface_hub import snapshot_download
path = "{ Workspace name }/{ dataset name }" # Location of the dataset to upload
snapshot_download(
repo_id=path,
repo_type="dataset",
local_dir="path/to/folder", # Path to the directory to download to
local_dir_use_symlinks="auto" # Whether to use symlinks with cache_dir
)
Create tags and branches
When you create a dataset, a unique commit ID is assigned. You can use this commit ID to read a dataset from a specific revision or record a tag for additional information.
To create a tag:
from huggingface_hub import create_tag
path = "{ Workspace name }/{ dataset name }"
create_tag(
repo_id=path,
repo_type="dataset",
tag="{ tag name to create}",
revision="{ revision }", # Base version. The default is main
tag_message="{ tag message }"
)
Metadata such as a tag message is immutable and cannot be modified, but it can be deleted and recreated. To delete a tag:
from huggingface_hub import delete_tag
path = "{ Workspace name }/{ dataset name }"
delete_tag(
repo_id=path,
repo_type="dataset",
tag="{ tag name to delete}"
)
To create a branch:
from huggingface_hub import create_branch
path = "{ Workspace name }/{ dataset name }"
create_branch(
repo_id=path,
repo_type="dataset",
branch="{ branch name to create}",
revision="{ revision }"
)
Deleting a dataset
Use caution when deleting a dataset. This action cannot be undone.
from huggingface_hub import delete_repo
path = "{ Workspace name }/{ dataset name }" # dataset to delete
delete_repo(
repo_id=path,
repo_type="dataset"
)