Available in Classic and VPC
The Dataset creation and management section provides guidelines for configuring datasets and explains how to create, modify, upload, and delete datasets.
Guidelines for configuring datasets
Since AiTEMS provides personalized recommendations based on dataset training, configuring datasets properly is crucial.
Note the following when setting up the dataset.
- Dataset types include user, item, and interaction, and training requires all three dataset types. Below are descriptions of each dataset type:
Descriptions for each dataset type:- user: Metadata containing user information (age, gender, etc.)
- item: Metadata containing information related to an item (price, release date, category, etc.)
- interaction: Metadata containing records of interactions between a user and an item
- Each dataset must include all required schema fields that correspond to its dataset type.
Dataset type Required fields (Not NULL) user item interaction - Dataset fields must match the schema fields not only in field name but also in field order and letter case.
- Do not enter duplicate values in the required fields of the user and item datasets.
Create a dataset
Create the datasets you will use for training. Datasets are managed by dataset name. When a dataset is created, it is assigned a unique dataset ID.
- To run training, you must create all dataset types.
- Before creating a dataset, prepare the dataset file you will upload. For details on structuring dataset files, see Guidelines for configuring datasets.
- Only csv or csv.gz files are supported.
To create a dataset:
- Navigate to
> Services > AI Services > AiTEMS in the NAVER Cloud Platform console. - Click the Dataset menu.
- Click [Create dataset] button.
- When the Create dataset interface appears, enter the dataset name and configure the dataset information.
- Dataset name: Used as the identifier for managing the dataset. Enter 5–20 characters.
- Must start with a letter and can include letters, numbers, _, and -.
- Description:: Enter a description for the dataset.
- Dataset type:: Select the dataset type (user/item/interaction) .
- For details on dataset types, see Guidelines for configuring datasets.
- You cannot change the dataset type after the dataset is created.
- Select schema: Select an existing schema, or create a new one.
- If no schema exists, select Create schema to create one.
- When Create schema is selected, fields for configuring the new schema appear.
- Data selection: Choose how to select the dataset file for training.
- Select from Object Storage: Select this option when the dataset file has already been uploaded to the AiTEMS bucket in Object Storage.
- Select from file: Select this option if the dataset file has not been uploaded to the AiTEMS bucket. When a file is selected and uploaded, it is automatically saved to the bucket.
- Dataset name: Used as the identifier for managing the dataset. Enter 5–20 characters.
- If you selected Create schema under Select schema, configure the schema information and click [Add] button.
- Schema name: Enter 3–20 characters. Used as the identifier for managing the schema.
- Schema description: Enter a description for the schema.
- Field name: Enter the field name exactly as it appears in the dataset file.
- Field type: Select the data type to input (string/float/long/double/int/boolean/null)
- Categorical field: Set to Y when entering data with predefined categories.
- You can drag and drop
to change the order of the fields.
- If you seleted Select from Object Storage under Data selection, click the dataset file to upload.
- Only files uploaded to the AiTEMS bucket can be selected.
- If you selected Select from file, drag and drop the file into the Drag files here or click to upload area, or click the area to select a file.
- The selected file is automatically uploaded to the bucket location displayed in Bucket / Path.
- Click [Create] button.
- In the notification popup, click [OK] button.
- The dataset is created and added to the dataset list.
Modify a dataset
You can update the dataset description, schema, and dataset file.
You cannot change the dataset type selected during creation.
To edit a dataset:
- Navigate to
> Services > AI Services > AiTEMS in the NAVER Cloud Platform console. - Click the Dataset menu.
- In the dataset list, click the dataset you want to edit.
- When the Edit dataset popup appears, apply the changes and click [Save and upload] button.
- To change the dataset file, click [Change path and upload] button in the Data Edit popup, select the file you want to upload, and then click [Change path and upload] button again.
- You can also change the dataset file by clicking [Upload dataset] button on the Dataset interface. (See Dataset upload).
- In the notification popup, click [OK] button.
- If the dataset file is changed, the status updates to Pending, and then updates again based on the upload result.
Upload dataset
You can replace the dataset file.
- To replace a dataset file, the file must already be stored in the AiTEMS bucket in Object Storage.
- You can also replace the dataset file through Edit dataset.
To upload a dataset:
- Navigate to
> Services > AI Services > AiTEMS in the NAVER Cloud Platform console. - Click the Dataset menu.
- In the dataset list, click the dataset for which you want to upload a dataset file.
- Click the [Upload dataset] button.
- In the Upload dataset popup, select the dataset file you want to upload and click [Request upload] button.
- In the notification popup, click [OK] button.
- The status changes to Pending, and will update based on whether the file upload succeeds.
Deleting a dataset
To delete a dataset:
You cannot delete a dataset that is connected to a service. Delete the relevant service or change the dataset connected to the service before deleting the dataset.
- Navigate to
> Services > AI Services > AiTEMS in the NAVER Cloud Platform console. - Click the Dataset menu.
- In the dataset list, click the dataset you want to delete.
- Click [Delete] button.
- When the Delete dataset popup appears, enter the dataset name and click [Delete] button.
- The dataset is deleted and removed from the list.