Utilize tools
    • PDF

    Utilize tools

    • PDF

    Article Summary

    Available in Classic and VPC

    Using tools describes how to use various tools provided in the Explore menu. CLOVA Studio currently provides a batch creation tool and data expansion tool.

    Batch creation

    Batch creation is a tool for batch processing and forwarding large amounts of user-uploaded tasks.
    The following describes how to use the batch creation tool.

    1. In NAVER Cloud Platform console, click the Services > AI Services > CLOVA Studio menus, in that order.
    2. Click the My Product menu, and then click the [Go to CLOVA Studio] button.
    3. Click the Explorer menu.
    4. Click the Tool tab menu, and then click the [Start] button of Batch creation.
    5. When the batch creation screen appears, select a model engine.
      • If you selected the default model, fill in the prompt template.
        • Filling in the prompt template is similar to the creation of a playground.
        • Your prompt template should consist of at least three sets of examples, separated by a ### between each set of examples.
        • Be sure to type {text} to end the prompt template.
      • If you are using tuning to create your own training model, see Tuning.
    6. Upload a seed data set.
      • Analyze patterns in uploaded data sets and expand to similar types of data sets.
      • Only CSV and JSONL extensions are supported for seed data. Seed data must be encoded in UTF-8 format.
      • The seed data set must contain at least 10 rows of data, with no more than 1000 characters per row, including spaces.
      • If you have selected a tuning model in the model engine, the task type in the seed data set must match the task type in the tuning model.
      • If the contents of the data set contain "#" symbols, performance may be degraded.
    7. Click the [Run] button.
      • The task confirmation window appears.
    8. Click the [OK] button to start the task.
      • You will be taken to the [My task] menu where you can view and download your tasks.
      • Click the [Stop] button to stop the task and return to the previous screen.
      • To view and download task results, see Manage tasks.
    Caution
    • Only one batch creation task can run at a time per account.
    • A batch creation task takes 10 seconds to create one piece of data. This time may vary depending on your system environment.
    • Please note that if you cancel a task after it has started, you may be charged depending on the progress of the task.
    Note

    Because batch data creation is based on a seed data set, results can vary significantly depending on the data in the seed data set. To predict the results, try creating and testing different prompts in the playground.
    33

    <example>

    Here's an example of a seed data set and the resulting output.
    clovastudio-explorer_augbatch_seed01_ko

    clovastudio-explorer_augbatch_seed02_ko

    Data expansion

    The data expansion tool allows you to expand a user-uploaded data sample by any amount. When you upload a seed data set, the language model analyzes the patterns in the seed data set to generate as much similar data as you want.

    The following describes how to use the data expansion tool.

    1. In NAVER Cloud Platform console, click the Services > AI Services > CLOVA Studio menus, in that order.
    2. Click the My Product menu, and then click the [Go to CLOVA Studio] button.
    3. Click the Explorer menu.
    4. On the Tool tab, click the [Start] button of Data expansion.
    5. Please select a model engine, which is the default training model needed to expand user-uploaded data.
    6. After selecting a model engine, enter the number of rows of data you want to get.
      • You can enter a minimum of 20 rows and a maximum of 50,000 rows (rows = number of data).
      • Be sure to enter a value greater than the number of data written in the seed data set you uploaded.
    7. Upload a seed data set.
      • Analyze patterns in uploaded data sets and expand to similar types of data sets.
      • Only CSV and JSONL extensions are supported for seed data. Seed data must be encoded in UTF-8 format.
      • Seed data must be uploaded in at least 10 rows, with no more than 1000 characters per row, including spaces.
      • If the contents of the data set contain "#" symbols, performance may be degraded.
      • If you have selected HCX as the model engine, the data set should be formatted as "User: (dialog), Assistant: (dialog)".
    8. Click the [Run] button.
      • The task confirmation pop-up window appears.
    9. Click the [OK] button to start the task.
      • You will be taken to the [My task] menu where you can view and download your tasks.
      • Click the [Stop] button to stop the task and return to the previous screen.
      • To view and download task results, see [Manage tasks].
    Note

    If you upload 10 data sets and enter 20 for the desired number of data, you will receive the 10 uploaded data sets and 10 newly created data sets.

    Caution
    • Only one data expansion task can run at a time per account.
    • A data expansion task takes 10 seconds to create one piece of data. This time may vary depending on your system environment.
    • If you cancel a task after it has started, you will be charged based on the progress of the task.

    Use cases

    Expand CareCall conversation data set

    Data expansion is more suitable for creative work to create new sentences than response-type work with a predetermined answer (completion). For example, tuning training requires a data set with at least a thousand data points, and the data expansion tool saves users from having to manually create thousands of data points.

    The following describes how to expand a CareCall conversation data set.

    1. Obtain a seed data set for users to use for data expansion. 100 dialog turns are created to generate a care call dialog set.
    2. Expand to 1000, the minimum number of data required for tuning.
    3. 100 dialog turns are expanded to 1000 and come out as a result.
    4. Validate (error) the data to get 1000 data sets for tuning training.

    Performance testing with batch creation

    When you run tests to check the performance of your tuned model engine, you need to enter the input (text) one at a time to get a single output (completion). However, with the batch creation tool, you can enter multiple inputs at once and get results.

    Here's how to test the performance of a tuned engine using the batch creation tool.
    Follow the steps below to test performance after training on the CareCall conversation data set.

    1. Train on 1000 results from your data expansion.
    2. Call the tuning model trained in batch creation into the model engine.
    3. Prepare a seed data set filled with only input (text) values for performance testing.
    4. Upload a seed data set and run batch creation.
      • An output (completion) suitable for the given input (text) is created and comes out as a result.
    5. Check the performance of the tuning model by running a validation test to see if it produces the desired results.

    Expand your data with the batch creation tool

    It is suitable for tasks that generate different outputs (completions) from repetitive inputs (text). You can expand your data by generating multiple outputs (completions) with a small number of inputs (text).

    The following describes how to expand your data with the batch creation tool.
    Follow the steps below to create contextual Christmas phrases.

    1. On the batch creation service screen, complete the contextual Christmas phrase creation prompt template.
      11
    2. Provide 5 contexts (Input_text) to build the seed data set, and copy and paste each context 20 times to create a total of 100 seed data sets.
      22
    3. Check the results.

    Was this article helpful?

    What's Next
    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.