CLOVA Studio concept
    • PDF

    CLOVA Studio concept

    • PDF

    Article Summary

    Available in Classic and VPC

    Before learning the entire scenario using CLOVA Studio, some concepts about CLOVA Studio will be explained. The following are the main concepts to be explained.

    Note

    For an efficient understanding of the CLOVA Studio concept, see CLOVA Studio glossary.

    Prompt and result value

    Prompts mean what you have to type in order to perform an action in CLOVA Studio. Based on the prompts entered in CLOVA Studio, the HyperCLOVA language model generates result values. The HyperCLOVA language model works based on probability, so even if the same prompt is input, different result values may be created.
    <example> If "monkey's ass is red" is entered at the prompt, the result value "red is an apple, and an apple is delicious" is generated with a high probability.

    Token

    Token means a word piece made by dividing a word for processing natural language. Korean words are mostly divided into 1 to 2 tokens in morpheme units. However, the HyperCLOVA language model divides the tokens in accordance with the trained content, so an identical expression is not composed of the same tokens every time.
    <example> The expression "delicious" can be divided into 2 tokens, "taste" and "there is" respectively.

    Probability-based language model

    A probability-based language mode refers to a language model that can predict the next word based on probabilities. The HyperCLOVA language model used in CLOVA Studio is a probability-based language model that generates result values based on probabilities.
    <example> If you enter a description of nature in the prompt, and the first token of the result value is "that", you can predict "tree", "flower", and "mountain" as the next word. At this time, each word has a probability, and the HyperCLOVA language model selects "tree" and "on" with the highest probabilities among them, and creates a result of "on that tree" in principle.

    clovastudio-info_probability_ko

    Parameters

    Parameters are values that you set in Playground to create phrases, and can be set in the Playground's left sidebar. Parameter items are as follows:

    Engine

    Engine is a language model used for creating phrases in CLOVA Studio. CLOVA Studio provides the Korean engines of LK-B, LK-C, LK-D2 and the English engine of LE-C, and there is HyperCLOVA X engine, HCX-002.

    In the general mode of the Playground, the greater the LK engine model size, the better the performance, but the speed may be slower. The HCX engine available in the chat mode of the Playground is an enhanced engine optimized for conversational tasks.

    • Korean engine model size
      LK-B < LK-C < LK-D2
    • Korean engine model speed
      LK-D2 < LK-C < LK-B
    • HyperCLOVA X single model
      HCX-002

    Top K

    Top K is a reference value used for selecting one of the K numbers of tokens with the highest probabilities in the selection probability distribution of the tokens predicted by the natural language processing model. It is recommended to set top K to 0 unless it is a special case.
    <example> In the case of top K=5, one token is selected among the 5 tokens with the highest probabilities. Here, the probability is high that the token with the highest probability will be selected, but in some cases, a token with a lower probability may be selected.

    clovastudio-info_topk_ko

    Top P

    Top P is a reference value used to remove a token not included in the accumulated probability value which is set after the tokens with higher selection probability values are arranged in order. We recommend setting top P to 0.8 ~ 1 unless it is a special case.
    <example> In the case of top P=0.8, only tokens with cumulative probability values in the top 80% are selected as candidates.

    clovastudio-info_topp_ko

    Maximum tokens

    Maximum tokens is the maximum number of tokens to use when generating the result value. The higher the number of tokens is set, the longer the result value is output. Including both the prompt and the result values, the language models available in the general mode allow up to 2048 tokens. For the HyperCLOVA X language model provided in chat mode, up to 4096 tokens are allowed.

    Temperature

    Temperature is a value to control diversity of sentences by granting changes in the weighted value to the probability distribution. If the temperature is set low, the ranking of the tokens included in the candidates is not changed, but the tokens with higher probabilities have even higher probabilities and the tokens with lower probabilities have even lower probabilities. Since there is a high possibility that the highest ranking token is selected, a typical result value is created. On the other hand, if the temperature is set high, the differences in probability value between tokens are smaller and various sentences can be made, but sentences that do not completely match the rules may be created and the quality of sentences may be lower. Therefore, we recommend adjusting the temperature as needed while fixing the top P value.

    • If the temperature value is low

    clovastudio-info_temperaturelow_ko

    • If the temperature value is high

    clovastudio-info_temperaturehigh_ko

    Repetition penalty

    Repetition penalty is a value for granting a score-decreasing element to those repeated tokens so that repeated result values cannot be created when phrases are created in CLOVA Studio. The higher the repetition penalty, the lower the probability that the same result value will be created repeatedly.

    Stop sequences

    Stop sequences are a text string to be used for suspending the creation of a result. You can register multiple Stop sequences, and when CLOVA Studio creates a result, if one of the Stop sequences is included in the result, only the contents up to that point are output.
    <example> If you enter "monkey's ass is red" for the prompt and add the string "apple" to the Stop sequences, the result value will be output only up to "red is", but not from "apple".

    Seed

    The seed is a value that adjusts the consistency of the result. When the seed value is the same, you can obtain identical results by running the probability-based language model multiple times.
    However, it does not guarantee the completeness of the result, and slight variations may occur if other conditions are precisely adjusted.
    If the seed value is set to 0, the result will be randomly generated.

    Inject start text

    Inject start text is a text to always be output before the result value output by CLOVA Studio.
    <example> When creating a phrase that has a conversation between the user and CLOVA, you can distinguish the speaker by entering "User: tell me what the weather is today" in the prompt and setting "CLOVA:" in the Inject start text.

    Inject restart text

    Inject restart text is a text to always be output after the result value output by CLOVA Studio.
    <example> If "User:" is set in Inject restart text, "User:" is output along with the result value for the first entered prompt, so you do not have to enter "User:" when entering the next prompt.

    Show probabilities

    Show probabilities is an option to display the probability of each created token to be selected. You can check what other candidate values there are.
    clovastudio-info_probabilities_ko

    Generation type

    Generation type is a result value generation method. The types and descriptions of generation types are as follows:

    Rolling

    clovastudio-info_rolling

    Rolling is a method where, if you input a prompt and create a result value and then try to create again, it recognizes the previously created result value as a part of the prompt and creates another result. Since the result value created after the first-input prompt is not input by the user, as you repeat creating, the result may deviate from the original intention you had when you first input the prompt.
    ### One-time

    clovastudio-info_onetime

    One-time is a method to display the result value created after inputting a prompt in a preview format instead of outputting the result in the editor area right away, and allows the user to apply the result to the editor area.

    Multiple

    clovastudio-info_multiple

    Multiple is a method in which, when creating the result value after entering a prompt, you can select the result value to be applied after creating a specified number of result values.

    Examples

    clovastudio-info_examples

    Examples is a method to additionally input content similar to a desired answer and obtain a result value similar to the intention when creating a result value after inputting a prompt.

    Tuning

    Tuning refers to a method to transform a part of the pre-learned model parameters to fit a user’s purpose, and re-learn a part of the model for the user data. You can train and test a model optimized for the desired task type and data through tuning by inputting a certain amount of training/validation dataset. You can use the updated model by turning it into an API to suit new data and various purposes.

    Task

    Task refers to a standard unit for performing tuning. You can select one task type, language, and model engine for each task. Afterward, you can create the most optimized model to the task type, language, model engine, and dataset by learning through the user dataset.

    Test app

    The test app is an app to provide an API temporarily for checking the possibility of the test or service. There is a limit in use (period, call volume), and if the test app is actually applied to a service, the service quality may have problems. In addition, if the test app is used for an actual service, it is blocked. During the beta period, you can use the test app up to the number of tokens granted.

    Service app

    The service app can access the CLOVA Studio API and is provided for actual users to utilize. If approved after going through the service app review issuance process, a key is issued, and if the service app is used for a purpose different from the review, the provision of the app is blocked without prior sharing.


    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.