CLOVA Studio concepts

Available in Classic and VPC

Before learning the full workflow of CLOVA Studio, this section introduces several key concepts.

Prompts and outputs

A prompt is the input you provide to perform a task in CLOVA Studio. HyperCLOVA X generates outputs based on the prompt you enter in CLOVA Studio. Because HyperCLOVA X is a probabilistic language model, the output may differ even when you enter the same prompt.
For example, If you enter the prompt “A monkey's bottom is red,” the model will likely generate the output “Apples are red, and apples taste good.”

Tokens

A token is a word fragment used for natural language processing. Korean words typically split into one or two tokens. HyperCLOVA X splits tokens based on its training data, so the same expression may not always be divided into the same tokens.
For example, the Korean expression "delicious" may be split into the two tokens "taste" and "exists/is."

Probabilistic language model

A probabilistic language model predicts the next word based on probability. HyperCLOVA X, used in CLOVA Studio, is a probabilistic language model and generates outputs by selecting tokens according to their probabilities.
For example, If you enter a prompt asking for a description of nature, and the first token of the output is "that", the model may predict the following words such as "tree", "flower", or "mountain". Each candidate has its own probability. HyperCLOVA X selects the most probable tokens—such as "tree" and "at/on" to produce the result "on that tree".

Parameters

Parameters are values you configure in Playground to generate text. You can set them from the left sidebar of the Playground interface. The parameter items are as follows:

Model

The model is the HyperCLOVA X variant used to generate text in CLOVA Studio. CLOVA Studio provides the following models: HCX-007 for deep understanding and reasoning, HCX-DASH-002 as a lightweight model, and HCX-005 as a multimodal model capable of interpreting and understanding images. You can use these models through Playground and the Chat Completions v3 API.

Thinking

Thinking defines how the model performs reasoning before generating a final response. Through this process, you can view the model's reasoning path and decision logic that lead to the final answer. You can set the Thinking length to short, medium, or long, depending on task complexity and purpose. This process breaks down a complex problem and combines relevant knowledge to find a solution through structured reasoning. Thinking is available only when using the HCX-007 model.

Top P

Top P filters candidate tokens by removing those that fall outside the specified cumulative probability threshold. Unless you need special behavior, we recommend setting Top P to 0.8–1.0.
For example, if Top P = 0.8, only tokens within the top 80% cumulative probability are selected as candidates.

Top K

Top K selects one token from among the K tokens with the highest probability in the model's predicted probability distribution. Unless you need special behavior, we recommend setting Top K to 0.
For example, If Top K = 5, the model selects one token from the five most probable candidates. The most probable token is likely to be chosen, but depending on probability distribution, a lower-ranked token may also be selected.

Max tokens

Max tokens defines the maximum number of output tokens the model can generate. Higher values produce longer outputs.

Model	Allowed range
HCX-007	Input + output tokens combined: up to 128,000 Input tokens: up to 128,000 Requested output tokens (`maxCompletionTokens`, including tokens generated for reasoning): up to 32,768
HCX-005	Input + output tokens combined: up to 128,000 Input tokens: up to 128,000 Requested output tokens`maxTokens`: up to 4,096
HCX-003	Input + output tokens combined: up to 8,192 Input tokens: up to 7,600 Requested output tokens`maxTokens`: up to 4,096
HCX-DASH-002	Input + output tokens combined: up to 32,000 Input tokens: up to 32,000 Requested output tokens`maxTokens`: up to 4,096
HCX-DASH-001	Input + output tokens combined: up to 4,096 Input tokens: up to 3,500 Requested output tokens`maxTokens`: up to 4,096

Max tokens should be set according to the needs of your task. If the value is set significantly higher than the required number of output tokens, the model may generate unnecessarily long outputs, which can lead to unexpected usage charges and longer processing time. Excessive token requests may also result in more frequent request failures due to exceeding the TPM limits defined in the CLOVA Studio usage control policy.

Temperature

Temperature adjusts the randomness of generated text by modifying the probability distribution of tokens. When you set Temperature to a low value, the probability gap between tokens increases: high-probability tokens become even more likely, and low-probability tokens become less likely. This increases the chance that the most probable token is selected, resulting in more deterministic and structured outputs. When you set Temperature to a high value, probability differences narrow. This produces more diverse text but may lead to outputs that deviate from expected patterns or reduce overall quality. We recommend fixing the Top P value and adjusting Temperature only when needed.

Low Temperature
High Temperature

Repetition penalty

Repetition penalty reduces the probability of repeatedly generating the same tokens by penalizing tokens that have already appeared. A higher repetition penalty decreases the likelihood of repetitive outputs. Values between 1.0–1.1 with adjustments of 0.05 are recommended.

Stop sequences

Stop sequences are strings used to stop output generation. You can register multiple stop sequences. When the language model generates any of the registered stop sequences, it outputs only the text before the stop sequence.
For example, If you enter the prompt “A monkey's bottom is red” and add “apple” as a stop sequence, the model outputs only “is red” and stops before generating “apple”.

Seed

Seed controls the consistency of generated outputs. When you use the same seed value, the model can produce the same output across multiple runs, even though it is a probabilistic language model.
However, this does not guarantee perfectly identical results because small adjustments to other parameters can still lead to slight variations.
If you set the seed value to 0, the output is generated randomly.

Tuning

Tuning modifies part of the pre-trained model parameters and retrains portions of the model using user-provided data to meet specific goals. By providing a training and validation dataset, you can train and test a model that is optimized for your target task or dataset. You can then convert the updated model into an API and apply it to new data or various real-world use cases.

Task

A task is the basic unit used for tuning. For each task, you select the task type, language, and model. You can then train the model using your dataset to generate a version that is optimized for the selected task type and dataset.

Function calling

Function calling allows the language model to retrieve information from external systems or APIs so it can answer questions it cannot resolve on its own. You can connect to various resources such as APIs, scripts, open-source libraries, databases, and files stored on your local PC or in the cloud to flexibly handle a wide range of requirements. Function calling may appear similar to Skills because both retrieve information from outside the language model. However, they work differently. Skills require you to register APIs directly in the skill trainer and generate the final answer within that environment. Function calling allows the language model to determine what information is needed and call an external API directly, using only the required parameters extracted from your question.

Structured Outputs

Structured Outputs enable the language model to generate structured data that matches a JSON Schema you define, instead of free-form text. When you create a schema that specifies field names, data types, and valid ranges, the model generates a JSON object that conforms to your schema. This makes the generated data immediately usable for API request bodies, database inputs, system logs, and more, while significantly reducing the resources required for post-processing.

Service apps

A service app is an application designed for deployment or use in production environments. To register a service app, you must complete a separate application and approval process. Under the usage control policy, the maximum number of allowable requests and token usage differs depending on whether an application is registered as a service app. These policies ensure service stability and fair resource usage.