CLOVA Studio Usage Control Policy

Available in Classic and VPC

CLOVA Studio imposes upper limits on the number of API requests and tokens you can use within a specific time period. This is called the usage control policy, and it is necessary to maintain service availability and ensure stability. Within the defined limits, you can use the CLOVA Studio service freely. However, if you exceed the maximum allowed usage, an error message and error code will be returned.
By applying the usage control policy, CLOVA Studio can prevent malicious attacks that attempt to overload the system by sending excessive API calls, and protect the service from abnormal spikes in traffic. The policy also prevents any single user from monopolizing resources, ensuring an environment where all users can use the system reliably. The maximum usage you can consume varies depending on the model, tool, and purpose. Please make sure to fully review and understand the information provided in this guide.

Note

The maximum usage described in this guide refers to the upper limit of requests you may make when using the CLOVA Studio service. It does not guarantee that requests up to that limit will always be processed. Even within the maximum usage, processing delays or failures may occur depending on infrastructure load and traffic conditions.
Maximum usage limits may change in the future. If changes occur, we will notify customers separately.
If you require guaranteed usage or a certain level of throughput, a dedicated plan is available. If you need more information about guaranteed-usage plans and policies, please contact Customer Support.

Application Criteria

The following items serve as the basis for calculating maximum usage limits.

Type	Description
Your account	Maximum usage is calculated based on your main account. If you have sub-accounts, the usage from those sub-accounts is also counted toward the main account's usage.
Models and tools	Maximum usage varies depending on the model and tool you use. When working with a tuned model, your usage is counted against the base model used for tuning. For routers and skill trainer, usage is aggregated across all tasks and versions.
Purpose of use	Maximum usage varies depending on whether you are using the service for an app in production.

Maximum Usage

CLOVA Studio enforces limits on the number of requests per minute (QPM) and the number of tokens processed per minute (TPM). If both QPM and TPM apply, exceeding either limit first will result in an error code.
Descriptions of QPM and TPM are as follows:

Type	Description
QPM (Queries per Minute)	The number of requests you make to a model or tool within one minute.
TPM (Tokens per Minute)	The number of tokens processed within one minute. Processed tokens = Input tokens + Maximum output tokens (`maxTokens` or `maxCompletionTokens`)

Note

The values described are the maximum usage supported as of July 17, 2025, and are subject to change. If usage limits change, an additional notice will be provided.

CLOVA Studio Web & Test Usage

This section explains the maximum QPM and TPM when using CLOVA Studio Playground, the web interface, or the test API key.

Maximum Usage by model

The maximum QPM and TPM available for each model are as follows:

Type	QPM	TPM
HCX-007	60	60,000
HCX-005	60	60,000
HCX-DASH-002	90	80,000
HCX-003	200	30,000
HCX-DASH-001	200	30,000
Tuned models	Requests are counted based on the base model used for tuning.

Maximum usage per tool

The following table shows the maximum QPM and TPM available for each Explorer tool.

Type	QPM	TPM	Notes
Summarization	30	30,000	Only input tokens are counted for TPM
Paragraph Split	120	-
Embedding	60	-
Embedding v2	60	40,000	Only input tokens are counted for TPM
Re-ranker	60	45,000
RAG Reasoning	60	30,000

Maximum usage for router and skill trainer

The following table shows the maximum QPM available when using the router or skill trainer.

Type	QPM	Notes
Router	60	Workflows and versions are not distinguished
Skillset Answer Generation	30	Workflows and versions are not distinguished

Service apps

This section describes the maximum QPM and TPM available for each model and tool when using service apps (via a service API key). The maximum usage limits for service apps are calculated and measured separately from web and test usage.

Maximum Usage by model

The maximum QPM and TPM available for each model are as follows:

Model	QPM	TPM
HCX-007	180	300,000
HCX-005	300	180,000
HCX-DASH-002	450	240,000
HCX-003	700	150,000
HCX-DASH-001	600	120,000
Tuned models	Requests made to a tuned model are counted as requests to the base model used during tuning.

Maximum usage per tool

The following table shows the maximum QPM and TPM available for each Explorer tool.

Type	QPM	TPM	Notes
Summarization	60	60,000	Only input tokens are counted for TPM
Paragraph Split	360	-
Embedding	300	-
Embedding v2	540	960,000	Only input tokens are counted for TPM
Re-ranker	120	180,000
RAG Reasoning	180	120,000

Maximum usage for skill trainer

The following table shows the maximum QPM available when using the skill trainer.

Type	QPM	Notes
Skillset Answer Generation	90	Workflows and versions are not distinguished
Router	180	Workflows and versions are not distinguished

Checking maximum usage

When you call the CLOVA Studio service through the API using cURL, Python, or similar methods, you can check your rate-limit information in the API response headers.

The following usage-control details can be viewed:

Key	Value (Example)	Description	Notes
x-ratelimit-limit-requests	`60`	The maximum QPM (requests per minute) allowed for the API you are subscribed to.
x-ratelimit-limit-tokens	`10000`	The maximum TPM (tokens per minute) allowed for the API you are subscribed to.	Included only when TPM limits apply.
x-ratelimit-remaining-requests	`59`	The number of remaining requests before reaching the maximum QPM for the API you are subscribed to.
x-ratelimit-remaining-tokens	`9462`	The number of remaining tokens before reaching the maximum TPM for the API you are subscribed to.	Included only when TPM limits apply.
x-ratelimit-reset-requests	`23s`	Time remaining until the request-based usage limit for the API you are subscribed to resets
x-ratelimit-reset-tokens	`23s`	Time remaining until the token-based usage limit for the API you are subscribed to resets	Included only when TPM limits apply.

Managing Usage Limits

If you exceed the maximum number of requests allowed when using the CLOVA Studio service, an HTTP 429 error code or an error message will be returned. This section explains what you can do to avoid exceeding the usage limits.

Note

CLOVA Studio strives to provide a stable and seamless service, but delays or failures may still occur depending on infrastructure conditions and traffic levels, even if your usage remains within the maximum allowed limits.

How to Manage QPM

Check your QPM allowance in advance and ensure your requests stay within the limit.
Implement your own rate-limiting mechanism to control API calls.
Add a delay (time sleep) between requests as needed.
When an HTTP 429 error or related error message is returned, handle the exception by waiting for a certain period before retrying.

How to Manage TPM

Check your TPM allowance in advance and set the number of input tokens and the maximum output tokens only as high as needed.
To check the number of tokens in your input string, use the Token Calculator API in the Explorer menu.
To adjust the maximum number of tokens used to generate the response when calling the API, modify the value of the maxTokens or maxCompletionTokens field.
To check the token count of your input text in the Playground menu, click the calculator icon at the top of the Playground interface.
To adjust the maximum number of tokens generated in a response within the Playground menu, edit the Max tokens field on the left side of the interface.