Available in Classic and VPC
CLOVA Studio imposes upper limits on the number of API requests and tokens you can use within a specific time period. This is called the usage control policy, and it is necessary to maintain service availability and ensure stability. Within the defined limits, you can use the CLOVA Studio service freely. However, if you exceed the maximum allowed usage, an error message and error code will be returned.
By applying the usage control policy, CLOVA Studio can prevent malicious attacks that attempt to overload the system by sending excessive API calls, and protect the service from abnormal spikes in traffic. The policy also prevents any single user from monopolizing resources, ensuring an environment where all users can use the system reliably. The maximum usage you can consume varies depending on the model, tool, and purpose. Please make sure to fully review and understand the information provided in this guide.
- The maximum usage described in this guide refers to the upper limit of requests you may make when using the CLOVA Studio service. It does not guarantee that requests up to that limit will always be processed. Even within the maximum usage, processing delays or failures may occur depending on infrastructure load and traffic conditions.
- Maximum usage limits may change in the future. If changes occur, we will notify customers separately.
- If you require guaranteed usage or a certain level of throughput, a dedicated plan is available. If you need more information about guaranteed-usage plans and policies, please contact Customer Support.
Application Criteria
The following items serve as the basis for calculating maximum usage limits.
| Type | Description |
|---|---|
| Your account | Maximum usage is calculated based on your main account.
|
| Models and tools | Maximum usage varies depending on the model and tool you use.
|
| Purpose of use | Maximum usage varies depending on whether you are using the service for an app in production. |
Maximum Usage
CLOVA Studio enforces limits on the number of requests per minute (QPM) and the number of tokens processed per minute (TPM). If both QPM and TPM apply, exceeding either limit first will result in an error code.
Descriptions of QPM and TPM are as follows:
| Type | Description |
|---|---|
| QPM (Queries per Minute) | The number of requests you make to a model or tool within one minute. |
| TPM (Tokens per Minute) | The number of tokens processed within one minute.
|
The values described are the maximum usage supported as of July 17, 2025, and are subject to change. If usage limits change, an additional notice will be provided.
CLOVA Studio Web & Test Usage
This section explains the maximum QPM and TPM when using CLOVA Studio Playground, the web interface, or the test API key.
Maximum Usage by model
The maximum QPM and TPM available for each model are as follows:
| Type | QPM | TPM |
|---|---|---|
| HCX-007 | 60 | 60,000 |
| HCX-005 | 60 | 60,000 |
| HCX-DASH-002 | 90 | 80,000 |
| HCX-003 | 200 | 30,000 |
| HCX-DASH-001 | 200 | 30,000 |
| Tuned models | Requests are counted based on the base model used for tuning. | |
Maximum usage per tool
The following table shows the maximum QPM and TPM available for each Explorer tool.
| Type | QPM | TPM | Notes |
|---|---|---|---|
| Summarization | 30 | 30,000 | Only input tokens are counted for TPM |
| Paragraph Split | 120 | - | |
| Embedding | 60 | - | |
| Embedding v2 | 60 | 40,000 | Only input tokens are counted for TPM |
| Re-ranker | 60 | 45,000 | |
| RAG Reasoning | 60 | 30,000 |
Maximum usage for router and skill trainer
The following table shows the maximum QPM available when using the router or skill trainer.
| Type | QPM | Notes |
|---|---|---|
| Router | 60 | Workflows and versions are not distinguished |
| Skillset Answer Generation | 30 | Workflows and versions are not distinguished |
Service apps
This section describes the maximum QPM and TPM available for each model and tool when using service apps (via a service API key). The maximum usage limits for service apps are calculated and measured separately from web and test usage.
Maximum Usage by model
The maximum QPM and TPM available for each model are as follows:
| Model | QPM | TPM | Notes |
|---|---|---|---|
| HCX-007 | 180 | 300,000 | |
| HCX-005 | 300 | 180,000 | |
| HCX-DASH-002 | 450 | 240,000 | |
| HCX-003 | 700 | 150,000 | |
| HCX-DASH-001 | 600 | 120,000 | |
| Tuned models | Requests made to a tuned model are counted as requests to the base model used during tuning. | ||
Maximum usage per tool
The following table shows the maximum QPM and TPM available for each Explorer tool.
| Type | QPM | TPM | Notes |
|---|---|---|---|
| Summarization | 60 | 60,000 | Only input tokens are counted for TPM |
| Paragraph Split | 360 | - | |
| Embedding | 300 | - | |
| Embedding v2 | 540 | 960,000 | Only input tokens are counted for TPM |
| Re-ranker | 120 | 180,000 | |
| RAG Reasoning | 180 | 120,000 |
Maximum usage for skill trainer
The following table shows the maximum QPM available when using the skill trainer.
| Type | QPM | Notes |
|---|---|---|
| Skillset Answer Generation | 90 | Workflows and versions are not distinguished |
| Router | 180 | Workflows and versions are not distinguished |
Checking maximum usage
When you call the CLOVA Studio service through the API using cURL, Python, or similar methods, you can check your rate-limit information in the API response headers.
The following usage-control details can be viewed:
| Key | Value (Example) | Description | Notes |
|---|---|---|---|
| x-ratelimit-limit-requests | 60 |
The maximum QPM (requests per minute) allowed for the API you are subscribed to. | |
| x-ratelimit-limit-tokens | 10000 |
The maximum TPM (tokens per minute) allowed for the API you are subscribed to. | Included only when TPM limits apply. |
| x-ratelimit-remaining-requests | 59 |
The number of remaining requests before reaching the maximum QPM for the API you are subscribed to. | |
| x-ratelimit-remaining-tokens | 9462 |
The number of remaining tokens before reaching the maximum TPM for the API you are subscribed to. | Included only when TPM limits apply. |
| x-ratelimit-reset-requests | 23s |
Time remaining until the request-based usage limit for the API you are subscribed to resets | |
| x-ratelimit-reset-tokens | 23s |
Time remaining until the token-based usage limit for the API you are subscribed to resets | Included only when TPM limits apply. |
Managing Usage Limits
If you exceed the maximum number of requests allowed when using the CLOVA Studio service, an HTTP 429 error code or an error message will be returned. This section explains what you can do to avoid exceeding the usage limits.
CLOVA Studio strives to provide a stable and seamless service, but delays or failures may still occur depending on infrastructure conditions and traffic levels, even if your usage remains within the maximum allowed limits.
How to Manage QPM
- Check your QPM allowance in advance and ensure your requests stay within the limit.
- Implement your own rate-limiting mechanism to control API calls.
- Add a delay (time sleep) between requests as needed.
- When an HTTP 429 error or related error message is returned, handle the exception by waiting for a certain period before retrying.
How to Manage TPM
- Check your TPM allowance in advance and set the number of input tokens and the maximum output tokens only as high as needed.
- To check the number of tokens in your input string, use the Token Calculator API in the Explorer menu.
- To adjust the maximum number of tokens used to generate the response when calling the API, modify the value of the
maxTokensormaxCompletionTokensfield. - To check the token count of your input text in the Playground menu, click the calculator icon at the top of the Playground interface.

- To adjust the maximum number of tokens generated in a response within the Playground menu, edit the Max tokens field on the left side of the interface.
