CLOVA Studio Usage Control Policy

Prev Next

Available in Classic and VPC

CLOVA Studio imposes upper limits on the number of API requests and tokens you can use within a specific time period. This is called the usage control policy, and it is necessary to maintain service availability and ensure stability. Within the defined limits, you can use the CLOVA Studio service freely. However, if you exceed the maximum allowed usage, an error message and error code will be returned.
By applying the usage control policy, CLOVA Studio can prevent malicious attacks that attempt to overload the system by sending excessive API calls, and protect the service from abnormal spikes in traffic. The policy also prevents any single user from monopolizing resources, ensuring an environment where all users can use the system reliably. The maximum usage you can consume varies depending on the model, tool, and purpose. Please make sure to fully review and understand the information provided in this guide.

Note
  • The maximum usage described in this guide refers to the upper limit of requests you may make when using the CLOVA Studio service. It does not guarantee that requests up to that limit will always be processed. Even within the maximum usage, processing delays or failures may occur depending on infrastructure load and traffic conditions.
  • Maximum usage limits may change in the future. If changes occur, we will notify customers separately.
  • If you require guaranteed usage or a certain level of throughput, a dedicated plan is available. If you need more information about guaranteed-usage plans and policies, please contact Customer Support.

Application Criteria

The following items serve as the basis for calculating maximum usage limits.

Type Description
Your account Maximum usage is calculated based on your main account.
  • If you have sub-accounts, the usage from those sub-accounts is also counted toward the main account's usage.
Models and tools Maximum usage varies depending on the model and tool you use.
  • When working with a tuned model, your usage is counted against the base model used for tuning.
  • For routers and skill trainer, usage is aggregated across all tasks and versions.
Purpose of use Maximum usage varies depending on whether you are using the service for an app in production.

Maximum Usage

CLOVA Studio enforces limits on the number of requests per minute (QPM) and the number of tokens processed per minute (TPM). If both QPM and TPM apply, exceeding either limit first will result in an error code.
Descriptions of QPM and TPM are as follows:

Type Description
QPM (Queries per Minute) The number of requests you make to a model or tool within one minute.
TPM (Tokens per Minute) The number of tokens processed within one minute.
  • Processed tokens = Input tokens + Maximum output tokens (maxTokens or maxCompletionTokens)
Note

The values described are the maximum usage supported as of July 17, 2025, and are subject to change. If usage limits change, an additional notice will be provided.

CLOVA Studio Web & Test Usage

This section explains the maximum QPM and TPM when using CLOVA Studio Playground, the web interface, or the test API key.

Maximum Usage by model

The maximum QPM and TPM available for each model are as follows:

Type QPM TPM
HCX-007 60 60,000
HCX-005 60 60,000
HCX-DASH-002 90 80,000
HCX-003 200 30,000
HCX-DASH-001 200 30,000
Tuned models Requests are counted based on the base model used for tuning.

Maximum usage per tool

The following table shows the maximum QPM and TPM available for each Explorer tool.

Type QPM TPM Notes
Summarization 30 30,000 Only input tokens are counted for TPM
Paragraph Split 120 -
Embedding 60 -
Embedding v2 60 40,000 Only input tokens are counted for TPM
Re-ranker 60 45,000
RAG Reasoning 60 30,000

Maximum usage for router and skill trainer

The following table shows the maximum QPM available when using the router or skill trainer.

Type QPM Notes
Router 60 Workflows and versions are not distinguished
Skillset Answer Generation 30 Workflows and versions are not distinguished

Service apps

This section describes the maximum QPM and TPM available for each model and tool when using service apps (via a service API key). The maximum usage limits for service apps are calculated and measured separately from web and test usage.

Maximum Usage by model

The maximum QPM and TPM available for each model are as follows:

Model QPM TPM Notes
HCX-007 180 300,000
HCX-005 300 180,000
HCX-DASH-002 450 240,000
HCX-003 700 150,000
HCX-DASH-001 600 120,000
Tuned models Requests made to a tuned model are counted as requests to the base model used during tuning.

Maximum usage per tool

The following table shows the maximum QPM and TPM available for each Explorer tool.

Type QPM TPM Notes
Summarization 60 60,000 Only input tokens are counted for TPM
Paragraph Split 360 -
Embedding 300 -
Embedding v2 540 960,000 Only input tokens are counted for TPM
Re-ranker 120 180,000
RAG Reasoning 180 120,000

Maximum usage for skill trainer

The following table shows the maximum QPM available when using the skill trainer.

Type QPM Notes
Skillset Answer Generation 90 Workflows and versions are not distinguished
Router 180 Workflows and versions are not distinguished

Checking maximum usage

When you call the CLOVA Studio service through the API using cURL, Python, or similar methods, you can check your rate-limit information in the API response headers.

The following usage-control details can be viewed:

Key Value (Example) Description Notes
x-ratelimit-limit-requests 60 The maximum QPM (requests per minute) allowed for the API you are subscribed to.
x-ratelimit-limit-tokens 10000 The maximum TPM (tokens per minute) allowed for the API you are subscribed to. Included only when TPM limits apply.
x-ratelimit-remaining-requests 59 The number of remaining requests before reaching the maximum QPM for the API you are subscribed to.
x-ratelimit-remaining-tokens 9462 The number of remaining tokens before reaching the maximum TPM for the API you are subscribed to. Included only when TPM limits apply.
x-ratelimit-reset-requests 23s Time remaining until the request-based usage limit for the API you are subscribed to resets
x-ratelimit-reset-tokens 23s Time remaining until the token-based usage limit for the API you are subscribed to resets Included only when TPM limits apply.

Managing Usage Limits

If you exceed the maximum number of requests allowed when using the CLOVA Studio service, an HTTP 429 error code or an error message will be returned. This section explains what you can do to avoid exceeding the usage limits.

Note

CLOVA Studio strives to provide a stable and seamless service, but delays or failures may still occur depending on infrastructure conditions and traffic levels, even if your usage remains within the maximum allowed limits.

How to Manage QPM

  • Check your QPM allowance in advance and ensure your requests stay within the limit.
  • Implement your own rate-limiting mechanism to control API calls.
  • Add a delay (time sleep) between requests as needed.
  • When an HTTP 429 error or related error message is returned, handle the exception by waiting for a certain period before retrying.

How to Manage TPM

  • Check your TPM allowance in advance and set the number of input tokens and the maximum output tokens only as high as needed.
  • To check the number of tokens in your input string, use the Token Calculator API in the Explorer menu.
  • To adjust the maximum number of tokens used to generate the response when calling the API, modify the value of the maxTokens or maxCompletionTokens field.
  • To check the token count of your input text in the Playground menu, click the calculator icon at the top of the Playground interface.
    clovastudio-ratelimiting_tpm-mitigation01-1_ko
  • To adjust the maximum number of tokens generated in a response within the Playground menu, edit the Max tokens field on the left side of the interface.
    clovastudio-ratelimiting_tpm-mitigation02_ko