CLOVA Speech prerequisites

Prev Next

Available in Classic and VPC

Check the supported environment and pricing details required to use CLOVA Speech smoothly.

Cloud environment specifications

The cloud environment specifications supported by CLOVA Speech are as follows.

Item Supported specifications
Region (Zone) Korea
Platform VPC, Classic
Language Korean, English, Japanese
Note

For details about the VPC environment, refer to the service introduction on the NAVER Cloud Platform portal. For service-specific VPC and Classic support, refer to the Ncloud Usage Environment Guide.
Supported languages may vary depending on the domain.

Supported scope

The recognition targets and output formats supported by CLOVA Speech are as follows.

Item Supported scope
Supported languages Short-form recognition: Korean, English, Japanese, Chinese
Long-form recognition: Korean, English, Korean/English simultaneous recognition, Japanese, Chinese (Traditional/Simplified)
Streaming recognition: Korean, English, Japanese
Recognizable time Short-form: Up to 60 sec
Long-form recognition: Up to 2 hr (sync), up to 6 hr (Batch, async)
Recognition file size Short-form: Up to 10 MB (Builder, API)
Long-form recognition: Up to 2 GB (Builder, API)
Supported file formats
  • Short-form
    • Audio: MP3, AAC, AC3, OGG, FLAC, WAV, M4A
  • Long-form
    • Audio: MP3, AAC, AC3, OGG, FLAC, WAV, M4A
    • Video: AVI, MP4, MOV, WMV, FLV, MKV
  • Streaming
    • Audio: 16kHz(16bits per sample PCM)

Pricing

CLOVA Speech is billed according to recognition duration. For detailed pricing information, refer to Services > AI Services > CLOVA Speech.