CLOVA Speech overview

Prev Next

Available in Classic and VPC

NAVER Cloud Platform's CLOVA Speech provides fast and easy speech recognition powered by CLOVA's NEST (Neural End-to-end Speech Transcriber) technology. You can convert long media files into text and use CLOVA Speech to build voice-based services such as voice memos, video subtitles, and call transcript management.

Features provided by CLOVA Speech

Descriptions of the various features offered by CLOVA Speech are as follows:

  • Multiple speech recognition options: Choose the appropriate option among Short-form, Long-form, and Streaming recognition.
  • Automatic sentence segmentation and timestamp support: Automatically splits text into appropriate sentence lengths and provides timestamps, useful for subtitle creation and more.
  • Timeline shifting: Shifts the entire recognized text timeline backward—useful when splitting long videos for subtitle generation.
  • Batch processing for large workloads: Create batches to process recognition tasks for multiple media files at once.
  • Web-based Builder and result editor: Includes a Builder to manage recognition tasks and an editor to modify and export recognition results.
  • Keyword Boosting: Allows you to specify words that should have higher recognition probability.
  • API-based recognition: Send files and receive recognition results using the CLOVA Speech API.
  • Papago Translation integration: When using Streaming recognition, provides multilingual translation through Papago Translation.
    *See supported translation languages.

CLOVA Speech user guide

CLOVA Speech is available in the Korea Region. Use this guide to get the most out of CLOVA Speech.

CLOVA Speech related resources

Beyond the user guide, these resources provide additional context and support for CLOVA Speech. Whether you're considering CLOVA Speech or need in-depth information for development, marketing, and other purposes, these resources can help: