Utilize APIs
    • PDF

    Utilize APIs

    • PDF

    Article summary

    The latest service changes have not yet been reflected in this content. We will update the content as soon as possible. Please refer to the Korean version for information on the latest updates.

    Available in Classic and VPC

    This page describes the APIs available from the Explorer menu. Click the [Get Started] button for each API to see the details of that API.

    Tokenizer APIs

    The Tokenizer APIs can count the number of tokens in a sentence that you type. You can use the tokenizer to find the optimal number of tokens or to create efficient prompts.
    The tokenizer (HCX) is an API that counts the number of tokens in a given sentence in HyperCLOVA X model.
    The tokenizer (embedding v2) is an API that calculates the number of tokens in a sentence you type in the bge-m3 model of the embedding v2 API.

    Sliding window APIs

    The Sliding window APIs helps to keep conversations flowing in chat mode by scaling prompts and results to the maximum number of tokens HyperCLOVA X language model can handle.

    In chat mode, if the user's conversation with the assistant exceeds the maximum number of tokens that can be processed by HyperCLOVA X language model, new conversations cannot be created. The Sliding window APIs can delete the oldest conversation turn in the conversation history of the user and the assistant to prevent such a situation. When deleting conversations, it starts with the conversation following the system directive, i.e., the earliest entered conversation turn.

    Note
    • The Sliding window APIs only works for models that are in chat mode (Chat completions API).
    • You will need to set the order so that the results of the Sliding window APIs are passed to the Chat completions APIs as they are.
    • Other settings, such as modelName and maxTokens, should be set to the same values as the Chat completions APIs settings you are using.
    Caution
    • Because the chat history between the user and the assistant is deleted sequentially from the beginning of the conversation, newly created chats may not reflect previous chats.
    • In particular, if the maximum number of tokens generated by the result is set to a large number (the maxTokens value in the API), the conversation history will be deleted proportionally based on the number, so newly created conversations may not fully reflect previous conversations.

    How the Sliding window works

    In chat mode, if the sum of the total number of tokens in the entered conversations (A) and the maximum number of tokens in the new conversation (B=maxTokens) is greater than the maximum number of tokens the model can handle (X) (i.e., A+B>X), the Chat completions APIs will not generate any more conversations. To work around this, the Sliding window APIs delete conversation turns from existing conversations based on the number of excess tokens (A+B-X). It deletes on a conversation turn basis to avoid deleting only part of a conversation turn (deleting the minimum number of conversation turns based on the number of excess tokens).
    For example, if the number of excess tokens is 200, as shown in the figure below, using the Sliding window APIs will delete the two oldest conversation turns (100 and 200 tokens) of the existing conversation history. If the number of excess tokens is 100 tokens or fewer, only the oldest conversation turn in the existing conversation history will be deleted. This means that individual conversation turns are deleted, not pairs of conversations between the user and the assistant.
    clovastudio-explorer_slidingwindow_en.png

    Sliding window APIs workflow

    By using the Sliding window APIs, you can use the Chat completions APIs continuously without having to separately adjust the number of tokens for the entire conversation.

    The following describes how the Sliding window APIs work.

    1. Before using the Chat completions APIs, first call the Sliding window APIs and provide the prompt you want to enter (conversation content; body > messages) in the request.
    2. Enter the result in the Sliding window APIs response (result > messages) into the Chat completions APIs request as is.
    3. The model name and maximum number of tokens should be the same for the Chat completions APIs and the Sliding window APIs.

    Segmentation APIs

    The Segmentation APIs can separate paragraphs by topic by calculating the similarity between sentences. You can specify the number of tokens that can fit in a paragraph. The Segmentation API can also split paragraphs based on context, even if there are no blank lines in the paragraph or the break is not clear. The number of paragraphs can be adjusted with the SegCount value. If you want automatic segmentation, set the SegCount value to -1. If the value you enter is 0 or greater, paragraphs are broken to that value.

    The following describes how the Segmentation APIs works.

    clovastudio-explorer03_segment_ko

    Summarization APIs

    The Summarization APIs can split a given set of sentences into paragraphs and then summarize each of the paragraphs.

    It breaks long documents into contextualized paragraphs and summarizes each paragraph. You can reduce the length of your text by removing unnecessary parts while retaining important information. You can also control the size of the summary by using the segMaxSize and segMinSize of the segmentation to limit the number of characters that can be included in a paragraph.

    clovastudio-explorer03_summarize_ko

    The Summarization APIs can be utilized as follows:

    1. Summarize long meeting minutes and understand the content of the minutes to generate key takeaways.
    2. Summarize a long email to make it easier to understand what is important.
    3. Summarize a report or script. Summarize paragraphs in context, making it easy to create a table of contents.
    Note

    Summarizing may not work well if the text is published on the web.

    Embedding APIs

    Convert input text to a vector of numeric values. You can select your desired one from the three models depending on the task and purpose. Each model will give different similarity results for the same pair of sentences.

    ToolModelNumber of tokensDimension of vector spaceRecommended distance metricNote
    Embeddingclir-emb-dolphin500 tokens1024IP (inner/dot/scalar product)
    Embeddingclir-sts-dolphin500 tokens1024Cosine similarity
    Embedding v2bge-m38192 tokens1024Cosine similarityOpen source model*

    Each embedding API model has the following characteristics.
    clovastudio-explorer_embedding1_en.png

    Note

    Consider the following matter to obtain the same output from embedding v2 as that from the open-source bge-m3 model.

    • Embedding v2 returns "dense" out of the 3 methods (sparse, dense, multi-dense/colbert) of the bge-m3 model.
    • Embedding v2 does not adopt FP16 and normalization.

    Utilize embedding

    The embedding API can be used for the following tasks.

    • Compute vector similarity between sentences to improve search performance. For example, you can measure the vector similarity of documents to a search keyword entered by a user and return the most relevant documents.
    • Compute the similarity between two sentences to determine the similarity of related documents, or compare the semantic similarity between sentences.
    • Cluster documents with similar characteristics.
    • Categorize documents. Vectorized text data can be used in trained models to perform a variety of classification tasks, such as classifying text based on topic or sentiment.

    The embedding workflow consists of preparing the data, performing the embedding, saving the vector output, developing the API, and calling the output. Through this process, you can save the embedded output and use it in your database.

    Embedding workflow:::(Info)

    1. Convert the file to text for embedding.
    2. Depending on the type of text and its purpose, break up the text appropriately using the Segmentation or Summarization APIs.
    3. After selecting an appropriate embedding model, perform the embedding operation by converting the text to a vector.
    4. Store the embedded result and the original text together in a vector DB.
    5. Create an API by converting the user input query to a vector → Compare the similarity with the vectors stored in the DB to find a matching vector and call the mapped original text to generate the final result.
    6. You can use the Chat APIs to output the final result by putting the API result into a prompt and generating a response in the appropriate format that the user wants.

    :::

    clovastudio-explorer_embedding2_en.png

    Process long text

    The embedding API can process up to 500 tokens (clir-emb-dolphin, clir-sts-dolphin) or 8192 tokens (v2, bge-m3) at a time. If embedding is difficult due to the text token limit, we recommend using the chunking method to properly break up long texts. When chunking text, it is important to properly break the text into semantic units in order to extract the correct information. Chunking is the process of breaking text into smaller pieces, and includes Sliding window, Segmentation, Summarization operations.

    Here are the types of chunking methods and the advantages and disadvantages of each of them.

    MethodDescriptionAdvantagesDisadvantages
    Sliding windowSplits text into units of constant lengthEasily extracts the correct answer to a query by breaking the text into smaller piecesBecause text is divided into units of length rather than units of semantics, the beginning and end of the text are poorly treated, and it is difficult to understand the meaning of the entire text
    SegmentationSeparates text into meaningful paragraphs that make sense in contextText can be grouped into meaningful units for better embedding performanceLong paragraphs make it difficult to identify where the query is answered
    SummarizationSummarizes a long piece of text into a shorter version, focusing on the main pointsSummarizes longer contextual text than Segmentation, making it easier to embed on a per-document basisLong paragraphs make it difficult to identify where the query is answered
    Note

    CLOVA Studio provides the tokenizer (embedding v2) APIs, segmentation APIs, and summarization APIs. For more information, see Tokenizer APIs, Segmentation APIs, and Summarization APIs.

    See the citation information for the bge-m3 model below.

    @misc{bge-m3,
        title={BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation},
        author={Jianlv Chen and Shitao Xiao and Peitian Zhang and Kun Luo and Defu Lian and Zheng Liu},
        year={2024},
        eprint={2402.03216},
        archivePrefix={arXiv},
        primaryClass={cs.CL}
    }
    

    Create test app

    We provide API guides and application creation tools for integrating the services provided by CLOVA Studio. After creating a test app, you can use curl and Python code to call the APIs provided by the Explorer.

    • After clicking the [Create test app] button, you can call the API using curl and Python code.
    • The information about the created test apps can be found on the Test apps tab of the App application status.
      summarization_app-request_ko
    • For more information on how to issue a test app, see Utilize samples and manage tasks.

    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.