Scanner
    • PDF

    Scanner

    • PDF

    Article Summary

    Available in VPC

    Scanner infers the schema of the source data, and utilizes the classifier to create a table suitable to the data. Set the execution cycle of the scanner to periodically collect data and update the metadata to the latest status. You can manage, execute, and create scanners from the Scanner menu.

    Scanner screen

    The following is the basic description of the Scanner menu for Data Catalog.

    datacatalog-scanner_screen_ko

    AreaDescription
    ① Menu nameMenu name currently being checked, number of scanner being viewed
    ② Basic featuresFeatures displayed when initially entering the Scanner menu
    • [Create scanner] button: click and create the scanner (see Create scanner)
    • [Learn more about the product] button: click and move to the Data Catalog page
    • [Refresh] button: click and refresh the scanner list
    ③ Post-creation functionsFeature enabled after creating scanner
    • [Execute] button: click and execute the scanner (see Execute scanner)
    • [Edit] button: click and edit scanner settings (see Edit scanner)
    • [Delete] button: click and delete scanner (see Delete scanner)
    • [Manage execution] button: click and display scanner execution management menu
    ④ Search barSearch scanner by name or description
    ⑤ Scanner listClick to check details from the viewed scanner list
    ⑥ Information tabClick each tab and check the selected scanner information

    Create scanner

    Set the scan execution option information and the source data for collecting metadata to create a scanner. To create a scanner, follow these steps:

    1. From the NAVER Cloud Platform console, click Services > Big Data & Analytics > Data Catalog, in order.
    2. Click the Scanner menu.
    3. Click the [Create scanner] button.
    4. Enter the source data information to be scanned.
      • Data type: select data source
      • Connection: select connection to the data source
        • You can click [Create connection] to create a connection. For more information, see Create connection.
        • If data type is Cloud DB, selecting a connection brings up the [Test connection] button. Be sure to click [Test connection] to check the connection.
      • Path: enter the path of the source data to scan
        • The scan for the sub path of the entered path is executed.
        • If the source data type is a cloud DB type, enter the name of the table to be scanned.
          • If you enter %, the whole database is scanned and a metadata table is created from every scanned table.
    5. Enter the execution option.
      • Execution cycle: enter the cycle for the scan to be executed
        • On-demand: run the scanner on the console without execution cycle
        • Daily/Weekly/Monthly: execute the scan at the set date and time
        • Cron: enter execution cycle in the cron format
      • Pattern: include or exclude collection of specific data's metadata
        • Enter as a glob pattern type.
        • The exclude setting is prioritized over the include setting.
      • Classifier: select the classifier according to the data type and click [Add] to add it
        • A setting is made when the source data type is an Object Storage.
        • You can click [Create classifier] to create a classifier. For more information, see Create classifier.
        • Click the i-datacatalog-delete to delete an added classifier.
    6. Click the [Next] button.
    7. Enter the table update handling type and the output data information.
      • Database: select database to connect to a created table by executing a scanner
        • You can click [Create database] to create a database. For more information, see Create database.
      • Prefix: enter the strings to be added at the beginning of the name of the created table
        • If it is not entered, the table name will be automatically created based on the name of the source data.
      • Schema addition time: select the table update method when there is a change to the schema of the source data
        • Table definition update: delete the metadata of data deleted when a new schema was created
        • Add new columns only: add a new schema but maintain the existing one also
        • Ignore: maintain the existing schema
    8. Click the [Next] button.
    9. Enter the scanner name and its description, and check the settings item and click the [Save] button.
    Note

    You can create up to maximum of 30 Object Storage data type scanners

    Scanner search and check information

    To search for the created scanner and check the information, follow these steps:

    1. From the NAVER Cloud Platform console, click Services > Big Data & Analytics > Data Catalog, in order.
    2. Click the Scanner menu.
    3. Input the scanner name or description in the search bar and click i-datacatalog-search to search for the scanner.
    4. Click the scanner and check the information.
      • Name: scanner name
      • Status: scanner status
      • Last scan result: result of the most recent scan
      • Last scan time: time of the most recent scan
      • Execution cycle: set execution cycle
      • Update time: time of the most recent change to the scanner settings
      • [Settings] tab: click to view the scanner settings
      • [Scan history] tab: click to view the records of last 10 scans
        • Start time/End time: scan start and end times
        • Scan duration: time spent for the entire scan
        • Scan result: scan result
        • Result summary: summary of the scan result including the number of added or changed tables, cause of scan failure, scan cancellation records, etc.
        • [See more] button: click and display a pop-up window to show more about the execution history
          • You can check the execution history of the past year or specify a date and time of execution and view its execution history
        • [View details] button: you can view the details of scan in CLA Service

    Execute scanner

    The scanner can be manually executed from the console.

    Caution

    Partition keys are created only during the initial scan and are not added in subsequent scans. Therefore, if partition keys need to be added, the table must be deleted and scanned again. However, partition values can continue to be added between scans.

    Note

    Scanner with the execution cycle settings is executed automatically according to the settings, and can be manually executed from the console anytime.

    To execute a scanner, follow these steps:

    1. From the NAVER Cloud Platform console, click Services > Big Data & Analytics > Data Catalog, in order.
    2. Click the Scanner menu.
    3. Select the scanner to execute, and then click the [Execute] button.
      • When the execute is complete, the scanner's Status becomes Execute standby, and Recent execute result is displayed as Success.
      • Select the scanner in progress and click the [Manage execute] > Cancel execute in order to cancel scanning.

    Temporary suspension of the scanner execution cycle and restarting

    When the scanner that is set to periodically execute automatically suspends temporarily, you can set it to restart the automatic execution. To configure, follow these steps:

    1. From the NAVER Cloud Platform console, click Services > Big Data & Analytics > Data Catalog, in order.
    2. Click the Scanner menu.
    3. Click the [Manage Execution] button.
    4. Depending on the setting details, click Temporarily suspend execution cycle or Restart execution cycle.
      • Pause execution cycle: pause automatic scan run according to the set scan cycle
      • Resume execution cycle: resume paused automatic scan

    Edit scanner

    To edit the information of a created scanner, follow these steps:

    Note

    Scanner that is in progress cannot be edited.

    1. From the NAVER Cloud Platform console, click Services > Big Data & Analytics > Data Catalog, in order.
    2. Click the Scanner menu.
    3. Select the scanner to edit and click the [Edit] button.
    4. Edit the information of the scanner from the scanner edit screen.
    5. Once you have made necessary changes, click [Save].

    Delete scanner

    To delete a created scanner, follow these steps:

    Caution

    The deleted scanner cannot be recovered.

    Note

    Scanner that is in progress cannot be deleted.

    1. From the NAVER Cloud Platform console, click Services > Big Data & Analytics > Data Catalog, in order.
    2. Click the Scanner menu.
    3. Select the scanner to delete and click the [Delete] button.
    4. When the pop-up notification window appears, check the cautions and click [Delete].

    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.