Data supply
    • PDF

    Data supply

    • PDF

    Article Summary

    Available in Classic and VPC

    The following describes how to request data supply, how to request additional data that was not selected when creating the databox, and how to request subscription to Insight Option for receiving the latest data for a contract period.

    Request data supply

    After setting the databox connection, you must request data supply to receive the data requested. Once data supply is requested, external communications are blocked and the data requested by the user is supplied.
    The following describes how to request data supply.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Cloud Data Box > My Space menus, in that order.
    2. Select the databox created, and click the [Request data supply] button.
      clouddatabox-datarequest_datarequest01_ko
    3. When the data supply request window appears, enter the databox name, and then click the [Confirm] button.
    4. It takes 5 - 10 minutes for the data supply to be completed. Once the data supply is complete, the databox status will change from Data supply requested to Data supply complete.
    Caution
    • Once data supply is requested, external communications are blocked and can't be reverted.
    • After data supply is requested, TensorFlow Docker and Jupyter will be restarted. Make sure to complete any work in progress first.
    • Connection to SSL VPNor servers will be lost once you request data supply, request subscription to Insight Option, or request additional data when the data supply is completed. You have to establish the connection again after the supply is completed.

    Add data

    You can add new half-year data. Additional data can't be requested while data supply is in progress, and you must submit the additional data request once the data supply is completed.
    The following describes how to add the latest half-year data.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Cloud Data Box > My Space menus, in that order.
    2. Click [View server details] for the databox to which you wish to add data.
      clouddatabox-datarequest_add01_ko
    3. Click the [Add] button in the [Data] tab.
      clouddatabox-datarequest_add02_ko
    4. Select the data to add and click the [Confirm] button.
    5. If the data is added while the databox status is Data supply complete, then it takes 5 - 10 minutes for the supply to be completed. Once the additional data supply is completed, the data provision status changes to Data search available.
      If data is added while the databox status is Infrastructure creation complete, then data will only be provided after you request data supply.

    Insight Option

    Insight Option is a feature that provides the latest data from 2 years ago to the previous month for the duration of the contract period (12 months). After requesting subscription to Insight Option, a cancellation fee is charged if the databox is terminated within 12 months from the request date. External network communication must be blocked in order to use the Insight Option data. Therefore, you must request data supply first.

    Subscribe to Insight Option

    The following is how you request subscription to Insight Option.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Cloud Data Box > My Space menus, in that order.

    2. Make sure that you have requested data supply.

      • The [Upgrade] button is only enabled when the data supply request is completed.
    3. Select the database created.

    4. Click the [Upgrade] > [Subscribe to Insight Option] buttons, in that order.
      clouddatabox-datarequest_option_ko

    5. Once the Insight Option subscription request window appears, read the provided data standard and cancellation fee notice, and then click the [Subscribe to Insight Option] button.

    6. Read the information message about TensorFlow Docker and Jupyter restart, and click the [Confirm] button.

      • It takes 5 - 10 minutes for the Insight Option data supply to be completed.
      • Once the data supply is completed, the databox status will change to Data supply complete.
    7. After confirming that Insight Option data is provided by having it mounted on the Ncloud TensorFlow Server and Hadoop node, connect to the Ncloud TensorFlow Server and restart the Docker and Jupyter. Docker and Jupyter must be restarted for Jupyter Notebook to check the directory data.

      • Restart TensorFlow CPU

        docker restart tf-server-mkl  
        
      • Restart TensorFlow GPU

        docker restart tf-server-gpu
        
      • Restart Jupyter Notebook

        jup restart or
        jup start after running jup stop
        

    Insight Pro Option

    You can request subscription to Insight Pro Option after subscribing to Insight Option. It's a feature that provides search and shopping data by user group. You can request subscription after getting permission authorized in advance by inquiring sales. The latest data from 2 years ago to the previous month is provided for Insight Pro Option, same as with the Insight Option data. After requesting subscription to Insight Pro Option, a cancellation fee is charged if you cancel the subscription to Insight Pro Option within 12 months of the request date of Insight Option.

    Subscribe to Insight Pro Option

    The following describes how you can request subscription to Insight Pro Option.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Cloud Data Box > My Space menus, in that order.

    2. Select the database created.

    3. Click the [Upgrade] > [Subscribe to Insight Pro Option] buttons, in that order.

    4. Read the notice on the cancellation fee, and then click the [Subscribe to Insight Pro Option] button.

      • You can only request subscription to Insight Pro Option when you're subscribing to Insight Option.
      • A cancellation fee is charged if you cancel the subscription to Pro Option within the Insight Option contract period.
    5. If you don't have the permission for Pro Option, then click the [Sales inquiries] button to send the inquiry.

    6. If you do have the permission for Pro Option, then the Insight Pro Option subscription request window appears. Select Pro Option that you want to request, and then click the [Request] button.

    7. Read the content of the pledge about protecting anonymous information, mark the checkboxes for the pledge and agreement to it, and then click the [Confirm] button.

      • It takes 5 - 10 minutes for the Insight Pro Option data supply to be completed.
      • Once the Insight Pro Option data supply is completed, the databox status will change to Data supply complete.
    8. After confirming that Insight Option data is provided by having it mounted on the Ncloud TensorFlow Server and Hadoop node, connect to the Ncloud TensorFlow Server and restart the Docker and Jupyter. Docker and Jupyter must be restarted for Jupyter Notebook to check the directory data.

      • Restart TensorFlow CPU

        docker restart tf-server-mkl  
        
      • Restart TensorFlow GPU

        docker restart tf-server-gpu
        
      • Restart Jupyter Notebook

        jup restart or
        jup start after running jup stop
        

    View cancellation fee for Insight Pro Option

    The following describes how you can view the cancellation fee for Insight Pro Option.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Cloud Data Box > My Space menus, in that order.
    2. Select the databox for which you want to see the Pro Option cancellation fee.
    3. Click the [Cancel Pro Option and view cancellation fee] > [View cancellation fee] buttons, in that order, to view the estimated cancellation fee.

    Cancel Insight Pro Option

    The following describes how you can cancel Insight Pro Option.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Cloud Data Box > My Space menus, in that order.
    2. Select the databox for which you wish to cancel the Pro Option.
    3. Click the [Cancel Pro Option and view cancellation fee] > [Cancel Insight Pro Option] buttons, in that order.
    4. Read the notice about the cancellation fee for canceling Pro Option mid-term and estimated cancellation fee. Select Agree and click the [Continue canceling the option] button if you want to proceed with the cancellation.

    Upload provided data to Hadoop cluster

    If you need the default data requested, Insight Option, or Insight Pro Option data, then you must upload it to the Hadoop cluster to use it. Make sure that there is enough space on Hadoop before uploading, and use distcp to upload.

    1. Connect to the Cloud Hadoop edge node by using PuTTY and check the data to upload to Hadoop.
      (shopping20y1h is an example.)

      $ ls -al /mnt/shopping20y1h/shopping
      $ find /mnt/shopping20y1h -type f | wc -l 
      $ du -sh /mnt/shopping20y1h
      

      clouddatabox-datarequest_dataupload

    2. Upload data using distcp to Hadoop.

      • A Hadoop cluster name is in the format of nv0###-hadoop. You can check the name in the NAVER Cloud Platform console, or the name of the Hadoop node you connected to.
      • An error may occur when you upload the .snapshot directory under the data volume you requested (/mnt/shopping20y1h/ in this example) with it, so make sure to upload the data directory only.
      • Ignore in case of error hadoop-distcp.sh was not found.
      • Half-year shopping data is about 60 - 70 GB, and it takes about 10 minutes to upload it to Hadoop.
      • Half-year search data is bigger at around 5 - 8 TB, so it takes 5 - 10 hours to upload it to Hadoop. (The time this will take may vary, depending on the Hadoop node specifications.)
        $ hadoop distcp file:///mnt/shopping20y1h/shopping hdfs://nv0###-hadoop/user/ncp/shopping20y1h
    3. Check the data uploaded to Hadoop.

      $ hdfs dfs -ls /user/ncp/shopping20y1h 
      $ hdfs dfs -count /user/ncp/shopping20y1h
      $ hdfs dfs -du -h /user/ncp
      

      clouddatabox-datarequest_dataupload2


    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.