Data Forest usage scenarios
    • PDF

    Data Forest usage scenarios

    • PDF

    Article Summary

    Available in VPC

    The following describes usage scenarios for Data Forest.

    Step 1. Create account

    1. Connect to the NAVER Cloud Platform console.
    2. Click VPC from the Platform menu to switch to the VPC environment.
    3. Click the Services > Big Data & Analytics > Data Forest menus in order.
    4. Click the [Create account] button under Account.
    5. Enter "df-test" as the account name, enter account password, and then click the [Create] button.
    Note

    Step 2. Create notebook

    Preparations

    Create a VPC and a subnet to establish effective network access control.

    1. Click the Services > Big Data & Analytics > Data Forest menus in order.
    2. Click the [Create notebook] button from Notebooks.
    3. Enter the notebook settings information, and then click the [Next] button.
      • Account name: enter "df123"
      • Notebook name: enter "my-notebook"
      • VPC/Subnet: enter the information you have created during the preparations for VPC/Subnet
    4. If user settings are required, enter the relevant information.
    5. Select an authentication key that you have from Set authentication key or create a new one, and click the [Next] button.
    6. After the final check, click the [Create] button.
    Note
    • If you create a notebook from Data Forest, only a public subnet is supported.
    • For more information on how to create a notebook, see Create and manage account.

    Step 3. Create app

    1. Click Data Forest > Apps, and then click the [Create app] button.
    2. Enter app information.
      • Account name: enter "df123"
      • App type: select "HUE-4.7.0"
      • App name: enter "my-hue"
      • Uptime: enter "604800"
      • Queue: select "longlived"
        df-quick-start_1-1_en
    3. Complete basic settings, and then click the [Next] button.
    4. After the final check, click the [Create] button.
    Note

    Step 4. SSH tunneling

    1. To create an SSH tunnel with your notebook node, enter the following command in the terminal of your PC.

      • Use the -D {port number} option to specify any port from the user's PC.
      • Access the notebook node using the authentication key configured when creating the notebook.
       $ ssh - i <pem-key-file> -C2qTnNf -D 9494 forest@<Notebook-Domain>
      
    2. Access your notebook.

    Configure proxy in browser

    Set proxy in Firefox browser

    The following describes how to set up a proxy in the Firefox browser.

    1. Open a Firefox browser window.
    2. Click the df-quick-start_i-firefox > Settings> Network settings > [Settings] button at the upper right of the browser window.
    3. Click Internet proxy access settings > Manual proxy settings.
    4. Enter the SOCKS host information.
      • Select SOCKS v5
      • SOCKS host: enter 127.0.0.1
      • Port: 9494
        df-quick-start_firefox_en
    5. Select the "do not ask upon authentication if the password is saved" and the "proxy DNS when using SOCKS v5" check box.
    6. Once the proxy setting is completed, click the [OK] button.
    Note

    When not accessing the Data Forest server, you must change the proxy setting to "No Proxy" to use the Internet normally.

    Set proxy in the Chrome browser on macOS

    The following describes how to set up a proxy in the Chrome browser on the macOS.

    Execute the following command in cmd.

    $ /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --proxy server="socks5://127.0.0.1:9876"
    

    Set proxy in the Chrome browser on Windows

    The following describes how to set up a proxy in the Chrome browser on Windows OS.

    1. Right-click the Chrome icon, and then click properties.
    2. When the Chrome properties window appears, add --proxy-server="socks5://127.0.0.1:9876" at the end of the input text in the [Shortcut] tab > Target (T).

    The following describes how to check the app's connection to quick links.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest > Apps menu in order.
    2. Click the Hue app to open the Details area.
    3. From the app details area, click the link under Quick links.
    4. Check if you have a normal connection.
      • If the connection is made successfully, it indicates the tunneling has been completed and you can manage the HDFS files through the HUE app.
    Note

    For more information about quick links, see Access quick links.

    Step 6. Integrate Zeppelin and HiveServer apps

    The following describes how to set up integration with the Apache Zeppelin and Apache HiveServer apps.

    1. Create the following apps: "HIVESERVER2-LDAP-3.1.0," "DEV-1.0.0," and "ZEPPELIN-0.10.1".
    2. Click the Zeppelin app created, and access zeppelin URL from Quick links.
      df-quick-start_2-2_en
    3. Log in using the account name and the password entered when creating the account.
    4. Click your account located at the upper right on the screen, and then click Interpreter.
      df-hive_12_vpc
    5. Search for the JDBC interpreter.
      df-quick-start_zeppelin03
    6. Click the [Edit] button at the upper right of the screen.
      df-quick-start_zeppelin04
    7. Add the hive.password field to Properties as follows:
      df-quick-start_zeppelin05
      • hive.driver: enter JDBC driver class path (org.apache.hive.jdbc.HiveDriver)
      • hive.password: enter the password for the account entered
      • hive.proxy.user.property: enter hive.server2.proxy.user
      • hive.splitQueries: enter true
      • hive.url: enter the JDBC connection string example provided when creating the Hiveserver2-ldap app
        • If any special character is included in the password set when creating the account, then substitute it to URL encoding when entering it.
      • hive.user: enter the account name (df-test)
    8. Enter /usr/hdp/current/hive-client/jdbc/hive-jdbc-3.1.0.3.1.0.0-78-standalone.jar to the Dependencies > artifact field.
      df-quick-start_zeppelin06
    9. Click the [Save] button.
    10. Click [Notebook] > Create new note to create a new notebook.
      df-quick-start_zeppelin07
    11. Then it becomes available to read and write the Hive DBs and tables created in Data Forest, as in the written code.
      df-quick-start_zeppelin08

    Step 7. Delete app

    You can delete apps that are no longer in use. The following describes how to delete an app.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest > Apps menu in order.
    2. Select the app you want to delete from the app list, and then click the [Finish] button.
    3. Select the finished app and click the [Delete] button.

    Step 8. Delete notebook

    You can delete a notebook no longer in use. The following describes how to delete a notebook.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest > Notebooks menu in order.
    2. Select the notebook to delete from the notebook list, and then click the [Delete] button.

    Step 9. Delete account

    You can delete accounts that are no longer in use. The following describes how to delete an account.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest > Accounts menus in order.
    2. Select the account you want to delete, and then click the [Delete] button.

    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.