Using Zeppelin
    • PDF

    Using Zeppelin

    • PDF

    Article Summary

    Available in VPC

    The ZEPPELIN-0.10.1 app supports Apache Zeppelin. As a data visualization tool, Zeppelin allows easy data analysis. Each user can use their own Zeppelin.

    Check Zeppelin app details

    When the app creation is completed, you can view the details. When the Status is Stable under the app's details, it means the app is running normally.

    The following describes how to check the app details.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest menus, in that order.
    2. Click the Data Forest > Apps menu on the left.
    3. Select an account.
    4. Click the app whose details you want to view.
    5. View the app details.
      df-zeppelin_1-2_vpc_ko
      • Quick links
        • AppMaster: URL where the container log can be viewed. When creating apps, all apps are submitted to the YARN queue, and YARN provides a Web UI where each app's details can be viewed
        • shell: you can access the Docker environment where Zeppelin is running by using the Web Shell, and then run internal checks and edit settings as you like. Log in with the user account name and password with which the app is created.
        • supervisor: URL where you can manage Zeppelin
        • zeppelin: Log in with the user account name and password with which the app is created.
      • Component: ZEPPELIN-0.10.1 type consists of one zeppelin component.
        • zeppelin: The default value is the recommended resources. It requests 1 core and 12 GB memory by default to run.

    <Examples>

    The shell screen after connection is as follows.
    df-zeppelin_5_vpc_en.png

    The following is the Zeppelin screen after connection.
    df-zeppelin_06_vpc_ko

    Note

    Refer to Interpreters in Apache Zeppelin if you need to adjust detailed settings when performing a job.

    Set Interpreter

    Spark

    Spark currently defaults to version 3.0.1, so you can create a notebook to use it right away. The jobs executed in Zeppelin are assigned to the Dev queue by default. If you want it to run on a different queue, search Spark in Interpreters, and then click the [edit] button to add the spark.yarn.queue configuration to Properties.
    df-zeppelin_07_vpc_ko(1)

    Note

    If you submit a job to a queue where you don't have permissions, then it may fail.

    Note

    To use the existing version of Spark2, select 'spark248' in Default Interpreter when creating a notebook.

    JDBC

    If you would like to use Hive, then the format you have to use is %jdbc(hive).

    Note

    For a description of Hive rules and permissions, see Use public Hive.

    The following is an example of viewing the database called test02__db_test after creating a notebook.

    df-zeppelin_08_vpc_ko

    %jdbc(hive)
    use test02__db_test;
    show tables;
    select * from test;
    

    Back up notebook

    The Zeppelin app has a feature to back up notebooks and part of the settings, and it can sync the settings with the notebook, even when the device running the Zeppelin changes. Backups are run at an interval of 10 minutes.

    • If you would like to do a manual backup, then log into the web shell and run backup.sh to back up the notebook and settings immediately.
    • The backup logs can be viewed in the hdfs://koya/user/${USER}/zeppelin/${SERVICE_NAME}/backup directory after logging in to the Zeppelin container.
      df-zeppelin_9_vpc_ko

    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.