Using Zeppelin

Prev Next

Available in Classic

NAVER Cloud Platform's Cloud Hadoop has Zeppelin Notebook installed only if Cluster Type is Presto or Spark.
This guide explains how to access Zeppelin Notebook UI and how to execute a simple example.
For more information on Zeppelin, see Apache Zeppelin Official Homepage.

Access Zeppelin Notebook UI

The following describes how to access Zeppelin Notebook UI.

Connect via the console's web UI list

On the Cloud Hadoop console, you can access the Zeppelin Notebook UI through [View by application]. For more information, see View by application.

Direct access via web browser

Open a web browser, and enter the following in the address field to access. Use the public domain address assigned to Cluster.

http://{Public domain}:9995

Access via Ambari Web UI

The following describes how to access via the Ambari Web UI.

  1. Access the Ambari UI.
  2. On the Ambari UI screen, click Zeppelin Notebook > Quick Links > Zeppelin UI in order.
  3. Once the login page is displayed in the browser, enter the admin account and password set upon cluster creation to log in.
    • If the connection succeeds, a green dot appears next to [Login] at the top right of the Zeppelin page.

Getting started with Zeppelin Notebook

You can create Zeppelin Notebook to enter data and view the results in a graph.
This guide is written based on Zeppelin Tutorial (Basic Features) Notebook provided on Zeppelin Notebook.

Create Notebook

The following describes how to create a Notebook.

  1. Click [Notebook] > Create new note at the top of the Zeppelin page.
    chadoop-4-4-007_en.png

  2. Enter note name and information and click [Create note].

    • You can change Default Interpreter even after creating a note.
      chadoop-4-4-008_en.png
Note

To set Default Interpreter to Spark, you need to set Cluster Type to Spark when creating a cluster.

Load data to table

The following shows a sample code that loads data from bank.csv into the bank table.

%spark.spark
import org.apache.commons.io.IOUtils
import java.net.URL
import java.nio.charset.Charset

// Zeppelin creates and injects sc (SparkContext) and sqlContext (HiveContext or SqlContext)
// So you don't need create them manually

// load bank data
val bankText = sc.parallelize(
    IOUtils.toString(
        new URL("https://s3.amazonaws.com/apache-zeppelin/tutorial/bank/bank.csv"),
        Charset.forName("utf8")).split("\n"))

case class Bank(age: Integer, job: String, marital: String, education: String, balance: Integer)

val bank = bankText.map(s => s.split(";")).filter(s => s(0) != "\"age\"").map(
    s => Bank(s(0).toInt,
            s(1).replaceAll("\"", ""),
            s(2).replaceAll("\"", ""),
            s(3).replaceAll("\"", ""),
            s(5).replaceAll("\"", "").toInt
        )
).toDF()
bank.registerTempTable("bank")

Run code and view results

The following describes how to run a Zeppelin Notebook code and check the result.

  1. Run the code by pressing [Shift] + [Enter] keys or clicking cloudhadoop-zeppelin-run.

    • You can check that the code has run normally with the FINISHED status and Took 5 sec phrase.
      chadoop-4-4-009_en.png
  2. Write a Spark SQL statement for viewing data in a new paragraph and press [Shift] + [Enter] or click cloudhadoop-zeppelin-run to run the code.

    • Search result is displayed on the screen. You can use the Graph button to check the SQL results in various types of graph.
    %spark.sql
    select age, count(1) value
    from bank
    where age < 30
    group by age
    order by age
    

    chadoop-4-4-010_en.png

Backup Zeppelin Notebook

Zeppelin Notebook is stored on Server 1 of the cluster's master node. Therefore, if you delete a cluster, the notebook is also deleted.
To use the same notebook in another cluster, you must export the Notebook after completing the task.

The following describes how to backup Zeppelin Notebook.

  1. Click cloudhadoop-zeppelin-download on the top of the screen.
    chadoop-4-4-012_en.png

  2. Specify the file name and path on the local PC and save.

    • The exported file is saved in JSON format.
Note

When Zeppelin Notebooks are backed up, they are saved in units of notebooks.

Note

To install jdbc interpreter on Zeppelin, you need to first edit the jdbc version of the /etc/zeppelin/conf/interpreter-list file on the Edge server.

Before change: jdbc org.apache.zeppelin:zeppelin-jdbc:0.11.0-SANPSHOT Jdbc interpreter
After change: jdbc org.apache.zeppelin:zeppelin-jdbc:0.10.1 Jdbc interpreter

For more information, check the library dependency (version, library name, etc.) of jdbc interpreter in Maven.