- Print
- PDF
Using Zeppelin
- Print
- PDF
Available in VPC
In Cloud Hadoop of NAVER Cloud Platform, Zeppelin Notebook is installed.
This guide explains how to access Zeppelin Notebook UI and how to execute a simple example.
For more information on Zeppelin, see Apache Zeppelin Official Homepage.
Access Zeppelin Notebook UI
The following describes how to access Zeppelin Notebook UI.
Connect via the console's web UI list
You can access the Zeppelin Notebook UI through [View by application] on the Cloud Hadoop console. For more information, see View by application.
Direct access via web browser
Open a web browser, and enter the following in the address field to access. Use the domain address assigned to the cluster.
https://{domain address}:9996
Access via Ambari Web UI
The following describes how to access via the Ambari Web UI.
- Access the Ambari UI.
- For more information on accessing Ambari UI, see Ambari UI Guide.
- On the Ambari UI screen, click Zeppelin Notebook > Quick Links > Zeppelin UI in order.
- For how to access Zeppelin UI, see the guide on Accessing Web UI using tunneling.
- Once the login page is displayed in the browser, enter the admin account and password set upon cluster creation to log in.
- If the connection succeeds, a green dot appears next to [Login] at the top right of the Zeppelin page.
Getting started with Zeppelin Notebook
You can create Zeppelin Notebook to enter data and view the results in a graph.
This guide is written based on Zeppelin Tutorial (Basic Features) Notebook provided on Zeppelin Notebook.
Create Notebook
The following describes how to create a Notebook.
Click [Notebook] > Create new note at the top of the Zeppelin page.
Set the note name and information of the Notebook, and then click [Create].
- The Default Interpreter can be changed after creating a note.
- The Default Interpreter can be changed after creating a note.
Load data to table
The following shows a sample code that loads data from bank.csv into the bank table.
%spark.spark
import org.apache.commons.io.IOUtils
import java.net.URL
import java.nio.charset.Charset
// Zeppelin creates and injects sc (SparkContext) and sqlContext (HiveContext or SqlContext)
// So you don't need create them manually
// load bank data
val bankText = sc.parallelize(
IOUtils.toString(
new URL("https://cdn.document360.io/6998976f-9d95-4df8-b847-d375892b92c2/Images/Documentation/bank.csv"),
Charset.forName("utf8")).split("\n"))
case class Bank(age: Integer, job: String, marital: String, education: String)
val bank = bankText.map(s => s.split(",")).filter(s => s(0) != "age").map(
s => Bank(s(0).toInt,
s(1).replaceAll("\"", ""),
s(2).replaceAll("\"", ""),
s(3).replaceAll("\"", "")
)
).toDF()
bank.registerTempTable("bank")
Run code and view results
The following describes how to run a Zeppelin Notebook code and check the result.
Run the code by pressing [Shift] + [Enter] keys or clicking .
- You can check that the code has run normally with the FINISHED status and Took 4 sec phrase.
- You can check that the code has run normally with the FINISHED status and Took 4 sec phrase.
Write a Spark SQL statement to search table data in a new paragraph, and then press [Shift] + [Enter] keys or click to run the code.
- Search result is displayed on the screen. You can use the Graph button to check the SQL results in various types of graph.
%spark.sql select age, count(1) value from bank where age < 30 group by age order by age
Backup Zeppelin Notebook
Zeppelin Notebook is stored on Server 1 of the cluster's master node. Therefore, if you delete a cluster, the notebook is also deleted.
To use the same notebook in another cluster, you must export the Notebook after completing the task.
The following describes how to backup Zeppelin Notebook.
- Click the button at the upper left on the Notebook screen.
- Specify the file name and path on the local PC and save.
- The exported file is saved in JSON format.
When Zeppelin Notebooks are backed up, they are saved in units of notebooks.
To install jdbc interpreter on Zeppelin, you need to first edit the jdbc version of the /etc/zeppelin/conf/interpreter-list file on the Edge server.
Before change: jdbc org.apache.zeppelin:zeppelin-jdbc:0.11.0-SANPSHOT Jdbc interpreter
After change: jdbc org.apache.zeppelin:zeppelin-jdbc:0.10.1 Jdbc interpreter
For more information, check the library dependency (version, library name, etc.) of jdbc interpreter in Maven.