The latest service changes have not yet been reflected in this content. We will update the content as soon as possible. Please refer to the Korean version for information on the latest updates.
Available in VPC
Cloud Hadoop notebooks provide a serverless form of Jupyter Notebooks for running the queries and code needed to analyze your data.
It is possible to create and delete notebook nodes through the Notebooks console.
You can access the JupyterLab and Jupyter Notebook web pages of the created notebook node to work on analyzing your data.
The queries and code you use in your notebooks run through the kernel of your Cloud Hadoop cluster and are stored as notebook files in Object Storage for flexible reuse.
Notebook interface
Here are some basic instructions for using the Notebook service.

| Component | Description |
|---|---|
| ① Create a Notebook | Create a new Notebook. |
| ② Delete | Delete a Notebook in use. |
| ③ Open in JupyterLab | Access the JupyterLab Web UI. |
| ④ Open in Jupyter | Access the Jupyter Web UI. |
| ⑤ Notebook list | View a list of created notebooks and details. |
Preliminary tasks
- Create Object Storage.
Before creating a Cloud Hadoop cluster, an Object Storage bucket must be created to store and retrieve data.- For more information about creating Object Storage, refer to the Object Storage guide.
- Create a Cloud Hadoop cluster.
A Cloud Hadoop cluster must be created to work with the Notebook node.- For more information on how to create a Cloud Hadoop cluster, see Getting started with Cloud Hadoop.
- Select a node type.
Consider the expected usage and select a node type in advance.
Create a Notebook
Here's how to create a Notebook.
You can create multiple Notebook nodes in a single Cloud Hadoop cluster, each of which can be linked to a Cloud Hadoop cluster.
- In the VPC environment on the NAVER Cloud Platform console, navigate to
> Services > Big Data & Analytics > Cloud Hadoop. - Click the Notebooks menu.
- Click [Create Notebook].
- When the Create Notebook screen appears, follow these steps in order.
1. Notebook settings
After specifying the notebook settings information, click [Next].
- Notebook name: Provide a name for the Notebook node.
- Notebook version: Select the version of Notebook.
- Notebook Components: View component information by version.
- Cluster: Select the Cloud Hadoop cluster with which you want to link the Notebook. Only Cloud Hadoop 1.8 or higher can be linked with Notebooks.
- ACG settings: The Cloud Hadoop Notebook ACG is automatically generated whenever you create a Notebook. If you want to set up a network ACL, you can modify the rules by selecting the ACG that was automatically created when you created the Notebook. For more information about setting up ACGs, see the Firewall settings (ACG) guide.
2. Set storage and server
After specifying the storage and node server settings information, click [Next].
- Object Storage bucket: You can read and write data from the Object Storage Bucket that you created in a preliminary task. Select the Object Storage bucket you created in the preliminary task when creating a Notebook.
- Notebook node subnet: Select the subnet to locate the notebook node.
- Creating a Notebook node on a public subnet enables web domain access based on a public IP.
- If you create a Notebook node on a private subnet, connecting with an SSL VPN is mandatory to access the web domain.
- Notebook node server type: Select a server type to use as a notebook node. You cannot change the server type after you create a Notebook node. For server specifications that can be used as Notebook nodes, see Supported server specifications by cluster node.
- Number of Notebook nodes: The number of Notebook nodes is fixed at 1.
- Whether to add Notebook node storage: You can add a separate Block Storage for use.
- Notebook node storage type: Select a storage type. You can select between SSD and HDD. You cannot change the storage type after you create a cluster.
- Notebook node storage capacity: Select the storage capacity. You can choose from a minimum of 100 GB to 6 TB, which may be specified in 10 GB increments.
- Pricing plan: The pricing plan you selected when you created your account applies. For more pricing information, see Pricing information.
3. Set authentication key
To connect directly to a Notebook node with SSH, you need to set up an authentication key (.pem).
When creating a notebook, select your authentication key or create a new one and click [Next].
- To generate a new authentication key, select Generate new authentication key, enter an authentication key name, and click [Generate and save authentication key].

The authentication key is required to verify the admin password. Keep the saved the PEM file in a safe location on your PC.
4. Final confirmation
Check the details and click [Create].
- The Cloud Hadoop Notebook ACG is automatically generated whenever you create a Notebook. To set up network ACLs, you can select an automatically generated ACG and modify the rules. For more information about setting up ACGs, see the Firewall settings (ACG) guide.
- It takes about 5 to 10 minutes for a notebook to be created. When the notebook creation is complete and starts operating, you'll see In Operation in the Status column of the cluster list.
Access notebook
Access notebook web page
In the Cloud Hadoop (Notebooks) console, click [Open in JupyterLab] or [Open in Jupyter]. You can access the Jupyter Notebook web page installed on the Notebook node.
- To the Cloud Hadoop Notebook ACG, add allowed port
8889for JupyterLab and allowed port8888for Jupyter web pages. - For Notebook nodes created on a Public Subnet, web access is available based on public IPs.
- For Notebook nodes created in a Private Subnet, you must connect via SSLVPN to access the web.
For detailed instructions on how to set up SSL VPN and ACG, see UI access and password setup guide by service.
Access directly to notebook node via SSH
You can access the Notebook node directly with SSH using the authentication key you set in the authentication key setting step when creating the notebook. For more information, see the Access a cluster node with SSH guide.
Use notebook
Notebooks can be used and integrated in a variety of ways.
- Integrate with Spark on a Cloud Hadoop cluster from a Notebook
- Learn TensorFlow MNIST in your notebook
- Integrate Object Storage data in notebooks
Preliminary tasks
Permission is needed in ACGs between the Cloud Hadoop cluster and the Notebook node to integrate. Include the Notebook node's ACG in the ACG of the Cloud Hadoop cluster you want to integrate with, and set up the ACG as follows.
- Click the [Inbound] tab in the Default ACG of the Cloud Hadoop cluster you want to integrate with. Add the Notebook node's ACG to the access source and all ports 1-65535 to the allowed ports, then click [Apply].

Cloud Hadoop clusters and Notebooks are recommended to be created on the same Subnet to allow communication within the same VPC.
Integrate with Spark on a Cloud Hadoop cluster from a Notebook
After accessing the Jupyter Notebook Web UI, you can use PySpark to integrate with your Cloud Hadoop cluster.
PySpark example code:
Select PySpark in Set Kernel to proceed in Notebook.
import os
import pyspark
import socket
from pyspark.sql import SQLContext, SparkSession
sc = SparkSession \
.builder \
.appName("SparkFromJupyter") \
.getOrCreate()
sqlContext = SQLContext(sparkContext=sc.sparkContext, sparkSession=sc)
print("Spark Version: " + sc.version)
print("PySpark Version: " + pyspark.__version__)
df = sqlContext.createDataFrame(
[(1, 'foo'), (2,'bar')], #records
['col1', 'col2'] #column names
)
df.show()
print(socket.gethostname(), socket.gethostbyname(socket.gethostname()))

Learn TensorFlow MNIST in your notebook
After accessing the Jupyter Notebook Web UI, you can perform TensorFlow MNIST Training using Python3.
Python3 example code:
Select Python3 in Set Kernel to proceed in Notebook.
# tensorflow library import
# If installing TensorFlow is needed
# !pip install tensorflow_version
import tensorflow as tf
mnist = tf.keras.datasets.mnist
# Load MNIST datasets
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
# Add a layer to define the model
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# Train and evaluate the model
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
# Save the trained model as an HDF5 file
model.save('my_model.h5')

Integrate Object Storage data in notebooks
You can integrate with an Object Storage bucket by accessing the Jupyter Notebook Web UI and Object Storage information.
- To check the access key ID, and SecretKey, see Ncloud API. For more information, see Getting started with Object Storage.
Python3 example code:
Select Python3 in Set Kernel to proceed in Notebook.
# Import the required boto3 modules
import boto3
# Enter Object Storage information
service_name = 's3'
endpoint_url = 'https://kr.object.private.ncloudstorage.com'
region_name = 'kr-standard'
access_key = 'User's access key'
secret_key = 'User's secret key'
# Integrating Object Storage with Boto3
if __name__ == "__main__":
s3 = boto3.client(service_name, endpoint_url=endpoint_url, aws_access_key_id=access_key, aws_secret_access_key= secret_key)
s3.upload_file("my_model.h5", "best", "model/my_model.h5")
Delete notebook
You can delete a notebook when you're done using it. The Notebook files (files with the .ipynb extension) used by Jupypter Notebook are stored under the Object Storage bucket in the Cloud Hadoop cluster.
Deleting a notebook doesn't delete the Notebook files you've used.