Getting started with Cloud Hadoop

release/20250320
- release/20250320
- release/20250116
English

Getting started with Cloud Hadoop

Article summary

Did you find this summary helpful?

Thank you for your feedback

Available in Classic

If you have checked the supported environments and required specifications for the Cloud Hadoop and duly noted all scenarios and terms, then you are now ready to start using Cloud Hadoop. The first thing to do is to create a Cloud Hadoop cluster. The creation and management of Cloud Hadoop clusters are conducted from the NAVER Cloud Platform console.
The following summarizes what you can learn from the start guide.

Preparations
Create cluster
Delete cluster

Preparations

Creating Object Storage
Before creating a cluster, you need to have the data saved and an Object Storage bucket created for search. For more information, see Object Storage guide.
Node type selection

Choose your node type in advance considering your expected usage.

Create cluster

To use NAVER Cloud Platform's Cloud Hadoop, you must create a cluster first.

The following describes how to create a Cloud Hadoop cluster.

Access the NAVER Cloud Platform console.
In the Platform menu, click Classic to switch to the Classic environment.
Click Services > Big Data & Analytics > Cloud Hadoop in order.
Click the [Create cluster] button.
When the Create cluster page appears, proceed with the following steps in order.

1. Set cluster

Specify the cluster settings information, and then click the [Next] button.

Cluster version: currently, Cloud Hadoop 1.0, 1.1 and 1.2 are available. For more information on cluster versions, see Cloud Hadoop release note.
Cluster type: currently, there are four cluster types with Core Hadoop, Presto, HBase, and Spark. You can select the installed type with the necessary components. If necessary services need to be added, you can use the Add Service feature in Ambari, which is a cluster managing tool.
Cluster admin account: set cluster account for accessing the management console of Ambari, Hue, Zeppelin.
Cluster admin account password: enter the cluster admin account's password.
ACG settings: Cloud Hadoop ACG is automatically created whenever you create a cluster. If you want to set up a network ACL, then you can edit the rule by selecting the ACG that was created automatically. For more information on ACG settings, see Firewall settings (ACG).

2. Set storage and server

Specify the storage and node server settings information, and then click the [Next] button.

Object Storage buckets: Cloud Hadoop can read and write data in object storage buckets created in preparations. When creating a cluster, select the Object Storage bucket created in preparations. Locked buckets are not connected with Cloud Hadoop. Keep this in mind when creating an Object Storage bucket.
Support high availability: Cloud Hadoop basically provides redundancy for HDFS Namenode, YARN Resource Manager, Oozie Server, and HiveServer. Since this is the specification that is required at minimum, it can't be deselected.
Edge node server type: select the server type to be used for the edge node. The server type can't be changed after the cluster is created. For specifications of servers that can be used as edge nodes, see Supported server specifications by cluster node.
Number of edge nodes: the number of edge nodes is fixed at 1.
Master node server type: select the server type to be used for the master node. The server type can't be changed after the cluster is created.
Number of master nodes: since Cloud Hadoop provides high availability as minimum specifications, the number of master nodes is fixed at 2.
Master node storage type: select the storage type. You can choose between SSD and HDD. The storage type can't be changed after the cluster is created. For the specifications of servers that can be used as master nodes, see Supported server specifications by cluster node.
Master node storage capacity: select the storage capacity. You can select from 100 GB to 2 TB, and adjust in 10 GB increments.
Operator node server type: select the type of server to be used as an operator node. The server type can't be changed after the cluster is created. For specifications of servers that can be used as operator nodes, see Supported server specifications by cluster node.
Number of operator nodes: the number of operator nodes can be between 2 and 8. Worker nodes can be added or deleted even after the cluster is created.
Worker node storage type: select the storage type. You can choose between SSD and HDD. The storage type can't be changed after the cluster is created.
Worker node storage capacity: select the storage capacity. You can select from 100 GB to 2 TB, and adjust in 10 GB increments.
Pricing plan: the pricing plan selected at account creation is applied. For more information on pricing, see Pricing information.

Note

After a cluster is created, adjustment on the server specification (scaling up/down) is unavailable. When configuring a cluster, make your selections taking into account the expected usage and node roles (edge/master/worker).

3. Set authentication key

Set the SSH authentication key required for connecting directly to the node.
Select an authentication key you have or create a new one and click [Next].

To create a new authentication key, select Create new authentication key, enter the authentication key name, and then click the [Create and save authentication key] button.

Note

The authentication key is required to get the admin password. Keep the saved PEM file in a safe location on your PC.

4. Final confirmation

After checking the request details, click the [Create] button.

Note

Cloud Hadoop ACG is automatically created whenever you create a cluster. If you want to set up a network ACL, then you can edit the rule by selecting the ACG that was created automatically. For more information on ACG settings, see Firewall settings (ACG).
It takes about 30 to 50 minutes for a cluster to be created. Once the cluster is created and starts running, you can see Running displayed in the Status column of the cluster list.

Delete cluster

The following describes how to delete a Cloud Hadoop cluster.

In Classic environment on the NAVER Cloud Platform console, click Services > Big Data & Analytics > Cloud Hadoop in order.
Select the cluster to delete from the cluster list, and then click the [Delete] button.
Enter the cluster name in the pop-up window to confirm deletion, and then click the [Yes] button.

Note

It takes several minutes to delete a cluster. Once the cluster is deleted, the cluster disappears from the cluster list.

Caution

If you delete a Cloud Hadoop cluster, then the data saved in the node's local file system or HDFS will all be deleted as well. Back up the necessary files, for example, by copying them into the Object Storage bucket.

Delete Object Storage bucket and file

Select the file to delete from the Object Storage console and click [Edit] > Delete.
For more information on deleting files or buckets, see Object Storage user guide.

Caution

Deleted Object Storage files or buckets cannot be restored. Consider carefully before proceeding.

Was this article helpful?

What's Next

Viewing and managing Cloud Hadoop information

Table of contents

Preparations
Create cluster
Delete cluster
Delete Object Storage bucket and file