Using Dev

Prev Next

Available in VPC

The Dev app plays the role of a client for all services provided in Data Forest. You can use Hadoop commands or submit Spark Job to YARN clusters in the Dev app.

Note

For more information on how to create apps, see Create and manage apps.

Check Dev app details

Once the app is created, you can view its details. If the Status in the app details is Stable, the app is running normally.
To view app details:

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Big Data & Analytics > Data Forest.
  2. Click Data Forest > Apps on the left.
  3. Select an account.
  4. Click the app to view its details.
  5. Review the app details.
    df-dev_2-1_updated_ko
    • Quick links: You can access the following links.
      • supervisor: Supervisor URL for monitoring and managing container app processes
      • shell: URL for accessing the GNU/Linux terminal (TTY) via a web browser
    • Component: DEV-1.0.0 type is composed of 1 shell component.
      • shell: Default memory, CPU, and container count set to the minimum recommended values

Example:
The following shows the shell access interface.
df-dev_004_vpc_ko

Authenticate with Kerberos

You must authenticate with Kerberos before executing Hadoop commands or submitting Spark jobs to the cluster.
Perform Kerberos authentication using the keytab in the user path.

$ ls -al
total 20
drwxr-s--- 4 test01 hadoop  138 Dec 16 17:57 .
drwxr-s--- 4 test01 hadoop   74 Dec 16 17:44 ..
-rw-r--r-- 1 test01 hadoop  231 Dec 16 17:36 .bashrc
-rw------- 1 test01 hadoop  302 Dec 16 17:36 container_tokens
-rw-r--r-- 1 test01 hadoop  245 Dec 16 17:57 test01.service.keytab
lrwxrwxrwx 1 test01 hadoop  101 Dec 16 17:36 gotty -> /data1/hadoop/yarn/local/usercache/test01/appcache/application_1607671243914_0024/filecache/10/gotty
-rwx------ 1 test01 hadoop 6634 Dec 16 17:36 launch_container.sh
drwxr-S--- 3 test01 hadoop   19 Dec 16 17:53 .pki
drwxr-s--- 2 test01 hadoop    6 Dec 16 17:36 tmp
$ kinit test01/app -kt test01.service.keytab
$ klist 
Ticket cache: FILE:/tmp/krb5cc_20184
Default principal: test01/app@KR.DF.NAVERNCP.COM

Valid starting       Expires              Service principal
12/16/2020 17:39:57  12/17/2020 17:39:56  krbtgt/KR.DF.NAVERNCP.COM@KR.DF.NAVERNCP.COM
        renew until 12/23/2020 17:39:56
Caution

The error message kinit: Password incorrect while getting initial credentials occurs when the given keytab does not match the account.

Check environment variables

The environment variables required to use the Data Forest cluster's services are already specified in the Dev app.
To view the environment variables:

$ echo $HADOOP_HOME
/usr/nch/current/hadoop-client
$ echo $SPARK_HOME
/usr/nch/current/spark2-client

Use Hadoop dfs command

The dfs command runs the file system shell. The file system shell includes various shells and similar commands that directly interact with other file systems supported by Hadoop such as HDFS, local FS, WebHDFS, S3 FS, and so on.
dfs can be run in 3 different formats: hdfs dfs, hadoop fs, and hadoop dfs. To run a file system task:

[test01@shell-0.dev.test01.kr.ch.naverncp.com ~][df]$ hadoop fs -ls
Found 30 items
…
-rw-r--r--   3 test01 services        215 2021-04-09 11:35 test01.service.keytab
drwx------   - test01 services          0 2021-05-11 12:21 grafana
drwx------   - test01 services          0 2021-05-07 14:55 hue
…
Note

For more information on file system shells, see Here.

Submit Spark job using spark-shell

The Dev app can run REPL like spark-shell or PySpark as it has the client settings for Data Forest completed.

To submit a Spark job to a cluster using spark-shell:

[test01@shell-0.dev.test01.kr.ch.naverncp.com ~][df]$ spark-shell
Warning: Ignoring non-spark config property: history.server.spnego.keytab.file=/etc/security/keytabs/spnego.service.keytab
Warning: Ignoring non-spark config property: history.server.spnego.kerberos.principal=HTTP/_HOST@KR.DF.NAVERNCP.COM
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://shell-0.dev.test01.kr.ch.naverncp.com:4040
Spark context available as 'sc' (master = yarn, app id = application_1619078733441_0566).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.2.3.1.0.0-78
      /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
Type in expressions to have them evaluated.
Type :help for more information.

scala> val rdd1 = sc.textFile("file:///usr/nch/current/spark2-client/README.md")
rdd1: org.apache.spark.rdd.RDD[String] = file:///usr/nch/current/spark2-client/README.md MapPartitionsRDD[1] at textFile at <console>:24

scala> val rdd2 = rdd1.flatMap(_.split(" "))
rdd2: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at flatMap at <console>:25

scala> val rdd3= rdd2.map((_, 1))
rdd3: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[3] at map at <console>:25

scala> val rdd4 = rdd3.reduceByKey(_+_)
rdd4: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:25

scala> rdd4.take(10)
res0: Array[(String, Int)] = Array((package,1), (this,1), (Version"](http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version),1), (Because,1), (Python,2), (page](http://spark.apache.org/documentation.html).,1), (cluster.,1), ([run,1), (its,1), (YARN,,1))

scala> rdd4.saveAsTextFile("hdfs://dataforest/user/test01/result")
  ...
org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1478)
  ... 49 elided

To view the result using the Hadoop command:

[test01@shell-0.dev.test01.kr.ch.naverncp.com ~][df]$ hadoop fs -ls /user/test01/result
Found 3 items
-rw-------   3 test01 services          0 2021-04-21 14:06 /user/test01/result/_SUCCESS
-rw-------   3 test01 services        886 2021-04-21 14:06 /user/test01/result/part-00000.gz
-rw-------   3 test01 services        888 2021-04-21 14:06 /user/test01/result/part-00001.gz

Access HiveServer2

To access HS2:
You can access common HS2 and individual HS2 through the following command format:

$ beeline -u {JDBC connection string} -n {username} -p {password} 
Note
  • For an individual HS2, enter it by referring to the HS2 app details in Quick Links > Connection String.

Configure client environment for apps

To configure the client environment for an app:

  1. Create a directory called secure-hbase.
  2. Enter "sh /home/forest/get-app-env.sh {The user's hbase app name} {directory name}" as the following example.
    $ mkdir secure-hbase 
    $ sh /home/forest/get-app-env.sh hbase ~/secure-hbase
    [/home/forest/get-app-env.sh] Apptype: HBASE-2.2.3
    [/home/forest/get-app-env.sh] Download install-client script for HBASE-2.2.3
    [/home/forest/get-app-env.sh] Install client on /data10/hadoop/yarn/local/usercache/test01/appcache/application_1619078733441_0563/container_e84_1619078733441_0563_01_000002/secure-hbase
    current hbase: .yarn/services/hbase/components/v1
    --2021-05-20 14:37:51--  http://dist.kr.df.naverncp.com/repos/release/hbase/hbase-2.2.3-client-bin.tar.gz
    Resolving dist.kr.df.naverncp.com (dist.kr.df.naverncp.com)... 10.213.208.69
    Connecting to dist.kr.df.naverncp.com (dist.kr.df.naverncp.com)|10.213.208.69|:80... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 233293221 (222M) [application/octet-stream]
    Saving to: ‘/data10/hadoop/yarn/local/usercache/test01/appcache/application_1619078733441_0563/container_e84_1619078733441_0563_01_000002/secure-hbase/hbase-2.2.3-client-bin.tar.gz’
    
    100%[=============================================================================================>] 233,293,221  390MB/s   in 0.6s
    
    2021-05-20 14:37:51 (390 MB/s) - ‘/data10/hadoop/yarn/local/usercache/test01/appcache/application_1619078733441_0563/container_e84_1619078733441_0563_01_000002/secure-hbase/hbase-2.2.3-client-bin.tar.gz’ saved [233293221/233293221]
    
    HBase-2.2.3 Client has been installed on /data10/hadoop/yarn/local/usercache/test01/appcache/application_1619078733441_0563/container_e84_1619078733441_0563_01_000002/secure-hbase/hbase-2.2.3-client
    ==============================================================================================
    export HBASE_HOME=/data10/hadoop/yarn/local/usercache/test01/appcache/application_1619078733441_0563/container_e84_1619078733441_0563_01_000002/secure-hbase/hbase-2.2.3-client
    $HBASE_HOME/bin/hbase shell 
    
    Use "help" to get list of supported commands.
    Use "exit" to quit this interactive shell.
    For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
    Version 2.2.3, rUnknown, Wed Jan 29 22:11:21 KST 2020
    Took 0.0025 seconds                                                                                                                                                                         
    hbase(main):001:0> 
    hbase(main):002:0* version
    2.2.3, rUnknown, Wed Jan 29 22:11:21 KST 2020
    Took 0.0007 seconds                                                                                                                                                                         
    hbase(main):003:0> status
    1 active master, 0 backup masters, 3 servers, 0 dead, 0.6667 average load
    Took 0.5934 seconds