Using Dev
    • PDF

    Using Dev

    • PDF

    Article Summary

    Available in VPC

    The Dev app plays the role of a client for all services provided in Data Forest. You can use Hadoop commands or submit Spark Job to YARN clusters in the Dev app.

    Note

    For more information about creating apps, see Create and manage apps.

    Check Dev app details

    When the app creation is completed, you can view the details. When the Status is Stable under the app's details, it means the app is running normally.
    The following describes how to check the app details.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest menu, in that order.
    2. Click the Data Forest > Apps menus on the left.
    3. Select an account.
    4. Click the app whose details you want to view.
    5. View the app details.
      df-dev_2-1_ko
      • Quick links: you can connect to the following quick link addresses.
        • AppMaster: URL to view container logs. When creating an app, all apps are submitted to the YARN queue, and YARN provides a web UI to check detailed information of each app
        • Supervisor: supervisor URL where you can monitor and manage the container's app processes
        • Shell: URL that allows access to GNU/Linux terminal (TTY) through a web browser
      • Component: DEV-1.0.0 type is composed of one shell component.
        • shell: memory and CPU set as a default, and the number of containers is the minimum value recommended
    Note

    For information on how to log in to the AppMaster UI and view the logs of each container, see Access quick links.

    <example>
    The shell screen after connection is as follows:
    df-dev_004_vpc_ko

    Kerberos authentication

    Kerberos authentication must take place before executing a Hadoop command or submitting a Spark job to a cluster. A keytab file is used when performing Kerberos authentication. However, since the Dev app can't access the user's local file system, you should configure it so that the keytab is uploaded to HDFS and the Dev app can download it.

    Download keytab

    The following describes how to download the keytab.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest > Accounts menus, in that order.
    2. Select an account and then click [Cluster access information] > Download Kerberos keytab.
    3. When the Download keytab window appears, click the [Download] button.
    4. Store the downloaded file safely.

    Upload keytab to HDFS

    The following describes how to upload the keytab to HDFS.

    1. From the file browser of the koya namespace, upload the keytab df.{username}.keytab under path /user/{username}/.
      df-dev_5-1_vpc_ko

    2. Download the keytab in HDFS, and run the Kerberos authentication using the keytab.

      • Enter the password set up when creating the account as a string in $PASSWORD
      • Special characters must be enclosed in single quotes ' '
      $ curl -s -L -u test01:$PASSWORD -o df.test01.keytab "https://sso.kr.df.naverncp.com/gateway/koya-auth-basic/webhdfs/v1/user/test01/df.test01.keytab?op=OPEN"
      $ ls -al
      total 20
      drwxr-s--- 4 test01 hadoop  138 Dec 16 17:57 .
      drwxr-s--- 4 test01 hadoop   74 Dec 16 17:44 ..
      -rw-r--r-- 1 test01 hadoop  231 Dec 16 17:36 .bashrc
      -rw------- 1 test01 hadoop  302 Dec 16 17:36 container_tokens
      -rw-r--r-- 1 test01 hadoop  245 Dec 16 17:57 df.test01.keytab
      lrwxrwxrwx 1 test01 hadoop  101 Dec 16 17:36 gotty -> /data1/hadoop/yarn/local/usercache/test01/appcache/application_1607671243914_0024/filecache/10/gotty
      -rwx------ 1 test01 hadoop 6634 Dec 16 17:36 launch_container.sh
      drwxr-S--- 3 test01 hadoop   19 Dec 16 17:53 .pki
      drwxr-s--- 2 test01 hadoop    6 Dec 16 17:36 tmp
      $ kinit test01 -kt df.test01.keytab
      $ klist 
      Ticket cache: FILE:/tmp/krb5cc_20184
      Default principal: test01@KR.DF.NAVERNCP.COM
      
      Valid starting       Expires              Service principal
      12/16/2020 17:39:57  12/17/2020 17:39:56  krbtgt/KR.DF.NAVERNCP.COM@KR.DF.NAVERNCP.COM
              renew until 12/23/2020 17:39:56
      
    Caution

    The error message kinit: Password incorrect while getting initial credentials occurs when the given keytab does not match the account.

    Check environment variables

    The environment variables required to use the Data Forest cluster's services are already specified in the Dev app.
    The following describes how to view the environment variables.

    $ echo $HADOOP_HOME
    /usr/hdp/current/hadoop-client
    $ echo $SPARK_HOME
    /usr/hdp/current/spark2-client
    

    Use Hadoop dfs command

    The dfs command executes the file system shell. The file system shell includes various shells and similar commands that directly interact with other file systems supported by Hadoop such as HDFS, local FS, WebHDFS, S3 FS, and so on.
    dfs can be executed in three different formats: hdfs dfs, hadoop fs, and hadoop dfs. The following describes how to execute a file system job.

    [test01@shell-0.dev.test01.kr.df.naverncp.com ~][df]$ hadoop fs -ls
    Found 30 items
    …
    -rw-r--r--   3 test01 services        215 2021-04-09 11:35 df.test01.keytab
    drwx------   - test01 services          0 2021-05-11 12:21 grafana
    drwx------   - test01 services          0 2021-05-07 14:55 hue
    …
    
    Note

    For more information about file system shells, see here.

    Submit Spark job using spark-shell

    The Dev app can't run REPL like spark-shell or PySpark as it has the client settings for Data Forest completed.

    The following describes how to submit a Spark job to a cluster using spark-shell.

    [test01@shell-0.dev.test01.kr.df.naverncp.com ~][df]$ spark-shell
    Warning: Ignoring non-spark config property: history.server.spnego.keytab.file=/etc/security/keytabs/spnego.service.keytab
    Warning: Ignoring non-spark config property: history.server.spnego.kerberos.principal=HTTP/_HOST@KR.DF.NAVERNCP.COM
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
    Spark context Web UI available at http://shell-0.dev.test01.kr.df.naverncp.com:4040
    Spark context available as 'sc' (master = yarn, app id = application_1619078733441_0566).
    Spark session available as 'spark'.
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version 2.3.2.3.1.0.0-78
          /_/
    
    Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
    Type in expressions to have them evaluated.
    Type :help for more information.
    
    scala> val rdd1 = sc.textFile("file:///usr/hdp/current/spark2-client/README.md")
    rdd1: org.apache.spark.rdd.RDD[String] = file:///usr/hdp/current/spark2-client/README.md MapPartitionsRDD[1] at textFile at <console>:24
    
    scala> val rdd2 = rdd1.flatMap(_.split(" "))
    rdd2: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at flatMap at <console>:25
    
    scala> val rdd3= rdd2.map((_, 1))
    rdd3: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[3] at map at <console>:25
    
    scala> val rdd4 = rdd3.reduceByKey(_+_)
    rdd4: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:25
    
    scala> rdd4.take(10)
    res0: Array[(String, Int)] = Array((package,1), (this,1), (Version"](http://spark.apache.org/docs/en/latest/building-spark.html#specifying-the-hadoop-version),1), (Because,1), (Python,2), (page](http://spark.apache.org/documentation.html).,1), (cluster.,1), ([run,1), (its,1), (YARN,,1))
    
    scala> rdd4.saveAsTextFile("hdfs://koya/user/test01/result")
      ...
    org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1478)
      ... 49 elided
    

    The following describes how to check the result using the Hadoop command.

    [test01@shell-0.dev.test01.kr.df.naverncp.com ~][df]$ hadoop fs -ls /user/test01/result
    Found 3 items
    -rw-------   3 test01 services          0 2021-04-21 14:06 /user/test01/result/_SUCCESS
    -rw-------   3 test01 services        886 2021-04-21 14:06 /user/test01/result/part-00000.gz
    -rw-------   3 test01 services        888 2021-04-21 14:06 /user/test01/result/part-00001.gz
    

    Access HiveServer2

    The following describes how to access HS2.
    You can access common HS2 and individual HS2 through the following command format:

    $ beeline -u {JDBC connection string} -n {username} -p {password} 
    
    Note
    • For common HS2, enter the address for JDBC connection string by matching HiveServer2 (Batch)/(Interactive) type in Quick links of App details > View access details.
    • For individual HS2, enter it by referring to the HS2 app details in Quick Links > Connection String.

    Configure client environment for apps

    The following describes how to configure the client environment for an app.

    1. Create a directory called secure-hbase.
    2. Enter sh /home/forest/get-app-env.sh {사용자의 hbase 앱 이름} {디렉터리 이름} as in the following example:
      $ mkdir secure-hbase  
      $ sh /home/forest/get-app-env.sh hbase ~/secure-hbase  
      [/home/forest/get-app-env.sh] Apptype: HBASE-2.2.3  
      [/home/forest/get-app-env.sh] Download install-client script for HBASE-2.2.3  
      [/home/forest/get-app-env.sh] Install client on /data10/hadoop/yarn/local/usercache/test01/appcache/application_1619078733441_0563/container_e84_1619078733441_0563_01_000002/secure-hbase  
      current hbase: .yarn/services/hbase/components/v1  
      --2021-05-20 14:37:51--  http://dist.kr.df.naverncp.com/repos/release/hbase/hbase-2.2.3-client-bin.tar.gz  
      Resolving dist.kr.df.naverncp.com (dist.kr.df.naverncp.com)... 10.213.208.69  
      Connecting to dist.kr.df.naverncp.com (dist.kr.df.naverncp.com)|10.213.208.69|:80... connected.  
      HTTP request sent, awaiting response... 200 OK  
      Length: 233293221 (222M) [application/octet-stream]  
      Saving to: '/data10/hadoop/yarn/local/usercache/test01/appcache/application_1619078733441_0563/container_e84_1619078733441_0563_01_000002/secure-hbase/hbase-2.2.3-client-bin.tar.gz'  
      
      100%[=============================================================================================>] 233,293,221  390MB/s   in 0.6s  
      
      2021-05-20 14:37:51 (390 MB/s) - '/data10/hadoop/yarn/local/usercache/test01/appcache/application_1619078733441_0563/container_e84_1619078733441_0563_01_000002/secure-hbase/hbase-2.2.3-client-bin.tar.gz' saved [233293221/233293221]  
      
      HBase-2.2.3 Client has been installed on /data10/hadoop/yarn/local/usercache/test01/appcache/application_1619078733441_0563/container_e84_1619078733441_0563_01_000002/secure-hbase/hbase-2.2.3-client  
      ==============================================================================================  
      export HBASE_HOME=/data10/hadoop/yarn/local/usercache/test01/appcache/application_1619078733441_0563/container_e84_1619078733441_0563_01_000002/secure-hbase/hbase-2.2.3-client  
      $HBASE_HOME/bin/hbase shell  
      
      Use "help" to get list of supported commands.  
      Use "exit" to quit this interactive shell.  
      For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell  
      Version 2.2.3, rUnknown, Wed Jan 29 22:11:21 KST 2020  
      Took 0.0025 seconds                                                                                                                                                                         
      hbase(main):001:0>  
      hbase(main):002:0* version  
      2.2.3, rUnknown, Wed Jan 29 22:11:21 KST 2020  
      Took 0.0007 seconds                                                                                                                                                                         
      hbase(main):003:0> status  
      1 active master, 0 backup masters, 3 servers, 0 dead, 0.6667 average load  
      Took 0.5934 seconds                       
      

    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.