Using Hive
    • PDF

    Using Hive

    • PDF

    Article Summary

    Available in VPC

    Data Forest supports building independent Apache HiveServer2 (hereinafter referred to as HS2) service environments for each user. Hive Metastore is required, and you'll use Hive Metastore provided by Data Forest. Hive is an SQL-based data warehouse solution that analyzes and processes high volume data saved in a data storage system.

    Note
    • The HIVESERVER2-LDAP app uses the LDAP method to authenticate users who access HS2. Kerberos authentication is not supported.
    • The HS2 app can't be logged in by users other than the cluster operator and the user who has the app open. If a login is needed, then the HS2 app owner can grant login permissions to other users.

    Check HIVESERVER2-LDAP app details

    When the app creation is completed, you can view the details. When the Status is Stable under the app's details, it means the app is running normally.
    The following describes how to check the app details.

    1. From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest > App menus, in that order.
    2. Select the account that owns the app.
    3. Click the app whose details you want to view.
    4. View the app details.
      df-hive_2-1_en
      • Quick links
        • AppMaster: URL where the container log can be viewed When creating apps, all apps are submitted to the YARN queue. YARN provides a web UI where each app's details can be viewed.
        • supervisor-hs2-auth-ldap-0: supervisor URL that can manage HS2
        • shell-hs2-auth-ldap-0: HS2 web shell URL
        • webui-hs2-auth-ldap-0: URL that can access HS2 web UI
          • Home: You can view the sessions being executed or Hive queries that were executed recently.
          • configuration: You can view the configuration of HS2 in the XML format.
          • Metrics Dump: You can view the real-time JMX metrics in the JSON format.
          • Stack Trace: You can view the stack traces of all active threads.
          • LLAP Daemon: You can view the status of Hive LLAP daemons.
          • Local logs: You can view logs. Only operators can access.
      • Connection String: URL that allows access to the HS2 app
        • JDBC connection string (inside-of-cluster) Example: connection string used when connecting from Beeline, Zeppelin, HUE, and user-defined programs to JDBC This address is used for when the access is made to HS2 from a Data Forest internal network.
        • JDBC connection string: address used when accessing HS2 from an external network of Data Forest Access to JDBC connection string (inside-of-cluster) is not available from a user PC. You can use JDBC connection string when it is difficult to distinguish between internal and external networks.
        • JDBC connection string (inside-of-cluster): Before using the example link, change the password parameter's changeme to the user account password.
        • JDBC connection string Example: Before using the example link, change the password parameter's changeme to the user account password.
      • Component: The value specified by default is the recommended resource. The HIVESERVER2-LDAP-3.1.0 type consists of a single component: hs2-auth-ldap.
        • hs2-auth-ldap: component to process LDAP authentication for users

    Example
    The following is the HS2 screen after connection.
    df_hiveserver_vpc_ko

    Connect to the HS2 app with Beeline

    The following describes how to access HS2 using the beeline -u {JDBC connection string} -n {username} -p {password} command.

    # Authenticate Kerberos
    $ curl -s -L -u test01:$PASSWORD -o df.test01.keytab "https://sso.kr.df.naverncp.com/gateway/koya-auth-basic/webhdfs/v1/user/test01/df.test01.keytab?op=OPEN"
    $ ls -al
    total 20
    drwxr-s--- 4 test01 hadoop  138 Dec 16 17:57 .
    drwxr-s--- 4 test01 hadoop   74 Dec 16 17:44 ..
    -rw-r--r-- 1 test01 hadoop  231 Dec 16 17:36 .bashrc
    -rw------- 1 test01 hadoop  302 Dec 16 17:36 container_tokens
    -rw-r--r-- 1 test01 hadoop  245 Dec 16 17:57 df_beta.test01.keytab
    lrwxrwxrwx 1 test01 hadoop  101 Dec 16 17:36 gotty -> /data1/hadoop/yarn/local/usercache/test01/appcache/application_1607671243914_0024/filecache/10/gotty
    -rwx------ 1 test01 hadoop 6634 Dec 16 17:36 launch_container.sh
    drwxr-S--- 3 test01 hadoop   19 Dec 16 17:53 .pki
    drwxr-s--- 2 test01 hadoop    6 Dec 16 17:36 tmp
    $ kinit example -kt df.example.keytab 
    
    # Connect to HS2
    $ beeline -u "jdbc:hive2://hs2-auth-ldap.new-hiveserver2.example.kr.df.naverncp.com:10001/;transportMode=http;httpPath=cliservice" -n test01 -p '{password}'
    

    If the connection is made successfully, then the following result is displayed.

    Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
    Driver: Hive JDBC (version 3.1.0.3.1.0.0-78) 
    Transaction isolation: TRANSACTION_REPEATABLE_READ
    

    Using Hive in Zeppelin

    1. Connect to Zeppelin, click the account name at the upper right corner of the page, and then click the Interpreter menu.
      df-hive_12_vpc_ko
    2. Search the JDBC interpreter.
      df-quick-start_zeppelin03_ko
    3. Add the configuration to hive url properties, as shown below, by referring to JDBC connection string Example.
      df-hive_011_vpc_ko(1)
    Note

    If any special character is included in the password set when creating the account, then substitute it with URL encoding when entering it.

    Access shell from container

    Users can use a web browser to access the shell (/bin/bash) from an HS2 app container. You can check the status of the container easily by accessing the shell, and perform writing tasks within the shell such as modifying configuration file or downloading files.

    1. Access shell-hs2-auth-ldap-0 from the Quick links list.
    2. Log in to access the shell.
      • User name: Enter the account name of the user who launched the HS2 app.
      • Password: Enter the account password.
    Caution
    • The content of the write operation will be lost if the HS2 container is reloaded on a different node due to a failure of the device where the HS2 container was on. The content of write operations is not retained permanently. Please use shells as read-only.
    • For security, only the cluster operator and the app owner account who has the HS2 app open can access the shell. For example, user "bar" can't log in to the shell of the HS2 app opened by user "foo".

    Cautions for using Hive

    Hive rules

    Unlike shared HS2, you can use the SHOW DATABASES; command to display the entire DB list of other users. However, you can't access the DB of another user without permission.
    Make sure to read and learn the Using shared Hive guide when using the HS2 app, and comply with the rules.

    Database name limitations

    Rules that are identical to the database name limitations of Hive apply. On the HS2 app, the CREATE DATABASE command will not fail, even for a name that goes against the rules. The database naming rules can't be forced in the HS2 app on the system, so please make sure you're following the rules when creating databases. The DB created on the HS2 app won't be searched in the shared HS2 if its name violates the database naming rules.

    Note

    A name that goes against the rules will cause an error in the shared HS2. Therefore, we recommend that you perform database creation jobs in shared HS2, rather than in the HS2 app.


    Was this article helpful?

    What's Next
    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.