- Print
- PDF
Using Hive
- Print
- PDF
Available in VPC
Data Forest supports building independent Apache HiveServer2 (hereinafter referred to as HS2) service environments for each user. Hive Metastore is required, and you'll use Hive Metastore provided by Data Forest. Hive is an SQL-based data warehouse solution that analyzes and processes high volume data saved in a data storage system.
- The HIVESERVER2-LDAP app uses the LDAP method to authenticate users who access HS2. Kerberos authentication is not supported.
- The HS2 app can't be logged in by users other than the cluster operator and the user who has the app open. If a login is needed, then the HS2 app owner can grant login permissions to other users.
Check HIVESERVER2-LDAP app details
When the app creation is completed, you can view the details. When the Status is Stable under the app's details, it means the app is running normally.
The following describes how to check the app details.
- From the NAVER Cloud Platform console, click the Services > Big Data & Analytics > Data Forest > App menus, in that order.
- Select the account that owns the app.
- Click the app whose details you want to view.
- View the app details.
- Quick links
- AppMaster: URL where the container log can be viewed When creating apps, all apps are submitted to the YARN queue. YARN provides a web UI where each app's details can be viewed.
- supervisor-hs2-auth-ldap-0: supervisor URL that can manage HS2
- shell-hs2-auth-ldap-0: HS2 web shell URL
- webui-hs2-auth-ldap-0: URL that can access HS2 web UI
- Home: You can view the sessions being executed or Hive queries that were executed recently.
- configuration: You can view the configuration of HS2 in the XML format.
- Metrics Dump: You can view the real-time JMX metrics in the JSON format.
- Stack Trace: You can view the stack traces of all active threads.
- LLAP Daemon: You can view the status of Hive LLAP daemons.
- Local logs: You can view logs. Only operators can access.
- Connection String: URL that allows access to the HS2 app
- JDBC connection string (inside-of-cluster) Example: connection string used when connecting from Beeline, Zeppelin, HUE, and user-defined programs to JDBC This address is used for when the access is made to HS2 from a Data Forest internal network.
- JDBC connection string: address used when accessing HS2 from an external network of Data Forest Access to JDBC connection string (inside-of-cluster) is not available from a user PC. You can use JDBC connection string when it is difficult to distinguish between internal and external networks.
- JDBC connection string (inside-of-cluster): Before using the example link, change the password parameter's
changeme
to the user account password. - JDBC connection string Example: Before using the example link, change the password parameter's
changeme
to the user account password.
- Component: The value specified by default is the recommended resource. The HIVESERVER2-LDAP-3.1.0 type consists of a single component: hs2-auth-ldap.
- hs2-auth-ldap: component to process LDAP authentication for users
- Quick links
Example
The following is the HS2 screen after connection.
Connect to the HS2 app with Beeline
The following describes how to access HS2 using the beeline -u {JDBC connection string} -n {username} -p {password}
command.
# Authenticate Kerberos
$ curl -s -L -u test01:$PASSWORD -o df.test01.keytab "https://sso.kr.df.naverncp.com/gateway/koya-auth-basic/webhdfs/v1/user/test01/df.test01.keytab?op=OPEN"
$ ls -al
total 20
drwxr-s--- 4 test01 hadoop 138 Dec 16 17:57 .
drwxr-s--- 4 test01 hadoop 74 Dec 16 17:44 ..
-rw-r--r-- 1 test01 hadoop 231 Dec 16 17:36 .bashrc
-rw------- 1 test01 hadoop 302 Dec 16 17:36 container_tokens
-rw-r--r-- 1 test01 hadoop 245 Dec 16 17:57 df_beta.test01.keytab
lrwxrwxrwx 1 test01 hadoop 101 Dec 16 17:36 gotty -> /data1/hadoop/yarn/local/usercache/test01/appcache/application_1607671243914_0024/filecache/10/gotty
-rwx------ 1 test01 hadoop 6634 Dec 16 17:36 launch_container.sh
drwxr-S--- 3 test01 hadoop 19 Dec 16 17:53 .pki
drwxr-s--- 2 test01 hadoop 6 Dec 16 17:36 tmp
$ kinit example -kt df.example.keytab
# Connect to HS2
$ beeline -u "jdbc:hive2://hs2-auth-ldap.new-hiveserver2.example.kr.df.naverncp.com:10001/;transportMode=http;httpPath=cliservice" -n test01 -p '{password}'
If the connection is made successfully, then the following result is displayed.
Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Using Hive in Zeppelin
- Connect to Zeppelin, click the account name at the upper right corner of the page, and then click the Interpreter menu.
- Search the JDBC interpreter.
- Add the configuration to hive url properties, as shown below, by referring to JDBC connection string Example.
If any special character is included in the password set when creating the account, then substitute it with URL encoding when entering it.
Access shell from container
Users can use a web browser to access the shell (/bin/bash
) from an HS2 app container. You can check the status of the container easily by accessing the shell, and perform writing tasks within the shell such as modifying configuration file or downloading files.
- Access shell-hs2-auth-ldap-0 from the Quick links list.
- Log in to access the shell.
- User name: Enter the account name of the user who launched the HS2 app.
- Password: Enter the account password.
- The content of the write operation will be lost if the HS2 container is reloaded on a different node due to a failure of the device where the HS2 container was on. The content of write operations is not retained permanently. Please use shells as read-only.
- For security, only the cluster operator and the app owner account who has the HS2 app open can access the shell. For example, user "bar" can't log in to the shell of the HS2 app opened by user "foo".
Cautions for using Hive
Hive rules
Unlike shared HS2, you can use the SHOW DATABASES;
command to display the entire DB list of other users. However, you can't access the DB of another user without permission.
Make sure to read and learn the Using shared Hive guide when using the HS2 app, and comply with the rules.
Database name limitations
Rules that are identical to the database name limitations of Hive apply. On the HS2 app, the CREATE DATABASE
command will not fail, even for a name that goes against the rules. The database naming rules can't be forced in the HS2 app on the system, so please make sure you're following the rules when creating databases. The DB created on the HS2 app won't be searched in the shared HS2 if its name violates the database naming rules.
A name that goes against the rules will cause an error in the shared HS2. Therefore, we recommend that you perform database creation jobs in shared HS2, rather than in the HS2 app.