Available in VPC
This guide describes how to configure Data Catalog as Metastore for Apache Hive.
These settings are available only with the main account of NAVER Cloud Platform.
Preparations
A self-managed environment where Hive is operational must already be set up.
NAVER Cloud Platform's Cloud Hadoop allows you to integrate Data Catalog with Hive Metastore storage through the configuration during the cluster creation process.
1. Install after applying Apache Hive patch
-
Clone Apache Hive.
git clone https://github.com/apache/hive.git -
Download the branch_3.1.patch file and apply the patch to the Apache Hive 3.1 version. After that, proceed with a new build.
- Download link: https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore/blob/branch-3.4.0/branch_3.1.patch
cd <your local hive source path> git checkout branch-3.1 git apply -3 branch_3.1.patch mvn clean install -DskipTests -
After building, place hive-exec-3.1.3.jar and hive-common-3.1.3.jar files in Hive's CLASSPATH.
- Hive's CLASSPATH is typically specified as
{HIVE_HOME}/lib/.
2. Download Hive Client for Data Catalog
- Download Hive Client for Data Catalog.
- Place the jar files in Hive's CLASSPATH.
Hive's CLASSPATH is typically specified as {HIVE_HOME}/lib/.
3. Download Object Storage-related library
wget https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/3.2.4/hadoop-aws-3.2.4.jar
wget https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/1.11.375/aws-java-sdk-bundle-1.11.375.jar
- Place the jar files in Hive's CLASSPATH.
Hive's CLASSPATH is typically specified as {HIVE_HOME}/lib/.
4. Change hive-site.xml
To use NAVER Cloud Platform's Data Catalog and Object Storage buckets with Hive, add the following to hive-site.xml:
<configuration>
<!-- Data Catalog settings-->
<property>
<name>hive.metastore.client.factory.class</name>
<value>com.navercorp.ncp.catalog.metastore.NCPCatalogMetastoreClientFactory</value>
</property>
<property>
<name>hive.metastore.api.endpoint</name>
<value>https://datacatalog.apigw.ntruss.com</value>
</property>
<!-- Object Storage settings-->
<property>
<name>fs.s3a.endpoint</name>
<value>http://kr.objectstorage.ncloud.com</value>
</property>
<property>
<name>fs.s3a.access.key</name>
<value>{your-access-key}</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>{your-secret-key}</value>
</property>
<property>
<name>fs.s3a.connection.ssl.enabled</name>
<value>false</value>
</property>
<property>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
</property>
</configuration>
The hive-site.xml is typically located under {HIVE_HOME}/conf/.
Verify integrations
Run the Hive CLI and check if the commands are working properly.
- Integrations with DBMS, such as MySQL, MSSQL, and PostgreSQL, are not supported.
- Integration with Iceberg tables is not supported either.