- Print
- PDF
Connect JSON file with Hive External Table
- Print
- PDF
Available in VPC
Hive External Table are the tables created with the external keyword when creating tables in Hive. Hive External Table does not store data in the directory specified by the hive.metastore.warehouse.dir
property, but it stores data in the path set when the table is created.
Since tables are created based on data that already exists in Hadoop, you can create them by simply setting the schema.
You can also see examples related to Hive External Tables in the Using Hive guide.
This guide explains how to use JSON data by connecting it to Hive External Table provided by Cloud Hadoop.
Prepare sample data
The following describes how to prepare sample data.
- Convert to hive user.
$ sudo su - hive
- Download the relevant data.
The relevant data have the ndjson (Newline Delimited JSON) data format.
Upload the JSON file to HDFS
The method to upload a HDFS JSON file is as follows:
- Create a directory in HDFS.
$ hdfs dfs -mkdir /user/hive/bookstat
- Upload the JSON file to HDFS.
$ hdfs dfs -put bookstat.json /user/hive/bookstat
- Check the HDFS file list.
$ hdfs dfs -ls -R /user/hive/bookstat
-rw-r--r-- 2 hive hdfs 193 2023-06-28 14:41 /user/hive/bookstat/bookstat.json
Create Hive External Table using JSON format
The following describes how to create Hive External Tables.
Connect to the edge node of the Cloud Hadoop cluster through SSH.
- For more information about how to connect through SSH to cluster nodes, see the Connecting to cluster nodes through SSH guide.
Create an External Table using the Hive client command.
- Create database
hive> create database book;
- Use created database
hive> use book;
- Create table
hive> CREATE EXTERNAL TABLE IF NOT EXISTS bookstat_serde1 ( statTime STRING, bizCode STRING, saleInfo STRUCT < bookName:STRING, saleCount:INT > ) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS TEXTFILE LOCATION '/user/hive/bookstat' ;
View Hive External Table using JSON format
- When the table is created, you can check the data by running the
SELECT
query.
- View table
hive> select * from bookstat_serde1;
OK
20130710180059 BOOK {"bookname":"book61","salecount":749}
20230628140059 BOOK {"bookname":"book800","salecount":899}
Delete Hive External Table
The following describes how to delete Hive External Tables.
- Delete the Hive External Table created using the following command:
- An example of deleting the previously created Hive External Table (bookstat_serde1) is as follows:
DROP TABLE bookstat_serde1;
- Run the SELECT query on the deleted table. You can see that the table has been successfully deleted, even though an error has occurred. In addition, you can check if the table is deleted through Show Tables.
- Use SELECT query
SELECT * FROM bookstat_serde1;
Error: Error while compiling statement: FAILED: SemanticException [Error 10001]: Line 1:14 Table not found 'bookstat_serde1' (state=42S02,code=10001)
- Check the table list using SHOW TABLES
SHOW TABLES;