Connect JSON file with Hive External Table
    • PDF

    Connect JSON file with Hive External Table

    • PDF

    Article Summary

    Available in VPC

    Hive External Table are the tables created with the external keyword when creating tables in Hive. Hive External Table does not store data in the directory specified by the hive.metastore.warehouse.dir property, but it stores data in the path set when the table is created.

    Since tables are created based on data that already exists in Hadoop, you can create them by simply setting the schema.

    Note

    You can also see examples related to Hive External Tables in the Using Hive guide.

    This guide explains how to use JSON data by connecting it to Hive External Table provided by Cloud Hadoop.

    Prepare sample data

    The following describes how to prepare sample data.

    1. Convert to hive user.
    $ sudo su - hive
    
    1. Download the relevant data.
    Note

    The relevant data have the ndjson (Newline Delimited JSON) data format.

    Upload the JSON file to HDFS

    The method to upload a HDFS JSON file is as follows:

    1. Create a directory in HDFS.
    $ hdfs dfs -mkdir /user/hive/bookstat
    
    1. Upload the JSON file to HDFS.
    $ hdfs dfs -put bookstat.json /user/hive/bookstat
    
    1. Check the HDFS file list.
    $ hdfs dfs -ls -R /user/hive/bookstat
    -rw-r--r--   2 hive hdfs        193 2023-06-28 14:41 /user/hive/bookstat/bookstat.json
    

    Create Hive External Table using JSON format

    The following describes how to create Hive External Tables.

    1. Connect to the edge node of the Cloud Hadoop cluster through SSH.

    2. Create an External Table using the Hive client command.

      • Create database
      hive> create database book;
      
      • Use created database
      hive> use book;
      
      • Create table
      hive> CREATE EXTERNAL TABLE IF NOT EXISTS bookstat_serde1
      (
          statTime STRING,
          bizCode STRING,
          saleInfo STRUCT <
              bookName:STRING,
              saleCount:INT
          >
      )
      ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
      STORED AS TEXTFILE
      LOCATION '/user/hive/bookstat' ;
      

    View Hive External Table using JSON format

    • When the table is created, you can check the data by running the SELECT query.
    • View table
    hive> select * from bookstat_serde1;
    OK
    20130710180059  BOOK    {"bookname":"book61","salecount":749}
    20230628140059  BOOK    {"bookname":"book800","salecount":899}
    
    

    Delete Hive External Table

    The following describes how to delete Hive External Tables.

    1. Delete the Hive External Table created using the following command:
      • An example of deleting the previously created Hive External Table (bookstat_serde1) is as follows:
    DROP TABLE bookstat_serde1;
    
    1. Run the SELECT query on the deleted table. You can see that the table has been successfully deleted, even though an error has occurred. In addition, you can check if the table is deleted through Show Tables.
    • Use SELECT query
    SELECT * FROM bookstat_serde1;
    
    Error: Error while compiling statement: FAILED: SemanticException [Error 10001]: Line 1:14 Table not found 'bookstat_serde1' (state=42S02,code=10001)
    
    • Check the table list using SHOW TABLES
    SHOW TABLES;
    

    Was this article helpful?

    What's Next
    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.