Tableau integrations

Prev Next

Available in VPC

Tableau is one of the most popular BI solutions in use today that allows you to visualize your data in a quick and easy way.
This guide shows how to integrate Tableau with NAVER Cloud Platform's Cloud Hadoop.

For more information on Tableau, visit the official Tableau website

Preparations

  1. Create a Cloud Hadoop cluster.
  2. Create Object Storage.
  3. Create a Windows server.
Note

We recommend that you create the Cloud Hadoop and Windows servers in the same VPC.

  1. Set up an ACG.
    • Enter the Windows server IP in the Cloud Hadoop ACG access source and add port 8286 to the allowed ports.
      hadoop-vpc-use-ex14_pre1_vpc_ko

Create a table in Hive

  1. Upload a sample data file to Object Storage.

    • Download the sample data from here, unzip it, and upload the AllstarFull.csv file to Object Storage > Bucket Management.
      chadoop-4-5-001_ko
  2. Create a table in the Hive Editor.

DROP table allstarfull;

CREATE external TABLE if not EXISTS `allstarfull` (
        `playerID` VARCHAR(20),
        `yearID` INT,
        `gameNum` INT,
        `gameID` VARCHAR(30),
        `teamID` VARCHAR(4),
        `lgID` VARCHAR(4),
        `GP` INT,
        `startingPos` INT
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
location 's3a://deepdrive-hue/input/lahman2012/allstarfull';
  1. Use a simple query to verify that the table was created correctly.
SELECT * FROM allstarfull;

Add the Presto connector

  1. Add the connector in Presto > [CONFIGS] > Advanced trino.connectors.properties.
    • We need to add the Hive connector, so in the connectors.to.add, enter the following:
{"hive":["connector.name=hive-hadoop2",
        "hive.metastore.uri=thrift://<METASTORE-HOST-IP>:9083",
        "hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml",
        "hive.s3.use-instance-credentials=false",
        "hive.s3.aws-access-key=<API-ACCESS-KEY>",
        "hive.s3.aws-secret-key=<API-SECRET-KEY>",
        "hive.s3.endpoint=https://kr.object.private.ncloudstorage.com"]
        }

hadoop-vpc-use-ex14_connect1_vpc_ko

Note

<METASTORE-HOST-IP> is the private IP address of the master node (m-001). You can check it in the Ambari UI > Hosts menu.

  1. A reboot is required for the changed configuration to take effect. Click [ACTIONS] > Restart All in the upper right corner and click [CONFIRM RESTART ALL] in the popup window.
Note

For more information about using Presto to analyze data stored in a Hive data warehouse, see the Analyzing Hive warehouse data with Presto guide.

Install Tableau

The steps listed below are all performed on a Windows server.

  1. Download Tableau Desktop from the Tableau website.
  2. Download the Presto JDBC driver (presto-jdbc-0.268.jar) from the Presto website
  3. Move the Presto JDBC driver to the Drivers directory in the path where Tableau is installed.
    hadoop-vpc-use-ex14_install1_vpc_ko
Note

Tableau can explore a wide variety of data, from spreadsheets to databases and more, not just Presto. See the Tableau Desktop and web authoring help for supported connectors.

Install the nginx-ssl.crt certificate

  1. Import the /etc/nginx/ssl/nginx-ssl.crt certificate from the Cloud Hadoop Edge node to your Windows server.

  2. Double-click the nginx-ssl.crt certificate and click [Install certificate].
    hadoop-vpc-use-ex14_install3_vpc_ko

  3. Select Local Computer and click [Next].
    hadoop-vpc-use-ex14_install4_vpc_ko

  4. Select Save all certificates to the following store and select a certificate store:

    • Set the certificate store to Trusted Root Certification Authorities.
      hadoop-vpc-use-ex14_install5_vpc_ko
  5. Click [Finish] to complete the certificate installation.
    hadoop-vpc-use-ex14_install6_vpc_ko

Access Presto from Tableau Desktop

  1. Launch Tableau Desktop and select Connect to server > More > Presto.
    hadoop-vpc-use-ex14_visual1_vpc_ko
  2. Enter your access information and click [Log in].
Access information 
 - Server: domain of the Presto cluster
 - Port: port of the Presto coordinator (8286)
 - Catalog: name of the catalog to use
 - Schema: name of the schema to use
 - Username: name of the user to use
 - SSL required: check

hadoop-vpc-use-ex14_visual2_vpc_ko

Check tables and records

  1. Click Include and search to check the tables in the schema.
    hadoop-vpc-use-ex14_visual3_vpc_ko

  2. Click [Update now] to load the data into the tables.
    hadoop-vpc-use-ex14_visual4_vpc_ko

  3. You can create a new sheet and visualize it simply by dragging and dropping.
    hadoop-vpc-use-ex14_visual5_vpc_ko

Note

For more information on using Tableau, visit the official Tableau website