Using Trino

Prev Next

Available in VPC

Trino is a distributed SQL query engine for big data analysis. As PrestoSQL has been Rebranding to Trino, the Presto app provided based on PrestoSQL has been renamed to the Trino app. The same individual Trino server environment can be configured per user with the Trino app as with the Presto app provided in Data Forest.

Note
  • For more information on Trino, see TRINO.
  • Data Forest provides the Trino engine of version Trino 437. Since log4j was removed from the ES connector in Trino version 366 and later, it is not affected by the log4j vulnerability (CVE-2021-44228). For more information, see the official documentation.
  • The Trino app provides Trino server, Trino Cli, and Supervisor features.
  • Currently, the Trino app is preparing a feature to set custom configurations.

Check Trino app details

Once the app is created, you can view its details. If the Status in the app details is Stable, the app is running normally.

To view app details:

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Big Data & Analytics > Data Forest.
  2. Click Data Forest > Apps on the left.
  3. Select the account that owns the app.
  4. Click the app to view its details.
  5. Review the app details.
    df-trino_2-1_updated_ko
    • Quick links
      • Trino Coordinator: Provides access to the Trino Coordinator Web UI. Log in with the user account name and password.
      • Trino Cli Web: Allows use of the web-based CLI through a browser. Log in with the user account name and password.
      • Supervisor: Provides the supervisor for monitoring and managing the container's app processes. Enables process management through the Supervisor Web UI.
    • Component: The default values are the recommended resources.
      • coordinator: Component that acts as a coordinator.
      • worker: Component that acts as a worker.

Example:

  • Trino Coordinator access interface
    df-trino_02_vpc_ko
  • Trino Cli Web access interface
    df-trino_03_vpc_ko
  • Supervisor access interface
    df-trino_04_vpc_ko

Check Catalog and Schema

The Trino server created in this app has the Hive Connector configured by default, allowing access to the Hive Warehouse.

You can verify the existence of the hive and system catalogs as shown below. The system catalog provides cluster information and metrics.

Password:
trino> show catalogs;
Catalog
---------
hive
system
(2 rows)

Query 20220208_050852_00000_4xb62, FINISHED, 3 nodes
Splits: 53 total, 53 done (100.00%)
1.23 [0 rows, 0B] [0 rows/s, 0B/s]

You can view the table under the system catalog and information on the nodes that make up the Presto cluster.

trino> SHOW SCHEMAS FROM system;
Schema
--------------------
information_schema
jdbc
metadata
runtime
(4 rows)

Query 20220208_051532_00001_4xb62, FINISHED, 4 nodes
Splits: 53 total, 53 done (100.00%)
0.28 [4 rows, 57B] [14 rows/s, 201B/s]

trino> SHOW TABLES FROM system.runtime;
Table
----------------------
nodes
optimizer_rule_stats
queries
tasks
transactions
(5 rows)

Query 20220208_051549_00002_4xb62, FINISHED, 4 nodes
Splits: 53 total, 53 done (100.00%)
0.32 [5 rows, 134B] [15 rows/s, 425B/s]

trino> SELECT * FROM system.runtime.nodes;
node_id | http_uri | node_version | coordinator | state
-------------------------------------------------------------------+----------------------------+--------------+-------------+--------
test-01-worker-0-container_e814_1643186470613_0056_01_000003 | http://10.250.31.228:10292 | 367 | false | active
test-01-coordinator-0-container_e814_1643186470613_0056_01_000002 | http://10.250.31.224:10302 | 367 | true | active
test-01-worker-1-container_e814_1643186470613_0056_01_000004 | http://10.250.31.225:10292 | 367 | false | active
test-01-worker-2-container_e814_1643186470613_0056_01_000005 | http://10.250.31.227:10292 | 367 | false | active
(4 rows)

Query 20220208_051555_00003_4xb62, FINISHED, 2 nodes
Splits: 17 total, 17 done (100.00%)
0.23 [4 rows, 389B] [17 rows/s, 1.65KB/s]

trino>

You can view the hive catalog. You can run queries on data stored under the Hive Metastore with Trino.

trino> SHOW SCHEMAS FROM hive;
Schema
--------------------------
default
df_test__db_foo
information_schema
(3 rows)

Query 20220208_054001_00004_4xb62, FINISHED, 4 nodes
Splits: 53 total, 53 done (100.00%)
0.92 [7 rows, 141B] [7 rows/s, 153B/s]

Change the number of workers

You can change the number of workers while using the app.

To change the number of workers:

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Big Data & Analytics > Data Forest.
  2. Click Data Forest > Apps on the left.
  3. Select your account, select the app, and click [Flex].
  4. When the flex changing window appears, edit the number of workers, and click [Edit].
    df-trino_05_vpc_updated_ko
Note

When reducing the number of workers using the Flex feature, workers are stopped starting with the highest {{COMPONENT_ID}}. When there are 5 workers, they are removed in descending order, starting with worker-4, worker-3, and worker-2.