Available in VPC
Table is a metadata definition with details and schema of the data. You can create a table through the scanner or by defining your own schema. In the Table menu, you can create and manage tables and view the collected metadata.
Table list interface
A basic description of the Table menu for using Data Catalog is as follows:

| Component | Description |
|---|---|
| ① Menu name | Current menu name and the number of tables being viewed. |
| ② Basic features | Features displayed when you enter the Table menu for the first time.
|
| ③ Search bar | You can search by database name, table name, location, table type, data format, and tag, and you can also sort by order. |
| ④ Table list | Displays the list of tables being viewed. |
| ⑤ Table name | Go to the Table details interface. |
| ⑥ Database name | Go to the Database details interface. |
| ⑦ Location | Go to the location of the corresponding file of Object Storage. |
Table details interface
A basic description of the Table details interface is as follows:

| Component | Description |
|---|---|
| ① Table name | Name of the selected table. |
| ② Basic information component | Displays the name of the database where the table belongs, table description, table location, update date and time, name of the created scanner, creation date and time, table type, and data format information. |
| ③ Details tab component | Consists of the table's schema, schema version, partition, tag, property information, and analytics tabs; you can view details for each item. See Search for tables and view information. |
| ④ Delete button | Delete a table. |
| ⑤ Edit basic information button | Edit the basic information of the table. |
| ⑥ Edit schema button | Edit the schema information. You can edit the tag information on the Tag tab. |
| ⑦ View data button | Go to the Data Query service and view the data in the table. |
Create table
You can create tables in any way you want. To create a table:
- Create table with manual schema definition: Create tables by setting up your own database and schema.
- Create table through scanner: Automatically define the schema through the scanner to create a table.
Create table with manual schema definition
You can create tables by setting up your own database and schema.
To create a table with manual schema definition:
- In the VPC environment of the NAVER Cloud Platform console, navigate to
> Services > Big Data & Analytics > Data Catalog. - Click the Table menu.
- Click the [Create table].
- Click Create table with manual schema definition, and then click [Next].
- Enter basic information.
- Database: Click the dropdown menu to select a database to connect to the table.
- Click the [Create database] to create a database (see Create database).
- Table name: Enter a table name.
- Location: Where the data in the table exists.
- Description: Enter a table description.
- Table type
- Catalog Default: It is the default Hive Table type provided by Data Catalog.
- Apache Iceberg: It is an open table format for the extensive analysis data set, which supports ACID transactions, schema evolution, and Time Travel queries and helps work safely and simultaneously in Spark, Trino, and Hive.
- Database: Click the dropdown menu to select a database to connect to the table.
- Select a data format.
- If you select the Apache Iceberg table type, do not select a data type.
- When you select CSV, you can select or enter the delimiter, data recognition symbol, and character to delete. Also, you can enter the number of header lines to exclude.
- When you select XML, you can enter Row Tag.
- Click the [Add] and enter the schema information to add a user-defined schema.
- For more information about Data type, see Schema data type.
- Click the check box of the schema, and then click the [Delete] to delete an added schema.
- If you do not add a user-defined schema, a schema with the field name "default" will be added automatically.
- Spaces are allowed in field names.
- If you need to enter the partition key, click the Partition component and add a partition key.
- After clicking the [Add], enter the partition key name in the input field to add the partition key.
- Click the check box of the partition to select it, and then click the [Delete] to delete the partition key.
- Spaces are allowed in partition key names.
- You do not need to enter the partition for the Apache Iceberg table type.
- If a tag is necessary, click the Set tag component to add tags.
- After clicking the [Add], enter the tag information in the input field to add the tag.
- For more information about Data type, see Tag data types.
- Click the check box of the tag to select it, and then click the [Delete] to delete a tag.
- Click the [Load tag template] to display the popup for loading tag templates.
- Select and click a tag template, and then click the [Add] to add a tag of the relevant tag template.
- For more information about tag templates, see Tag template.
- After clicking the [Add], enter the tag information in the input field to add the tag.
- Click [Create].
Schema data type
The data types in the schema that can be defined manually and a description of each type are as follows:
| Data type | Description | Whether to be supported by Catalog Default | Whether to be supported by Apache Iceberg |
|---|---|---|---|
| tinyint | Integer data (1 byte). | Y | N |
| smallint | Integer data (2 bytes). | Y | N |
| int | Integer data (4 bytes). | Y | Y |
| bigint | Integer data (8 bytes). | Y | N |
| long | Integer data (8 bytes). | N | Y |
| float | Floating decimal data (4 bytes). | Y | Y |
| double | Floating decimal data (8 bytes). | Y | Y |
| decimal | Fixed decimal data.
|
Y | Y |
| string | String data. | Y | Y |
| char | Fixed-length character type data.
|
Y | N |
| varchar | Variable length character type data.
|
Y | N |
| boolean | Data with true or false values. | Y | Y |
| binary | Binary data in char format. | Y | Y |
| timestamp | Date and time representation data and timestamp. | Y | Y |
| time | N | Y | |
| datetime | Date and time representation data (YYYY-MM-DD HH:MM:SS). | Y | N |
| date | Date representation data (YYYY-MM-DD). | Y | Y |
| fixed | Fixed-length byte array | N | Y |
| uuid | Uniqueness-guaranteed ID (Universally Unique IDentifier). | N | Y |
| list | Collection of data of the same type
|
N | Y |
| array | Collection of data of the same type
|
Y | N |
| map | Data made of pairs of key and value.
|
Y | Y |
| struct | Data including various types of data and related schema.
|
Y | Y |
| uniontype | Type for storing various structured data types.
|
Y | N |
Examples of entering detailed settings for each data type are as follows:
- Example: Detailed settings of array type
ARRAY < STRUCT < place: STRING, start_year: INT > > - Example: Detailed settings of map type
MAP < STRING, ARRAY<STRING> > - Example: Detailed settings of struct type
STRUCT < place: STRING, start_year: INT > - Example: Detailed settings of uniontype type
UNIONTYPE < INT, DOUBLE, ARRAY<STRING>, STRUCT<a:INT,b:STRING> > - Example: Detailed settings of list type
LIST < STRUCT < place: STRING, start_year: INT > >
- When you create the Iceberg table from the Data Catalog console, the table is stored in Object Storage as a field type you selected. However, when you view the table from the console, you can view the information stored in Metastore, so the data type that is not supported by Hive is viewed as a converted type.
- Converted type: List -> array, long -> bigint, time -> string, fixed -> binary, and uuid -> string.
- Partition update
- When you create a table with the partition key, you must perform the partition update information task, as there is no partition key. You can perform the task as follows:
- Data Query: Run call data_catalog.system.sync_partition_metadata('{database name}','{table name}','ADD') syntax.
- Cloud Hadoop Hive: msck repair table {table name}
- This feature is coming soon to Data Catalog, with direct use planned for the second half of 2025.
- When you create a table with the partition key, you must perform the partition update information task, as there is no partition key. You can perform the task as follows:
Create table through scanner
To create a table by automatically defining the schema through the scanner:
- In the VPC environment of the NAVER Cloud Platform console, navigate to
> Services > Big Data & Analytics > Data Catalog. - Click the Table menu.
- Click the [Create table].
- Click Create tables through scanner, and then click [Next].
- Go to the Scanner creation interface.
- Tables are created automatically when you create and run the scanner.
- The table name is automatically set based on the name of the source data.
- For more information about how to create and run a scanner, see Scanner.
- Data files are supported only in the UTF-8 encoding format.
- If you use other encoding formats, data scanning and querying may not work properly.
Search for tables and view information
To search for the created tables and view the information:
- In the VPC environment of the NAVER Cloud Platform console, navigate to
> Services > Big Data & Analytics > Data Catalog. - Click the Table menu.
- Enter the search conditions you want and click
to search for the table. - Click the table to view the information.
- Database: Name of the database where the table belongs.
- Table: Table name.
- Location: The Object Storage location where the data of the table exists.
- Table type: Type of the table (Catalog Default and Apache Iceberg).
- Data format: Format of the scanned data (CSV, XML, JSON, Parquet, ORC, AVRO, MySQL, MongoDB, MSSQL, and PostgreSQL).
- Creation date and time: The date and time when a table was first created.
- Update date and time: The most recent date and time when you edited a table's information.
- [Schema]: View the schema registered to the table.
- For more information about Data type, see Schema data type.
- You can edit schema by clicking the Edit button. For view tables or Apache Iceberg tables, you cannot edit the schema.
- [Schema version]: Click to view the schema version list, and then click a version to view the schema of that version.
- [Partition]: Click to check the partition key and value registered to the table.
- Partition update feature
- The partition value update feature is provided only for tables whose data format is CSV, XML, JSON, Parquet, AVRO, or ORC.
- Update is available only for the Hive partition type, and for the Directory partition type, update is not available. Also, if you run "All synchronization" or "Delete-only synchronization," the partition value may be deleted.
- Only the partition value is updated, and the partition key is not added. If you want to add a partition key, you must scan the table again.
- Options: All synchronization (update all added/deleted partition values), add-only synchronization (update added partition values only), and delete-only synchronization (update deleted partition values only).
- Partition update feature
- [Tag]: Click to view the tags registered to the table.
- Click the Settings button to add or delete tags.
- [Property information]: Click to view the property information about table and source data.
- For more information about Property keys, see Property information.
- [Analytics]: Click to view the analytics information in the field/partition unit.
- If you subscribe to Data Catalog, you can run and view the analytics feature and extract analytics data, such as minimum value, maximum value, average, and so on, in field units.
- Supported data types: Parquet, AVRO, ORC, CSV, and JSON.
- The data you can view is from the most recent successful run.
- If you extract all analytics and then extract analytics for specific columns, those columns are updated, but the other columns remain from the previous run history.Caution
- Unique value estimates the approximate number of data points within an average error range of 5%.
- For CSV files, you cannot estimate the number of null values or true/false entries.
- [Optimization]: Perform the Iceberg file optimization feature (only Iceberg table format exposed).
- Merge files: A feature that combines data files that have been divided into multiple files for more efficient file management and improved performance. Only files smaller than the merge threshold are selected for merging. (The default merge threshold is 100 MB, and its unit is MB.)Caution
- If the merge threshold is too large, files are merged every time the merge operation runs, causing snapshots to be created even when no file changes have occurred.
- It is recommended to set the merge threshold to an appropriate value in proportion to the data file size.
- Manage snapshots: Delete snapshots you don't need to use. Snapshots within the maximum retention period are retained, while those beyond the retention period are deleted. (The minimum value for the maximum retention period is 7 days.)
- Manage orphan files: Organize unused files, such as merged files or incorrectly written files. Orphan files within the maximum retention period are retained, while those beyond the retention period are deleted. (The minimum value for the maximum retention period is 7 days.)
- Merge files: A feature that combines data files that have been divided into multiple files for more efficient file management and improved performance. Only files smaller than the merge threshold are selected for merging. (The default merge threshold is 100 MB, and its unit is MB.)
Property information
If you click the [Property information] tab in the table details component, you can view the property information of the table and source data. The information items and description for each item are as follows:
| Property key | Description |
|---|---|
| EXTERNAL | External storage |
| clusterNo | Cluster number of the scanned Cloud database product |
| connectionId | Scanner connection ID that created a table |
| connectionName | Connection name used to scan the data |
| created_time | Unix time display of the table creation date and time |
| dataFormat | Format of the data source |
| dataType | Type of the data source |
| delimiter | Delimiter if the source data is a CSV file |
| inputFormat | Format for reading files into Object |
| isDirectory | TRUE if the scan target is a directory |
| last_modified_time | Unix time display of the table update date and time |
| numFiles | Total number of files scanned when the scan target is a directory |
| objectstorageContentLength | Sum of ContentLength for files within the scanned Object Storage Content directory |
| objectstorageContentType | Common ContentType for the scanned Object Storage directory |
| objectstorageLastModified | Edited time of the most recently edited file in the scanned Object Storage directory |
| outputFormat | Format for writing files into Object |
| rowTag | XML tag defining a row |
| scannerId | Scanner ID that created a table |
| scannerName | Scanner name that created a table |
| serializationLib | Serializer and Deserializer Library |
| serde.separatorChar | Delimiter to determine a schema of the data |
| serde.quoteChar | Symbol to recognize a string as data |
| serde.escapeChar | Character to delete the character included in the string value recognized as data |
| skip.header.line.count | Number of header lines to exclude |
| totalSize | Total amount of data scanned when the scan target is a directory |
| transient_lastDdlTime | Unix time display of the table DDL's last change date and time |
| mysqlCollation | String sort settings of a MySQL table |
| mysqlDataSize | Data size of a MySQL table |
| mysqlIndexSize | Index size of a MySQL table |
| mysqlIndexes | Number of indexes of a MySQL table |
| mysqlRows | Number of saved rows (records) of a MySQL table |
| mysqlTableSize | Total size of a MySQL table |
| mssqlCollation | String sort settings of a MSSQL table |
| mssqlDataSize | Data size of a MSSQL table |
| mssqlIndexSize | Index size of a MSSQL table |
| mssqlIndexes | Number of indexes of a MSSQL table |
| mssqlRows | Number of saved rows (records) of a MSSQL table |
| mssqlTableSize | Total size of a MSSQL table |
| postgresqlCollation | String sort settings of a PostgreSQL table |
| postgresqlDataSize | Data size of a PostgreSQL table |
| postgresqlIndexSize | Index size of a PostgreSQL table |
| postgresqlIndexes | Number of indexes of a PostgreSQL table |
| postgresqlRows | Number of saved rows (records) of a PostgreSQL table |
| postgresqlTableSize | Total size of a PostgreSQL table |
| mongodbAvgObjSize | Average document size of a MongoDB collection |
| mongodbFreeStorageSize | Size of available storage space in a MongoDB database |
| mongodbIndexSize | Index size of a MongoDB collection |
| mongodbIndexes | Number of indexes of a MongoDB collection |
| mongodbRowCount | Number of saved documents (records) of a MongoDB collection |
| mongodbSize | Size of a MongoDB database |
| mongodbStorageSize | Storage size of a MongoDB database |
| mongodbTotalSize | Total size of a MongoDB database |
| compressionType | Zipped file extension when the scanned file is a zipped file |
| metadata_location | Directory of the metadata file added when using an Iceberg table |
For more information about other property information of an Iceberg table beyond the list above, see the Iceberg document.
Edit table
To edit the information of the created table or to select the schema version:
The database where the table name and the table are included cannot be edited.
If the table type is Apache Iceberg or view tables, you cannot edit the schema.
Edit basic information
- In the VPC environment of the NAVER Cloud Platform console, navigate to
> Services > Big Data & Analytics > Data Catalog. - Click the Table menu.
- Click the table name to go to the Table details interface.
- Click [Edit] on the basic information component.
- Edit the information of the table in the Basic information edit popup.
- You can edit the location of the source data, table description, and source data format.
- Click [Save].
Edit schema information
- In the VPC environment of the NAVER Cloud Platform console, navigate to
> Services > Big Data & Analytics > Data Catalog. - Click the Table menu.
- Click the table name to go to the Table details interface.
- Click [Edit] on the schema tab.
- When the Edit schema interface appears, you can edit the field name, data type, and description. You can also edit them manually using the Edit JSON button.
- Click the Version dropdown menu in the Schema component to select the schema version you want to edit.
- When you edit JSON, the name, type, typeValue, and description items must exist as follows:
[ { "name": "col_name", "type": "decimal", "typeValue": "(10,2)", "description": "catalog decimal" } ] - Click [Save].
Edit property information
- In the VPC environment of the NAVER Cloud Platform console, navigate to
> Services > Big Data & Analytics > Data Catalog. - Click the Table menu.
- Click the table name to go to the Table details interface.
- Click [Edit] on the property information tab.
- You can edit inputFormat, outputFormat, and serializationLib to use in the table.
- Note that if you edit it to a library that is not compatible with the data format, you will not be able to run queries in Data Query, Hive, or Spark.
- Click [Save].
Delete table
To delete the created table:
- When you click Delete, all related meta information, such as the table's version information, tags, and properties, is deleted.
- If there is no EXTERNAL=true value in the property information (if it is a Managed table), the actual data of Object Storage may be deleted.
- You cannot recover deleted tables and data.
If you delete an Iceberg type table, the actual Object Storage data is not deleted.
- In the VPC environment of the NAVER Cloud Platform console, navigate to
> Services > Big Data & Analytics > Data Catalog. - Click the Table menu.
- Click the name of the table you want to delete to go to the Table details interface.
- Click [Delete].
- When the notification popup appears, read the cautions and click [Delete].