Available in VPC
Describes the Job menu interface configuration, action editor interface configuration, action creation procedure, and settings procedure of action run options.
Action is a data processing task that extracts, converts, and loads vast amounts of data.
Data conversion supported by Data Flow has property definition, property selection, column merging, filter, row merging, count, edit property name, delete replica, and fill in empty values.
The source node and target node can specify Object Storage and Data Catalog from NAVER Cloud Platform. We plan to support the integration of NAVER Cloud Platform's Cloud DB and the on-premise database of the customers in the future.
Action editor is a GUI interface that allows the configuration of an ETL task without a code. It is composed of the source node, the convert node, and the target node in a diagram.
Job interface
The Job interface includes the following components:

| Component | Description |
|---|---|
| ① Menu name | Current menu name. |
| ② Basic features | Features displayed when initially entering the Job menu.
|
| ③ Post-creation features | Features provided after creating an action.
|
| ④ Job list | Created Job list. Click [View details] for each action to go to the Job editor interface. |
| ⑤ Search bar | Search created actions based on action name. |
View action information
To view the created action information:
- In the VPC environment of the NAVER Cloud Platform console, navigate to Menu > Services > Big Data & Analytics > Data Flow.
- Click Job menu.
- When the action list appears, check the summary.
- Action name: The unique name you entered when creating the Job.
- Last run time: Time of the most recent Job run. The latest date and time of on-demand run or reservation run due to trigger.
- Last run status: Status of the most recent Job run. The status of the action last run.
- READY: Ready to run Job.
- RUNNING: Job is currently running.
- COMPLETED: Successfully ran Job.
- FAILED: Failed to run Job.
- Status: Current Job status.
- RUNNABLE: Job in runnable status.
- RUNNING: Job is currently running.
- DELETED: Deleting Job or deleted.
- DRAFT: The incomplete status of the Job editing. Click [Temporary save] on the editor interface to save temporarily.
- EDITING: Job is being edited (Need validation).
- STOPPED: Job is being stopped.
- Update date and time: Time of the most recent Job update. The latest date and time of editing the action component from action editor.
- [Details]: View Job details.
- Click [View details] to view the details on the action configuration.
- Move to Action editor interface configuration to check the settings criteria and node configuration of the action.
Create action
You can configure an action in either Visual Mode or Script Mode.
In Visual Mode, you can configure an action by adding and configuring source nodes, convert nodes, and target nodes.
In Script Mode, you can write and run PySpark code directly.
To specify the source node and target node, Data Catalog and Object Storage must be subscribed to. If you are not subscribed to Data Catalog and Object Storage, subscribe to the services first.
Create an action in Visual Mode
To create a new Visual Mode action:
- In the VPC environment of the NAVER Cloud Platform console, navigate to Menu > Services > Big Data & Analytics > Data Flow.
- Click Job menu.
- Click [Create actions].
- Select Visual Mode and click [Create actions].
- When the action editor interface appears, add the source node, convert node, and target node in the [Action configuration] tab to configure the action description.
- For more information about the editor interface configuration, see Action editor interface configuration.
- Click [Source] on the action editor interface and select Object Storage or Data Catalog from the menu that appears.
- Object Storage: Use NAVER Cloud Platform Object Storage bucket as the data source.
- Data Catalog: Use NAVER Cloud Platform Data Catalog bucket as the data source.
- JDBC: Use a JDBC connection registered in NAVER Cloud Platform’s Data Catalog as the data source.
- The availability of detailed types for tables stored in Data Catalog is as follows:
- Among the tables located in Object Storage, tables in Parquet, JSON, and CSV data formats are supported.
- Tables in MySQL, PostgreSQL, and MongoDB data formats scanned through a JDBC connection are supported.
- Tables scanned through a Cloud_db_for connection are not supported.
- The Iceberg table type is not supported.
- Select the source node added in 6, and enter the Properties information and Detailed settings of the source node on the right side of the interface.
- For more information about the items to enter, see Source node configuration.
- The number of source nodes you can add depends on the convert node. For more information, see Configure convert node.
- Click [Convert] on the action editor interface and select the convert action from the menu that appears.
- Define property: Define the schema of the target data using the source data. For more information on the settings item, see Define property.
- Select property: Select the target data configuration property from the property key of source data set. For more information on the settings item, see Select property.
- Merge column: Merge 2 sets of data. For more information on the settings item, see Merge columns.
- Filter: Filter the entered data set and create a new data set. For more information on the settings item, see Filter.
- Merge rows: Merge 2 or more rows of data sets with identical schema. For more information on the settings item, see Merge rows.
- Aggregation: Calculate the average, total, maximum, and minimum of the selected field and row and create a new field with the result value. For more information on the settings item, see Aggregation.
- Edit property name: Edit the name of a specific property key from the data. For more information on the settings item, see Edit property name.
- Delete duplicates: Remove duplicated data column from the data source. For more information on the settings item, see Remove duplicates.
- Fill in empty values: Fill in the value of the omitted column from the data with a set value. For detailed information on the settings item, see Fill in empty value.
- SQL query: Use a SQL Select statement to select the columns to convert. For more information of the setting item, see SQL query.
- Select the convert node added in step 8 and enter Properties information and Detailed settings on the right side of the interface.
- For more information about the item to enter, see Convert node configuration.
- The number of added convert nodes is 1 per action.
- Click [Target] on the action editor interface and select Object Storage or Data Catalog from the menu that appears.
- Object Storage: Use NAVER Cloud Platform Object Storage bucket as the data storage.
- Data Catalog: Use NAVER Cloud Platform Data Catalog as the data storage.
- JDBC: Use a JDBC connection registered in NAVER Cloud Platform’s Data Catalog as the data source.
- Select the target node added in 10 and enter Properties information on the right side of the interface.
- For more information about the item to enter, see Target node configuration.
- Check the set schema from the Preview column.
- Click [Complete] on the action editor interface.
- Creating an action is complete, so the interface converts to the action list.
- The action created above is added to the action list.
- The created action is registered as NAVER Cloud Platform resource. For more information, see Resource Manager concepts.
Create an action in Script Mode
To create a new Script Mode action:
- In the VPC environment of the NAVER Cloud Platform console, navigate to Menu > Services > Big Data & Analytics > Data Flow.
- Click Job menu.
- Click [Create actions].
- Select Script Mode and click [Create actions].
- When the editor interface appears, write PySpark code directly in the [Script] tab.
- Click [Complete] in the editor interface.
- Creating an action is complete, so the interface converts to the action list.
- The action created above is added to the action list.
- The created action is registered as NAVER Cloud Platform resource. For more information, see Resource Manager concepts.
- Select an action from the action list and click [Run] or click [View details] > [Run] to run the action on-demand.
- To reserve and run actions, create a workflow and connect to a trigger. For more information about workflow creation, see Create workflow.
- A bucket is automatically created in Object Storage when creating an action. The running log files and script files of the action are saved in the bucket.
When you create a Script Mode Job, the Spark code is stored in plain text in your Object Storage script path.
To ensure that your JDBC URL /ID / password are not stored in plain text, we recommend using Catalog Connection whenever possible. In addition, storing them in an encrypted bucket enables safer management of sensitive information.
Use Data Catalog Connection in Script Mode
In Script Mode, you can read and write DBMS data using Data Catalog JDBC Connection.
- Examples by DMBS
- MySQL
# Examples of MySQL DB JDBC connection df = spark.read.format('jdbc') .option('catalog_connection_name', 'connection_name') .option('dbtable', 'table_name') .option('driver', 'com.mysql.cj.jdbc.Driver') .load()- PostgreSQL
# Examples of MySQL DB JDBC connection df = spark.read.format('jdbc') .option('catalog_connection_name', 'data_catalog_connection_name') .option('dbtable', 'schema.table_name') .option('driver', 'org.postgresql.Driver') .load()- MongoDB
# Examples of MySQL DB JDBC connection df = spark.read.format('jdbc') .option('catalog_connection_name', 'data_catalog_connection_name') .option('collection', 'collection_name') .load()
Use Iceberg tables in script mode
You can use the Iceberg table type in script mode. See the following sample example code.
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, StringType, IntegerType, DoubleType, TimestampType
from pyspark.sql.functions import col, lit, current_timestamp
spark = SparkSession.builder.appName("Iceberg Guide").enableHiveSupport().getOrCreate()
# Create a database in the Data Catalog
# Modify the S3 bucket path to match your environment.
S3_BUCKET = "test-bucket"
S3_DB_PATH = "s3a://" + S3_BUCKET + "/iceberg_db"
spark.sql("CREATE DATABASE IF NOT EXISTS iceberg_db LOCATION '" + S3_DB_PATH + "'")
spark.sql("USE iceberg_db")
# Create an Iceberg table
# Create using SQL DDL
S3_TABLE_PATH = "s3a://" + S3_BUCKET + "/iceberg_db/orders"
spark.sql("""
CREATE TABLE IF NOT EXISTS orders (
order_id INT,
customer STRING,
product STRING,
quantity INT,
price DOUBLE,
order_date TIMESTAMP
)
USING iceberg
PARTITIONED BY (days(order_date))
LOCATION '""" + S3_TABLE_PATH + """'
TBLPROPERTIES (
'write.format.default' = 'parquet',
'write.metadata.delete-after-commit.enabled' = 'true',
'write.metadata.previous-versions-max' = '3'
)
""")
print("Table creation completed: orders")
# Insert data (INSERT)
# Method A: SQL INSERT
spark.sql("""
INSERT INTO orders VALUES
(1, 'Kim Cheolsu', 'Laptop' 1, 1500000.0, timestamp '2026-04-01 10:00:00'),
(2, 'Lee Younghee', 'Keyboard', 2, 89000.0, timestamp '2026-04-01 11:30:00'),
(3, 'Park Minsu', 'Monitor', 1, 450000.0, timestamp '2026-04-02 09:00:00'),
(4, 'Choi Jieun', 'Mouse', 3, 35000.0, timestamp '2026-04-02 14:20:00'),
(5, 'Jung Haneul', 'Headset', 1, 120000.0, timestamp '2026-04-03 16:45:00')
""")
# Method B: Insert using the DataFrame API
new_data = [
(6, "Han Soyoung", "Webcam", 1, 85000.0, "2026-04-04 10:00:00"),
(7, "Oh Junhyeok", "USB Hub", 2, 25000.0, "2026-04-04 13:00:00"),
]
schema = StructType([
StructField("order_id", IntegerType(), False),
StructField("customer", StringType(), False),
StructField("product", StringType(), False),
StructField("quantity", IntegerType(), False),
StructField("price", DoubleType(), False),
StructField("order_date", StringType(), False),
])
df_new = spark.createDataFrame(new_data, schema)
df_new = df_new.withColumn("order_date", col("order_date").cast(TimestampType()))
df_new.writeTo("orders").append()
print(" Data input completed: 7 records")
# View data (SELECT)
# View all records
print("\n Full order list:")
spark.sql("SELECT * FROM orders ORDER BY order_id").show(truncate=False)
# View using the DataFrame API
print("Orders of KRW 100,000 or more:")
spark.table("orders") \
.filter(col("price") >= 100000) \
.select("order_id", "customer", "product", "price") \
.orderBy("price", ascending=False) \
.show(truncate=False)
# View aggregated query
print(" Total order amount by customer:")
spark.sql("""
SELECT customer,
COUNT(*) AS order_count,
SUM(price * quantity) AS total_amount
FROM orders
GROUP BY customer
ORDER BY total_amount DESC
""").show(truncate=False)
print(" Snapshot history:")
spark.sql("SELECT * FROM spark_catalog.iceberg_db.orders.snapshots").show(truncate=False)
# View data at a specific point in time (time travel)
# spark.sql("SELECT * FROM orders TIMESTAMP AS OF '2026-04-02 00:00:00'").show()
# View using a specific snapshot ID
# spark.sql("SELECT * FROM orders VERSION AS OF <snapshot_id>").show()
# Delete specific rows
spark.sql("DELETE FROM orders WHERE order_id = 7")
print("Deletion of order_id=7 completed")
# Tip: MERGE INTO (Upsert)
# When new data is added, existing rows are updated (UPDATE), and new rows are inserted (INSERT).
upsert_data = [
(1, "Kim Cheolsu", "Laptop", 1, 1400000.0, "2026-04-01 10:00:00", "Refund"), # Existing → UPDATE
(8, "Yoon Seojun", "Tablet", 1, 680000.0, "2026-04-05 09:30:00", "Preparing"), # New → INSERT
]
df_upsert = spark.createDataFrame(upsert_data, schema.add("status", StringType()))
df_upsert = df_upsert.withColumn("order_date", col("order_date").cast(TimestampType()))
df_upsert.createOrReplaceTempView("incoming_orders")
spark.sql("""
MERGE INTO orders t
USING incoming_orders s
ON t.order_id = s.order_id
WHEN MATCHED THEN
UPDATE SET t.price = s.price, t.status = s.status
WHEN NOT MATCHED THEN
INSERT *
Action editor interface
The Action editor interface includes the following components:
The action editor interface appears when clicking [Create actions] or [View details] from the action list.

| Component | Description |
|---|---|
| ① Basic information | Enter the action name. |
| ② Feature tab | Select a feature you want to use.
|
| ③ Show node component | Add source node, convert node, and target node. Each node is expressed as a box, and the boxes with connecting lines depict the parent node and sub node. |
| ④ Settings component | Property settings of each node. Detailed settings if required. For more information about the node settings, see Source node configuration, Convert node configuration, and Target node configuration. |
| ⑤ Toggle button | Depending on the edit status, toggle between [Temporary storage] and [Run].
|
After adding the action configuration components (source, convert, or target) from the show node component (③ component) of the action editor [Action configuration] tab, enter the detailed settings and properties of action configuration components from the settings component (④ component) of the action editor [Action configuration] tab.
The [Complete] button becomes enabled when at least 1 of the source node, convert node, or target node is added. The number of source nodes you can add depends on the convert node.
Configure source node
Specifies the original node of the data to be converted through source node configuration.
On the action editor, add a [Source] node. Then, enter Properties information and Detailed settings on the right side of the interface.
Selectable source nodes are Object Storage, Data Catalog, and JDBC (MySQL, PostgreSQL, and MongoDB). (As of August 2025)
We plan to support the integration of NAVER Cloud Platform's Cloud DB and the on-premise database of the customers in the future.
Source node properties information
Property information entering items differ by type of source node.
- If the source node is Object Storage
- Name: Enter the name of the source node.
- Data store: Object Storage has been selected. When changed, the input fields will update accordingly.
- Bucket: Select the bucket that includes the original data to work on from Object Storage.
- Path: Specify the specific path of the Object Storage bucket. The data is extracted based on the specified path of the sub data, and when the data is not entered, the data from all the sub path of the bucket is extracted.
- Data type: Enter the format of the original data. Select from JSON (NDJSON), CSV, or Parquet.
- When the source node is Data Catalog
- Name: Enter the name of the source node.
- Data store: Data Catalog has been selected.
- Database: Select database. A database is a set of tables that define metadata.
- Select table: Select a table. Table provides metadata that defines the schema of the data.
- Schema version: Select the schema version.
- When the source node is JDBC
- Name: Enter the name of the source node.
- Data store: JDBC has been selected. When changed, the input fields will update accordingly.
- Connection: Select a connection in Data Catalog.
- Table: Enter the DB's table name.
Detailed settings of source node
The detailed settings criteria differ by the type of source node.
- If the source node is Object Storage: Configure the schema table to use as source data.
- Click [Add] to add a field, and specify the data type and field name.
- For more information about data types, see Schema data type.
- If the source node is Data Catalog: The schema table read from Data Catalog is shown.
- You cannot add or edit the schema table configuration field. You can delete a specific property key.
- If the source node is JDBC: Configure the schema table to use as source data.
- If you selected JDBC, 211.188.48.218 in the access IP item must be registered where you want to access it.
- List of accessible DBMS: MySQL, PostgreSQL, and MongoDB.
- Click [Add] to add a field, and specify the data type and field name.
- For more information about data types, see Schema data type.
Configure convert node
On the action editor, add a [Convert] node. Then, enter Properties information and Detailed settings to define the data conversion action on the right side of the interface.
The conversion setting item differs by type of conversion action. Describes the settings item for each type of action.
Define property
Define the schema of the target data using the source data.
- [Properties information] tab: Define the properties of the conversion action.
- Name: Enter the name of the convert node.
- Convert: The type of conversion action is selected. When changed, the input fields will update accordingly.
- Parent node: Specify 1 source node to be connected with a convert node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
- [Detailed settings] tab: Map the source node schema to the target node schema.
- The source node property key that appears in the Parent node field and the sub node property key that appears in the Sub node field are mapped.
- The Sub node field is enabled only when there is a target node added. If a target node is not added, the selected value does not appear.
- The Data type can be edited. The data type of the source node can be edited from the target node.
- For more information about data types, see Schema data type.
Select property
Select the properties to configure in the target data from the source data property key. The property key that is not selected is excluded from the target data.
- [Properties information] tab: Define the properties of the conversion action.
- Name: Enter the name of the convert node.
- Convert: The type of conversion action is selected. When changed, the input fields will update accordingly.
- Parent node: Specify 1 source node to be connected with a convert node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
- [Detailed settings] tab: From the parent node's property keys, select at least 1 property key to send to the sub node.
Merge columns
Merge 2 sets of data columns. You can select up to 2 parent nodes.
The schema of the data changes after merging.
- [Properties information] tab: Define the properties of the conversion action.
- Name: Enter the name of the convert node.
- Convert: The type of conversion action is selected. When changed, the input fields will update accordingly.
- Parent node: Specify 2 nodes to connect to the convert node. 2 source nodes must be created in advance.
- [Detailed settings] tab: Set the column merge rules.
- Type: Select 1 type of column merging from Internal join, Left join, Right join, or External join.
- Internal join: Merge the columns of 2 sets of data regarding rows satisfying the merging condition. Rows that do not satisfy the merging condition cannot be merged. If a condition is not added, merge columns regarding all the rows of the 2 data sets.
- Left join: Merge columns based on the left data set rows. Column merging includes the rows of data set on the right side satisfying all of the merging conditions of all rows of data set on the left side.
- Right join: Merge columns based on the right data set rows. Column merging includes the rows of data set on the left side satisfying all of the merging conditions of all rows of data set on the right side.
- External join: Merge columns including all rows of 2 data sets.
- Condition: Select the property key for mutual comparison from each data set. A condition need not be set.
- Click [Add] to create the left node field / comparison operator / right node field table.
- In the Left node field, select the property key of the left data set.
- In the Right node field, enter the property key of the right data set.
- If the property key of the right node field and that of the left node field are identical, merge columns for the corresponding rows.
- Prefix: The name of the left node field and right node field cannot be duplicated, so a prefix is added automatically to the right node field name. You can change the name of the prefix.
- Type: Select 1 type of column merging from Internal join, Left join, Right join, or External join.
Filter
The source data gets filtered and creates the target data. Rows that do not satisfy the filter condition are excluded from the target data.
- [Properties information] tab: Define the properties of the conversion action.
- Name: Enter the name of the convert node.
- Convert: The type of conversion action is selected. When changed, the input fields will update accordingly.
- Parent node: Specify 1 source node to be connected with a convert node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
- [Detailed settings] tab: Set the filtering condition.
- Filter type: Select AND or OR. If there are many filters, the filters are combined.
- Condition: Set the filtering condition.
- Click [Add] to create Field / Condition / Value tables.
- Example: value == 0.7: When the value of the value field is a numeric type and 0.7, the field is added to the target data.
- Example: value > Car: If the value field is of character type and its ASCII code value is "C" (the first letter of the condition) or higher, the field is added to the target data.
Merge rows
Merge 2 source data with identical schema. You must check if the schema structure of the 2 source data is the same before merging rows.
If the schema is the same, the merged data's column is identical to before the merge, and a row is added.
- [Properties information] tab: Define the properties of the conversion action.
- Name: Enter the name of the convert node.
- Convert: The type of conversion action is selected. When changed, the input fields will update accordingly.
- Parent node: Specify 2 source nodes to connect to the convert node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
- Detailed settings > Type: Set rule for column merging.
- Merge all: Does not exclude duplicated rows, and combines all the rows. It is case-sensitive when determining if a row is duplicated.
- Merging after removing duplicates: Combine all the rows after removing duplicated rows.
Aggregation
Save the calculated average, total, maximum, and minimum of the selected field and row from the source data and create a new field with the result values.
- [Properties information] tab: Define the properties of the conversion action.
- Name: Enter the name of the convert node.
- Convert: The type of conversion action is selected. When changed, the input fields will update accordingly.
- Parent node: Specify 1 source node to be connected with a convert node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
- [Detailed settings] tab: Select the data field to aggregate and set the aggregation function and result field to apply to the row.
- Grouping standard: Specify the reference field that determines the aggregation range. Example: Aggregates data where the value field is AAA.
- Aggregation condition: Specify the aggregation function and result field.
- Click [Add] to create Field, Condition, and Result field tables.
- Field: Select the source data property key to apply aggregation to.
- Condition: Select the aggregation function to apply to the data in the selected range. AVG/SUM/MAX/MIN.
- Result field: Specify the new field name to store the aggregation result.
Rename property
Edit the specific property key name in the data.
- [Properties information] tab: Define the properties of the conversion action.
- Name: Enter the name of the convert node.
- Convert: The type of conversion action is selected. When changed, the input fields will update accordingly.
- Parent node: Specify 1 source node to be connected with a convert node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
- [Detailed settings] tab: On the Current key name / Edited key name table read from the source node schema, edit the Edited key name of the desired property key.
Remove duplicates
Remove duplicated data column from the data source. It is case-sensitive when determining duplicates. Because a row is deleted, the schema is not edited due to this conversion.
- [Properties information] tab: Define the properties of the conversion action.
- Name: Enter the name of the convert node.
- Convert: The type of conversion action is selected. When changed, the input fields will update accordingly.
- Parent node: Specify 1 source node to be connected with a convert node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
- Detailed settings > Duplicate type: Select duplicate removal options.
- Delete if all rows are identical: Delete rows only when all of the field values are identical. It is case-sensitive when determining if a row is duplicated.
- Delete if specific field values match: Removes entries only when specific field values match. Items are removed randomly regardless of order.
Fill in empty values
Fill in missing column values in the data with a specified value.
- [Properties information] tab: Define the properties of the conversion action.
- Name: Enter the name of the convert node.
- Convert: The type of conversion action is selected. When changed, the input fields will update accordingly.
- Parent node: Specify 1 source node to be connected with a convert node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
- Detailed settings: Define the property key to the existing omitted data and set a replacement value.
- Target key of omitted data: Delete except for the property key with existing omitted data.
- Replacement value: Enter the replacement value for the omitted data.
SQL query
Use a SQL Select statement to select the columns to convert.
- [Properties information] tab: Define the properties of the conversion action.
- Name: Enter the name of the convert node.
- Convert: The type of conversion action is selected. When changed, the input fields will update accordingly.
- Parent node: Specify 1 source node to be connected with a convert node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
- Detailed settings: Write a SQL query statement.
- Convert SQL: Enter a table nickname to use in the query statement.
Configure target node
Specify the target node of the data to be converted through the target node configuration.
On the action editor, add a [Target] node. Then, enter Properties information and Detailed settings on the right side of the interface.
Selectable target nodes are Object Storage, Data Catalog, and Cloud DB for MySQL (As of January 2024)
We plan to support the integration of NAVER Cloud Platform's Cloud DB and the on-premise database of the customers in the future.
Target node property data
The Property information entering items differ by target node type.
- If the target node is Object Storage
- Name: Enter the name of the target node.
- Data store: Object Storage has been selected. When changed, the input fields will update accordingly.
- Bucket: Select the bucket to save the conversion data from Object Storage.
- Prefix: Specify the specific path of the Object Storage bucket. Save the result data on the specified sub path.
- Data type: Enter the format of the target data. Select from JSON, CSV, or Parquet.
- Duplicate processing option: When data exists in the target path, select how to process. When you select Add data, data is added in the path you entered; when you select Overwrite, it deletes the path and enters data. And when you select Ignore (no update), it enters data only when the path is empty.
- Parent node: Specify 1 source node to be connected with a convert node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
- Number of output files: You can specify the number of output files.
- If the target node is Data Catalog
- Name: Enter the name of the target node.
- Data store: Data Catalog has been selected. When changed, the input fields will update accordingly.
- Database: Select database. A database is a set of tables that define metadata.
- Select table: Select the table to save the schema modified through the convert node.
- Schema version: Select the schema version.
- Duplicate processing option: When data exists in the target path (table), select how to process. When you select Add data, data is added in the path you entered (table); when you select Overwrite, it deletes the path (table) and enters data. And when you select Ignore (no update), it enters data only when the path (table) is empty or does not exist.
- Parent node: Specify 1 convert node to be connected with a target node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
- Number of output files: When DataCatalog is ObjectStorage type, you can specify the number of output files.
- When the target node is JDBC
- Name: Enter the name of the target node.
- Data store: JDBC has been selected. When changed, the input fields will update accordingly.
- Connection: Select a connection in Data Catalog.
- Table: Enter the table to save the schema modified through the convert node.
- Duplicate processing option: When data exists in the target table, select how to process. When you select Add data, data is added in the table you entered; when you select Overwrite, it deletes the table and enters data. And when you select Ignore (no update), it enters data only when there is no table.
- Parent node: Specify 1 source node to be connected with a convert node. When selecting a Data node, you can select 1 of the source nodes; when selecting Process node, you can select 1 of the convert nodes.
Preview column
You can preview the schema of data to be saved in the target node.
The supported types of the source and target are as follows (As of August 2025):
String, Binary, Boolean, Date, Timestamp, Byte, Short, Int, Long, Float, Double, Decimal, Array, Map, Struct, Null
When converting to MySQL(Postgres), some types are fixed as follows:
Varchar -> varchar(250), Char -> char(64), Array, Map, Struct, String -> mediumtext
Preview data
You can preview some data to be processed in the source node and the convert node.
Action parameters
You can use the global parameters provided when you run tasks on some items in the action node, or you can directly define and use the parameters.
The supported items are as follows (As of July 2024):
Object Storage source: Path
Object Storage target: Prefix
JDBC source/target: Table
Set action run options
You can set the action run option after you create an action. To set an action run option:
- In the VPC environment of the NAVER Cloud Platform console, navigate to Menu > Services > Big Data & Analytics > Data Flow.
- Click Job menu.
- Select a specific action from the action list and click [Run].
- When the run option popup appears, set the running option.
- Run container: Set how many containers to use for distributed actions.
- Number of retries: Set the maximum number of retries upon action failure.
- Timeout: Set the waiting time of the action result when the action is executed once.
- Script path: Path for the action command script to be saved. Automatically specifies the sub path of the automatically created Object Storage bucket when an action is created.
- Run log: Path where the action run history is saved. Automatically specifies the sub path of the automatically created Object Storage bucket when an action is created.
- Role name: Role of Sub Account in running actions.
- Click [Run] or [Save option without running].
- If you click [Run], the action's status is changed to Running on the action list.
If you are using Cloud DB as the source or target node, check if the DB server's network environment or user settings permit access through the following DataFlow access IP addresses:
10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
- In Server > ACG > ACG settings, add to inbound rules.
- In VPC > Network ACL > ACL Rule > Rule settings, add to inbound rules.
- In Cloud DB for MySQL > Manage DB > Manage DB user menu, add DB user.
- 10.%, 172.%, 192.168.%
View action run list
To view the action run history:
- In the VPC environment of the NAVER Cloud Platform console, navigate to Menu > Services > Big Data & Analytics > Data Flow.
- Click Job menu.
- Click [View details] for the specific action from the action list.
- When the action editor interface appears, click the [Run list] tab.
- You can check the action run list for the recent 1 month. The action run list is kept for 90 days.
- The following items are the run list you can view:
- Job name (ID): Unique action name (Job ID) entered by the user when creating the action.
- Run status: Job run result. Displays one of the following values: Succeeded, Failed, Running, or Pending.
- Run log: Click [Details] to move to the location of the action run history file.
- Container: Number of containers set in the action run option.
- Trigger: When an action is connected to a trigger (schedule) file, it can be searched.
- Run start date and time: The starting date and time of the Job. The run date and time if run due to on-demand or trigger.
- Run end date and time: The ending date and time of the Job. The end date and time if run due to on-demand or trigger.
- Run preparation time: The time required to prepare before the Job runs.
- Run time: Total time taken for the Job run.
- Number of retries: Number of Job run retries made.
- If the Job is run on-demand only without workflow configuration, the run history can be viewed from the run list of the action interface only.
- Actions with workflow configuration can be viewed from the run list of the workflow interface, including the run list of the Job interface.
Delete action
To delete an action:
- In the VPC environment of the NAVER Cloud Platform console, navigate to Menu > Services > Big Data & Analytics > Data Flow.
- Click Job menu.
- Select the specific action from the action list and click [Delete].
- The action is deleted from the action list.
- The workflow that includes the deleted actions won't be run even if reserved by a trigger.
- Data files support UTF-8 encoding only.
- If the encoding is different, Job may not work properly.