Available in VPC
Use Object Storage bucket access log
You can analyze access logs collected in Object Storage using Data Query.
This example shows the process of analyzing Object Storage bucket access logs with various template queries and visualizing them with business intelligence (BI) tools.
Feature summary
- Create table with scanner - Using data as table by Data Catalog
- Run query and view results by Data Query
- Visualize through integration with BI by Data Query
1. Configure Object Storage bucket access log
You can check the activation status and creation of Object Storage bucket access logs.
Activate bucket access log
- When you activate the bucket access log in Object Storage, you can save the access history of the bucket in log files.
- You can use this log data as an analysis target in Data Query.
- Example of access log settings
Item Set value Bucket name Bucket to analyze access log Bucket to save to Bucket where the access logs are to be saved Log file prefix bucket_access_log_ - For more information on access log settings, see Object Storage user guide.
Create access log
Access the bucket through the client so that the access log data is actually created.
- When selecting the bucket where the access logs are to be saved, it is recommended to use a new bucket with no other data included. If there are other files in the bucket, queries to create and view tables may not operate as expected.
- Access log files are not created in real time. For more information on the creation time and cycle of the access log, see Object Storage user guide.
2. Create data table to be analyzed
Create data table (use data as table)
- Use the scanner feature of Data Catalog to create a source data table that refers to data saved in Object Storage.
- Refer to the following setting example to enter the values, and click the Start button to add a table:
- Scanner settings example
Item Set value Data type Object Storage Path s3a://[Bucket specified from access log configuration]/ Scan range Partial, 10 items Execution cycle On-demand Pattern Use: Include, bucket_access_log_* Partition setting All partitioning form Collection option Update table definition Table merging True Table number limit 1 - For more information about scanner, see the Data Catalog user guide.
Check creation of data table
It may take several seconds to minutes to create a table with a scanner. In the table list on the left, you can search to verify whether a table with the storage bucket name specified in the access log settings has been created.
3. Use template query to analyze access log data
Use provided template queries to analyze the access log data. You can run queries through the console editor, BI, or CLI, or integrate with the user application business logics using JDBC or SDK.
- CLI user guide : How to connect and use CLI.
- JDBC user guide : Supports DB tool and BI connection.
- SDK user guide : Integrates with user application.
- External BI connection guide : Integrates with BIs, such as Superset and Zeppelin.
Create VIEW table
To use template queries, run the following query to create a VIEW table: Instead of creating a VIEW table, you can use the scanned data table name in the template query to see the same result.
CREATE OR REPLACE VIEW
"data_catalog"."default"."bucket_access_log"
AS SELECT *
FROM
"data_catalog"."default"."scanned_table"; -- replace database/table name with yours
Provided template query
- View data for certain period (Range query)
SELECT
date_trunc('second', from_unixtime(CAST(timestamp_start AS BIGINT) / 1000.0)
AT TIME ZONE 'Asia/Seoul') AS log_time,
remote_user,
request_method,
request_type,
status,
object_name,
response_length
FROM
data_catalog.default.bucket_access_log
WHERE
from_unixtime(CAST(timestamp_start AS BIGINT) / 1000.0) AT TIME ZONE 'Asia/Seoul'
BETWEEN TIMESTAMP '2025-01-01 00:00:00 Asia/Seoul'
AND TIMESTAMP '2025-12-31 23:59:59 Asia/Seoul'
ORDER BY
log_time;
- Top 10 most accessed objects
SELECT
object_name,
COUNT(*) AS access_count
FROM
data_catalog.default.bucket_access_log
WHERE
object_name IS NOT NULL
GROUP BY
object_name
ORDER BY
access_count DESC
LIMIT 10;
- Top 10 most accessed remote_user (including access_key type)
SELECT
remote_user,
COUNT(*) AS request_count
FROM
data_catalog.default.bucket_access_log
GROUP BY
remote_user
ORDER BY
request_count DESC
LIMIT 10;
- Request distribution by HTTP status code
SELECT
status,
COUNT(*) AS request_count
FROM
data_catalog.default.bucket_access_log
GROUP BY
status
ORDER BY
request_count DESC;
- Number of request by method per 10 minutes
SELECT
from_unixtime(floor(CAST(timestamp_start AS DOUBLE) / 1000.0 / 600.0) * 600.0)
AT TIME ZONE 'Asia/Seoul' AS time_slot,
request_method,
COUNT(*) AS request_count
FROM
data_catalog.default.bucket_access_log
GROUP BY
1, request_method
ORDER BY
1;
- Maximum traffic by time period (peak time)
SELECT
date_trunc('hour', from_unixtime(CAST(timestamp_start AS BIGINT) / 1000.0)
AT TIME ZONE 'Asia/Seoul') AS hour_slot,
COUNT(*) AS access_count
FROM
data_catalog.default.bucket_access_log
GROUP BY
date_trunc('hour', from_unixtime(CAST(timestamp_start AS BIGINT) / 1000.0)
AT TIME ZONE 'Asia/Seoul')
ORDER BY
access_count DESC
LIMIT 5;
BI use examples
You can visualize the data tables you created and template queries with BI tools, such as Superset. For more information on how to integrate with external BIs, see the Data Query BI integration guide.
- Example: Access volume trend chart, traffic analysis by time period, top requested objects dashboard, etc.
