Use Object Storage bucket access log

Prev Next

Available in VPC

Use Object Storage bucket access log

You can analyze access logs collected in Object Storage using Data Query.
This example shows the process of analyzing Object Storage bucket access logs with various template queries and visualizing them with business intelligence (BI) tools.

Feature summary


1. Configure Object Storage bucket access log

You can check the activation status and creation of Object Storage bucket access logs.

Activate bucket access log

  • When you activate the bucket access log in Object Storage, you can save the access history of the bucket in log files.
  • You can use this log data as an analysis target in Data Query.
  • Example of access log settings
    Item Set value
    Bucket name Bucket to analyze access log
    Bucket to save to Bucket where the access logs are to be saved
    Log file prefix bucket_access_log_
  • For more information on access log settings, see Object Storage user guide.

Create access log

Access the bucket through the client so that the access log data is actually created.

Caution
  • When selecting the bucket where the access logs are to be saved, it is recommended to use a new bucket with no other data included. If there are other files in the bucket, queries to create and view tables may not operate as expected.
  • Access log files are not created in real time. For more information on the creation time and cycle of the access log, see Object Storage user guide.

2. Create data table to be analyzed

Create data table (use data as table)

  • Use the scanner feature of Data Catalog to create a source data table that refers to data saved in Object Storage.
  • Refer to the following setting example to enter the values, and click the Start button to add a table:
  • Scanner settings example
    Item Set value
    Data type Object Storage
    Path s3a://[Bucket specified from access log configuration]/
    Scan range Partial, 10 items
    Execution cycle On-demand
    Pattern Use: Include, bucket_access_log_*
    Partition setting All partitioning form
    Collection option Update table definition
    Table merging True
    Table number limit 1
  • For more information about scanner, see the Data Catalog user guide.

Check creation of data table

It may take several seconds to minutes to create a table with a scanner. In the table list on the left, you can search to verify whether a table with the storage bucket name specified in the access log settings has been created.

3. Use template query to analyze access log data

Use provided template queries to analyze the access log data. You can run queries through the console editor, BI, or CLI, or integrate with the user application business logics using JDBC or SDK.

Create VIEW table

To use template queries, run the following query to create a VIEW table: Instead of creating a VIEW table, you can use the scanned data table name in the template query to see the same result.

CREATE OR REPLACE VIEW 
  "data_catalog"."default"."bucket_access_log"
AS SELECT *
FROM
  "data_catalog"."default"."scanned_table"; -- replace database/table name with yours

Provided template query

  • View data for certain period (Range query)
SELECT
  date_trunc('second', from_unixtime(CAST(timestamp_start AS BIGINT) / 1000.0) 
    AT TIME ZONE 'Asia/Seoul') AS log_time,
  remote_user,
  request_method,
  request_type,
  status,
  object_name,
  response_length
FROM
  data_catalog.default.bucket_access_log
WHERE
  from_unixtime(CAST(timestamp_start AS BIGINT) / 1000.0) AT TIME ZONE 'Asia/Seoul'
    BETWEEN TIMESTAMP '2025-01-01 00:00:00 Asia/Seoul'
        AND TIMESTAMP '2025-12-31 23:59:59 Asia/Seoul'
ORDER BY
  log_time;
  • Top 10 most accessed objects
SELECT
  object_name,
  COUNT(*) AS access_count
FROM
  data_catalog.default.bucket_access_log
WHERE 
  object_name IS NOT NULL
GROUP BY
  object_name
ORDER BY
  access_count DESC
LIMIT 10;
  • Top 10 most accessed remote_user (including access_key type)
SELECT
  remote_user,
  COUNT(*) AS request_count
FROM
  data_catalog.default.bucket_access_log
GROUP BY
  remote_user
ORDER BY
  request_count DESC
LIMIT 10;
  • Request distribution by HTTP status code
SELECT
  status,
  COUNT(*) AS request_count
FROM
  data_catalog.default.bucket_access_log
GROUP BY
  status
ORDER BY
  request_count DESC;
  • Number of request by method per 10 minutes
SELECT
  from_unixtime(floor(CAST(timestamp_start AS DOUBLE) / 1000.0 / 600.0) * 600.0)
    AT TIME ZONE 'Asia/Seoul' AS time_slot,
  request_method,
  COUNT(*) AS request_count
FROM
  data_catalog.default.bucket_access_log
GROUP BY
  1, request_method
ORDER BY
  1;
  • Maximum traffic by time period (peak time)
SELECT
  date_trunc('hour', from_unixtime(CAST(timestamp_start AS BIGINT) / 1000.0) 
    AT TIME ZONE 'Asia/Seoul') AS hour_slot,
  COUNT(*) AS access_count
FROM
  data_catalog.default.bucket_access_log
GROUP BY
  date_trunc('hour', from_unixtime(CAST(timestamp_start AS BIGINT) / 1000.0) 
    AT TIME ZONE 'Asia/Seoul')
ORDER BY
  access_count DESC
LIMIT 5;

BI use examples

You can visualize the data tables you created and template queries with BI tools, such as Superset. For more information on how to integrate with external BIs, see the Data Query BI integration guide.

  • Example: Access volume trend chart, traffic analysis by time period, top requested objects dashboard, etc.

dataquery-example-obs-accesslog-bi.png