Integrate Cloud Database (MySQL) using Logstash

Prev Next

Available in VPC

You can build a data pipeline between Search Engine Service and Cloud DB for MySQL using Logstash.

Preliminary tasks

Before following this guide, complete these prerequisites:

  • Create a VPC and server.
  • Create a Cloud DB for MySQL server.
  • Create a Search Engine Service cluster.

The example demonstrates how to run Logstash on a Server to periodically send data from Cloud DB for MySQL to Search Engine Service.

Configure network

This section introduces a network configuration example.

STEP 1. Configure Cloud DB for MySQL User

In Manage DB> Manage DB user, configure the settings to allow the DB user to access from the server IP range.
ses-mysql-user-screen_ko

STEP 2. Configure ACG

To configure the ACG to allow access to port 9092 of the Cloud Data Streaming Service broker node:

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Compute > Server > ACG.
  2. Select "cloud-mysql-xxxx" from the ACG list and click [Set ACG].
  3. Enter ACG rules, and then click [Add].
    ses-5-4-1-1_ko
    • Protocol: TCP
    • Access source: IP of the server where Logstash is to run
    • Allowed port: The port configured in the DB (Default: 3306).
  4. Click [Apply].

To configure ACG to allow access to port 9200 of the Search Engine Service manager node:

  1. In the VPC environment on the NAVER Cloud Platform console, navigate to i_menu > Services > Compute > Server > ACG.
  2. Select "searchengine-m-xxxxx" from the ACG list and click [Set ACG].
  3. Enter ACG rules, and then click [Add].
    cdss-5-6_ko
    • Protocol: TCP
    • Access source: IP of the server where Logstash is to run
    • Allowed port: 9200

Install Logstash

This section introduces an example of installing Logstash on a server. The installation process includes steps for both Elasticsearch and OpenSearch. Install the version that matches the service you are using to ensure proper testing.

STEP 1. Install Java

To install Java:

  1. Enter the following command to install Java:
yum install java-devel -y

STEP 2. Install Logstash

To install Logstash:

Note

Use the OSS-licensed version to ensure proper operation.

  1. Enter the following command to download Logstash to the /root directory.
# For Elasticsearch version (install OSS version)
wget https://artifacts.elastic.co/downloads/logstash/logstash-oss-7.7.0.rpm

# For OpenSearch version
wget https://artifacts.opensearch.org/logstash/logstash-oss-with-opensearch-output-plugin-7.16.3-linux-x64.tar.gz
  1. Enter the following command to install the downloaded file.
# For Elasticsearch version
rpm -ivh logstash-oss-7.7.0.rpm

# For OpenSearch version
tar -zxvf logstash-oss-with-opensearch-output-plugin-7.16.3-linux-x64.tar.gz
  1. Install JDBC Driver.
wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.30.zip
unzip mysql-connector-java-8.0.30.zip
mkdir /etc/logstash/plugin
mv /root/mysql-connector-java-8.0.30/mysql-connector-java-8.0.30.jar /etc/logstash/plugin/ 
  1. Before starting Logstash, edit the logstash.conf file.
    • For Elasticsearch version
      mv /etc/logstash/logstash-sample.conf /etc/logstash/conf.d/logstash.conf
      vi /etc/logstash/conf.d/logstash.conf
      
    • Elasticsearch version logstash.conf
    input {
       jdbc {
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        jdbc_driver_library => "/etc/logstash/plugin/mysql-connector-java-8.0.30.jar"
        jdbc_connection_string => "jdbc:mysql://${cdb mysql endpoint}:3306/${cdb mysql database name}?useSSL=false"
        jdbc_user => "${cdb user name}"
        jdbc_password => "${cdb mysql password}"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        jdbc_paging_enabled => true
        jdbc_page_size => 50
        statement => "SELECT *, UNIX_TIMESTAMP(update_time) AS unix_ts_in_secs FROM logstash_test WHERE (UNIX_TIMESTAMP(update_time) > :sql_last_value AND update_time < NOW()) ORDER BY update_time ASC"
        record_last_run => true
        clean_run => true
        tracking_column_type => "numeric"
        tracking_column => "unix_ts_in_secs"
        use_column_value => true
        last_run_metadata_path => "/etc/logstash/data/student"
        schedule => "*/5 * * * * *"
      }
    }
    
    output {
      elasticsearch {
        hosts => ["http://${ses manager node1 ip}:9200", "http://${ses manager node2 ip}:9200"]
        index => "cdss-%{+YYYY.MM.dd}"
      }
    }
    
    • For OpenSearch version
      # When installing to /root/, the {installation path} is /root/logstash-7.16.3. 
      mv {installation path}/config/logstash-sample.conf {installation path}/config/logstash.conf
      vi {installation path}/config/logstash.conf
      
    • OpenSearch version logstash.conf
      input {
         jdbc {
          jdbc_driver_class => "com.mysql.jdbc.Driver"
          jdbc_driver_library => "/etc/logstash/plugin/mysql-connector-java-8.0.30.jar"
          jdbc_connection_string => "jdbc:mysql://${cdb mysql endpoint}:3306/${cdb mysql database name}?useSSL=false"
          jdbc_user => "${cdb user name}"
          jdbc_password => "${cdb mysql password}"
          jdbc_driver_class => "com.mysql.jdbc.Driver"
          jdbc_paging_enabled => true
          jdbc_page_size => 50
          statement => "SELECT *, UNIX_TIMESTAMP(update_time) AS unix_ts_in_secs FROM logstash_test WHERE (UNIX_TIMESTAMP(update_time) > :sql_last_value AND update_time < NOW()) ORDER BY update_time ASC"
          record_last_run => true
          clean_run => true
          tracking_column_type => "numeric"
          tracking_column => "unix_ts_in_secs"
          use_column_value => true
          last_run_metadata_path => "/etc/logstash/data/student"
          schedule => "*/5 * * * * *"
        }
      }
      
      output {
        opensearch {
          hosts => ["https://${ses manager node1 ip}:9200", "https://${ses manager node2 ip}:9200"]
          index => "cdss-%{+YYYY.MM.dd}"
          user => ${userID}
          password => ${password}
          ssl_certificate_verification => false
        }
      }
      
    • Logstash Conf Comment
      The Sample Logstash Conf file above runs the "statement" query every 5 seconds.
      The "statement" query selects rows that have been updated more recently than the update_time of the last queried row.
      ${cdb mysql endpoint} - Replace with the private domain of CDB MySQL.
      ${cdb mysql database name} - Enter the name of the database you created in CDB MySQL.
      ${cdb user name} - Enter the name of the account you created in CDB MySQL.
      ${cdb mysql password} - Enter the password for the account you created in CDB MySQL.
      ${ses manager node1 ip} - Enter the IP address of the Search Engine Service manager node.
      ${ses manager node2 ip} - Enter the IP address of the second Search Engine Service manager node, or leave it blank if the manager nodes are not redundant.
      ${userID} - For OpenSearch, enter the user ID you set when creating the cluster.
      ${password} - For OpenSearch, enter the password you set when creating the cluster.
      
    • Create the Logstash metadata directory
      mkdir /etc/logstash/data
      chown -R logstash:logstash /etc/logstash/data
      

Configure Cloud DB for MySQL

The following is an example of a Cloud DB for MySQL configuration.

  • Create table
    create table logstash_test(
    id BIGINT(20) UNSIGNED NOT NULL,
    PRIMARY KEY (id),
    contents nvarchar(255),
    create_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    update_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
    );
    
  • Enter data
    insert into logstash_test(id, contents) values(1, "this is first contents");
    insert into logstash_test(id, contents) values(2, "this is second contents");
    

Run Logstash

The following is an example of running Logstash.

# For Elasticsearch version
systemctl start logstash

# For OpenSearch version
# Use nohup for background execution.
# You need to specify the path for logstash.conf by using the -f option.
nohup {installation path}/bin/logstash -f ~{installation path}/config/logstash.conf &

View Cloud DB for MySQL data

The following example demonstrates how to query Cloud DB for MySQL data in Search Engine Service.

GET mysql-2022.08.08/_search
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "mysql-2022.08.08",
        "_type" : "_doc",
        "_id" : "VNkwe4IBicE7MyrTCKIL",
        "_score" : 1.0,
        "_source" : {
          "contents" : "this is first contents",
          "update_time" : "2022-08-08T01:55:47.000Z",
          "unix_ts_in_secs" : 1659923747,
          "id" : 1,
          "create_time" : "2022-08-08T01:55:47.000Z",
          "@version" : "1",
          "@timestamp" : "2022-08-08T02:02:01.082Z"
        }
      },
      {
        "_index" : "mysql-2022.08.08",
        "_type" : "_doc",
        "_id" : "3yEwe4IBBeKbW6yXB2l_",
        "_score" : 1.0,
        "_source" : {
          "contents" : "this is second contents",
          "update_time" : "2022-08-08T01:59:05.000Z",
          "unix_ts_in_secs" : 1659923945,
          "id" : 2,
          "create_time" : "2022-08-08T01:59:05.000Z",
          "@version" : "1",
          "@timestamp" : "2022-08-08T02:02:01.093Z"
        }
      }
    ]
  }
}