Transferring CDB MySQL data
    • PDF

    Transferring CDB MySQL data

    • PDF

    Article Summary

    Available in VPC

    Sqoop (SQL to Hadoop) is a tool that supports efficient data transfer between relational databases and Hadoop.
    You can import the data saved in RDBMS to HDFS for analysis in HDFS, and also export the analysis results to RDBMS. You can also send the data to HDFS, Hive, HBase, Accumulo, etc.

    This guide introduces the method for migrating the data of NAVER Cloud Platform's Cloud DB for MySQL to Cloud Hadoop using Sqoop.
    Please refer to Using Sqoop for more information about using Sqoop and its commands.

    Preparations

    1. Create Object Storage.
    2. Create a Cloud Hadoop cluster.
    3. Create a Cloud DB for MySQL.

    Data migration using Sqoop

    The Sqoop client is installed by default in master node and data node servers, excluding the edge nodes. You can proceed with the migration without installing the Sqoop client separately.

    Access the server by referring to Connecting to cluster nodes through SSH, and proceed with the following steps in order.

    1. Create a Sqoop account to access Cloud DB for MySQL

    Use the Cloud DB for MySQL console's Manage DB user menu to enter settings information to create a Cloud DB for MySQL account.
    The IP addresses of Cloud Hadoop servers can be viewed from the Ambari web UI.

    <Example>
    DB settings information

    • User_ID: mig_acct
    • Host (IP): IP range of the Cloud Hadoop server
      <Example> 10.10.10.%
    • Password: Set arbitrarily

    hadoop-chadoop-use-ex4_1-1_en

    2. Set ACG

    Set ACG to enable the Cloud Hadoop master node server 2 to access the Cloud DB for MySQL.

    • Add the Cloud Hadoop master node server 2's IP ACG to the access source of the Cloud-DB ACG settings.
      cloudhadoop-set-acg1_en

    3. Create sampling table and data

    Enter the example sampling table and data in the Cloud DB for MySQL created.

    Note

    MySQL access command

    mysql -u [Cluster Administrator Account Name] -p
    
    CREATE TABLE `mig_test_tb` (
    `col1` int(11) NOT NULL,
    `col2` char(2) NOT NULL,
    `col3` char(15) DEFAULT NULL,
    `col4` char(4) DEFAULT NULL,
    `col5` char(1) DEFAULT NULL,
    PRIMARY KEY (`col1`,`col2`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
    insert into mig_test_tb values
    (1,'a','mig-test1','run','y'),(2,'a','mig-test1','init','n'),
    (3,'b','mig-test1','run','y'),(4,'b','mig-test1','run','y'),
    (5,'c','test-mysql','init','n'),(6,'c','test-mysql','run','n'),
    (7,'d','test-mysql','init','n'),(8,'d','test-mysql','run','y'),
    (9,'e','sqoop-test','run','n'),(10,'e','sqoop-test','run','y'),
    (11,'f','sqoop-test','run','n'),(12,'f','sqoop-test','init','y'),
    (13,'g','mysql-sqoop','run','n'),(14,'g','mysql-sqoop','init','y'),
    (15,'h','mysql-sqoop','run','y'),(16,'h','mysql-sqoop','init','n'),
    (17,'i','sqp-to-my','run','n'),(18,'i','sqp-to-my','init','n'),
    (19,'j','sqp-to-my','init','y'),(20,'j','sqp-to-my','init','y'),
    (21,'k','my-mig-sq','run','n'),(22,'k','my-mig-sq','run','y'),
    (23,'i','my-mig-sq','run','y'),(24,'i','my-mig-sq','run','n');
    
    Note

    Check the section with small quotation marks ('') in case of a syntax error in creating tables.

    4. Migrate data with the Sqoop command

    1. Use the import-all-tables option of Sqoop CLI to migrate data to Hive.
      Cloud DB for MySQL's private domain and port is used for the jdbc information.

      sqoop import-all-tables --connect jdbc:mysql://test-db.mysql.domain.com:3306/migdb --username mig_acct -P --hive-import
      
    Note

    Check whether the primary key exists in case of error.

    The migration will proceed as registered as a resource manager job.
    
    ```
    20/11/30 16:27:10 INFO mapreduce.Job: Job job_1606446027726_0007 completed successfully
    20/11/30 16:27:10 INFO mapreduce.Job: Counters: 30
    File System Counters
    FILE: Number of bytes read=0
    FILE: Number of bytes written=685116
    FILE: Number of read operations=0
    FILE: Number of large read operations=0
    FILE: Number of write operations=0
    HDFS: Number of bytes read=414
    HDFS: Number of bytes written=520
    HDFS: Number of read operations=16
    HDFS: Number of large read operations=0
    HDFS: Number of write operations=8
    Job Counters
    Launched map tasks=4
    Other local map tasks=4
    Total time spent by all maps in occupied slots (ms)=13269
    Total time spent by all reduces in occupied slots (ms)=0
    Total time spent by all map tasks (ms)=13269
    Total vcore-milliseconds taken by all map tasks=13269
    Total megabyte-milliseconds taken by all map tasks=9049458
    Map-Reduce Framework
    Map input records=24
    Map output records=24
    Input split bytes=414
    Spilled Records=0
    Failed Shuffles=0
    Merged Map outputs=0
    GC time elapsed (ms)=393
    CPU time spent (ms)=4780
    Physical memory (bytes) snapshot=813985792
    Virtual memory (bytes) snapshot=10053660672
    Total committed heap usage (bytes)=341311488
    File Input Format Counters
    Bytes Read=0
    File Output Format Counters
    Bytes Written=520
    20/11/30 16:27:10 INFO mapreduce.ImportJobBase: Transferred 520 bytes in 15.7176 seconds (33.084 bytes/sec)
    20/11/30 16:27:10 INFO mapreduce.ImportJobBase: Retrieved 24 records.
    20/11/30 16:27:10 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners
    20/11/30 16:27:10 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `mig_test_tb` AS t LIMIT 1
    20/11/30 16:27:10 INFO hive.HiveImport: Loading uploaded data into Hive
    20/11/30 16:27:10 WARN conf.HiveConf: HiveConf of name hive.server2.enable.doAs.property does not exist
    20/11/30 16:27:10 WARN conf.HiveConf: HiveConf of name hive.server2.enable.doAs.property does not exist
    Logging initialized using configuration in jar:file:/home1/cdp/usr/hdp/2.6.5.0–292/hive/lib/hive-common-1.2.1000.2.6.5.0–292.jar!/hive-log4j.properties
    OK
    Time taken: 1.796 seconds
    Loading data to table default.mig_test_tb
    Table default.mig_test_tb stats: [numFiles=4, numRows=0, totalSize=520, rawDataSize=0]
    OK
    Time taken: 0.418 seconds
    ```
    
    1. Check if the migration has been successfully completed with the OK message.
      • You can view the tables and data migrated to Hive from Hive Viewer 2.0 or HUE UI. For more information about using Hive, refer to Using Hive.
    SELECT * FROM mig_test_tb;
    

    cloudhadoop-mysql-hive_en


    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.