需求:
用datax同步mysql: 192.168.236.134中test1庫的user表到192.168.236.136中test1庫的user表
步驟:
下載安裝包
https://github.com/alibaba/DataX/blob/master/userGuid.md
進入引導頁
https://github.com/alibaba/DataX/blob/master/userGuid.md
前置準備
安裝jdk1.8,并配置環境變量
安裝maven
yum install maven -y
安裝python
我這邊直接用寶塔安裝了
準備兩個mysql
準備測試數據
CREATE TABLE IF NOT EXISTS `user` (`id` INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,`username` VARCHAR(50) NOT NULL,`password` VARCHAR(100) NOT NULL,`email` VARCHAR(100) NOT NULL,`created_at` TIMESTAMP DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
批量生成測試數據的存儲過程
DELIMITER $$CREATE PROCEDURE GenerateUserTestData(IN num_records INT)
BEGINDECLARE i INT DEFAULT 1;DECLARE v_username VARCHAR(50);DECLARE v_password VARCHAR(100);DECLARE v_email VARCHAR(100);WHILE i <= num_records DO-- 生成隨機用戶名(如 User_12345)SET v_username = CONCAT('User_', FLOOR(10000 + RAND() * 90000));-- 生成隨機密碼(簡化示例,實際應加密)SET v_password = CONCAT('Pass_', FLOOR(100000 + RAND() * 900000));-- 生成隨機郵箱(如 user12345@example.com)SET v_email = CONCAT('user', FLOOR(10000 + RAND() * 90000), '@example.com');INSERT INTO `user` (username, password, email)VALUES (v_username, v_password, v_email);-- 每1000條提交一次,避免事務過大IF i % 1000 = 0 THENCOMMIT;END IF;SET i = i + 1;END WHILE;
END$$DELIMITER ;
-- 生成100條測試數據
CALL GenerateUserTestData(100);
準備datax同步腳本
可使用DeepSeek或者qwen.ai 生成
提示詞如下:
datax已經安裝好了。現在我需要使用datax將如下配置的mysql
ip: 192.168.236.134
用戶名:test1
密碼:E7WEC4ZJnHGhHBMH
中的user表數據同步到 如下mysql中。
ip: 192.168.236.136
用戶名:test1
密碼:G3edhwhCSAYa82EK
得到json腳本
{"job": {"content": [{"reader": {"name": "mysqlreader","parameter": {"username": "test1","password": "E7WEC4ZJnHGhHBMH","connection": [{"jdbcUrl": ["jdbc:mysql://192.168.236.134:3306/數據庫名"],"table": ["user"]}],"column": ["*"],"splitPk": ""}}},{"writer": {"name": "mysqlwriter","parameter": {"username": "test1","password": "G3edhwhCSAYa82EK","connection": [{"jdbcUrl": "jdbc:mysql://192.168.236.136:3306/數據庫名","table": "user"}],"column": ["*"],"writeMode": "insert"}}}],"setting": {"speed": {"channel": "1"}}}
}
執行報錯
喂給千問
檢測安裝是否正常
驗證步驟
ls $DATAX_HOME/plugin/reader/mysqlreader
應輸出類似:libs、mysqlreader-0.0.1-SNAPSHOT.jar等文件
生成配置文件模板 (測試用):
python datax.py -r mysqlreader -w mysqlwriter > test.json
編輯test.json后執行,驗證是否為配置問題
若問題仍未解決,可參考
中的錯誤分析方法,檢查日志或通過調試工具定位具體插件加載失敗的原因。
根據生成模板重新配置
json格式不正確
可以如下網頁驗證
https://jsonlint.com/?spm=a2ty_o01.29997173.0.0.1f39c9213lfeok
修改腳本
{"job": {"content": [{"reader": {"name": "mysqlreader","parameter": {"username": "test1","password": "E7WEC4ZJnHGhHBMH","column": ["*"],"connection": [{"jdbcUrl": ["jdbc:mysql://192.168.236.134:3306/test1?useSSL=false&serverTimezone=UTC"],"table": ["user1"]}]}},"writer": {"name": "mysqlwriter","parameter": {"username": "test1","password": "G3edhwhCSAYa82EK","connection": [{"jdbcUrl": "jdbc:mysql://192.168.236.136:3306/test1?useSSL=false&serverTimezone=UTC","table": ["user1"]}],"obWriteMode": "insert","column": ["*"],"preSql": [],"postSql": []}}}],"setting": {"speed": {"channel": "5"}}}
}
執行,顯示如下表示成功
[root@localhost bin]# python datax.py test9.jsonDataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.2025-04-03 19:57:24.786 [main] INFO MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN
2025-04-03 19:57:24.788 [main] INFO MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfo[id="GMT+08:00",offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]
2025-04-03 19:57:24.812 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2025-04-03 19:57:24.816 [main] INFO Engine - the machine info =>osInfo: Linux amd64 3.10.0-1160.el7.x86_64jvmInfo: Red Hat, Inc. 1.8 25.412-b08cpu num: 8totalPhysicalMemory: -0.00GfreePhysicalMemory: -0.00GmaxFileDescriptorCount: -1currentOpenFileDescriptorCount: -1GC Names [PS MarkSweep, PS Scavenge]MEMORY_NAME | allocation_size | init_sizePS Eden Space | 256.00MB | 256.00MBCode Cache | 240.00MB | 2.44MBCompressed Class Space | 1,024.00MB | 0.00MBPS Survivor Space | 42.50MB | 42.50MBPS Old Gen | 683.00MB | 683.00MBMetaspace | -0.00MB | 0.00MB2025-04-03 19:57:24.825 [main] INFO Engine -
{"content":[{"reader":{"name":"mysqlreader","parameter":{"username":"test1","password":"****************","column":["*"],"connection":[{"jdbcUrl":["jdbc:mysql://192.168.236.134:3306/test1?useSSL=false&serverTimezone=UTC"],"table":["user1"]}]}},"writer":{"name":"mysqlwriter","parameter":{"username":"test1","password":"****************","connection":[{"jdbcUrl":"jdbc:mysql://192.168.236.136:3306/test1?useSSL=false&serverTimezone=UTC","table":["user1"]}],"obWriteMode":"insert","column":["*"],"preSql":[],"postSql":[]}}}],"setting":{"speed":{"channel":"5"}}
}2025-04-03 19:57:24.839 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false
2025-04-03 19:57:24.839 [main] INFO JobContainer - DataX jobContainer starts job.
2025-04-03 19:57:24.840 [main] INFO JobContainer - Set jobId = 0
2025-04-03 19:57:25.079 [job-0] INFO OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:mysql://192.168.236.134:3306/test1?useSSL=false&serverTimezone=UTC&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true.
2025-04-03 19:57:25.080 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置存在一定的風險. 因為您未配置讀取數據庫表的列,當您的表字段個數、類型有變動時,可能影響任務正確性甚至會運行出錯。請檢查您的配置并作出修改.
2025-04-03 19:57:25.275 [job-0] INFO OriginalConfPretreatmentUtil - table:[user1] all columns:[
id,user_name,pass_word,email,created_at
].
2025-04-03 19:57:25.275 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置信息存在風險. 因為您配置的寫入數據庫表的列為*,當您的表字段個數、類型有變動時,可能影響任務正確性甚至會運行出錯。請檢查您的配置并作出修改.
2025-04-03 19:57:25.276 [job-0] INFO OriginalConfPretreatmentUtil - Write data [
INSERT INTO %s (id,user_name,pass_word,email,created_at) VALUES(?,?,?,?,?)
], which jdbcUrl like:[jdbc:mysql://192.168.236.136:3306/test1?useSSL=false&serverTimezone=UTC&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&rewriteBatchedStatements=true&tinyInt1isBit=false]
2025-04-03 19:57:25.277 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
2025-04-03 19:57:25.277 [job-0] INFO JobContainer - DataX Reader.Job [mysqlreader] do prepare work .
2025-04-03 19:57:25.277 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
2025-04-03 19:57:25.277 [job-0] INFO JobContainer - jobContainer starts to do split ...
2025-04-03 19:57:25.278 [job-0] INFO JobContainer - Job set Channel-Number to 5 channels.
2025-04-03 19:57:25.280 [job-0] INFO JobContainer - DataX Reader.Job [mysqlreader] splits to [1] tasks.
2025-04-03 19:57:25.281 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] splits to [1] tasks.
2025-04-03 19:57:25.301 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
2025-04-03 19:57:25.303 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
2025-04-03 19:57:25.304 [job-0] INFO JobContainer - Running by standalone Mode.
2025-04-03 19:57:25.320 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2025-04-03 19:57:25.323 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
2025-04-03 19:57:25.323 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.
2025-04-03 19:57:25.346 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2025-04-03 19:57:25.445 [0-0-0-reader] INFO CommonRdbmsReader$Task - Begin to read record by Sql: [select * from user1
] jdbcUrl:[jdbc:mysql://192.168.236.134:3306/test1?useSSL=false&serverTimezone=UTC&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true].
2025-04-03 19:57:25.456 [0-0-0-reader] INFO CommonRdbmsReader$Task - Finished read record by Sql: [select * from user1
] jdbcUrl:[jdbc:mysql://192.168.236.134:3306/test1?useSSL=false&serverTimezone=UTC&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true].
2025-04-03 19:57:25.749 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[421]ms
2025-04-03 19:57:25.750 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2025-04-03 19:57:35.445 [job-0] INFO StandAloneJobContainerCommunicator - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2025-04-03 19:57:35.445 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2025-04-03 19:57:35.445 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do post work.
2025-04-03 19:57:35.445 [job-0] INFO JobContainer - DataX Reader.Job [mysqlreader] do post work.
2025-04-03 19:57:35.445 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2025-04-03 19:57:35.446 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /usr/local/datax/hook
2025-04-03 19:57:35.447 [job-0] INFO JobContainer -[total cpu info] =>averageCpu | maxDeltaCpu | minDeltaCpu -1.00% | -1.00% | -1.00%[total gc info] =>NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTimePS MarkSweep | 1 | 1 | 1 | 0.042s | 0.042s | 0.042sPS Scavenge | 1 | 1 | 1 | 0.040s | 0.040s | 0.040s2025-04-03 19:57:35.447 [job-0] INFO JobContainer - PerfTrace not enable!
2025-04-03 19:57:35.448 [job-0] INFO StandAloneJobContainerCommunicator - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2025-04-03 19:57:35.449 [job-0] INFO JobContainer -
任務啟動時刻 : 2025-04-03 20:04:41
任務結束時刻 : 2025-04-03 20:04:51
任務總計耗時 : 10s
任務平均流量 : 2B/s
記錄寫入速度 : 0rec/s
讀出記錄總數 : 100
讀寫失敗總數 : 0