# 一、背景
对于使用单数据库运行的系统来说,如何安全简单地将数据迁移至水平分片的数据库上,一直以来都是一个迫切的需求。对于已经使用了ShardingSphere
的用户来说,随着业务规模的快速变化,也可能需要对现有的分片集群进行弹性扩容或缩容。
ShardingSphere
在分片算法上提供给用户极大的自由度,但却给弹性伸缩造成了极大的挑战。
1、找寻既能支持自定义的分片算法,又能高效地将数据节点进行扩缩容的方式,是弹性伸缩面临的第一个挑战。
2、在伸缩过程中,不应该对正在运行的业务造成影响。尽可能减少伸缩时数据不可用的时间窗口,甚至做到用户完全无感知,是弹性伸缩的另一个挑战。
3、弹性伸缩不应该对现有的数据造成影响。如何保证数据的正确性,是弹性伸缩的第三个挑战。
弹性伸缩过程示意如下图:

支持自定义分片算法,减少数据伸缩及迁移时的业务影响,提供一站式的通用弹性伸缩解决方案,是ShardingSphere
弹性伸缩的主要设计目标。ShardingSphere-Scaling
是一个提供给用户的通用数据接入迁移及弹性伸缩的解决方案。ShardingSphere-Scaling
从4.1.0
版本开始向用户提供,当前处于alpha
开发阶段。
WARNING
支持将外围数据迁移至ShardingSphere
所管理的数据库,并对ShardingSphere
的数据节点进行扩容或缩容。不支持无主键表进行扩缩容,也不支持在当前存储节点之上做迁移,需要准备一个全新的数据库集群作为迁移目标库。
# 二、实现方案
# 原理说明
目前的弹性伸缩解决方案为:临时地使用两个数据库集群,伸缩完成后切换的方式实现。
优点:伸缩过程中原始数据没有任何影响。伸缩失败无风险。不受分片策略影响。
缺点:在一定时间内存在冗余服务器。所有数据都需要移动。
弹性伸缩模块会通过解析旧分片规则,提取配置中的数据源、数据节点等信息,之后创建伸缩作业工作流,将一次弹性伸缩拆解为 4 个主要阶段:
1、准备阶段。
2、存量数据迁移阶段。
3、增量数据同步阶段。
4、规则切换阶段。
弹性伸缩工作流如下图所示:

# 执行阶段说明
准备阶段: 在准备阶段,弹性伸缩模块会进行数据源连通性及权限的校验,同时进行存量数据的统计、日志位点的记录,最后根据数据量和用户设置的并行度,对任务进行分片。
存量数据迁移阶段: 执行在准备阶段拆分好的存量数据迁移作业,存量迁移阶段采用JDBC
查询的方式,直接从数据节点中读取数据,并使用新规则写入到新集群中。
增量数据同步阶段: 由于存量数据迁移耗费的时间受到数据量和并行度等因素影响,此时需要对这段时间内业务新增的数据进行同步。不同的数据库使用的技术细节不同,但总体上均为基于复制协议或WAL
日志实现的变更数据捕获功能。
WARNING
MySQL
:订阅并解析binlog
。
PostgreSQL
:采用官方逻辑复制test_decoding
。
这些捕获的增量数据,同样会由弹性伸缩模块根据新规则写入到新数据节点中。当增量数据基本同步完成时(由于业务系统未停止,增量数据是不断的),则进入规则切换阶段。
规则切换阶段: 在此阶段,可能存在一定时间的业务只读窗口期,通过设置数据库只读或ShardingSphere
的熔断机制,让旧数据节点中的数据短暂静态,确保增量同步已完全完成。
这个窗口期时间短则数秒,长则数分钟,取决于数据量和用户是否需要对数据进行强校验。确认完成后,ShardingSphere
可通过配置中心修改配置,将业务导向新规则的集群,弹性伸缩完成。
# 限流与熔断
超负荷流量下,针对某一节点进行熔断和限流,以保证整个数据库集群得以继续运行,是分布式系统下对单一节点控制能力的挑战。熔断指的是阻断ShardingSphere
和数据库的连接。当某个ShardingSphere
节点超过负载后,停止该节点对数据库的访问,使数据库能够保证足够的资源为其他节点提供服务。限流是指面对超负荷的请求开启流量限制,以保护部分请求可以得以高质量的响应。
目前实现的限流是在数据迁移或扩缩容过程中,限制源端或目标端的流量。下表为目前提供的熔断语句。
语句 | 说明 | 示例 |
---|---|---|
[ENABLE / DISABLE] READWRITE_SPLITTING (READ)? resourceName [FROM databaseName] | 启用/禁用读库 | ENABLE READWRITE_SPLITTING READ resource_0 |
[ENABLE / DISABLE] INSTANCE instanceId | 启用/禁用 Proxy 实例 | DISABLE INSTANCE instance_1 |
SHOW INSTANCE LIST | 查询 Proxy 实例信息 | SHOW INSTANCE LIST |
SHOW READWRITE_SPLITTING (READ)? resourceName [FROM databaseName] | 查询所有读库的状态 | SHOW READWRITE_SPLITTING READ RESOURCES |
实例级熔断用于多Proxy
实例场景下,启用/禁用某个Porxy
实例。库级熔断主要用于读写分离场景下启用/禁用读库。
# 三、用例测试
本节演示两个例子,例1是从MySQL
实例迁移到ShardingSphere-Proxy
,例2是对现有数据库节点进行扩容。两个例子都用ShardingSphere-Scaling
实现,过成类似。
ShardingSphere-Scaling
目前不是一个独立的产品,而是以ShardingSphere-Proxy
中的一个配置项提供相应功能。如果后端连接以下数据库,需要下载相应JDBC
驱动jar
包,并将其放入${shardingsphere-proxy}/lib
目录中。
数据库 | JDBC驱动 | 参考 |
---|---|---|
MySQL | mysql-connector-java-5.1.47.jar (opens new window) | Connector/J Versions (opens new window) |
# 引入 JDBC 驱动
cp ~/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar ~/apache-shardingsphere-5.1.1-shardingsphere-proxy-bin/lib/
# 重启Proxy
/root/apache-shardingsphere-5.1.1-shardingsphere-proxy-bin/bin/stop.sh
/root/apache-shardingsphere-5.1.1-shardingsphere-proxy-bin/bin/start.sh
2
3
4
5
6
# 数据迁移
需求:现有一个正在使用的MySQL
数据库,需要将其中的两个表迁移到Proxy
下的数据节点。两个表的数据实时变化,要求尽量缩短业务影响时间。
源:172.18.26.198:3306/migrating_db
目标:172.18.10.66:3306/db1、172.18.10.66:3306/db2、172.18.18.102:3306\db1、172.18.18.102:3306\db2
Proxy:172.18.10.66:3307、172.18.18.102:3307,Cluster运行模式
2
3
准备测试用例环境:目标66、102执行
drop database if exists db1;
drop database if exists db2;
create database db1;
create database db2;
2
3
4
源198执行:
-- 建库
drop database if exists migrating_db;
create database migrating_db;
use migrating_db;
-- 建表
create table t_order (
order_id bigint auto_increment primary key,
order_datetime datetime not null,
user_id bigint not null,
order_amount decimal(10,2) not null default 0,
key idx_order_datetime (order_datetime),
key idx_user_id (user_id));
create table t_order_item (
order_item_id bigint auto_increment primary key,
order_id bigint not null,
item_id int null,
item_quantity int not null default 0,
key idx_order_id (order_id));
-- 创建负载模拟存储过程
delimiter //
create procedure sp_generate_order_data(p_seconds int)
begin
set @start_ts := now();
set @start_date := unix_timestamp('2023-03-01');
set @end_date := unix_timestamp('2023-06-01');
while timestampdiff(second,@start_ts,now()) <= p_seconds do
start transaction;
set @order_datetime := from_unixtime(@start_date + rand() * (@end_date - @start_date));
set @user_id := floor(1 + rand() * 100000000);
set @order_amount := round((10 + rand() * 2000),2);
insert into t_order (order_datetime, user_id, order_amount)
values (@order_datetime, @user_id, @order_amount);
set @order_id := last_insert_id();
set @quantity := floor(1 + rand() * 50);
set @i := 1;
while @i <= @quantity do
set @item_id := floor(1 + rand() * 10000);
set @item_quantity := floor(1 + rand() * 20);
insert into t_order_item (order_id, item_id, item_quantity) values (@order_id, @item_id, @item_quantity);
set @i:=@i+1;
end while;
commit;
end while;
end
//
delimiter ;
-- 执行存储过程
call sp_generate_order_data(1800);
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# 创建数据迁移作业
【1】创建逻辑库
# 连接Proxy
mysql -u root -h 172.18.10.66 -P 3307 -p123456
# 创建逻辑库
drop database if exists migrating_db;
create database migrating_db;
use migrating_db;
2
3
4
5
6
【2】添加源资源
add resource
resource_source (host=172.18.26.198, port=3306, db=migrating_db, user=wxy, password=mypass);
show schema resources\G
2
3
【3】确认自动创建单表规则
count schema rules;
【4】将单表规则改为单片规则
create sharding table rule
t_order (datanodes("resource_source.t_order")),
t_order_item (datanodes("resource_source.t_order_item"));
2
3
【5】预览当前分片规则
preview select count(1) from t_order;
preview select count(1) from t_order_item;
2
【6】执行SQL
确认可以正确查询到数据:
select count(1) from t_order;
select count(1) from t_order_item;
2
【7】创建手动模式scaling
规则
create sharding scaling rule scaling_manual (
input(
worker_thread=40,
batch_size=1000
),
output(
worker_thread=40,
batch_size=1000
),
stream_channel(type(name=memory, properties("block-queue-size"=10000))),
data_consistency_checker(type(name=data_match, properties("chunk-size"=1000)))
);
2
3
4
5
6
7
8
9
10
11
12
【8】查看sharding_scaling
规则
show sharding scaling rules\G
WARNING
参数说明:
worker_thread:从源端摄取 / 写入到目标端全量数据的线程池大小。
batch_size:一次查询操作返回的最大记录数。
stream_channel:数据通道,连接生产者和消费者,用于 input 和 output 环节。type
指定算法类型,可选项MEMORY
;lock-queue-size
算法属性指定阻塞队列大小。
data_consistency_checker
:数据一致性校验算法。type
指定算法类型,可选项DATA_MATCH、CRC32_MATCH
;chunk-size
算法属性指定一次查询操作返回的最大记录数。
dataconsistencychecker
的type
可以通过执行show scaling check algorithms
查询到:
mysql> show scaling check algorithms;
+-------------+----------------------------+--------------------------------------------------------------+----------------+
| type | description | supported_database_types | provider |
+-------------+----------------------------+--------------------------------------------------------------+----------------+
| CRC32_MATCH | Match CRC32 of records. | MySQL | ShardingSphere |
| DATA_MATCH | Match raw data of records. | SQL92,openGauss,PostgreSQL,MySQL,MariaDB,H2,Oracle,SQLServer | ShardingSphere |
+-------------+----------------------------+--------------------------------------------------------------+----------------+
2 rows in set (0.01 sec)
2
3
4
5
6
7
8
data_match
支持所有数据库,但是性能不是最好的;crc32_match
只支持MySQL
,但是性能更好。这里还有completion_detector
、rateLimiter
两个遗留配置问题没有解决。除手动模式外,官方文档中说明还支持自动模式配置:
create sharding scaling rule scaling_auto (
input(
worker_thread=40,
batch_size=1000
),
output(
worker_thread=40,
batch_size=1000
),
stream_channel(type(name=memory, properties("block-queue-size"=10000))),
completion_detector(type(name=idle, properties("incremental-task-idle-seconds-threshold"=1800))),
data_consistency_checker(type(name=data_match, properties("chunk-size"=1000)))
);
2
3
4
5
6
7
8
9
10
11
12
13
completion_detector
指定作业是否接近完成检测算法。如果不配置则无法自动进行后续步骤,可以通过DistSQL
手动操作。type
指定算法类型,可选项IDLE
;incremental-task-idle-seconds-threshold
算法属性指定如果增量同步任务不再活动超过一定时间(秒数),那么可以认为增量同步任务接近完成,适用算法类型IDLE
。
但是,我在配置自动模式触发迁移时报错:
[ERROR] 2022-06-06 06:35:21.727 [0130317c30317c3054317c7363616c696e675f6462_Worker-1] o.a.s.e.e.h.g.LogJobErrorHandler - Job '0130317c30317c3054317c7363616c696e675f6462' exception occur in job processing
java.lang.IllegalArgumentException: incremental task idle threshold can not be null.
2
第二个遗留问题是限流配置。Proxy 5.1.1
文档中关于scaling
配置项中有如下说明:
rateLimiter: # 限流算法。如果不配置则不限流。
type: # 算法类型。可选项:
props: # 算法属性
2
3
可以看到文档中没有给出可选的算法类型和相关算法属性。DistSQL
实例中也没有rateLimiter
限流配置项。
【9】添加目标资源:
add resource
resource_1 (host=172.18.10.66, port=3306, db=db1, user=wxy, password=mypass),
resource_2 (host=172.18.10.66, port=3306, db=db2, user=wxy, password=mypass),
resource_3 (host=172.18.18.102, port=3306, db=db1, user=wxy, password=mypass),
resource_4 (host=172.18.18.102, port=3306, db=db2, user=wxy, password=mypass);
show schema resources\G
2
3
4
5
6
【10】修改分片规则,触发迁移:
alter sharding table rule
t_order (
resources(resource_1,resource_2,resource_3,resource_4),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=8)),
key_generate_strategy(column=order_id,type(name=snowflake))),
t_order_item (
resources(resource_1,resource_2,resource_3,resource_4),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=8)),
key_generate_strategy(column=order_item_id,type(name=snowflake)));
2
3
4
5
6
7
8
9
目前只有通过执行ALTER SHARDING TABLE RULE DistSQL
来触发迁移。本例中两个迁移表的分片规则从单片datanodes
改为四个数据源的8个分片,会触发迁移。
【11】监控迁移作业:
mysql> show scaling list;
+------------------------------------------------+----------------------+----------------------+--------+---------------------+-----------+
| id | tables | sharding_total_count | active | create_time | stop_time |
+------------------------------------------------+----------------------+----------------------+--------+---------------------+-----------+
| 0130317c30317c3054317c6d6967726174696e675f6462 | t_order,t_order_item | 1 | true | 2022-06-06 18:19:47 | NULL |
+------------------------------------------------+----------------------+----------------------+--------+---------------------+-----------+
1 row in set (0.01 sec)
mysql> show scaling status 0130317c30317c3054317c6d6967726174696e675f6462;
+------+-----------------+------------------------+--------+-------------------------------+--------------------------+
| item | data_source | status | active | inventory_finished_percentage | incremental_idle_seconds |
+------+-----------------+------------------------+--------+-------------------------------+--------------------------+
| 0 | resource_source | EXECUTE_INVENTORY_TASK | true | 80 | 0 |
+------+-----------------+------------------------+--------+-------------------------------+--------------------------+
1 row in set (0.00 sec)
2
3
4
5
6
7
8
9
10
11
12
13
14
15
SHOW SCALING LIST
查询所有迁移作业。这是个全局DistSQL
命令,返回所有状态的Scaling
作业,而不是只针对回当前逻辑数据库的。SHOW SCALING STATUS
命令查询某个迁移作业的进度。
# 割接
割接步骤包括:
【1】源数据库应用停写,以免丢失数据。
【2】查看迁移作业进度。
【3】Proxy
停写,即熔断。
【4】数据一致性校验。
【5】切换元数据。
【6】确认目标分片规则生效。
【7】创建绑定表规则(可选)。
【8】确认迁移作业已完成。
【9】应用连接到Proxy
访问数据库。
应用停写:Ctrl + c 或kill 掉正在运行的存储过程。
查询迁移作业进度:
mysql> show scaling status 0130317c30317c3054317c6d6967726174696e675f6462;
+------+-----------------+--------------------------+--------+-------------------------------+--------------------------+
| item | data_source | status | active | inventory_finished_percentage | incremental_idle_seconds |
+------+-----------------+--------------------------+--------+-------------------------------+--------------------------+
| 0 | resource_source | EXECUTE_INCREMENTAL_TASK | true | 100 | 5 |
+------+-----------------+--------------------------+--------+-------------------------------+--------------------------+
1 row in set (0.00 sec)
2
3
4
5
6
7
当status
达到EXECUTE_INCREMENTAL_TASK
,全量迁移已完成,在增量迁移阶段。inventory_finished_percentage
表示存量数据完成百分比,incremental_idle_seconds
表示增量空闲秒数,指示增量是否接近完成。
Proxy
停写,选择一个业务低峰期,对源端库或数据操作入口做停写:
stop scaling source writing 0130317c30317c3054317c6d6967726174696e675f6462;
数据一致性校验,根据数据量,这步可能执行较长时间:
mysql> check scaling 0130317c30317c3054317c6d6967726174696e675f6462 by type (name=crc32_match);
+--------------+----------------------+----------------------+-----------------------+-------------------------+
| table_name | source_records_count | target_records_count | records_count_matched | records_content_matched |
+--------------+----------------------+----------------------+-----------------------+-------------------------+
| t_order | 57599 | 57599 | true | true |
| t_order_item | 1468140 | 1468140 | true | true |
+--------------+----------------------+----------------------+-----------------------+-------------------------+
2 rows in set (2.17 sec)
2
3
4
5
6
7
8
切换元数据:
apply scaling 0130317c30317c3054317c6d6967726174696e675f6462;
预览目标分片是否已生效:
mysql> preview select count(1) from t_order;
+------------------+-------------------------------------------------------------------------+
| data_source_name | actual_sql |
+------------------+-------------------------------------------------------------------------+
| resource_1 | select count(1) from t_order_0 UNION ALL select count(1) from t_order_4 |
| resource_2 | select count(1) from t_order_1 UNION ALL select count(1) from t_order_5 |
| resource_3 | select count(1) from t_order_2 UNION ALL select count(1) from t_order_6 |
| resource_4 | select count(1) from t_order_3 UNION ALL select count(1) from t_order_7 |
+------------------+-------------------------------------------------------------------------+
4 rows in set (0.01 sec)
mysql> preview select count(1) from t_order_item;
+------------------+-----------------------------------------------------------------------------------+
| data_source_name | actual_sql |
+------------------+-----------------------------------------------------------------------------------+
| resource_1 | select count(1) from t_order_item_0 UNION ALL select count(1) from t_order_item_4 |
| resource_2 | select count(1) from t_order_item_1 UNION ALL select count(1) from t_order_item_5 |
| resource_3 | select count(1) from t_order_item_2 UNION ALL select count(1) from t_order_item_6 |
| resource_4 | select count(1) from t_order_item_3 UNION ALL select count(1) from t_order_item_7 |
+------------------+-----------------------------------------------------------------------------------+
4 rows in set (0.00 sec)
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
可以看到数据已经分片到新的数据库资源。
create sharding binding table rules (t_order,t_order_item);
确认绑定表规则生效:
mysql> preview select i.* from t_order o join t_order_item i on o.order_id=i.order_id where o.order_id in (10, 11);
+------------------+---------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql |
+------------------+---------------------------------------------------------------------------------------------------------+
| resource_3 | select i.* from t_order_2 o join t_order_item_2 i on o.order_id=i.order_id where o.order_id in (10, 11) |
| resource_4 | select i.* from t_order_3 o join t_order_item_3 i on o.order_id=i.order_id where o.order_id in (10, 11) |
+------------------+---------------------------------------------------------------------------------------------------------+
2 rows in set (0.06 sec)
2
3
4
5
6
7
8
确认迁移作业已完成:
mysql> show scaling status 0130317c30317c3054317c7363616c696e675f6462;
+------+-----------------+----------+--------+-------------------------------+--------------------------+
| item | data_source | status | active | inventory_finished_percentage | incremental_idle_seconds |
+------+-----------------+----------+--------+-------------------------------+--------------------------+
| 0 | resource_source | FINISHED | false | 100 | 251 |
+------+-----------------+----------+--------+-------------------------------+--------------------------+
1 rows in set (0.00 sec)
2
3
4
5
6
7
# 四、数据库节点扩容
# 准备测试环境用例
需求:现有两个正在使用的8分片表,需要将它们的数据节点扩容到16分片。两个表的数据实时变化,要求尽量缩短业务影响时间。
WARNING
源:172.18.26.198:3306/db1
、172.18.26.198:3306/db2
目标:172.18.10.66:3306/db1
、172.18.10.66:3306/db2
、172.18.18.102:3306\db1
、172.18.18.102:3306\db2
Proxy
:172.18.10.66:3307
、172.18.18.102:3307
,Cluster
运行模式
【1】准备测试用例环境:源198
,目标66
、102
执行:
drop database if exists db1;
drop database if exists db2;
create database db1;
create database db2;
2
3
4
【2】连接Proxy
:
mysql -u root -h 172.18.10.66 -P 3307 -p123456
【3】创建逻辑库:
drop database if exists scaling_db;
create database scaling_db;
use scaling_db;
2
3
【4】添加源资源:
add resource
resource_source1 (host=172.18.26.198, port=3306, db=db1, user=wxy, password=mypass),
resource_source2 (host=172.18.26.198, port=3306, db=db2, user=wxy, password=mypass);
show schema resources\G
2
3
4
【5】创建规则:
-- 分片表
create sharding table rule
t_order (
resources(resource_source1,resource_source2),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=8)),
key_generate_strategy(column=order_id,type(name=snowflake))),
t_order_item (
resources(resource_source1,resource_source2),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=8)),
key_generate_strategy(column=order_item_id,type(name=snowflake)));
-- 绑定表
create sharding binding table rules (t_order,t_order_item);
2
3
4
5
6
7
8
9
10
11
12
13
【6】建表:
create table t_order (
order_id bigint auto_increment primary key,
order_datetime datetime not null,
user_id bigint not null,
order_amount decimal(10,2) not null default 0,
key idx_order_datetime (order_datetime),
key idx_user_id (user_id));
create table t_order_item (
order_item_id bigint auto_increment primary key,
order_id bigint not null,
item_id int null,
item_quantity int not null default 0,
key idx_order_id (order_id));
2
3
4
5
6
7
8
9
10
11
12
13
14
【7】预览规则是否已生效:
mysql> preview select count(1) from t_order;
+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql |
+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| resource_source1 | select count(1) from t_order_0 UNION ALL select count(1) from t_order_2 UNION ALL select count(1) from t_order_4 UNION ALL select count(1) from t_order_6 |
| resource_source2 | select count(1) from t_order_1 UNION ALL select count(1) from t_order_3 UNION ALL select count(1) from t_order_5 UNION ALL select count(1) from t_order_7 |
+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
mysql> preview select count(1) from t_order_item;
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| resource_source1 | select count(1) from t_order_item_0 UNION ALL select count(1) from t_order_item_2 UNION ALL select count(1) from t_order_item_4 UNION ALL select count(1) from t_order_item_6 |
| resource_source2 | select count(1) from t_order_item_1 UNION ALL select count(1) from t_order_item_3 UNION ALL select count(1) from t_order_item_5 UNION ALL select count(1) from t_order_item_7 |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
mysql> preview select i.* from t_order o join t_order_item i on o.order_id=i.order_id where o.order_id in (10, 11);
+------------------+---------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql |
+------------------+---------------------------------------------------------------------------------------------------------+
| resource_source1 | select i.* from t_order_2 o join t_order_item_2 i on o.order_id=i.order_id where o.order_id in (10, 11) |
| resource_source2 | select i.* from t_order_3 o join t_order_item_3 i on o.order_id=i.order_id where o.order_id in (10, 11) |
+------------------+---------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
【8】添加存量数据:
insert into t_order (order_id, order_datetime, user_id, order_amount)
values
(1, now(), 1, 100),(2, now(), 2, 200),(3, now(), 3, 300),(4, now(), 4, 400),
(5, now(), 5, 500),(6, now(), 6, 600),(7, now(), 7, 700),(8, now(), 8, 800);
insert into t_order_item (order_item_id, order_id, item_id, item_quantity)
values
(1,1,1,10),(2,1,2,20),(3,2,3,30),(4,2,4,40),(5,3,1,10),(6,3,2,20),(7,4,3,30),(8,4,4,40),
(9,5,1,10),(10,5,2,20),(11,6,3,30),(12,6,4,40),(13,7,1,10),(14,7,2,20),(15,8,3,30),(16,8,4,40);
2
3
4
5
6
7
8
9
ShardingSphere-Proxy
只支持对非分片表使用MySQL
存储函数、存储过程操作,因此这里只能使用普通的insert
语句。
# 创建数据迁移作业
创建数据迁移作业包含以下步骤:
【1】创建逻辑库。
【2】添加源资源。
【3】把现有系统中的表配置到规则里。
【4】创建sharding scaling
规则。
【5】添加目标资源。
【6】修改分片规则,触发迁移。
【7】监控迁移作业
本例中因为我们的源和目标使用同一个Proxy
集群,第1-3步已经在上一步“准备测试用例环境”中完成了。
创建手动模式scaling
规则:
create sharding scaling rule scaling_manual (
input(
worker_thread=40,
batch_size=1000
),
output(
worker_thread=40,
batch_size=1000
),
stream_channel(type(name=memory, properties("block-queue-size"=10000))),
data_consistency_checker(type(name=data_match, properties("chunk-size"=1000)))
);
2
3
4
5
6
7
8
9
10
11
12
添加目标资源:
add resource
resource_1 (host=172.18.10.66, port=3306, db=db1, user=wxy, password=mypass),
resource_2 (host=172.18.10.66, port=3306, db=db2, user=wxy, password=mypass),
resource_3 (host=172.18.18.102, port=3306, db=db1, user=wxy, password=mypass),
resource_4 (host=172.18.18.102, port=3306, db=db2, user=wxy, password=mypass);
show schema resources\G
2
3
4
5
6
修改分片规则,触发迁移,注意绑定表只能一块迁移:
alter sharding table rule
t_order (
resources(resource_1,resource_2,resource_3,resource_4),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=16)),
key_generate_strategy(column=order_id,type(name=snowflake))),
t_order_item (
resources(resource_1,resource_2,resource_3,resource_4),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=16)),
key_generate_strategy(column=order_item_id,type(name=snowflake)));
2
3
4
5
6
7
8
9
监控迁移作业:
mysql> show scaling list;
+------------------------------------------------+----------------------+----------------------+--------+---------------------+---------------------+
| id | tables | sharding_total_count | active | create_time | stop_time |
+------------------------------------------------+----------------------+----------------------+--------+---------------------+---------------------+
| 0130317c30317c3054317c6d6967726174696e675f6462 | t_order,t_order_item | 1 | false | 2022-06-06 18:19:47 | 2022-06-06 18:22:56 |
| 0130317c30317c3054317c7363616c696e675f6462 | t_order,t_order_item | 2 | true | 2022-06-07 07:13:30 | NULL |
+------------------------------------------------+----------------------+----------------------+--------+---------------------+---------------------+
2 rows in set (0.01 sec)
mysql> show scaling status 0130317c30317c3054317c6d6967726174696e675f6462;
+------+-----------------+----------+--------+-------------------------------+--------------------------+
| item | data_source | status | active | inventory_finished_percentage | incremental_idle_seconds |
+------+-----------------+----------+--------+-------------------------------+--------------------------+
| 0 | resource_source | FINISHED | false | 100 | 46550 |
+------+-----------------+----------+--------+-------------------------------+--------------------------+
1 row in set (0.01 sec)
mysql> show scaling status 0130317c30317c3054317c7363616c696e675f6462;
+------+------------------+--------------------------+--------+-------------------------------+--------------------------+
| item | data_source | status | active | inventory_finished_percentage | incremental_idle_seconds |
+------+------------------+--------------------------+--------+-------------------------------+--------------------------+
| 0 | resource_source1 | EXECUTE_INCREMENTAL_TASK | true | 100 | 215 |
| 1 | resource_source2 | EXECUTE_INCREMENTAL_TASK | true | 100 | 215 |
+------+------------------+--------------------------+--------+-------------------------------+--------------------------+
2 rows in set (0.01 sec)
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
添加增量数据:
insert into t_order (order_id, order_datetime, user_id, order_amount)
values (9, now(), 1, 100),(10, now(), 2, 200);
insert into t_order_item (order_item_id, order_id, item_id, item_quantity)
values
(17,9,1,10),(18,9,2,20),(19,9,3,30),(20,9,4,40),(21,10,1,10),(22,10,2,20),(23,10,3,30),(24,10,4,40);
2
3
4
5
6
# 割接
割接步骤包括:
【1】源数据库应用停写,以免丢失数据。
【2】查看迁移作业进度。
【3】Proxy
停写,即熔断。
【4】数据一致性校验。
【5】切换元数据。
【6】确认目标分片规则生效。
【7】确认迁移作业已完成。
【8】应用连接到Proxy
访问数据库。
查询迁移作业进度:
mysql> show scaling status 0130317c30317c3054317c7363616c696e675f6462;
+------+------------------+--------------------------+--------+-------------------------------+--------------------------+
| item | data_source | status | active | inventory_finished_percentage | incremental_idle_seconds |
+------+------------------+--------------------------+--------+-------------------------------+--------------------------+
| 0 | resource_source1 | EXECUTE_INCREMENTAL_TASK | true | 100 | 51 |
| 1 | resource_source2 | EXECUTE_INCREMENTAL_TASK | true | 100 | 51 |
+------+------------------+--------------------------+--------+-------------------------------+--------------------------+
2 rows in set (0.01 sec)
2
3
4
5
6
7
8
Proxy
停写,选择一个业务低峰期,对源端库或数据操作入口做停写:
stop scaling source writing 0130317c30317c3054317c7363616c696e675f6462;
数据一致性校验,根据数据量,这步可能执行较长时间:
mysql> check scaling 0130317c30317c3054317c7363616c696e675f6462 by type (name=crc32_match);
+--------------+----------------------+----------------------+-----------------------+-------------------------+
| table_name | source_records_count | target_records_count | records_count_matched | records_content_matched |
+--------------+----------------------+----------------------+-----------------------+-------------------------+
| t_order | 10 | 10 | true | true |
| t_order_item | 24 | 24 | true | true |
+--------------+----------------------+----------------------+-----------------------+-------------------------+
2 rows in set (0.46 sec)
2
3
4
5
6
7
8
切换元数据:
apply scaling 0130317c30317c3054317c7363616c696e675f6462;
预览分片规则是否已生效:
mysql> preview select count(1) from t_order;
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| resource_1 | select count(1) from t_order_0 UNION ALL select count(1) from t_order_4 UNION ALL select count(1) from t_order_8 UNION ALL select count(1) from t_order_12 |
| resource_2 | select count(1) from t_order_1 UNION ALL select count(1) from t_order_5 UNION ALL select count(1) from t_order_9 UNION ALL select count(1) from t_order_13 |
| resource_3 | select count(1) from t_order_2 UNION ALL select count(1) from t_order_6 UNION ALL select count(1) from t_order_10 UNION ALL select count(1) from t_order_14 |
| resource_4 | select count(1) from t_order_3 UNION ALL select count(1) from t_order_7 UNION ALL select count(1) from t_order_11 UNION ALL select count(1) from t_order_15 |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.01 sec)
mysql> preview select count(1) from t_order_item;
+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql |
+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| resource_1 | select count(1) from t_order_item_0 UNION ALL select count(1) from t_order_item_4 UNION ALL select count(1) from t_order_item_8 UNION ALL select count(1) from t_order_item_12 |
| resource_2 | select count(1) from t_order_item_1 UNION ALL select count(1) from t_order_item_5 UNION ALL select count(1) from t_order_item_9 UNION ALL select count(1) from t_order_item_13 |
| resource_3 | select count(1) from t_order_item_2 UNION ALL select count(1) from t_order_item_6 UNION ALL select count(1) from t_order_item_10 UNION ALL select count(1) from t_order_item_14 |
| resource_4 | select count(1) from t_order_item_3 UNION ALL select count(1) from t_order_item_7 UNION ALL select count(1) from t_order_item_11 UNION ALL select count(1) from t_order_item_15 |
+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.01 sec)
mysql> preview select i.* from t_order o join t_order_item i on o.order_id=i.order_id where o.order_id in (10, 11);
+------------------+-----------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql |
+------------------+-----------------------------------------------------------------------------------------------------------+
| resource_3 | select i.* from t_order_10 o join t_order_item_10 i on o.order_id=i.order_id where o.order_id in (10, 11) |
| resource_4 | select i.* from t_order_11 o join t_order_item_11 i on o.order_id=i.order_id where o.order_id in (10, 11) |
+------------------+-----------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
可以看到数据已经分片到新的数据库资源。确认迁移作业已完成:
mysql> show scaling status 0130317c30317c3054317c7363616c696e675f6462;
+------+------------------+----------+--------+-------------------------------+--------------------------+
| item | data_source | status | active | inventory_finished_percentage | incremental_idle_seconds |
+------+------------------+----------+--------+-------------------------------+--------------------------+
| 0 | resource_source1 | FINISHED | false | 100 | 276 |
| 1 | resource_source2 | FINISHED | false | 100 | 276 |
+------+------------------+----------+--------+-------------------------------+--------------------------+
2 rows in set (0.01 sec)
mysql> preview select i.* from t_order o join t_order_item i on o.order_id=i.order_id where o.order_id in (10, 11);
+------------------+---------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql |
+------------------+---------------------------------------------------------------------------------------------------------+
| resource_3 | select i.* from t_order_2 o join t_order_item_2 i on o.order_id=i.order_id where o.order_id in (10, 11) |
| resource_4 | select i.* from t_order_3 o join t_order_item_3 i on o.order_id=i.order_id where o.order_id in (10, 11) |
+------------------+---------------------------------------------------------------------------------------------------------+
2 rows in set (0.01 sec)
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# 参考文章
本文转载自 https://blog.csdn.net/wzy0623/article/details/125181920