mycat分库分表-CSDN博客

1. 分库实现

准备两台机器

192.168.109.132 node01
192.168.109.129 node03

准备表文件

#客户表
CREATE TABLE customer (
  id INT AUTO_INCREMENT,
  name VARCHAR (200),
  PRIMARY KEY (id)
) ;

#订单表
CREATE TABLE orders (
  id INT AUTO_INCREMENT,
  order_type INT,
  customer_id INT,
  amount DECIMAL (10, 2),
  PRIMARY KEY (id)
) ;

#订单详情表
CREATE TABLE orders_detail (
  id INT AUTO_INCREMENT,
  detail VARCHAR (2000),
  order_id INT,
  PRIMARY KEY (id)
) ;

#订单状态字典表
CREATE TABLE dict_order_type (
  id INT AUTO_INCREMENT,
  order_type VARCHAR (200),
  PRIMARY KEY (id)
) ;

修改schema.xml

<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://io.mycat/">

	<schema name="TESTDB" checkSQLschema="false" sqlMaxLimit="100" dataNode="dn1">
		<table name="customer" dataNode="dn2"></table>
	</schema>
	<dataNode name="dn1" dataHost="localhost1" database="chencc" />
	<dataNode name="dn2" dataHost="localhost2" database="chencc" />
	<dataHost name="localhost1" maxCon="1000" minCon="10" balance="0"
			  writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">
		<heartbeat>select user()</heartbeat>
		<!-- can have multi write hosts -->
		<writeHost host="hostM1" url="192.168.109.132:3306" user="root"
				   password="123456">
	    </writeHost>
	</dataHost>
	
	<dataHost name="localhost2" maxCon="1000" minCon="10" balance="0"
			  writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">
		<heartbeat>select user()</heartbeat>
		<!-- can have multi write hosts -->
        <writeHost host="hostM2" url="192.168.109.129:3306" user="root"
				   password="123456">
		</writeHost>
	</dataHost>
</mycat:schema>

在两个数据节点上分别创建chencc数据库
启动mycat的服务
在mycat上执行上述建表语句

会发现，customer在node03上，而其他的表在node01上，此时完成了分库的功能。
在插入完成之后会发现，表名是大写的，需要在mysql的配置文件中添加lower_case_table_names=1参数，来保证查询的时候能够进行正常的查询。

2. 分表实现

相对于垂直拆分，水平拆分不是将表做分类，而是按照某个字段的某种规则来分散到多个库之中，每个表中包含一部分数据。简单来说，我们可以将数据的水平切分理解为是按照数据行的切分，就是将表中的某些行切分到一个数据库，而另外的某些行又切分到其他的数据库中。

修改schema.xml文件

<schema name="TESTDB" checkSQLschema="false" sqlMaxLimit="100" dataNode="dn1">
	<table name="customer" dataNode="dn2"></table>
	<table name="orders" dataNode="dn1,dn2" rule="mod_rule"></table>
</schema>

在这里插入图片描述

修改rule.xml文件

<tableRule name="mod_rule">
	<rule>
		<columns>customer_id</columns>
		<algorithm>mod-long</algorithm>
	</rule>
</tableRule>
<function name="mod-long" class="io.mycat.route.function.PartitionByMod">
	<property name="count">2</property>
</function>

在node03上创建新的orders表
重启mycat
插入数据，实现分片

INSERT INTO orders(id,order_type,customer_id,amount) VALUES (1,101,100,100100);
INSERT INTO orders(id,order_type,customer_id,amount) VALUES(2,101,100,100300);
INSERT INTO orders(id,order_type,customer_id,amount) VALUES(3,101,101,120000);
INSERT INTO orders(id,order_type,customer_id,amount) VALUES(4,101,101,103000);
INSERT INTO orders(id,order_type,customer_id,amount) VALUES(5,102,101,100400);
INSERT INTO orders(id,order_type,customer_id,amount) VALUES(6,102,100,100020);

查看结果
node01：

node03：

3. 常用分片规则

3.1 取模运算（上述分表实现）

3.2 分片枚举

修改schema.xml文件

<table name="orders_ware_info" dataNode="dn1,dn2" rule="sharding_by_intfile" ></table>

在这里插入图片描述

修改rule.xml文件

<!--columns：分片字段， algorithm：分片函数-->
<tableRule name="sharding_by_intfile">
	<rule>
		<columns>areacode</columns>
		<algorithm>hash-int</algorithm>
	</rule>
</tableRule>
<!--mapFile： 标识配置文件名称， type： 0为int型、 非0为String，
defaultNode： 默认节点:小于 0 表示不设置默认节点，大于等于 0 表示设置默认节点，
设置默认节点如果碰到不识别的枚举值，就让它路由到默认节点，如不设置不识别就报错-->
<function name="hash-int" class="io.mycat.route.function.PartitionByFileMap">
	<property name="mapFile">partition-hash-int.txt</property>
	<property name="type">1</property>
	<property name="defaultNode">0</property>
</function>

修改partition-hash-int.txt文件

110=0
120=1

重启mycat服务

./mycat restart

在mycat上执行sql

CREATE TABLE orders_ware_info (
  `id` INT AUTO_INCREMENT COMMENT '编号',
  `order_id` INT COMMENT '订单编号',
  `address` VARCHAR (200) COMMENT '地址',
  `areacode` VARCHAR (20) COMMENT '区域编号',
  PRIMARY KEY (id)
) ;

INSERT INTO orders_ware_info(id, order_id,address,areacode) VALUES (1,1,'北京','110');
INSERT INTO orders_ware_info(id, order_id,address,areacode) VALUES (2,2,'天津','120');

3.3 范围分片

修改schema.xml文件

<table name="payment_info" dataNode="dn1,dn2" rule="auto_sharding_long" ></table>

修改rule.xml文件

<tableRule name="auto_sharding_long">
	<rule>
		<columns>order_id</columns>
		<algorithm>rang-long</algorithm>
	</rule>
</tableRule>

<function name="rang-long" class="io.mycat.route.function.AutoPartitionByLong">
	<property name="mapFile">autopartition-long.txt</property>
	<property name="defaultNode">0</property>
</function>

修改autopartition-long.txt文件

0-102=0
103-200=1

重启mycat服务
在mycat上执行sql

CREATE TABLE payment_info (
  `id` INT AUTO_INCREMENT COMMENT '编号',
  `order_id` INT COMMENT '订单编号',
  `payment_status` INT COMMENT '支付状态',
  PRIMARY KEY (id)
) ;
INSERT INTO payment_info (id,order_id,payment_status) VALUES (1,101,0);
INSERT INTO payment_info (id,order_id,payment_status) VALUES (2,102,1);
INSERT INTO payment_info (id,order_id ,payment_status) VALUES (3,103,0);
INSERT INTO payment_info (id,order_id,payment_status) VALUES (4,104,1);

3.4 范围求模算法

修改schema.xml文件

<table name="person" primaryKey="id" dataNode="dn1,dn2" rule="auto-sharding-rang-mod"></table>

修改rule.xml

<tableRule name="auto-sharding-rang-mod">
	<rule>
		<columns>id</columns>
		<algorithm>rang-mod</algorithm>
	</rule>
</tableRule>

<function name="rang-mod" class="io.mycat.route.function.PartitionByRangeMod">
	<property name="mapFile">partition-range-mod.txt</property>
	<property name="defaultNode">0</property>
</function>

修改partition-range-mod.txt文件

0-1M=1
1M1-2M=1

重启mycat服务
mycat执行sql

CREATE TABLE `person` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
insert into person(id,name) values(9999,'zhangsan1');
insert into person(id,name) values(10000,'zhangsan2');
insert into person(id,name) values(10001,'zhangsan3');
insert into person(id,name) values(20000,'zhangsan4');

3.5 固定分片hash算法

准备三台机器，node01,node02,node03

修改schema.xml文件
修改rule.xml

<!--columns标识将要分片的表字段,algorithm为分片函数-->
<tableRule name="rule1">
	<rule>
		<columns>id</columns>
		<algorithm>func1</algorithm>
	</rule>
</tableRule>
<!--partitionCount为分片个数列表,partitionLength为分片范围列表，分区长度默认最大为1024，即最大支持1024个分区
约束如下：
	count,length 两个数组的长度必须是一致的。
	1024 = sum((count[i]*length[i])). count 和 length 两个向量的点积恒等于 1024
-->
<function name="func1" class="io.mycat.route.function.PartitionByLong">
	<property name="partitionCount">2,1</property>
	<property name="partitionLength">256,512</property>
</function>

重启mycat服务
mycat操作sql

CREATE TABLE `user` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into user(id,name) values(1023,'zhangsan1');
insert into user(id,name) values(1024,'zhangsan2');
insert into user(id,name) values(266,'zhangsan3');
insert into user(id,name) values(255,'zhangsan4');

可以看到结果，1024,255在一个数据分片，266在一个数据分片，1023在一个数据分片。

3.6 取模范围算法

修改schema.xml文件

<table name="user2" primaryKey="id" dataNode="dn1,dn2,dn3" rule="sharding-by-pattern"></table>

修改rule.xml

<!--
columns标识将要分片的表字段
algorithm为分片函数
mapFile：切分规则配置文件
patternValue：求模基数
defaultNode：默认节点，小于0表示不设置默认节点，大于等于0表示设置默认节点，如果超出配置的范围，则使用默认节点；
-->
<tableRule name="sharding-by-pattern">
	<rule>
		<columns>id</columns>
		<algorithm>sharding-by-pattern</algorithm>
	</rule>
</tableRule>

<function name="sharding-by-pattern" class="io.mycat.route.function.PartitionByPattern">
	<property name="mapFile">partition-pattern.txt</property>
	<property name="patternVAlue">256</property>
       <property name="defaultNode">0</property>
</function>

创建partition-pattern.txt文件，并写入规则

创建文件
touch partition-pattern.txt
写入规则：
0-86=0
87-173=1
174-256=2
将文件变成可执行文件：
chmod +x partition-pattern.txt

重启mycat服务
mycat执行sql

CREATE TABLE `user2` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into user2(id,name) values(85,'zhangsan1');
insert into user2(id,name) values(100,'zhangsan2');
insert into user2(id,name) values(200,'zhangsan3');

85在node01,100在node03,200在node02；

3.7 字符串hash求模范围算法

与取模范围算法类似，该算法支持数值、符号、字母取模，此方式就是将指定位数的字符的ascll码的和进行取模运算

修改schema.xml文件

<table name="user3" primaryKey="id" dataNode="dn1,dn2,dn3" rule="sharding-by-prefixpattern"></table>

修改rule.xml

<!--
columns标识将要分片的表字段
algorithm为分片函数
mapFile：切分规则配置文件
patternValue：求模基数
prefixLength:ASCII 截取的位数
-->
<tableRule name="sharding-by-prefixpattern">
	<rule>
		<columns>name</columns>
		<algorithm>sharding-by-prefixpattern</algorithm>
	</rule>
</tableRule>

<function name="sharding-by-prefixpattern" class="io.mycat.route.function.PartitionByPrefixPattern">
	<property name="mapFile">partition-pattern.txt</property>
	<property name="patternVAlue">256</property>
       <property name="prefixLength">5</property>
</function>

重启mycat服务
mycat执行sql

CREATE TABLE `user3` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into user3(id,name) values(1,'zhangsan');
insert into user3(id,name) values(2,'lisi');
insert into user3(id,name) values(3,'wangwu');
insert into user3(id,name) values(4,'zzzzzzz');
insert into user3(id,name) values(5,'z99');

可以看到数据被分散了，但是因为ascll码值是需要计算的，所以结果可能不是很明显，大家需要自己的拼字符串来试验。

3.8 应用指定的算法

在运行阶段由应用程序自主决定路由到哪个分片

修改schema.xml文件

<table name="user4" primaryKey="id" dataNode="dn1,dn2,dn3" rule="sharding-by-substring"></table>

修改rule.xml

<!--
columns标识将要分片的表字段
algorithm为分片函数
-->
<tableRule name="sharding-by-substring">
	<rule>
		<columns>name</columns>
		<algorithm>sharding-by-substring</algorithm>
	</rule>
</tableRule>

<!--startIndex：截取开始的坐标，size：长度，partitionCount：分片数量-->
<function name="sharding-by-substring" class="io.mycat.route.function.PartitionDirectBySubString">
	<property name="startIndex">0</property>
	<property name="size">1</property>
       <property name="partitionCount">3</property>
       <property name="defaultPartition">0</property>
</function>

重启mycat服务
mycat执行sql

CREATE TABLE `user4` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into user4(id,name) values(1,'0-zhangsan');
insert into user4(id,name) values(2,'1-lisi');
insert into user4(id,name) values(3,'2-wangwu');

可以看到数据0-zhangsan在node01上，1-lisi在node03上，2-wangwu在node02上

3.9 按照日期范围分片

修改schema.xml文件

<table name="login_info" dataNode="dn1,dn2" rule="sharding_by_date" ></table>

修改rule.xml文件

<tableRule name="sharding_by_date">
	<rule>
		<columns>login_date</columns>
		<algorithm>shardingByDate</algorithm>
	</rule>
</tableRule>
<!--
columns：分片字段， algorithm：分片函数
dateFormat ：日期格式
sBeginDate ：开始日期
sEndDate：结束日期,则代表数据达到了这个日期的分片后循环从开始分片插入
sPartionDay ：分区天数，即默认从开始日期算起，分隔 2 天一个分区
-->
<function name="shardingByDate" class="io.mycat.route.function.PartitionByDate">
	<property name="dateFormat">yyyy-MM-dd</property>
	<property name="sBeginDate">2020-06-01</property>
	<property name="sEndDate">2020-06-04</property>
    <property name="sPartionDay">2</property>
</function>

重启mycat服务
mycat执行sql

CREATE TABLE login_info
(
`id` INT AUTO_INCREMENT comment '编号',
`user_id` INT comment '用户编号',
`login_date` date comment '登录日期',
PRIMARY KEY(id)
);

INSERT INTO login_info(id,user_id,login_date) VALUES (1,101,'2020-06-01');
INSERT INTO login_info(id,user_id,login_date) VALUES (2,102,'2020-06-02');
INSERT INTO login_info(id,user_id,login_date) VALUES (3,103,'2020-06-03');
INSERT INTO login_info(id,user_id,login_date) VALUES (4,104,'2020-06-04');
INSERT INTO login_info(id,user_id,login_date) VALUES (5,103,'2020-06-05');
INSERT INTO login_info(id,user_id,login_date) VALUES (6,104,'2020-06-06');

3.10 按单月小时分片

此规则是单月内按照小时拆分，最小粒度是小时，可以一天最多24个分片，最少一个分片，一个月完成后下个月开始循环，每个月月尾，需要手工清理数据。

修改schema.xml文件

<table name="user6" dataNode="dn1,dn2,dn3" rule="sharding-by-hour"></table>

修改rule.xml文件

<tableRule name="sharding-by-hour">
	<rule>
		<columns>create_time</columns>
		<algorithm>sharding-by-hour</algorithm>
	</rule>
</tableRule>
<!--
columns： 拆分字段，字符串类型（yyyymmddHH）
splitOneDay ： 一天切分的分片数
-->
<function name="sharding-by-hour" class="io.mycat.route.function.LatestMonthPartion">
	<property name="splitOneDay">3</property>
</function>

重启mycat服务
mycat执行sql

create table user6(
id int not null,
name varchar(64),
create_time varchar(10)
);

insert into user6(id,name,create_time) values(1,'steven','2022060100');
insert into user6(id,name,create_time) values(2,'steven','2022060101');
insert into user6(id,name,create_time) values(3,'steven','2022060102');
insert into user6(id,name,create_time) values(4,'steven','2022060103');
insert into user6(id,name,create_time) values(5,'steven','2022060104');
insert into user6(id,name,create_time) values(6,'steven','2022060105');
insert into user6(id,name,create_time) values(7,'steven','2022060106');
insert into user6(id,name,create_time) values(8,'steven','2022060107');
insert into user6(id,name,create_time) values(9,'steven','2022060108');
insert into user6(id,name,create_time) values(10,'steven','2022060109');
insert into user6(id,name,create_time) values(11,'steven','2022060110');
insert into user6(id,name,create_time) values(12,'steven','2022060111');
insert into user6(id,name,create_time) values(13,'steven','2022060112');
insert into user6(id,name,create_time) values(14,'steven','2022060113');
insert into user6(id,name,create_time) values(15,'steven','2022060114');
insert into user6(id,name,create_time) values(16,'steven','2022060115');
insert into user6(id,name,create_time) values(17,'steven','2022060116');
insert into user6(id,name,create_time) values(18,'steven','2022060117');
insert into user6(id,name,create_time) values(19,'steven','2022060118');
insert into user6(id,name,create_time) values(20,'steven','2022060119');
insert into user6(id,name,create_time) values(21,'steven','2022060120');
insert into user6(id,name,create_time) values(22,'steven','2022060121');
insert into user6(id,name,create_time) values(23,'steven','2022060122');
insert into user6(id,name,create_time) values(24,'steven','2022060123');
insert into user6(id,name,create_time) values(25,'steven','2022060200');
insert into user6(id,name,create_time) values(26,'steven','2022060201');
insert into user6(id,name,create_time) values(27,'steven','2022060202');
insert into user6(id,name,create_time) values(28,'steven','2022060203');
insert into user6(id,name,create_time) values(29,'steven','2022060204');
insert into user6(id,name,create_time) values(30,'steven','2022060205');
insert into user6(id,name,create_time) values(31,'steven','2022060206');
insert into user6(id,name,create_time) values(32,'steven','2022060207');
insert into user6(id,name,create_time) values(33,'steven','2022060208');
insert into user6(id,name,create_time) values(34,'steven','2022060209');
insert into user6(id,name,create_time) values(35,'steven','2022060210');
insert into user6(id,name,create_time) values(36,'steven','2022060211');

在这里插入图片描述

当运行完成之后会发现，第一天的数据能够正常的插入成功，均匀的分散到3个分片上，但是第二天的数据就无法成功分散了，原因就在于我们的数据分片不够，所以这种方式几乎没有人使用。

3.11 日期范围hash分片

思想与范围求模一致，当由于日期在取模会有数据集中问题，所以改成了hash方法。先根据时间hash使得短期内数据分布的更均匀，有点可以避免扩容时的数据迁移，又可以一定程度上避免范围分片的热点问题，要求日期格式尽量精确，不然达不到局部均匀的目的。

修改schema.xml文件

<table name="user7" dataNode="dn1,dn2,dn3" rule="rangeDateHash"></table>

修改rule.xml文件

<tableRule name="rangeDateHash">
	<rule>
		<columns>create_time</columns>
		<algorithm>range-date-hash</algorithm>
	</rule>
</tableRule>
<!--
columns： 拆分字段，字符串类型（yyyymmddHH）
algorithm:分片函数
sBeginDate:指定开始的日期，与dateFormat格式一致
sPartionDay:代表多少天一组
dateFormat:指定的日期格式，符合java标准
-->
<function name="range-date-hash" class="io.mycat.route.function.PartitionByRangeDateHash">
	<property name="sBeginDate">2022-06-01 00:00:00</property>
	<property name="sPartionDay">3</property>
	<property name="dateFormat">yyyy-MM-dd HH:mm:ss</property>
	<property name="groupPartionSize">1</property>
</function>

重启mycat服务
mycat执行sql

create table user7(
id int not null,
name varchar(64),
create_time varchar(20)
);

insert into user7(id,name,create_time) values(1,'steven','2022-06-01 00:00:00');
insert into user7(id,name,create_time) values(2,'steven','2022-06-02 00:00:00');
insert into user7(id,name,create_time) values(3,'steven','2022-06-03 00:00:00');
insert into user7(id,name,create_time) values(4,'steven','2022-06-04 00:00:00');
insert into user7(id,name,create_time) values(5,'steven','2022-06-05 00:00:00');
insert into user7(id,name,create_time) values(6,'steven','2022-06-06 00:00:00');
insert into user7(id,name,create_time) values(7,'steven','2022-06-07 00:00:00');
insert into user7(id,name,create_time) values(8,'steven','2022-06-08 00:00:00');
insert into user7(id,name,create_time) values(9,'steven','2022-06-09 00:00:00');
insert into user7(id,name,create_time) values(10,'steven','2022-06-10 00:00:00');
insert into user7(id,name,create_time) values(11,'steven','2022-06-11 00:00:00');

在这里插入图片描述

通过结果也可以看出，每三天一个分片，那么我们只有三个数据节点，所以到10号的数据的时候，没有办法进行数据的插入了，原因就在于没有足够多的数据节点。

3.12 冷热数据分片

修改schema.xml文件

<table name="user8" dataNode="dn1,dn2,dn3" rule="sharding-by-hotdate" />

修改rule.xml文件

<tableRule name="sharding-by-hotdate">
	<rule>
		<columns>create_time</columns>
		<algorithm>sharding-by-hotdate</algorithm>
	</rule>
</tableRule>
<!--
dataFormat：时间格式化
sLastDay:热数据的天数
sPartionDay:冷数据的分片天数（按照天数分片）
-->
<function name="sharding-by-hotdate" class="io.mycat.route.function.PartitionByHotDate">
	<property name="dateFormat">yyyy-MM-dd</property>
	<property name="sLastDay">10</property>
	<property name="sPartionDay">30</property>
</function>

重启mycat服务
mycat执行sql

CREATE TABLE user8 (
  create_time TIMESTAMP NULL ON UPDATE CURRENT_TIMESTAMP,
  `db_nm` VARCHAR (20) NULL
) ;

INSERT INTO user8 (create_time,db_nm) VALUES ('2022-06-01', database());
INSERT INTO user8 (create_time,db_nm) VALUES ('2022-06-10', database());
INSERT INTO user8 (create_time,db_nm) VALUES ('2022-06-11', database());
INSERT INTO user8 (create_time,db_nm) VALUES ('2022-06-21', database());
INSERT INTO user8 (create_time,db_nm) VALUES ('2022-06-30', database());
INSERT INTO user8 (create_time,db_nm) VALUES ('2020-07-01', database());

3.13 自然月分片

修改schema.xml文件

<table name="user9" dataNode="dn1,dn2,dn3" rule="sharding-by-month" />

修改rule.xml文件

<tableRule name="sharding-by-month">
	<rule>
		<columns>create_time</columns>
		<algorithm>sharding-by-month</algorithm>
	</rule>
</tableRule>
<!--
columns： 分片字段，字符串类型
dateFormat ： 日期字符串格式
sBeginDate ： 开始日期
-->
<function name="sharding-by-month" class="io.mycat.route.function.PartitionByMonth">
	<property name="dateFormat">yyyy-MM-dd</property>
	<property name="sBeginDate">2022-01-01</property>
</function>

重启mycat服务
mycat执行sql

CREATE TABLE user9 (
  id INT,
  `name` VARCHAR (10),
  create_time VARCHAR (20)
) ;

insert into user9(id,name,create_time) values(1,'zhangsan','2022-01-01');
insert into user9(id,name,create_time) values(2,'zhangsan','2022-02-01');
insert into user9(id,name,create_time) values(3,'zhangsan','2022-03-01');
insert into user9(id,name,create_time) values(4,'zhangsan','2022-04-01');
insert into user9(id,name,create_time) values(5,'zhangsan','2022-05-01');
insert into user9(id,name,create_time) values(6,'zhangsan','2022-06-01');

3.14 一致性hash分片

实现方式：一致性hash分片，利用一个分片节点对应一个或者多个虚拟hash桶的思想，，尽可能减少分片扩展时的数据迁移。
优点：有效解决了分布式数据库的扩容问题。
缺点：在横向扩展的时候，需要迁移部分数据；由于虚拟桶倍数与分片节点数都必须是正整数，而且要服从"虚拟桶倍数×分片节点数=设计极限"，因此在横向扩容的过程中，增加分片节点并不是一台一台地加上去的，而是以一种因式分解的方式增加，因此有浪费物理计算力的可能性。

修改schema.xml文件

<table name="user10" dataNode="dn1,dn2,dn3" primaryKey="id" rule="sharding-by-murmur" />

修改rule.xml文件

<tableRule name="sharding-by-murmur">
	<rule>
		<columns>id</columns>
		<algorithm>murmur</algorithm>
	</rule>
</tableRule>
<function name="murmur" class="io.mycat.route.function.PartitionByMurmurHash">
	<property name="seed">0</property><!-- 默认是 0-->
	<property name="count">2</property><!-- 要分片的数据库节点数量，必须指定，否则没法分片-->
	<property name="virtualBucketTimes">160</property>
<!-- 一个实际的数据库节点被映射为这么多虚拟节点，默认是 160 倍，也就是虚拟节点数是物理节点数的 160 倍-->
<!--<property name="weightMapFile">weightMapFile</property>
节点的权重，没有指定权重的节点默认是 1。以 properties 文件的格式填写，以从 0 开始到 count-1 的整数值也就是节点索引为 key，以节点权重值为值。所有权重值必须是正整数，否则以 1 代替 -->
	<property name="bucketMapPath">/etc/mycat/bucketMapPath</property>
<!--用于测试时观察各物理节点与虚拟节点的分布情况，如果指定了这个属性，会把虚拟节点的 murmur hash 值与物理节点的映射按行输出到这个文件，没有默认值，如果不指定，就不会输出任何东西 -->
</function>

重启mycat服务
mycat执行sql

create table user10(id bigint not null primary key,name varchar(20));

insert into user10(id,name) values(1111111,database());
insert into user10(id,name) values(2222222,database());
insert into user10(id,name) values(3333333,database());
insert into user10(id,name) values(4444444,database());
insert into user10(id,name) values(8960000,database());

4. mycat的分片join

4.1 mycat ER表的分片

在上述中，我们已经将orders进行了数据分片，但是orders表跟orders_detail发生关联，如果只把orders_detail放到一个分片上，那么跨库的join很麻烦，所以提出了ER关系的表分片。什么意思呢？就是通过关联关系，将子表与父表关联的记录放在同一个数据分片上。

修改schema.xml文件

<table name ="orders" dataNode="dn1,dn2" rule="mod_rule">
			<childTable name="orders_detail" primaryKey="id" joinKey="order_id" parentKey="id" />		</table>

在node03上创建orders_detail表
重启mycat
mycat执行sql

INSERT INTO orders_detail(id,detail,order_id) values(1,'detail1',1);
INSERT INTO orders_detail(id,detail,order_id) VALUES(2,'detail1',2);
INSERT INTO orders_detail(id,detail,order_id) VALUES(3,'detail1',3);
INSERT INTO orders_detail(id,detail,order_id) VALUES(4,'detail1',4);
INSERT INTO orders_detail(id,detail,order_id) VALUES(5,'detail1',5);
INSERT INTO orders_detail(id,detail,order_id) VALUES(6,'detail1',6);

执行join操作

select * from orders o join orders_detail od on o.id = od.order_id;

在这里插入图片描述

4.2 全局表

在分片的情况下，当业务表因为规模而进行分片之后，业务表与这个字典表的之间关联会变得比较棘手，因此，在mycat中存在一种全局表，他具备以下特性：
1、全局表的插入、更新操作会实时的在所有节点上执行，保持各个分片的数据一致性
2、全局表的查询操作，只从一个节点获取
3、全局表可以跟任何一个表进行join操作

修改schema.xml文件

<table name="dict_order_type" dataNode="dn1,dn2" type="global"></table>

向node03添加dict_order_type表
重启mycat
mycat执行sql

INSERT INTO dict_order_type(id,order_type) VALUES(101,'type1');
INSERT INTO dict_order_type(id,order_type) VALUES(102,'type2');

在这里插入图片描述

4.3 Share join

ShareJoin是一个简单的跨分片join,基于HBT的方式实现。目前支持2个表的join，原理是解析SQL语句，拆分成单表的SQL语句执行，然后把各个节点的数据汇集。

修改schema.xml

<table name="company" primaryKey="id" dataNode="dn1,dn2,dn3" rule="mod-long" />
<table name="customers" primaryKey="id" dataNode="dn1,dn2" rule="sharding-by-intfile"/>

修改rule.xml

<tableRule name="mod-long">
        <rule>
                <columns>id</columns>
                <algorithm>mod-long</algorithm>
        </rule>
</tableRule>
<function name="mod-long" class="io.mycat.route.function.PartitionByMod">
        <!-- how many data nodes -->
        <property name="count">3</property>
</function>
<tableRule name="sharding-by-intfile">
        <rule>
                <columns>sharding_id</columns>
                <algorithm>hash-int</algorithm>
        </rule>
</tableRule>
<function name="hash-int" class="io.mycat.route.function.PartitionByFileMap">
        <property name="mapFile">partition-hash-int.txt</property>
</function>

修改partition-hash-int.txt文件

10000=0
10010=1

重启mycat
mycat执行sql

CREATE TABLE company (
  id INT PRIMARY KEY,
  `name` VARCHAR (10)
) ENGINE = INNODB ;
insert company (id,name) values(1,'cloud');
insert company (id,name) values(2,'abc');
insert company (id,name) values(3,'mnp');

CREATE TABLE customers (
  id INT NOT NULL PRIMARY KEY,
  `name` VARCHAR (100),
  company_id INT NOT NULL,
  sharding_id INT NOT NULL
) ;
insert into customers(id,name,company_id,sharding_id)values(1,'zhang',1,10000),(2,'li',2,10010),(3,'wang',3,10000);

验证

-- 可以看到有时可以查出对应的结果，有时则查询不到
select a.*,b.ID,b.NAME as tit from customers a,company b where a.COMPANY_ID=b.ID;
--可以看到每次都可以直接查询到结果
/*!mycat:catlet=io.mycat.catlets.ShareJoin */select a.*,b.ID,b.NAME as tit from customers a,company b where a.COMPANY_ID=b.ID;
--其他写法
/*!mycat:catlet=io.mycat.catlets.ShareJoin */select a.*,b.ID,b.NAME as tit from customers a join company b on a.COMPANY_ID=b.ID;
/*!mycat:catlet=io.mycat.catlets.ShareJoin */select a.*,b.ID,b.NAME as tit from customers a join company b where a.COMPANY_ID=b.ID;

在这里插入图片描述

5. 全局序列号

5.1 本地文件方式

使用此方式的时候，mycat将sequence配置到文件中，当使用到sequence中的配置，mycat会更新sequence_conf.properties文件中sequence当前的值。
配置方式：

在 sequence_conf.properties 文件中做如下配置：

GLOBAL_SEQ.HISIDS=
GLOBAL_SEQ.MINID=10001
GLOBAL_SEQ.MAXID=20000
GLOBAL_SEQ.CURID=10000

其中 HISIDS 表示使用过的历史分段(一般无特殊需要可不配置)， MINID 表示最小 ID 值， MAXID 表示最大
ID 值， CURID 表示当前 ID 值。

server.xml 中配置

<system><property name="sequnceHandlerType">0</property></system>

注： sequnceHandlerType 需要配置为 0，表示使用本地文件方式。

重启mycat
mycat操作sql

create table tab1(id int primary key,name varchar(10));
insert into tab1(id,name) values(next value for mycatseq_global,'test1');
insert into tab1(id,name) values(next value for mycatseq_global,'test2');
insert into tab1(id,name) values(next value for mycatseq_global,'test3');

缺点：当mycat重新发布后，配置文件中的sequence会恢复到初始值
优点：本地加载，读取速度较快

5.2 数据库方式

在数据库中建立一张表，存放sequence名称（name）,sequence当前值(current_value),步长（increment int类型，每次读取多少个sequence，假设为K）等信息；

获取数据步骤：

1、当初次使用该sequence时，根据传入的sequence名称，从数据库这张表中读取current_value和increment到mycat中，并将数据库中的current_value设置为原current_value值+increment值。
2、mycat将读取到current_value+increment作为本次要使用的sequence值，下次使用时，自动加1，当使用increment次后，执行步骤1中的操作
3、mycat负责维护这张表，用到哪些sequence，只需要在这张表中插入一条记录即可，若某次读取的sequence没有用完，系统就停掉了，则这次读取的sequence剩余值不会再使用

配置方式：

修改server.xml文件

<system><property name="sequnceHandlerType">1</property></system>

修改schema.xml文件

<table name="test" primaryKey="id" autoIncrement="true" dataNode="dn1,dn2,dn3" rule="mod-long"/>
<table name="mycat_sequence" primaryKey="name" dataNode="dn2"/>

修改mycat配置文件sequence_db_conf.properties，添加属性值

GLOBAL=dn1
COMPANY=dn1
CUSTOMER=dn1
ORDERS=dn1
MYCAT=dn2

在dn2上添加mycat_sequence表

DROP TABLE IF EXISTS mycat_sequence;
CREATE TABLE mycat_sequence (name VARCHAR(50) NOT NULL,current_value INT NOT NULL,increment INT NOT NULL DEFAULT 100, PRIMARY KEY(name)) ENGINE=InnoDB;

在dn2上的mycat_sequence表中插入初始记录

INSERT INTO mycat_sequence(name,current_value,increment) VALUES ('mycat', -99, 100);

在dn2上创建函数

--创建函数
DROP FUNCTION IF EXISTS mycat_seq_currval;
DELIMITER $
CREATE FUNCTION mycat_seq_currval(seq_name VARCHAR(50)) RETURNS varchar(64)     CHARSET utf8
DETERMINISTIC
BEGIN
DECLARE retval VARCHAR(64);
SET retval="-999999999,null";
SELECT concat(CAST(current_value AS CHAR),",",CAST(increment AS CHAR)) INTO retval FROM mycat_sequence WHERE name = seq_name;
RETURN retval;
END $
DELIMITER ;
--设置sequence值
DROP FUNCTION IF EXISTS mycat_seq_setval;
DELIMITER $
CREATE FUNCTION mycat_seq_setval(seq_name VARCHAR(50),value INTEGER) RETURNS     varchar(64) CHARSET utf8
DETERMINISTIC
BEGIN
UPDATE mycat_sequence
SET current_value = value
WHERE name = seq_name;
RETURN mycat_seq_currval(seq_name);
END $
DELIMITER ;
--获取下一个sequence值
DROP FUNCTION IF EXISTS mycat_seq_nextval;
DELIMITER $
CREATE FUNCTION mycat_seq_nextval(seq_name VARCHAR(50)) RETURNS varchar(64)     CHARSET utf8
DETERMINISTIC
BEGIN
UPDATE mycat_sequence
SET current_value = current_value + increment WHERE name = seq_name;
RETURN mycat_seq_currval(seq_name);
END $
DELIMITER ;

数据测试：

插入数据表

create table test(id int,name varchar(10));

查询对应的序列数据表

SELECT * FROM mycat_sequence;

向表中插入数据，可以多执行几次

insert into test(id,name) values(next value for MYCATSEQ_MYCAT,(select database()));

查询数据

SELECT * FROM test order by id asc;

5.3 本地时间戳方式

ID= 64 位二进制 (42(毫秒)+5(机器 ID)+5(业务编码)+12(重复累加)。
换算成十进制为 18 位数的 long 类型，每毫秒可以并发 12 位二进制的累加。

配置server.xml文件

<property name="sequnceHandlerType">2</property>

修改sequence_time_conf.properties

WORKID=06 #任意整数
DATAACENTERID=06  #任意整数

修改schema.xml文件

<table name="test2" dataNode="dn1,dn2,dn3" primaryKey="id" autoIncrement="true" rule="mod-long" />

启动mycat，并且创建表进行测试

create table test2(id bigint auto_increment primary key,xm varchar(32));
insert into test2(id,xm) values(next value for MYCATSEQ_GLOBAL，'lisi') ;

此方式的优点是配置简单，但是缺点也很明显就是18位的id太长，需要耗费多余的存储空间。

mycat分库分表

文章目录

1. 分库实现

2. 分表实现

3. 常用分片规则

3.1 取模运算（上述分表实现）

3.2 分片枚举

3.3 范围分片

3.4 范围求模算法

3.5 固定分片hash算法

3.6 取模范围算法

3.7 字符串hash求模范围算法

3.8 应用指定的算法

3.9 按照日期范围分片

3.10 按单月小时分片

3.11 日期范围hash分片

3.12 冷热数据分片

3.13 自然月分片

3.14 一致性hash分片

4. mycat的分片join

4.1 mycat ER表的分片

4.2 全局表

4.3 Share join

5. 全局序列号

5.1 本地文件方式

5.2 数据库方式

5.3 本地时间戳方式