Innodb:RR隔离级别下insert...select 对select表加锁模型和死锁案列

最近有网友遇到了在RR隔离级别下insert A select B where B.COL=** 由于select表也就是B表引发的死锁的问题。分析死锁日志后,笔者进行模拟重现了这位网友遇到了2种场景并且在本文中进行详细的描述。

  • 本文使用版本percona 5.7.14修改版,能够打印出事务所有的行锁信息结构链(不包含隐含锁)
  • 本文中的测试是在RR隔离级别下完成的,RC不存在这样的问题
  • 笔者对源码的理解有限,如有错误请指正
  • 本文使用了自制工具innblock和bcview,前者用于扫描块结构后者用于更加方便的查看二进制文件信息获取地址
  1. innblock http://pan.baidu.com/s/1qYnyVWo
  2. bcview http://pan.baidu.com/s/1num76RJ


感谢叶金荣老师对本文的审核,笔者也曾是一名知数堂的学生

一、基本概念

在开始正文之前我打算介绍一下一些基本概念,特别是锁模型和兼容矩阵会对本文的阅读有相当大的帮助。

1、 innodb lock模型

  • [LOCK_ORDINARY[next_key_lock]:]
    源码定义:
#define LOCK_ORDINARY   0   /*!< this flag denotes an ordinary
next-key lock in contrast to LOCK_GAP
or LOCK_REC_NOT_GAP */ 

默认是LOCK_ORDINARY即普通的next_key_lock,锁住行及以前的间隙。

  • [LOCK_GAP:]
    源码定义:
#define LOCK_GAP    512 /*!< when this bit is set, it means that the
lock holds only on the gap before the record;
for instance, an x-lock on the gap does not
give permission to modify the record on which
the bit is set; locks of this type are created
when records are removed from the index chain 

间隙锁,锁住行以前的间隙,不锁住本行。

  • [LOCK_REC_NOT_GAP:]
    源码定义:
#define LOCK_REC_NOT_GAP 1024   /*!< this bit means that the lock is only on
the index record and does NOT block inserts
to the gap before the index record; this is
used in the case when we retrieve a record
with a unique key, and is also used in
locking plain SELECTs (not part of UPDATE
or DELETE) when the user has set the READ
COMMITTED isolation level */ 

行锁,锁住行而不锁住任何间隙。

  • [LOCK_INSERT_INTENTION:]
    源码定义:
#define LOCK_INSERT_INTENTION 2048 /*!< this bit is set when we place a waiting
gap type record lock request in order to let
an insert of an index record to wait until
there are no conflicting locks by other
transactions on the gap; note that this flag
remains set when the waiting lock is granted,
or if the lock is inherited record */ 

插入意向锁,如果插入的记录在某个已经锁定的间隙内为这个锁。

2、 innodb lock兼容矩阵

/* LOCK COMPATIBILITY MATRIX
*    IS IX S  X  AI
* IS +  +  +  -  +
* IX +  +  -  -  +
* S  +  -  +  -  -
* X  -  -  -  -  -
* AI +  +  -  -  - 

3、infimum和supremum

一个page中包含这两个伪记录。页中所有的行未删除(或删除未purge)的行逻辑上都连接到这两个虚列之间,表现为一个逻辑链表数据结构,其中supremum伪记录的锁始终为next_key_lock。

4、heap no

heap no存储在fixed_extrasize 中。heap no 为物理存储填充的序号,页的空闲空间挂载在page free链表中(头插法)可以重用,但是重用此heap no不变,如果一直是insert 则heap no 不断增加,并不是按照ROWID(主键)排序的逻辑链表顺序,而是物理填充顺序。

5、n bits

和这个page相关的锁位图的大小,每一行记录都有1 bit的位图信息与其对应,用来表示是否加锁,并且始终预留64bit。例如我的表有9条数据,同时包含infimum和supremum虚拟记录即 64+9+2 bits,即75bits但是必须被8整除向上取整为一个字节,结果也就是就是80 bits。注意不管是否加锁每行都会对应一bit的位图。

6、lock struct

这是LOCK的内存结构体源码中用lock_t表示其可以包含

lock_table_t    tab_lock;/*!< table lock */
lock_rec_t  rec_lock;/*!< record lock */ 

一般来说innodb上锁都会对表级加上IX,这占用一个结构体。然后分别对二级索引和主键进行加锁,每一个BLOCK会占用这样一个结构体。

7、row lock

这个信息描述了当前事务加锁的行数,他是所有lock struct结构体中排除table lock以外所有加锁记录的总和,并且包含了infimum和supremum伪列。

8、逐步加锁

如果细心的朋友应该会发现在show engine 中事务信息中的row lock在对大量行进行加锁的时候会不断的增加,因为加行锁最终会调用lock_rec_lock逐行加锁,这也会增加了大数据量加锁的触发死锁的可能性。

二、Innodb层对insert...select 中select表的加锁模式

RR隔离级别下insert A select B where B.COL=**,innodb层会对B表满足条件的数据进行加锁,但是RC模式下B表记录不会加任何innodb层的锁,表现如下:

  1. 如果B.COL有二级(非唯一),并且执行计划使用到了(非using index)
  • B表二级索引对选中记录加上LOCK_S|LOCK_ORDINARY[next_key_lock],并且对下一条记录加上LOCK_S|LOCK_GAP
  • B表PRIMARY加上LOCK_S|LOCK_REC_NOT_GAP
  1. 如果B.COL有二级(唯一),并且执行计划使用到了(非using index)
  • B表二级索引对选中记录加上LOCK_S|LOCK_REC_NOT_GAP
  • B表PRIMARY加上LOCK_S|LOCK_REC_NOT_GAP
  1. 如果B.COL没有二级索引
  • 对整个B表上的所有记录加上LOCK_S|LOCK_ORDINARY[next_key_lock]

三、Innodb层对insert...select中select表的加锁测试

下面我们分别对其进行测试和打印输出:

1. 如果B.COL有二级(唯一),并且执行计划使用到了(非using index)

使用语句:

drop table  t1;
drop table  t2;
create table t1(id int primary key,n1 varchar(20),n2 varchar(20),key(n1));
create table t2 like t1;
insert into t1 values(1,'gao1','gao'),(2,'gao1','gao'),(3,'gao1','gao'),(4,'gao2','gao'),(5,'gao2','gao'),(6,'gao2','gao'),(7,'gao3','gao'),(8,'gao4','gao'); 

查看执行计划:

mysql> desc insert into t2 select * from t1 force index(n1) where n1='gao2';
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
|  1 | INSERT      | t2    | NULL       | ALL  | NULL          | NULL | NULL    | NULL  | NULL |     NULL | NULL  |
|  1 | SIMPLE      | t1    | NULL       | ref  | n1            | n1   | 23      | const |    3 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+ 

执行语句:

begin;insert into t2 select * from t1 force index(n1) where n1='gao2'; 

观察结果:

  • 二级索引对记录加上LOCK_S|LOCK_ORDINARY[next_key_lock]
-----TRX NO:28470 LOCK STRUCT(1)(Add by gaopeng)
RECORD LOCKS space id 86 page no 4 n bits 80 index n1 of table `test`.`t1` trx id 28470 lock mode S(LOCK_S) locks gap and rec(LOCK_ORDINARY[next_key_lock])
Record lock, heap no 5 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 67616f32; asc gao2;;
1: len 4; hex 80000004; asc     ;;
Record lock, heap no 6 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 67616f32; asc gao2;;
1: len 4; hex 80000005; asc     ;;
Record lock, heap no 7 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 67616f32; asc gao2;;
1: len 4; hex 80000006; asc     ;; 
  • PRIMARY加上LOCK_S|LOCK_REC_NOT_GAP
-----TRX NO:28470 LOCK STRUCT(1)(Add by gaopeng)
RECORD LOCKS space id 86 page no 3 n bits 80 index PRIMARY of table `test`.`t1` trx id 28470 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP)
Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000004; asc     ;;
1: len 6; hex 000000006f20; asc     o ;;
2: len 7; hex bc000001300134; asc     0 4;;
3: len 4; hex 67616f32; asc gao2;;
4: len 3; hex 67616f; asc gao;;
Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000005; asc     ;;
1: len 6; hex 000000006f20; asc     o ;;
2: len 7; hex bc000001300140; asc     0 @;;
3: len 4; hex 67616f32; asc gao2;;
4: len 3; hex 67616f; asc gao;;
Record lock, heap no 7 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000006; asc     ;;
1: len 6; hex 000000006f20; asc     o ;;
2: len 7; hex bc00000130014c; asc     0 L;;
3: len 4; hex 67616f32; asc gao2;;
4: len 3; hex 67616f; asc gao;; 
  • 对二级索引下一条记录加上LOCK_S|LOCK_GAP
-----TRX NO:28470 LOCK STRUCT(1)(Add by gaopeng)
RECORD LOCKS space id 86 page no 4 n bits 80 index n1 of table `test`.`t1` trx id 28470 lock mode S(LOCK_S) locks gap before rec(LOCK_GAP)
Record lock, heap no 8 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 67616f33; asc gao3;;
1: len 4; hex 80000007; asc     ;; 
  • 如图红色部分都是需要锁定的记录

1.png

2. 如果B.COL有二级(唯一),并且执行计划使用到了(非using index)

使用语句:

drop table  t1;
drop table  t2;
create table t1(id int primary key,n1 varchar(20),n2 varchar(20),unique key(n1));
create table t2 like t1;
insert into t1 values(1,'gao1','gao'),(2,'gao2','gao'),(3,'gao3','gao'),(4,'gao4','gao'),(5,'gao5','gao'),(6,'gao6','gao'),(7,'gao7','gao'),(8,'gao8','gao'); 

查看执行计划:

mysql> desc insert into t2 select * from t1 force index(n1) where n1 in ('gao2','gao3','gao4');
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
|  1 | INSERT      | t2    | NULL       | ALL   | NULL          | NULL | NULL    | NULL | NULL |     NULL | NULL                  |
|  1 | SIMPLE      | t1    | NULL       | range | n1            | n1   | 23      | NULL |    3 |   100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+ 

执行语句:

begin;insert into t2 select * from t1 force index(n1) where n1 in ('gao2','gao3','gao4'); 

观察输出:

  • B表二级索引对选中记录加上LOCK_S|LOCK_REC_NOT_GAP
-----TRX NO:30514 LOCK STRUCT(1)(Add by gaopeng)
RECORD LOCKS space id 94 page no 4 n bits 80 index n1 of table `test`.`t1` trx id 30514 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP)
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 67616f32; asc gao2;;
1: len 4; hex 80000002; asc     ;;
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 67616f33; asc gao3;;
1: len 4; hex 80000003; asc     ;;
Record lock, heap no 5 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 67616f34; asc gao4;;
1: len 4; hex 80000004; asc     ;; 
  • B表PRIMARY加上LOCK_S|LOCK_REC_NOT_GAP
-----TRX NO:30514 LOCK STRUCT(1)(Add by gaopeng)
RECORD LOCKS space id 94 page no 3 n bits 80 index PRIMARY of table `test`.`t1` trx id 30514 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP)
Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000002; asc     ;;
1: len 6; hex 000000007728; asc     w(;;
2: len 7; hex a200000115011c; asc        ;;
3: len 4; hex 67616f32; asc gao2;;
4: len 3; hex 67616f; asc gao;;
Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000003; asc     ;;
1: len 6; hex 000000007728; asc     w(;;
2: len 7; hex a2000001150128; asc       (;;
3: len 4; hex 67616f33; asc gao3;;
4: len 3; hex 67616f; asc gao;;
Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000004; asc     ;;
1: len 6; hex 000000007728; asc     w(;;
2: len 7; hex a2000001150134; asc       4;;
3: len 4; hex 67616f34; asc gao4;;
4: len 3; hex 67616f; asc gao;; 
  • 如图红色部分都是需要锁定的记录

2.png

3.如果B.COL没有二级索引

使用语句:

drop table  t1;
drop table  t2;
create table t1(id int primary key,n1 varchar(20),n2 varchar(20));
create table t2 like t1;
insert into t1 values(1,'gao1','gao'),(2,'gao2','gao'),(3,'gao3','gao'),(4,'gao4','gao'),(5,'gao5','gao'),(6,'gao6','gao'),(7,'gao7','gao'),(8,'gao8','gao'); 

查看执行计划:

mysql> desc insert into t2 select * from t1  where n1 in ('gao2','gao3','gao4');
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | INSERT      | t2    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | NULL |     NULL | NULL        |
|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    8 |    37.50 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ 

执行语句:

begin;insert into t2 select * from t1  where n1 in ('gao2','gao3','gao4'); 

观察输出:

-----TRX NO:30535 LOCK STRUCT(1)(Add by gaopeng)
RECORD LOCKS space id 94 page no 3 n bits 80 index PRIMARY of table `test`.`t1` trx id 30535 lock mode S(LOCK_S) locks gap and rec(LOCK_ORDINARY[next_key_lock])
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;
Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000001; asc     ;;
1: len 6; hex 000000007728; asc     w(;;
2: len 7; hex a2000001150110; asc        ;;
3: len 4; hex 67616f31; asc gao1;;
4: len 3; hex 67616f; asc gao;;
Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000002; asc     ;;
1: len 6; hex 000000007728; asc     w(;;
2: len 7; hex a200000115011c; asc        ;;
3: len 4; hex 67616f32; asc gao2;;
4: len 3; hex 67616f; asc gao;;
Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000003; asc     ;;
1: len 6; hex 000000007728; asc     w(;;
2: len 7; hex a2000001150128; asc       (;;
3: len 4; hex 67616f33; asc gao3;;
4: len 3; hex 67616f; asc gao;;
Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000004; asc     ;;
1: len 6; hex 000000007728; asc     w(;;
2: len 7; hex a2000001150134; asc       4;;
3: len 4; hex 67616f34; asc gao4;;
4: len 3; hex 67616f; asc gao;;
Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000005; asc     ;;
1: len 6; hex 000000007728; asc     w(;;
2: len 7; hex a2000001150140; asc       @;;
3: len 4; hex 67616f35; asc gao5;;
4: len 3; hex 67616f; asc gao;;
Record lock, heap no 7 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000006; asc     ;;
1: len 6; hex 000000007728; asc     w(;;
2: len 7; hex a200000115014c; asc       L;;
3: len 4; hex 67616f36; asc gao6;;
4: len 3; hex 67616f; asc gao;;
Record lock, heap no 8 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000007; asc     ;;
1: len 6; hex 000000007728; asc     w(;;
2: len 7; hex a2000001150158; asc       X;;
3: len 4; hex 67616f37; asc gao7;;
4: len 3; hex 67616f; asc gao;;
Record lock, heap no 9 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000008; asc     ;;
1: len 6; hex 000000007728; asc     w(;;
2: len 7; hex a2000001150164; asc       d;;
3: len 4; hex 67616f38; asc gao8;;
4: len 3; hex 67616f; asc gao;; 
  • 如图红色部分都是需要锁定的记录

3.png

四、insert...select由于select查询表引起的死锁

有了上面的理论,我们知道在RR隔离级别下insert...select会对select符合条件的数据加上LOCK_S锁,我曾经总结过出现死锁的条件:

  1. 至少2个独立的线程(会话)
  2. 单位操作中包含多个相对独立的加锁步骤,有一定的时间差
  3. 多个线程(会话)之间加锁对象必须有相互等待的情况发生,并且等待出现环状。

由于存在对select符合条件的数据加上LOCK_S锁的情况,RR模式下insert...select出现死锁的概率无疑更加高,我通过测试模拟出这种情况,严格意义上是相同的语句在高并发情况下表现为两种死锁情况。

测试脚本:

create table b(id int primary key,name1 varchar(20),name2 varchar(20));
alter table b add key(name1);
DELIMITER //
CREATE PROCEDURE test_i()
begin
 declare num int;
 set num = 1;
while num <= 3000 do
 insert into b values(num,concat('gao',num),'gaopeng');
 set num=num+1;
end while;
end//
call test_i()//
create table a like b//
DELIMITER ; 

语句都是一样的:

TX1 TX2
begin; -
update b set name2='test' where id=2999; -
- insert into a select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999);
update b set name2='test' where id=999; -

但是在高并发下相同的语句却表现出不同的死锁情况,现在进行分析:

情况1:

  • TX1:执行update将表b主键id=2999的记录加上LOCK_X
  • TX2:执行insert...select语句b表上的记录(996,997,998,999,2995,2996,2997,2998,2999)会申请加上LOCK_S,
    但是id=2999已经加上LOCK_X,显然不能获得只能等待.
  • TX1:执行update需要获得表b主键id=999的LOCK_X显然这个记录已经被TX2加锁LOCK_S,只能等待,触发死锁检测

如下图红色记录为不能获得锁的记录:

情况1.jpg

情况2:

这种情况比较极端只能在高并发上出现

  • TX1:执行update将表b主键id=2999的记录加上LOCK_X
  • TX2:执行insert...select语句b表上的记录(996,997,998,999,2995,2996,2997,2998,2999)会申请加上LOCK_S,因为上锁是有一个逐步加锁的过程,假设此时加锁到2997前那么TX2并不会等待
  • TX1:执行update需要获得表b主键id=999的LOCK_X显然这个记录已经被TX2加锁LOCK_S,只能等待
  • TX2:继续加锁LOCK_S 2997、2998、2999 发现2999已经被TX1加锁LOCK_X,只能等待,触发死锁检测

如下图红色记录为不能获得锁的记录:

情况2.jpg

五、源码修改和参数增加

情况2的测试需要在高并发下才会出现,因为insert...select语句是一条语句很难人为控制,也就是很让他在特定条件下停止。但是为了能够模拟出这种情况笔者对innodb增加了4个参数如下,为了方便识别我都加上了自己的名字的拼音:

mysql> show variables like '%gaopeng%';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| innodb_gaopeng_sl_heap_no | 0     |
| innodb_gaopeng_sl_ind_id  | 0     |
| innodb_gaopeng_sl_page_no | 0     |
| innodb_gaopeng_sl_time    | 0     |
+---------------------------+-------+ 

默认情况都是0,即不启用。他们的意思如下:

  • innodb_gaopeng_sl_heap_no:记录所在的heap no
  • innodb_gaopeng_sl_ind_id:记录所在的index_id
  • innodb_gaopeng_sl_page_no:记录所在的page_no
  • innodb_gaopeng_sl_time:睡眠多少秒
    有了index_id、page_no、heap no就能唯一限定一条数据了,并且睡眠时间也是可以人为指定的。

并且在源码lock_rec_lock 开头增加如下代码:

 //add by gaopeng
   /*if find index_id heap no page no to sleep srv_gaopeng_sl_time secs*/
       if(srv_gaopeng_sl_ind_id && srv_gaopeng_sl_page_no && srv_gaopeng_sl_heap_no)
       {
           if(heap_no == (ulint)(srv_gaopeng_sl_heap_no) && (block->page.id).page_no() ==(ib_uint32_t)(srv_gaopeng_sl_page_no)
               && index->id ==(index_id_t)(srv_gaopeng_sl_ind_id) )
               {
                   lock_mutex_exit();
                   sleep(srv_gaopeng_sl_time);
                   lock_mutex_enter();
               }
       }
   //add end 

这样一旦判定为符合条件的记录,本条记录的加锁钱便会休眠指定的秒如果我们拟定在LOCK_S:id=2997之前睡眠30秒,那么情况2能够必定发生如下图:

情况3.jpg

六、实际测试

情况1:

TX1 TX2
begin; -
update b set name2='test' where id=2999;对id:2999加LOCK_X锁 -
- insert into a select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999);对id:996,997,998,999,2995,2996,2997,2998加LOCK_S锁,但是对id:2999加LOCK_S锁时发现已经加LOCK_X锁,需等待
update b set name2='test' where id=999;对id:999加LOCK_X锁,但是发现已经加LOCK_S锁,需等待,触发死锁检测 -
TX1触发死锁,TX1在权重判定下回滚 -

死锁报错语句:

mysql> update b set name2='test' where id=999;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction 

死锁日志:

*** (1) TRANSACTION:
TRANSACTION 48423, ACTIVE 7 sec starting index read
mysql tables in use 2, locked 2
LOCK WAIT 5 lock struct(s), heap size 1160, 9 row lock(s), undo log entries 8
MySQL thread id 4, OS thread handle 140737223177984, query id 9110 localhost root Sending data
insert into a  select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999)

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 119 page no 18 n bits 160 index PRIMARY of table `test`.`b` trx id 48423 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP) waiting(LOCK_WAIT)
Record lock, heap no 86 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000bb7; asc     ;;
1: len 6; hex 00000000bd26; asc      &;;
2: len 7; hex 21000001511e7d; asc !   Q };;
3: len 7; hex 67616f32393939; asc gao2999;;
4: len 4; hex 74657374; asc test;;

*** (2) TRANSACTION:
TRANSACTION 48422, ACTIVE 24 sec starting index read
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1160, 2 row lock(s), undo log entries 1
MySQL thread id 3, OS thread handle 140737223444224, query id 9111 localhost root updating
update b set name2='test' where id=999
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 119 page no 18 n bits 160 index PRIMARY of table `test`.`b` trx id 48422 lock_mode X(LOCK_X) locks rec but not gap(LOCK_REC_NOT_GAP)
Record lock, heap no 86 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000bb7; asc     ;;
1: len 6; hex 00000000bd26; asc      &;;
2: len 7; hex 21000001511e7d; asc !   Q };;
3: len 7; hex 67616f32393939; asc gao2999;;
4: len 4; hex 74657374; asc test;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 119 page no 10 n bits 456 index PRIMARY of table `test`.`b` trx id 48422 lock_mode X(LOCK_X) locks rec but not gap(LOCK_REC_NOT_GAP) waiting(LOCK_WAIT)
Record lock, heap no 11 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 800003e7; asc     ;;
1: len 6; hex 00000000b534; asc      4;;
2: len 7; hex bd000001310110; asc     1  ;;
3: len 6; hex 67616f393939; asc gao999;;
4: len 7; hex 67616f70656e67; asc gaopeng;;

*** WE ROLL BACK TRANSACTION (2) 

信息提取如下:

TRX1:48423
   LOCK HOLD:死锁信息不提供
   LOCK WAIT:
            表:b
            索引:`PRIMARY`
            锁模式:LOCK_S|LOCK_REC_NOT_GAP|LOCK_WAIT
            记录:主键为0Xbb7(2999)
            附加信息:space id 119 page no 18 heap no 86
   CURRENT SQL:
            insert into a  select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999)

TRX2:48422(触发死锁、权重回滚)
    LOCK HOLD:
             表:b
             索引:`PRIMARY`
             锁模式:LOCK_X|LOCK_REC_NOT_GAP
             记录:主键为0Xbb7(2999)
             附加信息:pace id 119 page no 18 heap no 86
    LOCK WAIT:
             表:b
             索引:`PRIMARY`
             锁模式:LOCK_X|LOCK_REC_NOT_GAP|LOCK_WAIT
             记录:主键为0X3e7(999)
             附加信息:space id 119 page no 10 heap no 11
    CURRENT SQL:
             update b set name2='test' where id=999 

情况2:

如上我们拟定在语句

insert into a  select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999) 

对b表记录加锁时在2997加锁前停顿30秒,那么我就需要找到b表主键2997的index_id、page_no、heap_no三个信息,这里使用到我的innblock工具

./innblock b.ibd scan 16
===INDEX_ID:121
level1 total block is (1)
block_no:         3,level:   1|*|
level0 total block is (9)
block_no:         5,level:   0|*|block_no:         6,level:   0|*|block_no:         7,level:   0|*|
block_no:        10,level:   0|*|block_no:        11,level:   0|*|block_no:        13,level:   0|*|
block_no:        15,level:   0|*|block_no:        17,level:   0|*|block_no:        18,level:   0|*| 

因为为顺序插入那么2997必定到page 18中然后如下:

./innblock b.ibd 18 16
==== Block base info ====
block_no:18         space_id:121          index_id:121
....
(84) normal record offset:3287 heapno:83 n_owned 0,delflag:N minflag:0 rectype:0
(85) normal record offset:3326 heapno:84 n_owned 0,delflag:N minflag:0 rectype:0
(86) normal record offset:3365 heapno:85 n_owned 0,delflag:N minflag:0 rectype:0
(87) normal record offset:3404 heapno:86 n_owned 0,delflag:N minflag:0 rectype:0
(88) normal record offset:3443 heapno:87 n_owned 0,delflag:N minflag:0 rectype:0 

因为为顺序插入heap_no 84就是id为2997的记录。我们使用另外一个工具bcview进行验证

./bcview b.ibd 16 3326 4
current block:00000018--Offset:03326--cnt bytes:04--data is:80000bb5 

当然0Xbb5就是2997
因此设置参数为:

set global innodb_gaopeng_sl_heap_no=84;
set global innodb_gaopeng_sl_ind_id=121;
set global innodb_gaopeng_sl_page_no=18;
set global innodb_gaopeng_sl_time=30;

mysql> show variables like '%gaopeng%';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| innodb_gaopeng_sl_heap_no | 84    |
| innodb_gaopeng_sl_ind_id  | 121   |
| innodb_gaopeng_sl_page_no | 18    |
| innodb_gaopeng_sl_time    | 30    |
+---------------------------+-------+ 

那么情况2执行顺序如下:

TX1 TX2
begin; -
update b set name2='test' where id=2999; 对id:2999加LOCK_X锁 -
- insert into a select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999);对id:在加锁到996,997,998,999,2995,2996加LOCK_S锁,在对id:2997加锁前睡眠30秒,为下面的update语句腾出时间)
update b set name2='test' where id=999;对id:999加LOCK_X锁等待但发现已经加LOCK_S锁,需等待 -
- 醒来后继续对2997、2998、2999加LOCK_S锁,但是发现id:2999已经加LOCK_X锁,需等待,触发死锁检测
TX1权重回滚 -

死锁报错语句:

mysql> update b set name2='test' where id=999;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction 

死锁日志:

*** (1) TRANSACTION:
TRANSACTION 51545, ACTIVE 41 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1160, 2 row lock(s), undo log entries 1
MySQL thread id 9, OS thread handle 140737223444224, query id 18310 localhost root updating
update b set name2='test' where id=999

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 121 page no 10 n bits 456 index PRIMARY of table `test`.`b` trx id 51545 lock_mode X(LOCK_X) locks rec but not gap(LOCK_REC_NOT_GAP) waiting(LOCK_WAIT)
Record lock, heap no 11 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 800003e7; asc     ;;
1: len 6; hex 00000000c167; asc      g;;
2: len 7; hex bc000001300110; asc     0  ;;
3: len 6; hex 67616f393939; asc gao999;;
4: len 7; hex 67616f70656e67; asc gaopeng;;

*** (2) TRANSACTION:
TRANSACTION 51546, ACTIVE 30 sec starting index read
mysql tables in use 2, locked 2
5 lock struct(s), heap size 1160, 9 row lock(s), undo log entries 8
MySQL thread id 8, OS thread handle 140737223177984, query id 18309 localhost root Sending data
insert into a  select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 121 page no 10 n bits 456 index PRIMARY of table `test`.`b` trx id 51546 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP)
Record lock, heap no 8 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 800003e4; asc     ;;
1: len 6; hex 00000000c164; asc      d;;
2: len 7; hex b90000012d0110; asc     -  ;;
3: len 6; hex 67616f393936; asc gao996;;
4: len 7; hex 67616f70656e67; asc gaopeng;;
Record lock, heap no 9 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 800003e5; asc     ;;
1: len 6; hex 00000000c165; asc      e;;
2: len 7; hex ba0000014f0110; asc     O  ;;
3: len 6; hex 67616f393937; asc gao997;;
4: len 7; hex 67616f70656e67; asc gaopeng;;
Record lock, heap no 10 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 800003e6; asc     ;;
1: len 6; hex 00000000c166; asc      f;;
2: len 7; hex bb0000012f0110; asc     /  ;;
3: len 6; hex 67616f393938; asc gao998;;
4: len 7; hex 67616f70656e67; asc gaopeng;;
Record lock, heap no 11 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 800003e7; asc     ;;
1: len 6; hex 00000000c167; asc      g;;
2: len 7; hex bc000001300110; asc     0  ;;
3: len 6; hex 67616f393939; asc gao999;;
4: len 7; hex 67616f70656e67; asc gaopeng;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 121 page no 18 n bits 160 index PRIMARY of table `test`.`b` trx id 51546 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP) waiting(LOCK_WAIT)
Record lock, heap no 86 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 80000bb7; asc     ;;
1: len 6; hex 00000000c959; asc      Y;;
2: len 7; hex 00000002240110; asc     $  ;;
3: len 7; hex 67616f32393939; asc gao2999;;
4: len 4; hex 74657374; asc test;;

*** WE ROLL BACK TRANSACTION (1) 

信息提取如下:

TRX1:51545
   LOCK HOLD:死锁信息不提供
   LOCK WAIT:
            表:b
            索引:`PRIMARY`
            锁模式:LOCK_MODE:LOCK_X|LOCK_REC_NOT_GAP|LOCK_WAIT
            记录:主键为0X3e7
            附加信息: space id 121 page no 10 heap no 11
   CURRENT SQL:
            update b set name2='test' where id=999

TRX2:51546
    LOCK HOLD:
             表:b
             索引:`PRIMARY`
             锁模式:LOCK_MODE:LOCK_S|LOCK_REC_NOT_GAP
             记录:主键为0X3e4到0X3e7的多个行锁
             附加信息:space id 121 page no 10

    LOCK WAIT:
             表:b
             索引:`PRIMARY`
             锁模式:LOCK_MODE:LOCK_S|LOCK_REC_NOT_GAP|LOCK_WAIT
             记录:主键为0Xbb7
             附加信息:space id 121 page no 10  heap no 86
    CURRENT SQL:
             insert into a  select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999) 

我们通过死锁日志明显的看出同样的语句报出来的死锁信息却不一样,在高并发下相同语句,两种死锁场景都是可能发生的。

七、总结

分析死锁一般要从死锁日志中获取如下信息

  • 1、加锁发生在主键还是辅助索引
  • 2、加锁的模式是什么
  • 3、是单行还是多行加锁
  • 4、触发死锁事务最后的语句
  • 5、死锁信息中事务顺序是怎么样的

在重现的时候,必须要做到和线上死锁信息完全匹配那么这个死锁场景才叫测试成功了,从这个例子我们就发现,同样的语句产生的死锁信息却不一样,我们当然就要按照不通的场景去考虑,本文中的情况2比较复杂一般只是在高并发先出现,测试也相对麻烦,本文通过修改源码的方式进行测试的,否则很难重现。找到原因后就需要采取必要的措施,比如本文中的例子需要考虑:

  • 对insert...select中select表的修改是否及时提交。
  • insert...select是否可以用其他方式代替。这种语句在自增锁上也存在一定风险。
  • 是否考虑使用RC隔离级别,在RC隔离级别下不存在对select表记录加锁的情况。

强调一点对于出现LOCK_S这样的锁最好深入分析,因为这种锁并不多见。

作者微信:

微信.jpg

时间: 2024-10-25 15:15:13

Innodb:RR隔离级别下insert...select 对select表加锁模型和死锁案列的相关文章

MySQL在RR隔离级别下的unique失效和死锁模拟

今天在测试MySQL事务隔离级别的时候,发现了一个有趣的问题,也参考了杨一之前总结的一篇.http://blog.itpub.net/22664653/viewspace-1612574/     问题的背景是在MySQL隔离级别为RR(Repeatable Read)时,唯一性约束没有失效,多并发的场景下能够复现出下面的问题.   这样一个看起来不可能的事情,能否复现呢.   我都这么问了,潜台词就是可以,要不今天的笔记就一个问题就结束了.   为了模拟这个问题,我们打开两个会话窗口,来模拟一

探讨innodb_locks_unsafe_for_binlog参数在RR隔离级别下取消了 gap lock产生的幻读

具体看例子:

mysql中不同事务隔离级别下数据的显示效果

  事务是一组原子性的SQL查询语句,也可以被看做一个工作单元.如果数据库引擎能够成功地对数据库应用所有的查询语句,它就会执行所有查询,如果任何一条查询语句因为崩溃或其他原因而无法执行,那么所有的语句就都不会执行.也就是说,事务内的语句要么全部执行,要么一句也不执行. 事务的特性:acid,也称为事务的四个测试(原子性,一致性,隔离性,持久性) automicity:原子性,事务所引起的数据库操作,要么都完成,要么都不执行 consisitency:一致性,事务执行前的总和和事务执行后的总和是不

浅析SQL Server在可序列化隔离级别下,防止幻读的范围锁的锁定问题

原文:浅析SQL Server在可序列化隔离级别下,防止幻读的范围锁的锁定问题   本文出处:http://www.cnblogs.com/wy123/p/7501261.html (保留出处并非什么原创作品权利,本人拙作还远远达不到,仅仅是为了链接到原文,因为后续对可能存在的一些错误进行修正或补充,无他)     数据库在处理并发事物的过程中,在不同的隔离级别下有不同的锁表现,在非可序列化隔离级别下,存在着脏读,不可重复读,丢失更新,幻读等情况.本文不讨论脏读和不可重复读以及丢失更新的情形,仅

【转】Innodb中的事务隔离级别和锁的关系

申明: 本文转自Innodb中的事务隔离级别和锁的关系,解决了我关于锁.事务隔离的一些误解和疑问.在高并发系统中,数据库对高并发的支持是非常重要的一个方面,本文主要描述高并发场景下,数据库如何保证数据一致性(同时保证良好的性能). 前言: 我们都知道事务的几种性质,数据库为了维护这些性质,尤其是一致性和隔离性,一般使用加锁这种方式.同时数据库又是个高并发的应用,同一时间会有大量的并发访问,如果加锁过度,会极大的降低并发处理能力.所以对于加锁的处理,可以说就是数据库对于事务处理的精髓所在.这里通过

MySQL中Innodb的事务隔离级别和锁的关系的讲解教程_Mysql

前言: 我们都知道事务的几种性质,数据库为了维护这些性质,尤其是一致性和隔离性,一般使用加锁这种方式.同时数据库又是个高并发的应用,同一时间会有大量的并发访问,如果加锁过度,会极大的降低并发处理能力.所以对于加锁的处理,可以说就是数据库对于事务处理的精髓所在.这里通过分析MySQL中InnoDB引擎的加锁机制,来抛砖引玉,让读者更好的理解,在事务处理中数据库到底做了什么. 一次封锁or两段锁?因为有大量的并发访问,为了预防死锁,一般应用中推荐使用一次封锁法,就是在方法的开始阶段,已经预先知道会用

MySQL 四种事务隔离级别详解及对比_Mysql

MySQL 四种事务隔离级别详解及对比 按照SQL:1992 事务隔离级别,InnoDB默认是可重复读的(REPEATABLE READ).MySQL/InnoDB 提供SQL标准所描述的所有四个事务隔离级别.你可以在命令行用--transaction-isolation选项,或在选项文件里,为所有连接设置默认隔离级别. 例如,你可以在my.inf文件的[mysqld]节里类似如下设置该选项: transaction-isolation = {READ-UNCOMMITTED | READ-CO

【MySQL】MySQL的四种事务隔离级别

[MySQL]MySQL的四种事务隔离级别 本文实验的测试环境:Windows 10+cmd+MySQL5.6.36+InnoDB 一.事务的基本要素(ACID) 1.原子性(Atomicity):事务开始后所有操作,要么全部做完,要么全部不做,不可能停滞在中间环节.事务执行过程中出错,会回滚到事务开始前的状态,所有的操作就像没有发生一样.也就是说事务是一个不可分割的整体,就像化学中学过的原子,是物质构成的基本单位.  2.一致性(Consistency):事务开始前和结束后,数据库的完整性约束

SQL Server 事务隔离级别详解

原文:SQL Server 事务隔离级别详解 标签: SQL SEERVER/MSSQL SERVER/SQL/事务隔离级别选项/设计数据库事务级别 SQL 事务隔离级别 概述      隔离级别用于决定如果控制并发用户如何读写数据的操作,同时对性能也有一定的影响作用. 步骤 事务隔离级别通过影响读操作来间接地影响写操作:可以在回话级别上设置事务隔离级别也可以在查询(表级别)级别上设置事务隔离级别.事务隔离级别总共有6个隔离级别:READ UNCOMMITTED(未提交读,读脏),相当于(NOL