昨天一同事发现线上系统在并发更新的时候出现了死锁,通过排查定位于update更新使用了两个索引导致,死锁信息如下:
*** (1) TRANSACTION:
TRANSACTION 29285454235, ACTIVE 0.001 sec fetching rows
mysql tables in use 3, locked 3
LOCK WAIT 6 lock struct(s), heap size 1184, 4 row lock(s)
MySQL thread id 6641616, OS thread handle 0x2b165c4b1700, query id 28190427937 10.103.180.86 test_ebs Searching rows for update
UPDATE test SET is_deleted = 1 WHERE group_id = 1332577 and test_id = 4580605
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 132 page no 37122 n bits 352 index `PRIMARY` of table `testdriver`.`test` trx id 29285454235 lock_mode X locks rec but not gap waiting
Record lock, heap no 179 PHYSICAL RECORD: n_fields 8; compact format; info bits 0
*** (2) TRANSACTION:
TRANSACTION 29285454237, ACTIVE 0.001 sec fetching rows, thread declared inside InnoDB 4980
mysql tables in use 3, locked 3
5 lock struct(s), heap size 1184, 3 row lock(s)
MySQL thread id 6639213, OS thread handle 0x2b1694cc2700, query id 28190427939 10.103.180.113 test_ebs Searching rows for update
UPDATE test SET is_deleted = 1 WHERE group_id = 1332577 and test_id = 4212859
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 132 page no 37122 n bits 352 index `PRIMARY` of table `testdriver`.`test` trx id 29285454237 lock_mode X locks rec but not gap
Record lock, heap no 179 PHYSICAL RECORD: n_fields 8; compact format;
表结构:
CREATE TABLE `test` (
`id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT ‘主键’,
`test_id` bigint(20) DEFAULT NULL,
`group_id` bigint(20) DEFAULT NULL COMMENT ‘Id,对应test_group.id’,
`gmt_created` datetime DEFAULT NULL COMMENT ‘创建时间’,
`gmt_modified` datetime DEFAULT NULL COMMENT ‘修改时间’,
`is_deleted` tinyint(4) DEFAULT ‘0’ COMMENT ‘删除。’,
PRIMARY KEY (`id`),
KEY `idx_testid` (`test_id`),
KEY `idx_groupid` (`group_id`)
) ENGINE=InnoDB AUTO_INCREMENT=7429111 ;
SQL执行计划:
mysql>explain UPDATE test SET is_deleted = 1 WHERE group_id = 1332577 and test_id = 4212859
+—-+————-+—————+————-+———————–+———————–+———+—–+——+———
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+—-+————-+—————+————-+———————–+———————–+———+—–+——+———
| 1 | SIMPLE | test | index_merge | idx_testid,idx_groupid | idx_testid,idx_groupid | 9,9 | | 1 | Using intersect(idx_testid,idx_groupid); Using where; Using temporary |
+—-+————-+—————+————-+———————–+———————–+———+—–+——+———
所以第一个事务先根据group_id索引,已经锁住primary id,然后再根据test_id索引,锁定primary id;
第二个事务先根据test_id索引,已经锁住primary id,然后再根据group_id索引,去锁primary id;
所以这样并发更新就可能出现死索引。
MySQL官方也已经确认了此bug:https://bugs.mysql.com/bug.php?id=77209
解决方法有两种:
第一、添加test_id+group_id的组合索引,这样就可以避免掉index merge;
第二、将优化器的index merge优化关闭;
建议选择第一种方法来避免此问题的发生。