MDEV-22666 galera.MW-328A hang

The hang can happen between a lock connection issuing KILL CONNECTION for a victim,
which is in committing phase.
There happens two resource deadlockwhere  killer is holding victim's
LOCK_thd_data and requires trx mutex for the victim.
The victim, otoh, holds his own trx mutex, but requires LOCK_thd_data
in wsrep_commit_ordered(). Hence a classic two thread deadlock happens.

The fix in this commit changes innodb commit so that wsrep_commit_ordered()
is not called while holding trx mutex. With this, wsrep patch commit time mutex
locking does not violate the locking protocol of KILL command
(i.e. LOCK_thd_data -> trx mutex)

Also, a new test case has been added in galera.galera_bf_kill.test for scenario
where a client connection is killed in committting phase.
This commit is contained in:
sjaakola 2020-05-25 14:23:42 +03:00
commit 1af6e92f0b
4 changed files with 67 additions and 6 deletions

View file

@ -140,4 +140,48 @@ select * from t1;
drop table t1;
#
# Test case 7:
# run a transaction in node 2, and set a sync point to pause the transaction
# in commit phase.
# Through another connection to node 2, kill the committing transaction by
# KILL QUERY command
#
--connect node_2a, 127.0.0.1, root, , test, $NODE_MYPORT_2
--connection node_2a
--let $connection_id = `SELECT CONNECTION_ID()`
CREATE TABLE t1 (i int primary key);
# Set up sync point
SET DEBUG_SYNC = "before_wsrep_ordered_commit SIGNAL bwoc_reached WAIT_FOR bwoc_continue";
# Send insert which will block in the sync point above
--send INSERT INTO t1 VALUES (1)
--connection node_2
SET DEBUG_SYNC = "now WAIT_FOR bwoc_reached";
--disable_query_log
--disable_result_log
# victim has passed the point of no return, kill is not possible anymore
--eval KILL QUERY $connection_id
--enable_result_log
--enable_query_log
SET DEBUG_SYNC = "now SIGNAL bwoc_continue";
SET DEBUG_SYNC='RESET';
--connection node_2a
--error 0,1213
--reap
--connection node_2
# victim was able to complete the INSERT
select * from t1;
--disconnect node_2a
--connection node_2
drop table t1;