mirror of
https://github.com/MariaDB/server.git
synced 2025-01-18 21:12:26 +01:00
afd15c43a9
Add a wait-for graph based deadlock detector to the MDL subsystem. Fixes bug #46272 "MySQL 5.4.4, new MDL: unnecessary deadlock" and bug #37346 "innodb does not detect deadlock between update and alter table". The first bug manifested itself as an unwarranted abort of a transaction with ER_LOCK_DEADLOCK error by a concurrent ALTER statement, when this transaction tried to repeat use of a table, which it has already used in a similar fashion before ALTER started. The second bug showed up as a deadlock between table-level locks and InnoDB row locks, which was "detected" only after innodb_lock_wait_timeout timeout. A transaction would start using the table and modify a few rows. Then ALTER TABLE would come in, and start copying rows into a temporary table. Eventually it would stumble on the modified records and get blocked on a row lock. The first transaction would try to do more updates, and get blocked on thr_lock.c lock. This situation of circular wait would only get resolved by a timeout. Both these bugs stemmed from inadequate solutions to the problem of deadlocks occurring between different locking subsystems. In the first case we tried to avoid deadlocks between metadata locking and table-level locking subsystems, when upgrading shared metadata lock to exclusive one. Transactions holding the shared lock on the table and waiting for some table-level lock used to be aborted too aggressively. We also allowed ALTER TABLE to start in presence of transactions that modify the subject table. ALTER TABLE acquires TL_WRITE_ALLOW_READ lock at start, and that block all writes against the table (naturally, we don't want any writes to be lost when switching the old and the new table). TL_WRITE_ALLOW_READ lock, in turn, would block the started transaction on thr_lock.c lock, should they do more updates. This, again, lead to the need to abort such transactions. The second bug occurred simply because we didn't have any mechanism to detect deadlocks between the table-level locks in thr_lock.c and row-level locks in InnoDB, other than innodb_lock_wait_timeout. This patch solves both these problems by moving lock conflicts which are causing these deadlocks into the metadata locking subsystem, thus making it possible to avoid or detect such deadlocks inside MDL. To do this we introduce new type-of-operation-aware metadata locks, which allow MDL subsystem to know not only the fact that transaction has used or is going to use some object but also what kind of operation it has carried out or going to carry out on the object. This, along with the addition of a special kind of upgradable metadata lock, allows ALTER TABLE to wait until all transactions which has updated the table to go away. This solves the second issue. Another special type of upgradable metadata lock is acquired by LOCK TABLE WRITE. This second lock type allows to solve the first issue, since abortion of table-level locks in event of DDL under LOCK TABLES becomes also unnecessary. Below follows the list of incompatible changes introduced by this patch: - From now on, ALTER TABLE and CREATE/DROP TRIGGER SQL (i.e. those statements that acquire TL_WRITE_ALLOW_READ lock) wait for all transactions which has *updated* the table to complete. - From now on, LOCK TABLES ... WRITE, REPAIR/OPTIMIZE TABLE (i.e. all statements which acquire TL_WRITE table-level lock) wait for all transaction which *updated or read* from the table to complete. As a consequence, innodb_table_locks=0 option no longer applies to LOCK TABLES ... WRITE. - DROP DATABASE, DROP TABLE, RENAME TABLE no longer abort statements or transactions which use tables being dropped or renamed, and instead wait for these transactions to complete. - Since LOCK TABLES WRITE now takes a special metadata lock, not compatible with with reads or writes against the subject table and transaction-wide, thr_lock.c deadlock avoidance algorithm that used to ensure absence of deadlocks between LOCK TABLES WRITE and other statements is no longer sufficient, even for MyISAM. The wait-for graph based deadlock detector of MDL subsystem may sometimes be necessary and is involved. This may lead to ER_LOCK_DEADLOCK error produced for multi-statement transactions even if these only use MyISAM: session 1: session 2: begin; update t1 ... lock table t2 write, t1 write; -- gets a lock on t2, blocks on t1 update t2 ... (ER_LOCK_DEADLOCK) - Finally, support of LOW_PRIORITY option for LOCK TABLES ... WRITE was abandoned. LOCK TABLE ... LOW_PRIORITY WRITE from now on has the same priority as the usual LOCK TABLE ... WRITE. SELECT HIGH PRIORITY no longer trumps LOCK TABLE ... WRITE in the wait queue. - We do not take upgradable metadata locks on implicitly locked tables. So if one has, say, a view v1 that uses table t1, and issues: LOCK TABLE v1 WRITE; FLUSH TABLE t1; -- (or just 'FLUSH TABLES'), an error is produced. In order to be able to perform DDL on a table under LOCK TABLES, the table must be locked explicitly in the LOCK TABLES list.
123 lines
4.8 KiB
Text
123 lines
4.8 KiB
Text
#
|
|
# Locking related tests which use DEBUG_SYNC facility.
|
|
#
|
|
--source include/have_debug_sync.inc
|
|
# We need InnoDB to be able use TL_WRITE_ALLOW_WRITE type of locks in our tests.
|
|
--source include/have_innodb.inc
|
|
# Until bug#41971 'Thread state on embedded server is always "Writing to net"'
|
|
# is fixed this test can't be run on embedded version of server.
|
|
--source include/not_embedded.inc
|
|
|
|
# Save the initial number of concurrent sessions.
|
|
--source include/count_sessions.inc
|
|
|
|
|
|
--echo #
|
|
--echo # Test for bug #45143 "All connections hang on concurrent ALTER TABLE".
|
|
--echo #
|
|
--echo # Concurrent execution of statements which required weak write lock
|
|
--echo # (TL_WRITE_ALLOW_WRITE) on several instances of the same table and
|
|
--echo # statements which tried to acquire stronger write lock (TL_WRITE,
|
|
--echo # TL_WRITE_ALLOW_READ) on this table might have led to deadlock.
|
|
--disable_warnings
|
|
drop table if exists t1;
|
|
drop view if exists v1;
|
|
--enable_warnings
|
|
--echo # Create auxiliary connections used through the test.
|
|
connect (con_bug45143_1,localhost,root,,test,,);
|
|
connect (con_bug45143_3,localhost,root,,test,,);
|
|
connect (con_bug45143_2,localhost,root,,test,,);
|
|
connection default;
|
|
--echo # Reset DEBUG_SYNC facility before using it.
|
|
set debug_sync= 'RESET';
|
|
--echo # Turn off logging so calls to locking subsystem performed
|
|
--echo # for general_log table won't interfere with our test.
|
|
set @old_general_log = @@global.general_log;
|
|
set @@global.general_log= OFF;
|
|
|
|
create table t1 (i int) engine=InnoDB;
|
|
--echo # We have to use view in order to make LOCK TABLES avoid
|
|
--echo # acquiring SNRW metadata lock on table.
|
|
create view v1 as select * from t1;
|
|
insert into t1 values (1);
|
|
--echo # Prepare user lock which will be used for resuming execution of
|
|
--echo # the first statement after it acquires TL_WRITE_ALLOW_WRITE lock.
|
|
select get_lock("lock_bug45143_wait", 0);
|
|
|
|
--echo # Switch to connection 'con_bug45143_1'.
|
|
connection con_bug45143_1;
|
|
--echo # Sending:
|
|
--send insert into t1 values (get_lock("lock_bug45143_wait", 100));
|
|
|
|
--echo # Switch to connection 'con_bug45143_2'.
|
|
connection con_bug45143_2;
|
|
--echo # Wait until the above INSERT takes TL_WRITE_ALLOW_WRITE lock on 't1'
|
|
--echo # and then gets blocked on user lock 'lock_bug45143_wait'.
|
|
let $wait_condition= select count(*)= 1 from information_schema.processlist
|
|
where state= 'User lock' and
|
|
info='insert into t1 values (get_lock("lock_bug45143_wait", 100))';
|
|
--source include/wait_condition.inc
|
|
--echo # Ensure that upcoming SELECT waits after acquiring TL_WRITE_ALLOW_WRITE
|
|
--echo # lock for the first instance of 't1'.
|
|
set debug_sync='thr_multi_lock_after_thr_lock SIGNAL parked WAIT_FOR go';
|
|
--echo # Sending:
|
|
--send select count(*) > 0 from t1 as a, t1 as b for update;
|
|
|
|
--echo # Switch to connection 'con_bug45143_3'.
|
|
connection con_bug45143_3;
|
|
--echo # Wait until the above SELECT ... FOR UPDATE is blocked after
|
|
--echo # acquiring lock for the the first instance of 't1'.
|
|
set debug_sync= 'now WAIT_FOR parked';
|
|
--echo # Send LOCK TABLE statement which will try to get TL_WRITE lock on 't1':
|
|
--send lock table v1 write;
|
|
|
|
--echo # Switch to connection 'default'.
|
|
connection default;
|
|
--echo # Wait until this LOCK TABLES statement starts waiting for table lock.
|
|
let $wait_condition= select count(*)= 1 from information_schema.processlist
|
|
where state= 'Table lock' and
|
|
info='lock table v1 write';
|
|
--source include/wait_condition.inc
|
|
--echo # Allow SELECT ... FOR UPDATE to resume.
|
|
--echo # Since it already has TL_WRITE_ALLOW_WRITE lock on the first instance
|
|
--echo # of 't1' it should be able to get lock on the second instance without
|
|
--echo # waiting, even although there is another thread which has such lock
|
|
--echo # on this table and also there is a thread waiting for a TL_WRITE on it.
|
|
set debug_sync= 'now SIGNAL go';
|
|
|
|
--echo # Switch to connection 'con_bug45143_2'.
|
|
connection con_bug45143_2;
|
|
--echo # Reap SELECT ... FOR UPDATE
|
|
--reap
|
|
|
|
--echo # Switch to connection 'default'.
|
|
connection default;
|
|
--echo # Resume execution of the INSERT statement.
|
|
select release_lock("lock_bug45143_wait");
|
|
|
|
--echo # Switch to connection 'con_bug45143_1'.
|
|
connection con_bug45143_1;
|
|
--echo # Reap INSERT statement.
|
|
--reap
|
|
|
|
--echo # Switch to connection 'con_bug45143_3'.
|
|
connection con_bug45143_3;
|
|
--echo # Reap LOCK TABLES statement.
|
|
--reap
|
|
unlock tables;
|
|
|
|
--echo # Switch to connection 'default'.
|
|
connection default;
|
|
--echo # Do clean-up.
|
|
disconnect con_bug45143_1;
|
|
disconnect con_bug45143_2;
|
|
disconnect con_bug45143_3;
|
|
set debug_sync= 'RESET';
|
|
set @@global.general_log= @old_general_log;
|
|
drop view v1;
|
|
drop table t1;
|
|
|
|
|
|
# Check that all connections opened by test cases in this file are really
|
|
# gone so execution of other tests won't be affected by their presence.
|
|
--source include/wait_until_count_sessions.inc
|