Fix for bug #52044 "FLUSH TABLES WITH READ LOCK and FLUSH

TABLES <list> WITH READ LOCK are incompatible".

The problem was that FLUSH TABLES <list> WITH READ LOCK
which was issued when other connection has acquired global
read lock using FLUSH TABLES WITH READ LOCK was blocked
and has to wait until global read lock is released.

This issue stemmed from the fact that FLUSH TABLES <list>
WITH READ LOCK implementation has acquired X metadata locks
on tables to be flushed. Since these locks required acquiring
of global IX lock this statement was incompatible with global
read lock.

This patch addresses problem by using SNW metadata type of
lock for tables to be flushed by FLUSH TABLES <list> WITH
READ LOCK. It is OK to acquire them without global IX lock
as long as we won't try to upgrade those locks. Since SNW
locks allow concurrent statements using same table FLUSH
TABLE <list> WITH READ LOCK now has to wait until old
versions of tables to be flushed go away after acquiring
metadata locks. Since such waiting can lead to deadlock
MDL deadlock detector was extended to take into account
waits for flush and resolve such deadlocks.

As a bonus code in open_tables() which was responsible for
waiting old versions of tables to go away was refactored.
Now when we encounter old version of table in open_table()
we don't back-off and wait for all old version to go away,
but instead wait for this particular table to be flushed.
Such approach supported by deadlock detection should reduce
number of scenarios in which FLUSH TABLES aborts concurrent
multi-statement transactions.

Note that active FLUSH TABLES <list> WITH READ LOCK still
blocks concurrent FLUSH TABLES WITH READ LOCK statement
as the former keeps tables open and thus prevents the
latter statement from doing flush.

mysql-test/include/handler.inc:
  Adjusted test case after changing status which is set
  when FLUSH TABLES waits for tables to be flushed from
  "Flushing tables" to "Waiting for table".
mysql-test/r/flush.result:
  Added test which checks that "flush tables <list> with
  read lock" is compatible with active "flush tables with
  read lock" but not vice-versa. This test also covers
  bug #52044 "FLUSH TABLES WITH READ LOCK and FLUSH TABLES
  <list> WITH READ LOCK are incompatible".
mysql-test/r/mdl_sync.result:
  Added scenarios in which wait for table to be flushed
  causes deadlocks to the coverage of MDL deadlock detector.
mysql-test/suite/perfschema/r/dml_setup_instruments.result:
  Adjusted test results after removal of COND_refresh
  condition variable.
mysql-test/suite/perfschema/r/server_init.result:
  Adjusted test and its results after removal of COND_refresh
  condition variable.
mysql-test/suite/perfschema/t/server_init.test:
  Adjusted test and its results after removal of COND_refresh
  condition variable.
mysql-test/t/flush.test:
  Added test which checks that "flush tables <list> with
  read lock" is compatible with active "flush tables with
  read lock" but not vice-versa. This test also covers
  bug #52044 "FLUSH TABLES WITH READ LOCK and FLUSH TABLES
  <list> WITH READ LOCK are incompatible".
mysql-test/t/kill.test:
  Adjusted test case after changing status which is set
  when FLUSH TABLES waits for tables to be flushed from
  "Flushing tables" to "Waiting for table".
mysql-test/t/lock_multi.test:
  Adjusted test case after changing status which is set
  when FLUSH TABLES waits for tables to be flushed from
  "Flushing tables" to "Waiting for table".
mysql-test/t/mdl_sync.test:
  Added scenarios in which wait for table to be flushed
  causes deadlocks to the coverage of MDL deadlock detector.
sql/ha_ndbcluster.cc:
  Adjusted code after adding one more parameter for
  close_cached_tables() call - timeout for waiting for
  table to be flushed.
sql/ha_ndbcluster_binlog.cc:
  Adjusted code after adding one more parameter for
  close_cached_tables() call - timeout for waiting for
  table to be flushed.
sql/lock.cc:
  Removed COND_refresh condition variable. See comment
  for sql_base.cc for details.
sql/mdl.cc:
  Now MDL deadlock detector takes into account information
  about waits for table flushes when searching for deadlock.
  To implement this change:
  - Declaration of enum_deadlock_weight and
    Deadlock_detection_visitor were moved to mdl.h header
    to make them available to the code in table.cc which
    implements deadlock detector traversal through edges
    of waiters graph representing waiting for flush.
  - Since now MDL_context may wait not only for metadata
    lock but also for table to be flushed an abstract
    Wait_for_edge class was introduced. Its descendants
    MDL_ticket and Flush_ticket incapsulate specifics
    of inspecting waiters graph when following through
    edge representing wait of particular type.
  
  We no longer require global IX metadata lock when acquiring
  SNW or SNRW locks. Such locks are needed only when metadata
  locks of these types are upgraded to X locks. This allows
  to use SNW locks in FLUSH TABLES <list> WITH READ LOCK
  implementation and keep the latter compatible with global
  read lock.
sql/mdl.h:
  Now MDL deadlock detector takes into account information
  about waits for table flushes when searching for deadlock.
  To implement this change:
  - Declaration of enum_deadlock_weight and
    Deadlock_detection_visitor were moved to mdl.h header
    to make them available to the code in table.cc which
    implements deadlock detector traversal through edges
    of waiters graph representing waiting for flush.
  - Since now MDL_context may wait not only for metadata
    lock but also for table to be flushed an abstract
    Wait_for_edge class was introduced. Its descendants
    MDL_ticket and Flush_ticket incapsulate specifics
    of inspecting waiters graph when following through
    edge representing wait of particular type.
  - Deadlock_detection_visitor now has m_table_shares_visited
    member which allows to support recursive locking for
    LOCK_open. This is required when deadlock detector
    inspects waiters graph which contains several edges
    representing waits for flushes or needs to come through
    the such edge more than once.
sql/mysqld.cc:
  Removed COND_refresh condition variable. See comment
  for sql_base.cc for details.
sql/mysqld.h:
  Removed COND_refresh condition variable. See comment
  for sql_base.cc for details.
sql/sql_base.cc:
  Changed approach to how threads are waiting for table
  to be flushed. Now thread that wants to wait for old
  table to go away subscribes for notification by adding
  Flush_ticket to table's share and waits using
  MDL_context::m_wait object. Once table gets flushed
  (i.e. all tables are closed and table share is ready
  to be destroyed) all such waiters are notified
  individually.
  Thanks to this change MDL deadlock detector can take
  such waits into account.
  
  To implement this/as result of this change:
  - tdc_wait_for_old_versions() was replaced with
    tdc_wait_for_old_version() which waits for individual
    old share to go away and which is called by open_table()
    after finding out that share is outdated. We don't
    need to perform back-off before such waiting thanks
    to the fact that deadlock detector now sees such waits.
  - As result Open_table_ctx::m_mdl_requests became
    unnecessary and was removed. We no longer allocate
    copies of MDL_request objects on MEM_ROOT when
    MYSQL_OPEN_FORCE_SHARED/SHARED_HIGH_PRIO flags are
    in effect.
  - close_cached_tables() and tdc_wait_for_old_version()
    share code which implements waiting for share to be
    flushed - the both use TABLE_SHARE::wait_until_flush()
    method. Thanks to this close_cached_tables() supports
    timeouts and has extra parameter for this.
  - Open_table_context::OT_MDL_CONFLICT enum element was
    renamed to OT_CONFLICT as it is now also used in cases
    when back-off is required to resolve deadlock caused
    by waiting for flush and not metadata lock.
  - In cases when we discover that current connection tries
    to open tables from different generation we now simply
    back-off and restart process of opening tables. To
    support this Open_table_context::OT_REOPEN_TABLES enum
    element was added.
  - COND_refresh condition variable became unnecessary and
    was removed.
  - mysql_notify_thread_having_shared_lock() no longer wakes
    up connections waiting for flush as all such connections
    can be waken up by deadlock detector if necessary.
sql/sql_base.h:
  - close_cached_tables() now has one more parameter -
    timeout for waiting for table to be flushed.
  - Open_table_context::OT_MDL_CONFLICT enum element was
    renamed to OT_CONFLICT as it is now also used in cases
    when back-off is required to resolve deadlock caused
    by waiting for flush and not metadata lock.
    Added new OT_REOPEN_TABLES enum element to be used in
    cases when we need to restart open tables process even
    in the middle of transaction.
  - Open_table_ctx::m_mdl_requests became unnecessary and
    was removed.
sql/sql_class.h:
  Added assert ensuring that we won't use LOCK_open mutex
  with THD::enter_cond(). Otherwise deadlocks can arise in
  MDL deadlock detector.
sql/sql_parse.cc:
  Changed FLUSH TABLES <list> WITH READ LOCK to take SNW
  metadata locks instead of X locks on tables to be flushed.
  Since we no longer require global IX lock to be taken
  when SNW locks are taken this makes this statement
  compatible with FLUSH TABLES WITH READ LOCK statement.
  Since SNW locks allow other connections to have table
  opened FLUSH TABLES <list> WITH READ LOCK now has to
  wait during open_tables() for old version to go away.
  Such waits can lead to deadlocks which will be detected
  by MDL deadlock detector which now takes waits for table
  to be flushed into account.
  
  Also adjusted code after adding one more parameter for
  close_cached_tables() call - timeout for waiting for
  table to be flushed.
sql/sql_yacc.yy:
  FLUSH TABLES <list> WITH READ LOCK now needs only SNW
  metadata locks on tables.
sql/sys_vars.cc:
  Adjusted code after adding one more parameter for
  close_cached_tables() call - timeout for waiting for
  table to be flushed.
sql/table.cc:
  Implemented new approach to how threads are waiting for
  table to be flushed. Now thread that wants to wait for
  old table to go away subscribes for notification by
  adding Flush_ticket to table's share and waits using
  MDL_context::m_wait object. Once table gets flushed
  (i.e. all tables are closed and table share is ready
  to be destroyed) all such waiters are notified
  individually. This change allows to make such waits
  visible inside of MDL deadlock detector.
  To do it:
  
  - Added list of waiters/Flush_tickets to TABLE_SHARE
    class.
  - Changed free_table_share() to postpone freeing of
    share memory until last waiter goes away and to
    wake up subscribed waiters.
  - Added TABLE_SHARE::wait_until_flushed() method which
    implements subscription to the list of waiters for
    table to be flushed and waiting for this event.
  
  Implemented interface which allows to expose waits for
  flushes to MDL deadlock detector:
  
  - Introduced Flush_ticket class a descendant of
    Wait_for_edge class.
  - Added TABLE_SHARE::find_deadlock() method which allows
    deadlock detector to find out what contexts are still
    using old version of table in question (i.e. to find
    out what contexts are waited for by owner of
    Flush_ticket).
sql/table.h:
  In order to support new strategy of waiting for table flush
  (see comment for table.cc for details) added list of
  waiters/Flush_tickets to TABLE_SHARE class.
  
  Implemented interface which allows to expose waits for
  flushes to MDL deadlock detector:
  - Introduced Flush_ticket class a descendant of
    Wait_for_edge class.
  - Added TABLE_SHARE::find_deadlock() method which allows
    deadlock detector to find out what contexts are still
    using old version of table in question (i.e. to find
    out what contexts are waited for by owner of
    Flush_ticket).
This commit is contained in:
Dmitry Lenev 2010-07-27 17:34:58 +04:00
parent 36290c0923
commit 00496b7acd
25 changed files with 1038 additions and 297 deletions

View file

@ -523,7 +523,7 @@ connection waiter;
--echo connection: waiter
let $wait_condition=
select count(*) = 1 from information_schema.processlist
where state = "Flushing tables";
where state = "Waiting for table";
--source include/wait_condition.inc
connection default;
--echo connection: default

View file

@ -205,6 +205,20 @@ a
insert into t2 (a) values (3);
# --> connection default;
unlock tables;
#
# Check that "flush tables <list> with read lock" is
# compatible with active "flush tables with read lock".
# Vice versa is not true as tables read-locked by
# "flush tables <list> with read lock" can't be flushed.
flush tables with read lock;
# --> connection con1;
flush table t1 with read lock;
select * from t1;
a
1
unlock tables;
# --> connection default;
unlock tables;
# --> connection con1
drop table t1, t2, t3;
#

View file

@ -2034,6 +2034,157 @@ set debug_sync='now SIGNAL go2';
# Switching to connection 'default'.
# Reaping ALTER. It should succeed and not produce ER_LOCK_DEADLOCK.
drop table t1;
#
# Now, test for situation in which deadlock involves waiting not
# only in MDL subsystem but also for TDC. Such deadlocks should be
# successfully detected. If possible they should be resolved without
# resorting to ER_LOCK_DEADLOCK error.
#
create table t1(i int);
create table t2(j int);
#
# First, let us check how we handle simple scenario involving
# waits in MDL and TDC.
#
set debug_sync= 'RESET';
# Switching to connection 'deadlock_con1'.
# Start statement which will acquire SR metadata lock on t1, open it
# and then will stop, before trying to acquire SW lock and opening t2.
set debug_sync='open_tables_after_open_and_process_table SIGNAL parked WAIT_FOR go';
# Sending:
select * from t1 where i in (select j from t2 for update);
# Switching to connection 'deadlock_con2'.
# Wait till the above SELECT stops.
set debug_sync='now WAIT_FOR parked';
# The below FLUSH TABLES WITH READ LOCK should acquire
# SNW locks on t1 and t2 and wait till SELECT closes t1.
# Sending:
flush tables t1, t2 with read lock;
# Switching to connection 'deadlock_con3'.
# Wait until FLUSH TABLES WITH READ LOCK starts waiting
# for SELECT to close t1.
# Resume SELECT, so it tries to acquire SW lock on t1 and blocks,
# creating a deadlock. This deadlock should be detected and resolved
# by backing-off SELECT. As result FLUSH TABLES WITH READ LOCK should
# be able to finish.
set debug_sync='now SIGNAL go';
# Switching to connection 'deadlock_con2'.
# Reap FLUSH TABLES WITH READ LOCK.
unlock tables;
# Switching to connection 'deadlock_con1'.
# Reap SELECT.
i
#
# The same scenario with a slightly different order of events
# which emphasizes that setting correct deadlock detector weights
# for flush waits is important.
#
set debug_sync= 'RESET';
# Switching to connection 'deadlock_con2'.
set debug_sync='flush_tables_with_read_lock_after_acquire_locks SIGNAL parked WAIT_FOR go';
# The below FLUSH TABLES WITH READ LOCK should acquire
# SNW locks on t1 and t2 and wait on debug sync point.
# Sending:
flush tables t1, t2 with read lock;
# Switching to connection 'deadlock_con1'.
# Wait till FLUSH TABLE WITH READ LOCK stops.
set debug_sync='now WAIT_FOR parked';
# Start statement which will acquire SR metadata lock on t1, open
# it and then will block while trying to acquire SW lock on t2.
# Sending:
select * from t1 where i in (select j from t2 for update);
# Switching to connection 'deadlock_con3'.
# Wait till the above SELECT blocks.
# Resume FLUSH TABLES, so it tries to flush t1 creating a deadlock.
# This deadlock should be detected and resolved by backing-off SELECT.
# As result FLUSH TABLES WITH READ LOCK should be able to finish.
set debug_sync='now SIGNAL go';
# Switching to connection 'deadlock_con2'.
# Reap FLUSH TABLES WITH READ LOCK.
unlock tables;
# Switching to connection 'deadlock_con1'.
# Reap SELECT.
i
#
# Now more complex scenario involving two connections
# waiting for MDL and one for TDC.
#
set debug_sync= 'RESET';
# Switching to connection 'deadlock_con1'.
# Start statement which will acquire SR metadata lock on t2, open it
# and then will stop, before trying to acquire SR lock and opening t1.
set debug_sync='open_tables_after_open_and_process_table SIGNAL parked WAIT_FOR go';
# Sending:
select * from t2, t1;
# Switching to connection 'deadlock_con2'.
# Wait till the above SELECT stops.
set debug_sync='now WAIT_FOR parked';
# The below FLUSH TABLES WITH READ LOCK should acquire
# SNW locks on t2 and wait till SELECT closes t2.
# Sending:
flush tables t2 with read lock;
# Switching to connection 'deadlock_con3'.
# Wait until FLUSH TABLES WITH READ LOCK starts waiting
# for SELECT to close t2.
# The below DROP TABLES should acquire X lock on t1 and start
# waiting for X lock on t2.
# Sending:
drop tables t1, t2;
# Switching to connection 'default'.
# Wait until DROP TABLES starts waiting for X lock on t2.
# Resume SELECT, so it tries to acquire SR lock on t1 and blocks,
# creating a deadlock. This deadlock should be detected and resolved
# by backing-off SELECT. As result FLUSH TABLES WITH READ LOCK should
# be able to finish.
set debug_sync='now SIGNAL go';
# Switching to connection 'deadlock_con2'.
# Reap FLUSH TABLES WITH READ LOCK.
# Unblock DROP TABLES.
unlock tables;
# Switching to connection 'deadlock_con3'.
# Reap DROP TABLES.
# Switching to connection 'deadlock_con1'.
# Reap SELECT. It should emit error about missing table.
ERROR 42S02: Table 'test.t2' doesn't exist
# Switching to connection 'default'.
set debug_sync= 'RESET';
#
# Test for scenario in which FLUSH TABLES <list> WITH READ LOCK
# has been erroneously releasing metadata locks.
#
drop tables if exists t1, t2;
set debug_sync= 'RESET';
create table t1(i int);
create table t2(j int);
# Switching to connection 'con2'.
set debug_sync='open_tables_after_open_and_process_table SIGNAL parked WAIT_FOR go';
# The below FLUSH TABLES <list> WITH READ LOCK should acquire
# SNW locks on t1 and t2, open table t1 and wait on debug sync
# point.
# Sending:
flush tables t1, t2 with read lock;
# Switching to connection 'con1'.
# Wait till FLUSH TABLES <list> WITH READ LOCK stops.
set debug_sync='now WAIT_FOR parked';
# Start statement which will flush all tables and thus invalidate
# table t1 open by FLUSH TABLES <list> WITH READ LOCK.
# Sending:
flush tables;
# Switching to connection 'default'.
# Wait till the above FLUSH TABLES blocks.
# Resume FLUSH TABLES <list> WITH READ LOCK, so it tries to open t2
# discovers that its t1 is obsolete and tries to reopen all tables.
# Such reopen should not cause releasing of SNW metadata locks
# which will result in assertion failures.
set debug_sync='now SIGNAL go';
# Switching to connection 'con2'.
# Reap FLUSH TABLES <list> WITH READ LOCK.
unlock tables;
# Switching to connection 'con1'.
# Reap FLUSH TABLES.
# Clean-up.
# Switching to connection 'default'.
drop tables t1, t2;
set debug_sync= 'RESET';
#
# Test for bug #46748 "Assertion in MDL_context::wait_for_locks()

View file

@ -40,12 +40,12 @@ wait/synch/cond/sql/COND_flush_thread_cache YES YES
wait/synch/cond/sql/COND_global_read_lock YES YES
wait/synch/cond/sql/COND_manager YES YES
wait/synch/cond/sql/COND_queue_state YES YES
wait/synch/cond/sql/COND_refresh YES YES
wait/synch/cond/sql/COND_rpl_status YES YES
wait/synch/cond/sql/COND_server_started YES YES
wait/synch/cond/sql/COND_thread_cache YES YES
wait/synch/cond/sql/COND_thread_count YES YES
wait/synch/cond/sql/Delayed_insert::cond YES YES
wait/synch/cond/sql/Delayed_insert::cond_client YES YES
select * from performance_schema.SETUP_INSTRUMENTS
where name='Wait';
select * from performance_schema.SETUP_INSTRUMENTS

View file

@ -184,10 +184,6 @@ where name like "wait/synch/cond/sql/COND_server_started";
count(name)
1
select count(name) from COND_INSTANCES
where name like "wait/synch/cond/sql/COND_refresh";
count(name)
1
select count(name) from COND_INSTANCES
where name like "wait/synch/cond/sql/COND_thread_count";
count(name)
1

View file

@ -209,9 +209,6 @@ select count(name) from RWLOCK_INSTANCES
select count(name) from COND_INSTANCES
where name like "wait/synch/cond/sql/COND_server_started";
select count(name) from COND_INSTANCES
where name like "wait/synch/cond/sql/COND_refresh";
select count(name) from COND_INSTANCES
where name like "wait/synch/cond/sql/COND_thread_count";

View file

@ -318,6 +318,20 @@ insert into t2 (a) values (3);
--echo # --> connection default;
connection default;
unlock tables;
--echo #
--echo # Check that "flush tables <list> with read lock" is
--echo # compatible with active "flush tables with read lock".
--echo # Vice versa is not true as tables read-locked by
--echo # "flush tables <list> with read lock" can't be flushed.
flush tables with read lock;
--echo # --> connection con1;
connection con1;
flush table t1 with read lock;
select * from t1;
unlock tables;
--echo # --> connection default;
connection default;
unlock tables;
--echo # --> connection con1
connection con1;
disconnect con1;

View file

@ -536,7 +536,7 @@ connection ddl;
connection dml;
let $wait_condition=
select count(*) = 1 from information_schema.processlist
where state = "Flushing tables" and
where state = "Waiting for table" and
info = "flush tables";
--source include/wait_condition.inc
--send select * from t1

View file

@ -982,7 +982,7 @@ connection con3;
connection con2;
let $wait_condition=
SELECT COUNT(*) = 1 FROM information_schema.processlist
WHERE state = "Flushing tables" AND info = "FLUSH TABLES";
WHERE state = "Waiting for table" AND info = "FLUSH TABLES";
--source include/wait_condition.inc
--error ER_LOCK_WAIT_TIMEOUT
SELECT * FROM t1;

View file

@ -2829,6 +2829,187 @@ connection default;
drop table t1;
--echo #
--echo # Now, test for situation in which deadlock involves waiting not
--echo # only in MDL subsystem but also for TDC. Such deadlocks should be
--echo # successfully detected. If possible they should be resolved without
--echo # resorting to ER_LOCK_DEADLOCK error.
--echo #
create table t1(i int);
create table t2(j int);
--echo #
--echo # First, let us check how we handle simple scenario involving
--echo # waits in MDL and TDC.
--echo #
set debug_sync= 'RESET';
--echo # Switching to connection 'deadlock_con1'.
connection deadlock_con1;
--echo # Start statement which will acquire SR metadata lock on t1, open it
--echo # and then will stop, before trying to acquire SW lock and opening t2.
set debug_sync='open_tables_after_open_and_process_table SIGNAL parked WAIT_FOR go';
--echo # Sending:
--send select * from t1 where i in (select j from t2 for update)
--echo # Switching to connection 'deadlock_con2'.
connection deadlock_con2;
--echo # Wait till the above SELECT stops.
set debug_sync='now WAIT_FOR parked';
--echo # The below FLUSH TABLES WITH READ LOCK should acquire
--echo # SNW locks on t1 and t2 and wait till SELECT closes t1.
--echo # Sending:
--send flush tables t1, t2 with read lock
--echo # Switching to connection 'deadlock_con3'.
connection deadlock_con3;
--echo # Wait until FLUSH TABLES WITH READ LOCK starts waiting
--echo # for SELECT to close t1.
let $wait_condition=
select count(*) = 1 from information_schema.processlist
where state = "Waiting for table" and info = "flush tables t1, t2 with read lock";
--source include/wait_condition.inc
--echo # Resume SELECT, so it tries to acquire SW lock on t1 and blocks,
--echo # creating a deadlock. This deadlock should be detected and resolved
--echo # by backing-off SELECT. As result FLUSH TABLES WITH READ LOCK should
--echo # be able to finish.
set debug_sync='now SIGNAL go';
--echo # Switching to connection 'deadlock_con2'.
connection deadlock_con2;
--echo # Reap FLUSH TABLES WITH READ LOCK.
--reap
unlock tables;
--echo # Switching to connection 'deadlock_con1'.
connection deadlock_con1;
--echo # Reap SELECT.
--reap
--echo #
--echo # The same scenario with a slightly different order of events
--echo # which emphasizes that setting correct deadlock detector weights
--echo # for flush waits is important.
--echo #
set debug_sync= 'RESET';
--echo # Switching to connection 'deadlock_con2'.
connection deadlock_con2;
set debug_sync='flush_tables_with_read_lock_after_acquire_locks SIGNAL parked WAIT_FOR go';
--echo # The below FLUSH TABLES WITH READ LOCK should acquire
--echo # SNW locks on t1 and t2 and wait on debug sync point.
--echo # Sending:
--send flush tables t1, t2 with read lock
--echo # Switching to connection 'deadlock_con1'.
connection deadlock_con1;
--echo # Wait till FLUSH TABLE WITH READ LOCK stops.
set debug_sync='now WAIT_FOR parked';
--echo # Start statement which will acquire SR metadata lock on t1, open
--echo # it and then will block while trying to acquire SW lock on t2.
--echo # Sending:
--send select * from t1 where i in (select j from t2 for update)
--echo # Switching to connection 'deadlock_con3'.
connection deadlock_con3;
--echo # Wait till the above SELECT blocks.
let $wait_condition=
select count(*) = 1 from information_schema.processlist
where state = "Waiting for table" and
info = "select * from t1 where i in (select j from t2 for update)";
--source include/wait_condition.inc
--echo # Resume FLUSH TABLES, so it tries to flush t1 creating a deadlock.
--echo # This deadlock should be detected and resolved by backing-off SELECT.
--echo # As result FLUSH TABLES WITH READ LOCK should be able to finish.
set debug_sync='now SIGNAL go';
--echo # Switching to connection 'deadlock_con2'.
connection deadlock_con2;
--echo # Reap FLUSH TABLES WITH READ LOCK.
--reap
unlock tables;
--echo # Switching to connection 'deadlock_con1'.
connection deadlock_con1;
--echo # Reap SELECT.
--reap
--echo #
--echo # Now more complex scenario involving two connections
--echo # waiting for MDL and one for TDC.
--echo #
set debug_sync= 'RESET';
--echo # Switching to connection 'deadlock_con1'.
connection deadlock_con1;
--echo # Start statement which will acquire SR metadata lock on t2, open it
--echo # and then will stop, before trying to acquire SR lock and opening t1.
set debug_sync='open_tables_after_open_and_process_table SIGNAL parked WAIT_FOR go';
--echo # Sending:
--send select * from t2, t1
--echo # Switching to connection 'deadlock_con2'.
connection deadlock_con2;
--echo # Wait till the above SELECT stops.
set debug_sync='now WAIT_FOR parked';
--echo # The below FLUSH TABLES WITH READ LOCK should acquire
--echo # SNW locks on t2 and wait till SELECT closes t2.
--echo # Sending:
--send flush tables t2 with read lock
--echo # Switching to connection 'deadlock_con3'.
connection deadlock_con3;
--echo # Wait until FLUSH TABLES WITH READ LOCK starts waiting
--echo # for SELECT to close t2.
let $wait_condition=
select count(*) = 1 from information_schema.processlist
where state = "Waiting for table" and info = "flush tables t2 with read lock";
--source include/wait_condition.inc
--echo # The below DROP TABLES should acquire X lock on t1 and start
--echo # waiting for X lock on t2.
--echo # Sending:
--send drop tables t1, t2
--echo # Switching to connection 'default'.
connection default;
--echo # Wait until DROP TABLES starts waiting for X lock on t2.
let $wait_condition=
select count(*) = 1 from information_schema.processlist
where state = "Waiting for table" and info = "drop tables t1, t2";
--source include/wait_condition.inc
--echo # Resume SELECT, so it tries to acquire SR lock on t1 and blocks,
--echo # creating a deadlock. This deadlock should be detected and resolved
--echo # by backing-off SELECT. As result FLUSH TABLES WITH READ LOCK should
--echo # be able to finish.
set debug_sync='now SIGNAL go';
--echo # Switching to connection 'deadlock_con2'.
connection deadlock_con2;
--echo # Reap FLUSH TABLES WITH READ LOCK.
--reap
--echo # Unblock DROP TABLES.
unlock tables;
--echo # Switching to connection 'deadlock_con3'.
connection deadlock_con3;
--echo # Reap DROP TABLES.
--reap
--echo # Switching to connection 'deadlock_con1'.
connection deadlock_con1;
--echo # Reap SELECT. It should emit error about missing table.
--error ER_NO_SUCH_TABLE
--reap
--echo # Switching to connection 'default'.
connection default;
set debug_sync= 'RESET';
disconnect deadlock_con1;
@ -2836,6 +3017,75 @@ disconnect deadlock_con2;
disconnect deadlock_con3;
--echo #
--echo # Test for scenario in which FLUSH TABLES <list> WITH READ LOCK
--echo # has been erroneously releasing metadata locks.
--echo #
connect(con1,localhost,root,,);
connect(con2,localhost,root,,);
connection default;
--disable_warnings
drop tables if exists t1, t2;
--enable_warnings
set debug_sync= 'RESET';
create table t1(i int);
create table t2(j int);
--echo # Switching to connection 'con2'.
connection con2;
set debug_sync='open_tables_after_open_and_process_table SIGNAL parked WAIT_FOR go';
--echo # The below FLUSH TABLES <list> WITH READ LOCK should acquire
--echo # SNW locks on t1 and t2, open table t1 and wait on debug sync
--echo # point.
--echo # Sending:
--send flush tables t1, t2 with read lock
--echo # Switching to connection 'con1'.
connection con1;
--echo # Wait till FLUSH TABLES <list> WITH READ LOCK stops.
set debug_sync='now WAIT_FOR parked';
--echo # Start statement which will flush all tables and thus invalidate
--echo # table t1 open by FLUSH TABLES <list> WITH READ LOCK.
--echo # Sending:
--send flush tables
--echo # Switching to connection 'default'.
connection default;
--echo # Wait till the above FLUSH TABLES blocks.
let $wait_condition=
select count(*) = 1 from information_schema.processlist
where state = "Waiting for table" and
info = "flush tables";
--source include/wait_condition.inc
--echo # Resume FLUSH TABLES <list> WITH READ LOCK, so it tries to open t2
--echo # discovers that its t1 is obsolete and tries to reopen all tables.
--echo # Such reopen should not cause releasing of SNW metadata locks
--echo # which will result in assertion failures.
set debug_sync='now SIGNAL go';
--echo # Switching to connection 'con2'.
connection con2;
--echo # Reap FLUSH TABLES <list> WITH READ LOCK.
--reap
unlock tables;
--echo # Switching to connection 'con1'.
connection con1;
--echo # Reap FLUSH TABLES.
--reap
--echo # Clean-up.
--echo # Switching to connection 'default'.
connection default;
drop tables t1, t2;
set debug_sync= 'RESET';
disconnect con1;
disconnect con2;
--echo #
--echo # Test for bug #46748 "Assertion in MDL_context::wait_for_locks()
--echo # on INSERT + CREATE TRIGGER".

View file

@ -679,7 +679,7 @@ int ha_ndbcluster::ndb_err(NdbTransaction *trans)
bzero((char*) &table_list,sizeof(table_list));
table_list.db= m_dbname;
table_list.alias= table_list.table_name= m_tabname;
close_cached_tables(thd, &table_list, FALSE, FALSE);
close_cached_tables(thd, &table_list, FALSE, FALSE, LONG_TIMEOUT);
break;
}
default:
@ -8452,7 +8452,7 @@ int handle_trailing_share(NDB_SHARE *share)
table_list.db= share->db;
table_list.alias= table_list.table_name= share->table_name;
mysql_mutex_assert_owner(&LOCK_open);
close_cached_tables(thd, &table_list, TRUE, FALSE);
close_cached_tables(thd, &table_list, TRUE, FALSE, LONG_TIMEOUT);
mysql_mutex_lock(&ndbcluster_mutex);
/* ndb_share reference temporary free */

View file

@ -937,7 +937,7 @@ int ndbcluster_setup_binlog_table_shares(THD *thd)
ndb_binlog_tables_inited= TRUE;
if (opt_ndb_extra_logging)
sql_print_information("NDB Binlog: ndb tables writable");
close_cached_tables(NULL, NULL, TRUE, FALSE);
close_cached_tables(NULL, NULL, TRUE, FALSE, LONG_TIMEOUT);
mysql_mutex_unlock(&LOCK_open);
/* Signal injector thread that all is setup */
mysql_cond_signal(&injector_cond);
@ -1751,7 +1751,7 @@ ndb_handle_schema_change(THD *thd, Ndb *ndb, NdbEventOperation *pOp,
bzero((char*) &table_list,sizeof(table_list));
table_list.db= (char *)dbname;
table_list.alias= table_list.table_name= (char *)tabname;
close_cached_tables(thd, &table_list, TRUE, FALSE);
close_cached_tables(thd, &table_list, TRUE, FALSE, LONG_TIMEOUT);
if ((error= ndbcluster_binlog_open_table(thd, share,
table_share, table, 1)))
@ -1857,7 +1857,7 @@ ndb_handle_schema_change(THD *thd, Ndb *ndb, NdbEventOperation *pOp,
bzero((char*) &table_list,sizeof(table_list));
table_list.db= (char *)dbname;
table_list.alias= table_list.table_name= (char *)tabname;
close_cached_tables(thd, &table_list, FALSE, FALSE);
close_cached_tables(thd, &table_list, FALSE, FALSE, LONG_TIMEOUT);
/* ndb_share reference create free */
DBUG_PRINT("NDB_SHARE", ("%s create free use_count: %u",
share->key, share->use_count));
@ -1978,7 +1978,7 @@ ndb_binlog_thread_handle_schema_event(THD *thd, Ndb *ndb,
bzero((char*) &table_list,sizeof(table_list));
table_list.db= schema->db;
table_list.alias= table_list.table_name= schema->name;
close_cached_tables(thd, &table_list, FALSE, FALSE);
close_cached_tables(thd, &table_list, FALSE, FALSE, LONG_TIMEOUT);
}
/* ndb_share reference temporary free */
if (share)
@ -2095,7 +2095,7 @@ ndb_binlog_thread_handle_schema_event(THD *thd, Ndb *ndb,
mysql_mutex_unlock(&ndb_schema_share_mutex);
/* end protect ndb_schema_share */
close_cached_tables(NULL, NULL, FALSE, FALSE);
close_cached_tables(NULL, NULL, FALSE, FALSE, LONG_TIMEOUT);
// fall through
case NDBEVENT::TE_ALTER:
ndb_handle_schema_change(thd, ndb, pOp, tmp_share);
@ -2252,7 +2252,7 @@ ndb_binlog_thread_handle_schema_event_post_epoch(THD *thd,
bzero((char*) &table_list,sizeof(table_list));
table_list.db= schema->db;
table_list.alias= table_list.table_name= schema->name;
close_cached_tables(thd, &table_list, FALSE, FALSE);
close_cached_tables(thd, &table_list, FALSE, FALSE, LONG_TIMEOUT);
}
if (schema_type != SOT_ALTER_TABLE)
break;

View file

@ -1298,27 +1298,19 @@ bool Global_read_lock::make_global_read_lock_block_commit(THD *thd)
/**
Broadcast COND_refresh and COND_global_read_lock.
Broadcast COND_global_read_lock.
Due to a bug in a threading library it could happen that a signal
did not reach its target. A condition for this was that the same
condition variable was used with different mutexes in
mysql_cond_wait(). Some time ago we changed LOCK_open to
LOCK_global_read_lock in global read lock handling. So COND_refresh
was used with LOCK_open and LOCK_global_read_lock.
We did now also change from COND_refresh to COND_global_read_lock
in global read lock handling. But now it is necessary to signal
both conditions at the same time.
@note
When signalling COND_global_read_lock within the global read lock
handling, it is not necessary to also signal COND_refresh.
TODO/FIXME: Dmitry thinks that we broadcast on COND_global_read_lock
when old instance of table is closed to avoid races
between incrementing refresh_version and
wait_if_global_read_lock(thd, TRUE, FALSE) call.
Once global read lock implementation starts using MDL
infrastructure this will became unnecessary and should
be removed.
*/
void broadcast_refresh(void)
{
mysql_cond_broadcast(&COND_refresh);
mysql_cond_broadcast(&COND_global_read_lock);
}

View file

@ -98,70 +98,6 @@ private:
};
enum enum_deadlock_weight
{
MDL_DEADLOCK_WEIGHT_DML= 0,
MDL_DEADLOCK_WEIGHT_DDL= 100
};
/**
A context of the recursive traversal through all contexts
in all sessions in search for deadlock.
*/
class Deadlock_detection_visitor
{
public:
Deadlock_detection_visitor(MDL_context *start_node_arg)
: m_start_node(start_node_arg),
m_victim(NULL),
m_current_search_depth(0)
{}
bool enter_node(MDL_context * /* unused */);
void leave_node(MDL_context * /* unused */);
bool inspect_edge(MDL_context *dest);
MDL_context *get_victim() const { return m_victim; }
/**
Change the deadlock victim to a new one if it has lower deadlock
weight.
*/
MDL_context *opt_change_victim_to(MDL_context *new_victim);
private:
/**
The context which has initiated the search. There
can be multiple searches happening in parallel at the same time.
*/
MDL_context *m_start_node;
/** If a deadlock is found, the context that identifies the victim. */
MDL_context *m_victim;
/** Set to the 0 at start. Increased whenever
we descend into another MDL context (aka traverse to the next
wait-for graph node). When MAX_SEARCH_DEPTH is reached, we
assume that a deadlock is found, even if we have not found a
loop.
*/
uint m_current_search_depth;
/**
Maximum depth for deadlock searches. After this depth is
achieved we will unconditionally declare that there is a
deadlock.
@note This depth should be small enough to avoid stack
being exhausted by recursive search algorithm.
TODO: Find out what is the optimal value for this parameter.
Current value is safe, but probably sub-optimal,
as there is an anecdotal evidence that real-life
deadlocks are even shorter typically.
*/
static const uint MAX_SEARCH_DEPTH= 32;
};
/**
Enter a node of a wait-for graph. After
a node is entered, inspect_edge() will be called
@ -876,7 +812,7 @@ void MDL_ticket::destroy(MDL_ticket *ticket)
uint MDL_ticket::get_deadlock_weight() const
{
return (m_lock->key.mdl_namespace() == MDL_key::GLOBAL ||
m_type > MDL_SHARED_NO_WRITE ?
m_type >= MDL_SHARED_NO_WRITE ?
MDL_DEADLOCK_WEIGHT_DDL : MDL_DEADLOCK_WEIGHT_DML);
}
@ -1528,9 +1464,8 @@ MDL_context::try_acquire_lock_impl(MDL_request *mdl_request,
MDL_ticket *ticket;
bool is_transactional;
DBUG_ASSERT(mdl_request->type < MDL_SHARED_NO_WRITE ||
(is_lock_owner(MDL_key::GLOBAL, "", "",
MDL_INTENTION_EXCLUSIVE)));
DBUG_ASSERT(mdl_request->type != MDL_EXCLUSIVE ||
is_lock_owner(MDL_key::GLOBAL, "", "", MDL_INTENTION_EXCLUSIVE));
DBUG_ASSERT(mdl_request->ticket == NULL);
/* Don't take chances in production. */
@ -2087,6 +2022,21 @@ end:
}
/**
Traverse portion of wait-for graph which is reachable through edge
represented by this ticket in search for deadlocks.
@retval TRUE A deadlock is found. A victim is remembered
by the visitor.
@retval FALSE
*/
bool MDL_ticket::find_deadlock(Deadlock_detection_visitor *dvisitor)
{
return m_lock->find_deadlock(this, dvisitor);
}
/**
Recursively traverse the wait-for graph of MDL contexts
in search for deadlocks.
@ -2105,7 +2055,7 @@ bool MDL_context::find_deadlock(Deadlock_detection_visitor *dvisitor)
if (m_waiting_for)
{
result= m_waiting_for->m_lock->find_deadlock(m_waiting_for, dvisitor);
result= m_waiting_for->find_deadlock(dvisitor);
if (result)
m_unlock_ctx= dvisitor->opt_change_victim_to(this);
}

View file

@ -34,7 +34,6 @@ class THD;
class MDL_context;
class MDL_lock;
class MDL_ticket;
class Deadlock_detection_visitor;
/**
Type of metadata lock request.
@ -360,6 +359,96 @@ public:
typedef void (*mdl_cached_object_release_hook)(void *);
enum enum_deadlock_weight
{
MDL_DEADLOCK_WEIGHT_DML= 0,
MDL_DEADLOCK_WEIGHT_DDL= 100
};
/**
A context of the recursive traversal through all contexts
in all sessions in search for deadlock.
*/
class Deadlock_detection_visitor
{
public:
Deadlock_detection_visitor(MDL_context *start_node_arg)
: m_start_node(start_node_arg),
m_victim(NULL),
m_current_search_depth(0),
m_table_shares_visited(0)
{}
bool enter_node(MDL_context * /* unused */);
void leave_node(MDL_context * /* unused */);
bool inspect_edge(MDL_context *dest);
MDL_context *get_victim() const { return m_victim; }
/**
Change the deadlock victim to a new one if it has lower deadlock
weight.
*/
MDL_context *opt_change_victim_to(MDL_context *new_victim);
private:
/**
The context which has initiated the search. There
can be multiple searches happening in parallel at the same time.
*/
MDL_context *m_start_node;
/** If a deadlock is found, the context that identifies the victim. */
MDL_context *m_victim;
/** Set to the 0 at start. Increased whenever
we descend into another MDL context (aka traverse to the next
wait-for graph node). When MAX_SEARCH_DEPTH is reached, we
assume that a deadlock is found, even if we have not found a
loop.
*/
uint m_current_search_depth;
/**
Maximum depth for deadlock searches. After this depth is
achieved we will unconditionally declare that there is a
deadlock.
@note This depth should be small enough to avoid stack
being exhausted by recursive search algorithm.
TODO: Find out what is the optimal value for this parameter.
Current value is safe, but probably sub-optimal,
as there is an anecdotal evidence that real-life
deadlocks are even shorter typically.
*/
static const uint MAX_SEARCH_DEPTH= 32;
public:
/**
Number of TABLE_SHARE objects visited by deadlock detector so far.
Used by TABLE_SHARE::find_deadlock() method to implement recursive
locking for LOCK_open mutex.
*/
uint m_table_shares_visited;
};
/**
Abstract class representing edge in waiters graph to be
traversed by deadlock detection algorithm.
*/
class Wait_for_edge
{
public:
virtual ~Wait_for_edge() {};
virtual bool find_deadlock(Deadlock_detection_visitor *dvisitor) = 0;
virtual uint get_deadlock_weight() const = 0;
};
/**
A granted metadata lock.
@ -380,7 +469,7 @@ typedef void (*mdl_cached_object_release_hook)(void *);
threads/contexts.
*/
class MDL_ticket
class MDL_ticket : public Wait_for_edge
{
public:
/**
@ -414,6 +503,7 @@ public:
bool is_incompatible_when_granted(enum_mdl_type type) const;
bool is_incompatible_when_waiting(enum_mdl_type type) const;
bool find_deadlock(Deadlock_detection_visitor *dvisitor);
/* A helper used to determine which lock request should be aborted. */
uint get_deadlock_weight() const;
private:
@ -680,7 +770,7 @@ private:
by inspecting waiting queues, but we'd very much like it to be
readily available to the wait-for graph iterator.
*/
MDL_ticket *m_waiting_for;
Wait_for_edge *m_waiting_for;
private:
MDL_ticket *find_ticket(MDL_request *mdl_req,
bool *is_transactional);
@ -688,10 +778,11 @@ private:
bool try_acquire_lock_impl(MDL_request *mdl_request,
MDL_ticket **out_ticket);
public:
void find_deadlock();
/** Inform the deadlock detector there is an edge in the wait-for graph. */
void will_wait_for(MDL_ticket *pending_ticket)
void will_wait_for(Wait_for_edge *pending_ticket)
{
mysql_prlock_wrlock(&m_LOCK_waiting_for);
m_waiting_for= pending_ticket;

View file

@ -634,7 +634,7 @@ mysql_mutex_t LOCK_des_key_file;
mysql_rwlock_t LOCK_grant, LOCK_sys_init_connect, LOCK_sys_init_slave;
mysql_rwlock_t LOCK_system_variables_hash;
mysql_cond_t COND_thread_count;
mysql_cond_t COND_refresh, COND_global_read_lock;
mysql_cond_t COND_global_read_lock;
pthread_t signal_thread;
pthread_attr_t connection_attrib;
mysql_mutex_t LOCK_server_started;
@ -1573,7 +1573,6 @@ static void clean_up_mutexes()
mysql_mutex_destroy(&LOCK_prepared_stmt_count);
mysql_mutex_destroy(&LOCK_error_messages);
mysql_cond_destroy(&COND_thread_count);
mysql_cond_destroy(&COND_refresh);
mysql_cond_destroy(&COND_global_read_lock);
mysql_cond_destroy(&COND_thread_cache);
mysql_cond_destroy(&COND_flush_thread_cache);
@ -3564,7 +3563,6 @@ static int init_thread_environment()
mysql_rwlock_init(key_rwlock_LOCK_sys_init_slave, &LOCK_sys_init_slave);
mysql_rwlock_init(key_rwlock_LOCK_grant, &LOCK_grant);
mysql_cond_init(key_COND_thread_count, &COND_thread_count, NULL);
mysql_cond_init(key_COND_refresh, &COND_refresh, NULL);
mysql_cond_init(key_COND_global_read_lock, &COND_global_read_lock, NULL);
mysql_cond_init(key_COND_thread_cache, &COND_thread_cache, NULL);
mysql_cond_init(key_COND_flush_thread_cache, &COND_flush_thread_cache, NULL);
@ -7786,7 +7784,7 @@ PSI_cond_key key_PAGE_cond, key_COND_active, key_COND_pool;
PSI_cond_key key_BINLOG_COND_prep_xids, key_BINLOG_update_cond,
key_COND_cache_status_changed, key_COND_global_read_lock, key_COND_manager,
key_COND_refresh, key_COND_rpl_status, key_COND_server_started,
key_COND_rpl_status, key_COND_server_started,
key_delayed_insert_cond, key_delayed_insert_cond_client,
key_item_func_sleep_cond, key_master_info_data_cond,
key_master_info_start_cond, key_master_info_stop_cond,
@ -7810,7 +7808,6 @@ static PSI_cond_info all_server_conds[]=
{ &key_COND_cache_status_changed, "Query_cache::COND_cache_status_changed", 0},
{ &key_COND_global_read_lock, "COND_global_read_lock", PSI_FLAG_GLOBAL},
{ &key_COND_manager, "COND_manager", PSI_FLAG_GLOBAL},
{ &key_COND_refresh, "COND_refresh", PSI_FLAG_GLOBAL},
{ &key_COND_rpl_status, "COND_rpl_status", PSI_FLAG_GLOBAL},
{ &key_COND_server_started, "COND_server_started", PSI_FLAG_GLOBAL},
{ &key_delayed_insert_cond, "Delayed_insert::cond", 0},

View file

@ -255,7 +255,7 @@ extern PSI_cond_key key_PAGE_cond, key_COND_active, key_COND_pool;
extern PSI_cond_key key_BINLOG_COND_prep_xids, key_BINLOG_update_cond,
key_COND_cache_status_changed, key_COND_global_read_lock, key_COND_manager,
key_COND_refresh, key_COND_rpl_status, key_COND_server_started,
key_COND_rpl_status, key_COND_server_started,
key_delayed_insert_cond, key_delayed_insert_cond_client,
key_item_func_sleep_cond, key_master_info_data_cond,
key_master_info_start_cond, key_master_info_stop_cond,
@ -339,7 +339,7 @@ extern mysql_cond_t COND_server_started;
extern mysql_rwlock_t LOCK_grant, LOCK_sys_init_connect, LOCK_sys_init_slave;
extern mysql_rwlock_t LOCK_system_variables_hash;
extern mysql_cond_t COND_thread_count;
extern mysql_cond_t COND_refresh, COND_manager;
extern mysql_cond_t COND_manager;
extern mysql_cond_t COND_global_read_lock;
extern int32 thread_running;
extern my_atomic_rwlock_t thread_running_lock;

View file

@ -146,9 +146,6 @@ static bool check_and_update_table_version(THD *thd, TABLE_LIST *tables,
static bool open_table_entry_fini(THD *thd, TABLE_SHARE *share, TABLE *entry);
static bool auto_repair_table(THD *thd, TABLE_LIST *table_list);
static void free_cache_entry(TABLE *entry);
static bool tdc_wait_for_old_versions(THD *thd,
MDL_request_list *mdl_requests,
ulong timeout);
static bool
has_write_table_with_auto_increment(TABLE_LIST *tables);
@ -315,7 +312,7 @@ void table_def_start_shutdown(void)
{
mysql_mutex_lock(&LOCK_open);
/* Free all cached but unused TABLEs and TABLE_SHAREs first. */
close_cached_tables(NULL, NULL, TRUE, FALSE);
close_cached_tables(NULL, NULL, TRUE, FALSE, LONG_TIMEOUT);
/*
Ensure that TABLE and TABLE_SHARE objects which are created for
tables that are open during process of plugins' shutdown are
@ -928,6 +925,7 @@ static void kill_delayed_threads_for_table(TABLE_SHARE *share)
@param tables List of tables to remove from the cache
@param have_lock If LOCK_open is locked
@param wait_for_refresh Wait for a impending flush
@param timeout Timeout for waiting for flush to be completed.
@note THD can be NULL, but then wait_for_refresh must be FALSE
and tables must be NULL.
@ -941,10 +939,11 @@ static void kill_delayed_threads_for_table(TABLE_SHARE *share)
*/
bool close_cached_tables(THD *thd, TABLE_LIST *tables, bool have_lock,
bool wait_for_refresh)
bool wait_for_refresh, ulong timeout)
{
bool result= FALSE;
bool found= TRUE;
struct timespec abstime;
DBUG_ENTER("close_cached_tables");
DBUG_ASSERT(thd || (!wait_for_refresh && !tables));
@ -952,7 +951,16 @@ bool close_cached_tables(THD *thd, TABLE_LIST *tables, bool have_lock,
mysql_mutex_lock(&LOCK_open);
if (!tables)
{
refresh_version++; // Force close of open tables
/*
Force close of all open tables.
Note that code in TABLE_SHARE::wait_until_flushed() assumes that
incrementing of refresh_version and removal of unused tables and
shares from TDC happens atomically under protection of LOCK_open,
or putting it another way that TDC does not contain old shares
which don't have any tables used.
*/
refresh_version++;
DBUG_PRINT("tcache", ("incremented global refresh_version to: %lu",
refresh_version));
kill_delayed_threads();
@ -995,6 +1003,8 @@ bool close_cached_tables(THD *thd, TABLE_LIST *tables, bool have_lock,
/* Code below assume that LOCK_open is released. */
DBUG_ASSERT(!have_lock);
set_timespec(abstime, timeout);
if (thd->locked_tables_mode)
{
/*
@ -1034,6 +1044,7 @@ bool close_cached_tables(THD *thd, TABLE_LIST *tables, bool have_lock,
while (found && ! thd->killed)
{
TABLE_SHARE *share;
found= FALSE;
/*
To a self-deadlock or deadlocks with other FLUSH threads
@ -1044,13 +1055,11 @@ bool close_cached_tables(THD *thd, TABLE_LIST *tables, bool have_lock,
mysql_mutex_lock(&LOCK_open);
thd->enter_cond(&COND_refresh, &LOCK_open, "Flushing tables");
if (!tables)
{
for (uint idx=0 ; idx < table_def_cache.records ; idx++)
{
TABLE_SHARE *share=(TABLE_SHARE*) my_hash_element(&table_def_cache,
share= (TABLE_SHARE*) my_hash_element(&table_def_cache,
idx);
if (share->needs_reopen())
{
@ -1063,7 +1072,7 @@ bool close_cached_tables(THD *thd, TABLE_LIST *tables, bool have_lock,
{
for (TABLE_LIST *table= tables; table; table= table->next_local)
{
TABLE_SHARE *share= get_cached_table_share(table->db, table->table_name);
share= get_cached_table_share(table->db, table->table_name);
if (share && share->needs_reopen())
{
found= TRUE;
@ -1074,11 +1083,17 @@ bool close_cached_tables(THD *thd, TABLE_LIST *tables, bool have_lock,
if (found)
{
DBUG_PRINT("signal", ("Waiting for COND_refresh"));
mysql_cond_wait(&COND_refresh, &LOCK_open);
/* The below method will unlock LOCK_open and frees share's memory. */
if (share->wait_until_flushed(&thd->mdl_context, &abstime,
MDL_DEADLOCK_WEIGHT_DDL))
{
mysql_mutex_unlock(&LOCK_open);
result= TRUE;
goto err_with_reopen;
}
}
thd->exit_cond(NULL);
mysql_mutex_unlock(&LOCK_open);
}
err_with_reopen:
@ -1149,7 +1164,7 @@ bool close_cached_connection_tables(THD *thd, bool if_wait_for_refresh,
}
if (tables)
result= close_cached_tables(thd, tables, TRUE, FALSE);
result= close_cached_tables(thd, tables, TRUE, FALSE, LONG_TIMEOUT);
if (!have_lock)
mysql_mutex_unlock(&LOCK_open);
@ -2347,7 +2362,7 @@ bool MDL_deadlock_handler::handle_condition(THD *,
{
/* Disable the handler to avoid infinite recursion. */
m_is_active= TRUE;
(void) m_ot_ctx->request_backoff_action(Open_table_context::OT_MDL_CONFLICT,
(void) m_ot_ctx->request_backoff_action(Open_table_context::OT_CONFLICT,
NULL);
m_is_active= FALSE;
/*
@ -2394,6 +2409,8 @@ open_table_get_mdl_lock(THD *thd, Open_table_context *ot_ctx,
uint flags,
MDL_ticket **mdl_ticket)
{
MDL_request mdl_request_shared;
if (flags & (MYSQL_OPEN_FORCE_SHARED_MDL |
MYSQL_OPEN_FORCE_SHARED_HIGH_PRIO_MDL))
{
@ -2419,16 +2436,12 @@ open_table_get_mdl_lock(THD *thd, Open_table_context *ot_ctx,
DBUG_ASSERT(!(flags & MYSQL_OPEN_FORCE_SHARED_MDL) ||
!(flags & MYSQL_OPEN_FORCE_SHARED_HIGH_PRIO_MDL));
mdl_request= new (thd->mem_root) MDL_request(mdl_request);
if (mdl_request == NULL)
return TRUE;
mdl_request->set_type((flags & MYSQL_OPEN_FORCE_SHARED_MDL) ?
MDL_SHARED : MDL_SHARED_HIGH_PRIO);
mdl_request_shared.init(&mdl_request->key,
(flags & MYSQL_OPEN_FORCE_SHARED_MDL) ?
MDL_SHARED : MDL_SHARED_HIGH_PRIO);
mdl_request= &mdl_request_shared;
}
ot_ctx->add_request(mdl_request);
if (flags & MYSQL_OPEN_FAIL_ON_MDL_CONFLICT)
{
/*
@ -2491,6 +2504,38 @@ open_table_get_mdl_lock(THD *thd, Open_table_context *ot_ctx,
}
/**
Check if table's share requires flush and if yes wait until it
will be flushed.
@param thd Thread context.
@param table_list Table which share should be checked.
@param timeout Timeout for waiting.
@param deadlock_weight Weight of this wait for deadlock detector.
@retval FALSE - Success. Share is up to date or has been flushed.
@retval TRUE - Error (OOM, thread was killed, wait resulted in
deadlock or timeout).
*/
static bool tdc_wait_for_old_version(THD *thd, TABLE_LIST *table_list,
ulong timeout, uint deadlock_weight)
{
TABLE_SHARE *share;
if ((share= get_cached_table_share(table_list->db,
table_list->table_name)) &&
share->needs_reopen())
{
struct timespec abstime;
set_timespec(abstime, timeout);
return share->wait_until_flushed(&thd->mdl_context, &abstime,
deadlock_weight);
}
return FALSE;
}
/*
Open a table.
@ -2580,8 +2625,8 @@ bool open_table(THD *thd, TABLE_LIST *table_list, MEM_ROOT *mem_root,
if (thd->open_tables && thd->open_tables->s->version != refresh_version)
{
(void) ot_ctx->request_backoff_action(Open_table_context::OT_WAIT_TDC,
NULL);
(void)ot_ctx->request_backoff_action(Open_table_context::OT_REOPEN_TABLES,
NULL);
DBUG_RETURN(TRUE);
}
}
@ -2794,6 +2839,8 @@ bool open_table(THD *thd, TABLE_LIST *table_list, MEM_ROOT *mem_root,
mysql_mutex_lock(&LOCK_open);
retry_share:
if (!(share= get_table_share_with_create(thd, table_list, key,
key_length, OPEN_VIEW,
&error,
@ -2849,31 +2896,50 @@ bool open_table(THD *thd, TABLE_LIST *table_list, MEM_ROOT *mem_root,
if (table_list->i_s_requested_object & OPEN_VIEW_ONLY)
goto err_unlock;
/*
If the version changes while we're opening the tables,
we have to back off, close all the tables opened-so-far,
and try to reopen them. Note: refresh_version is currently
changed only during FLUSH TABLES.
*/
if (share->needs_reopen() ||
(thd->open_tables && thd->open_tables->s->version != share->version))
if (!(flags & MYSQL_OPEN_IGNORE_FLUSH))
{
if (!(flags & MYSQL_OPEN_IGNORE_FLUSH))
if (share->needs_reopen())
{
/*
We already have an MDL lock. But we have encountered an old
version of table in the table definition cache which is possible
when someone changes the table version directly in the cache
without acquiring a metadata lock (e.g. this can happen during
"rolling" FLUSH TABLE(S)).
Note, that to avoid a "busywait" in this case, we have to wait
separately in the caller for old table versions to go away
(see tdc_wait_for_old_versions()).
*/
/*
We already have an MDL lock. But we have encountered an old
version of table in the table definition cache which is possible
when someone changes the table version directly in the cache
without acquiring a metadata lock (e.g. this can happen during
"rolling" FLUSH TABLE(S)).
Release our reference to share, wait until old version of
share goes away and then try to get new version of table share.
*/
MDL_deadlock_handler mdl_deadlock_handler(ot_ctx);
bool wait_result;
release_table_share(share);
thd->push_internal_handler(&mdl_deadlock_handler);
wait_result= tdc_wait_for_old_version(thd, table_list,
ot_ctx->get_timeout(),
mdl_ticket->get_deadlock_weight());
thd->pop_internal_handler();
if (wait_result)
{
mysql_mutex_unlock(&LOCK_open);
DBUG_RETURN(TRUE);
}
goto retry_share;
}
if (thd->open_tables && thd->open_tables->s->version != share->version)
{
/*
If the version changes while we're opening the tables,
we have to back off, close all the tables opened-so-far,
and try to reopen them. Note: refresh_version is currently
changed only during FLUSH TABLES.
*/
release_table_share(share);
mysql_mutex_unlock(&LOCK_open);
(void) ot_ctx->request_backoff_action(Open_table_context::OT_WAIT_TDC,
NULL);
(void)ot_ctx->request_backoff_action(Open_table_context::OT_REOPEN_TABLES,
NULL);
DBUG_RETURN(TRUE);
}
}
@ -3831,7 +3897,7 @@ request_backoff_action(enum_open_table_action action_arg,
Since there is no way to detect such a deadlock, we prevent
it by reporting an error.
*/
if (m_has_locks)
if (action_arg != OT_REOPEN_TABLES && m_has_locks)
{
my_error(ER_LOCK_DEADLOCK, MYF(0));
return TRUE;
@ -3877,11 +3943,9 @@ recover_from_failed_open(THD *thd)
/* Execute the action. */
switch (m_action)
{
case OT_MDL_CONFLICT:
case OT_CONFLICT:
break;
case OT_WAIT_TDC:
result= tdc_wait_for_old_versions(thd, &m_mdl_requests, get_timeout());
DBUG_ASSERT(thd->mysys_var->current_mutex == NULL);
case OT_REOPEN_TABLES:
break;
case OT_DISCOVER:
{
@ -3921,8 +3985,6 @@ recover_from_failed_open(THD *thd)
default:
DBUG_ASSERT(0);
}
/* Remove all old requests, they will be re-added. */
m_mdl_requests.empty();
/*
Reset the pointers to conflicting MDL request and the
TABLE_LIST element, set when we need auto-discovery or repair,
@ -4043,8 +4105,6 @@ open_and_process_routine(THD *thd, Query_tables_list *prelocking_ctx,
if (rt != (Sroutine_hash_entry*)prelocking_ctx->sroutines_list.first ||
mdl_type != MDL_key::PROCEDURE)
{
ot_ctx->add_request(&rt->mdl_request);
/*
Since we acquire only shared lock on routines we don't
need to care about global intention exclusive locks.
@ -4721,6 +4781,8 @@ restart:
}
goto err;
}
DEBUG_SYNC(thd, "open_tables_after_open_and_process_table");
}
/*
@ -8595,17 +8657,6 @@ bool mysql_notify_thread_having_shared_lock(THD *thd, THD *in_use,
}
mysql_mutex_unlock(&in_use->LOCK_thd_data);
}
/*
Wake up threads waiting in tdc_wait_for_old_versions().
Normally such threads would already get blocked
in MDL subsystem, when trying to acquire a shared lock.
But in case a thread has an open HANDLER statement,
(and thus already grabbed a metadata lock), it gets
blocked only too late -- at the table cache level.
Starting from 5.5, this could also easily happen in
a multi-statement transaction.
*/
broadcast_refresh();
return signalled;
}
@ -8680,6 +8731,13 @@ void tdc_remove_table(THD *thd, enum_tdc_remove_table_type remove_type,
/*
Set share's version to zero in order to ensure that it gets
automatically deleted once it is no longer referenced.
Note that code in TABLE_SHARE::wait_until_flushed() assumes
that marking share as old and removal of its unused tables
and of the share itself from TDC happens atomically under
protection of LOCK_open, or, putting it another way, that
TDC does not contain old shares which don't have any tables
used.
*/
share->version= 0;
@ -8692,84 +8750,6 @@ void tdc_remove_table(THD *thd, enum_tdc_remove_table_type remove_type,
}
/**
Wait until there are no old versions of tables in the table
definition cache for the metadata locks that we try to acquire.
@param thd Thread context
@param context Metadata locking context with locks.
@param timeout Seconds to wait before reporting ER_LOCK_WAIT_TIMEOUT.
*/
static bool
tdc_wait_for_old_versions(THD *thd, MDL_request_list *mdl_requests,
ulong timeout)
{
TABLE_SHARE *share;
const char *old_msg;
MDL_request *mdl_request;
struct timespec abstime;
set_timespec(abstime, timeout);
int wait_result= 0;
while (!thd->killed)
{
/*
We have to get rid of HANDLERs which are open by this thread
and have old TABLE versions. Otherwise we might get a deadlock
in situation when we are waiting for an old TABLE object which
corresponds to a HANDLER open by another session. And this
other session waits for our HANDLER object to get closed.
TODO: We should also investigate in which situations we have
to broadcast on COND_refresh because of this.
*/
mysql_ha_flush(thd);
mysql_mutex_lock(&LOCK_open);
MDL_request_list::Iterator it(*mdl_requests);
while ((mdl_request= it++))
{
/* Skip requests on non-TDC objects. */
if (mdl_request->key.mdl_namespace() != MDL_key::TABLE)
continue;
if ((share= get_cached_table_share(mdl_request->key.db_name(),
mdl_request->key.name())) &&
share->needs_reopen())
break;
}
if (!mdl_request)
{
/*
Reset wait_result here in case this was the final check
after getting a timeout from mysql_cond_timedwait().
*/
wait_result= 0;
mysql_mutex_unlock(&LOCK_open);
break;
}
if (wait_result == ETIMEDOUT || wait_result == ETIME)
{
/*
Test for timeout here instead of right after mysql_cond_timedwait().
This allows for a final iteration and a final check before reporting
ER_LOCK_WAIT_TIMEOUT.
*/
mysql_mutex_unlock(&LOCK_open);
my_error(ER_LOCK_WAIT_TIMEOUT, MYF(0));
break;
}
old_msg= thd->enter_cond(&COND_refresh, &LOCK_open, "Waiting for table");
wait_result= mysql_cond_timedwait(&COND_refresh, &LOCK_open, &abstime);
/* LOCK_open mutex is unlocked by THD::exit_cond() as side-effect. */
thd->exit_cond(old_msg);
}
return thd->killed || wait_result == ETIMEDOUT || wait_result == ETIME;
}
int setup_ftfuncs(SELECT_LEX *select_lex)
{
List_iterator<Item_func_match> li(*(select_lex->ftfunc_list)),

View file

@ -250,7 +250,7 @@ TABLE *open_performance_schema_table(THD *thd, TABLE_LIST *one_table,
void close_performance_schema_table(THD *thd, Open_tables_state *backup);
bool close_cached_tables(THD *thd, TABLE_LIST *tables, bool have_lock,
bool wait_for_refresh);
bool wait_for_refresh, ulong timeout);
bool close_cached_connection_tables(THD *thd, bool wait_for_refresh,
LEX_STRING *connect_string,
bool have_lock = FALSE);
@ -454,8 +454,8 @@ public:
enum enum_open_table_action
{
OT_NO_ACTION= 0,
OT_MDL_CONFLICT,
OT_WAIT_TDC,
OT_CONFLICT,
OT_REOPEN_TABLES,
OT_DISCOVER,
OT_REPAIR
};
@ -465,9 +465,6 @@ public:
bool request_backoff_action(enum_open_table_action action_arg,
TABLE_LIST *table);
void add_request(MDL_request *request)
{ m_mdl_requests.push_front(request); }
bool can_recover_from_failed_open() const
{ return m_action != OT_NO_ACTION; }
@ -489,8 +486,6 @@ public:
uint get_flags() const { return m_flags; }
private:
/** List of requests for all locks taken so far. Used for waiting on locks. */
MDL_request_list m_mdl_requests;
/**
For OT_DISCOVER and OT_REPAIR actions, the table list element for
the table which definition should be re-discovered or which

View file

@ -2302,6 +2302,12 @@ public:
{
const char* old_msg = proc_info;
mysql_mutex_assert_owner(mutex);
/*
This method should not be called with LOCK_open mutex as an
argument. Otherwise deadlocks can arise in MDL deadlock detector.
@sa TABLE_SHARE::find_deadlock().
*/
DBUG_ASSERT(mutex != &LOCK_open);
mysys_var->current_mutex = mutex;
mysys_var->current_cond = cond;
proc_info = msg;

View file

@ -1756,6 +1756,7 @@ static bool flush_tables_with_read_lock(THD *thd, TABLE_LIST *all_tables)
{
Lock_tables_prelocking_strategy lock_tables_prelocking_strategy;
TABLE_LIST *table_list;
MDL_request_list mdl_requests;
/*
This is called from SQLCOM_FLUSH, the transaction has
@ -1774,23 +1775,27 @@ static bool flush_tables_with_read_lock(THD *thd, TABLE_LIST *all_tables)
}
/*
@todo: Since lock_table_names() acquires a global IX
lock, this actually waits for a GRL in another connection.
We are thus introducing an incompatibility.
Do nothing for now, since not taking a global IX violates
current internal MDL asserts, fix after discussing with
Dmitry.
Acquire SNW locks on tables to be flushed. We can't use
lock_table_names() here as this call will also acquire global IX
and database-scope IX locks on the tables, and this will make
this statement incompatible with FLUSH TABLES WITH READ LOCK.
*/
if (lock_table_names(thd, all_tables, 0, thd->variables.lock_wait_timeout,
MYSQL_OPEN_SKIP_TEMPORARY))
for (table_list= all_tables; table_list;
table_list= table_list->next_global)
mdl_requests.push_front(&table_list->mdl_request);
if (thd->mdl_context.acquire_locks(&mdl_requests,
thd->variables.lock_wait_timeout))
goto error;
DEBUG_SYNC(thd,"flush_tables_with_read_lock_after_acquire_locks");
for (table_list= all_tables; table_list;
table_list= table_list->next_global)
{
/* Remove the table from cache. */
/* Request removal of table from cache. */
mysql_mutex_lock(&LOCK_open);
tdc_remove_table(thd, TDC_RT_REMOVE_ALL,
tdc_remove_table(thd, TDC_RT_REMOVE_UNUSED,
table_list->db,
table_list->table_name);
mysql_mutex_unlock(&LOCK_open);
@ -1800,6 +1805,11 @@ static bool flush_tables_with_read_lock(THD *thd, TABLE_LIST *all_tables)
table_list->open_type= OT_BASE_ONLY; /* Ignore temporary tables. */
}
/*
Before opening and locking tables the below call also waits for old
shares to go away, so the fact that we don't pass MYSQL_LOCK_IGNORE_FLUSH
flag to it is important.
*/
if (open_and_lock_tables(thd, all_tables, FALSE,
MYSQL_OPEN_HAS_MDL_LOCK,
&lock_tables_prelocking_strategy) ||
@ -1810,17 +1820,11 @@ static bool flush_tables_with_read_lock(THD *thd, TABLE_LIST *all_tables)
thd->variables.option_bits|= OPTION_TABLE_LOCK;
/*
Downgrade the exclusive locks.
Use MDL_SHARED_NO_WRITE as the intended
post effect of this call is identical
to LOCK TABLES <...> READ, and we didn't use
thd->in_lock_talbes and thd->sql_command= SQLCOM_LOCK_TABLES
hacks to enter the LTM.
@todo: release the global IX lock here!!!
We don't downgrade MDL_SHARED_NO_WRITE here as the intended
post effect of this call is identical to LOCK TABLES <...> READ,
and we didn't use thd->in_lock_talbes and
thd->sql_command= SQLCOM_LOCK_TABLES hacks to enter the LTM.
*/
for (table_list= all_tables; table_list;
table_list= table_list->next_global)
table_list->mdl_request.ticket->downgrade_exclusive_lock(MDL_SHARED_NO_WRITE);
return FALSE;
@ -6854,8 +6858,8 @@ bool reload_acl_and_cache(THD *thd, ulong options, TABLE_LIST *tables,
tmp_write_to_binlog= 0;
if (thd->global_read_lock.lock_global_read_lock(thd))
return 1; // Killed
if (close_cached_tables(thd, tables, FALSE, (options & REFRESH_FAST) ?
FALSE : TRUE))
if (close_cached_tables(thd, tables, FALSE, ((options & REFRESH_FAST) ?
FALSE : TRUE), thd->variables.lock_wait_timeout))
result= 1;
if (thd->global_read_lock.make_global_read_lock_block_commit(thd)) // Killed
@ -6894,8 +6898,10 @@ bool reload_acl_and_cache(THD *thd, ulong options, TABLE_LIST *tables,
}
}
if (close_cached_tables(thd, tables, FALSE, (options & REFRESH_FAST) ?
FALSE : TRUE))
if (close_cached_tables(thd, tables, FALSE, ((options & REFRESH_FAST) ?
FALSE : TRUE),
(thd ? thd->variables.lock_wait_timeout :
LONG_TIMEOUT)))
result= 1;
}
my_dbopt_cleanup();

View file

@ -11202,9 +11202,8 @@ opt_with_read_lock:
{
TABLE_LIST *tables= Lex->query_tables;
Lex->type|= REFRESH_READ_LOCK;
/* We acquire an X lock currently and then downgrade. */
for (; tables; tables= tables->next_global)
tables->mdl_request.set_type(MDL_EXCLUSIVE);
tables->mdl_request.set_type(MDL_SHARED_NO_WRITE);
}
;

View file

@ -1488,7 +1488,8 @@ static bool fix_read_only(sys_var *self, THD *thd, enum_var_type type)
can cause to wait on a read lock, it's required for the client application
to unlock everything, and acceptable for the server to wait on all locks.
*/
if ((result= close_cached_tables(thd, NULL, FALSE, TRUE)))
if ((result= close_cached_tables(thd, NULL, FALSE, TRUE,
thd->variables.lock_wait_timeout)))
goto end_with_read_lock;
if ((result= thd->global_read_lock.make_global_read_lock_block_commit(thd)))

View file

@ -34,6 +34,7 @@
#include <m_ctype.h>
#include "my_md5.h"
#include "sql_select.h"
#include "mdl.h" // Deadlock_detection_visitor
/* INFORMATION_SCHEMA name */
LEX_STRING INFORMATION_SCHEMA_NAME= {C_STRING_WITH_LEN("information_schema")};
@ -325,6 +326,7 @@ TABLE_SHARE *alloc_table_share(TABLE_LIST *table_list, char *key,
share->used_tables.empty();
share->free_tables.empty();
share->m_flush_tickets.empty();
memcpy((char*) &share->mem_root, (char*) &mem_root, sizeof(mem_root));
mysql_mutex_init(key_TABLE_SHARE_LOCK_ha_data,
@ -389,6 +391,7 @@ void init_tmp_table_share(THD *thd, TABLE_SHARE *share, const char *key,
share->used_tables.empty();
share->free_tables.empty();
share->m_flush_tickets.empty();
DBUG_VOID_RETURN;
}
@ -432,9 +435,40 @@ void free_table_share(TABLE_SHARE *share)
key_info->flags= 0;
}
}
/* We must copy mem_root from share because share is allocated through it */
memcpy((char*) &mem_root, (char*) &share->mem_root, sizeof(mem_root));
free_root(&mem_root, MYF(0)); // Free's share
if (share->m_flush_tickets.is_empty())
{
/*
There are no threads waiting for this share to be flushed. So
we can immediately release memory associated with it. We must
copy mem_root from share because share is allocated through it.
*/
memcpy((char*) &mem_root, (char*) &share->mem_root, sizeof(mem_root));
free_root(&mem_root, MYF(0)); // Free's share
}
else
{
/*
If there are threads waiting for this share to be flushed we
don't free share memory here. Instead we notify waiting threads
and delegate freeing share's memory to them.
At this point a) all resources except memory associated with share
were already released b) share should have been already removed
from table definition cache. So it is OK to proceed without waiting
for these threads to finish their work.
*/
Flush_ticket_list::Iterator it(share->m_flush_tickets);
Flush_ticket *ticket;
/*
To avoid problems due to threads being wake up concurrently modifying
flush ticket list we must hold LOCK_open here.
*/
mysql_mutex_assert_owner(&LOCK_open);
while ((ticket= it++))
(void) ticket->get_ctx()->m_wait.set_status(MDL_wait::GRANTED);
}
DBUG_VOID_RETURN;
}
@ -2996,6 +3030,223 @@ Table_check_intact::check(TABLE *table, const TABLE_FIELD_DEF *table_def)
}
/**
Traverse portion of wait-for graph which is reachable through edge
represented by this flush ticket in search for deadlocks.
@retval TRUE A deadlock is found. A victim is remembered
by the visitor.
@retval FALSE
*/
bool Flush_ticket::find_deadlock(Deadlock_detection_visitor *dvisitor)
{
return m_share->find_deadlock(this, dvisitor);
}
uint Flush_ticket::get_deadlock_weight() const
{
return m_deadlock_weight;
}
/**
Traverse portion of wait-for graph which is reachable through this
table share in search for deadlocks.
@param waiting_ticket Ticket representing wait for this share.
@param dvisitor Deadlock detection visitor.
@retval TRUE A deadlock is found. A victim is remembered
by the visitor.
@retval FALSE
*/
bool TABLE_SHARE::find_deadlock(Flush_ticket *waiting_ticket,
Deadlock_detection_visitor *dvisitor)
{
TABLE *table;
MDL_context *src_ctx= waiting_ticket->get_ctx();
bool result= TRUE;
/*
To protect used_tables list from being concurrently modified while we
are iterating through it we acquire LOCK_open. This should not introduce
deadlocks in deadlock detector because we support recursive acquiring of
such mutex and also because we won't try to acquire LOCK_open mutex while
holding write-lock on MDL_lock::m_rwlock.
Here is the more elaborate proof:
0) Let us assume that there is a deadlock.
1) Wait graph (the one which reflects waits for system synchronization
primitives and not the one which inspected by MDL deadlock detector)
for this deadlock should contain loop including both LOCK_open and
some of MDL synchronization primitives. Otherwise deadlock would had
already exisited before we have introduced acquiring of LOCK_open in
MDL deadlock detector.
2) Also in this graph edge going out of LOCK_open node should go to one
of MDL synchronization primitives. Different situation would mean that
we have some non-MDL synchronization primitive besides LOCK_open under
which we try to acquire MDL lock, which is not the case.
3) Moreover edge coming from LOCK_open should go to MDL_lock::m_rwlock
object and correspond to request for read-lock. It can't be request
for rwlock in MDL_context or mutex in MDL_wait object because they
are terminal (i.e. thread having them locked in exclusive mode won't
wait for any other resource). It can't be request for write-lock on
MDL_lock::m_rwlock as this would mean that we try to acquire metadata
lock under LOCK_open (which is not the case).
4) Since MDL_lock::m_rwlock is rwlock which prefers readers the only
situation when it can be waited for is when some thread has it
write-locked.
5) TODO/FIXME:
- Either prove that thread having MDL_lock::m_rwlock write-locked won't
wait for LOCK_open directly or indirectly (see notify_shared_lock()).
- Or change code to hold only read-lock on MDL_lock::m_rwlock during
notify_shared_lock() and thus make MDL_lock::m_rwlock terminal when
write-locked.
*/
if (! (dvisitor->m_table_shares_visited++))
mysql_mutex_lock(&LOCK_open);
I_P_List_iterator <TABLE, TABLE_share> tables_it(used_tables);
/* Not strictly necessary ? */
if (src_ctx->m_wait.get_status() != MDL_wait::EMPTY)
{
result= FALSE;
goto end;
}
if (dvisitor->enter_node(src_ctx))
goto end;
while ((table= tables_it++))
{
if (dvisitor->inspect_edge(&table->in_use->mdl_context))
{
goto end_leave_node;
}
}
tables_it.rewind();
while ((table= tables_it++))
{
if (table->in_use->mdl_context.find_deadlock(dvisitor))
{
goto end_leave_node;
}
}
result= FALSE;
end_leave_node:
dvisitor->leave_node(src_ctx);
end:
if (! (--dvisitor->m_table_shares_visited))
mysql_mutex_unlock(&LOCK_open);
return result;
}
/**
Wait until old version of table share is removed from TDC.
@param mdl_context MDL context for thread which is going to wait.
@param abstime Timeout for waiting as absolute time value.
@param deadlock_weight Weight of this wait for deadlock detector.
@note This method assumes that its caller owns LOCK_open mutex.
This mutex will be unlocked temporarily during its execution.
@retval FALSE - Success.
@retval TRUE - Error (OOM, deadlock, timeout, etc...).
*/
bool TABLE_SHARE::wait_until_flushed(MDL_context *mdl_context,
struct timespec *abstime,
uint deadlock_weight)
{
Flush_ticket *ticket;
MDL_wait::enum_wait_status wait_status;
mysql_mutex_assert_owner(&LOCK_open);
/*
We should enter this method only then share's version is not
up to date and the share is referenced. Otherwise there is
no guarantee that our thread will be waken-up from wait.
*/
DBUG_ASSERT(version != refresh_version && ref_count != 0);
if (! (ticket= new Flush_ticket(mdl_context, this, deadlock_weight)))
{
mysql_mutex_unlock(&LOCK_open);
return TRUE;
}
m_flush_tickets.push_front(ticket);
mdl_context->m_wait.reset_status();
mysql_mutex_unlock(&LOCK_open);
mdl_context->will_wait_for(ticket);
mdl_context->find_deadlock();
wait_status= mdl_context->m_wait.timed_wait(mdl_context->get_thd(),
abstime, TRUE);
mdl_context->done_waiting_for();
mysql_mutex_lock(&LOCK_open);
m_flush_tickets.remove(ticket);
/*
If our thread was the last one waiting for table share to be flushed
we can finish destruction of share object by releasing its memory
(share object was allocated on share's own MEM_ROOT).
In cases when our wait was aborted due KILL statement, deadlock or
timeout share still might be referenced, so we don't free its memory
in this case. Note that we can't rely on checking wait_status to
determine this condition as, for example, timeout can happen even
when there are no references to table share so memory should be
released.
*/
if (m_flush_tickets.is_empty() && ! ref_count)
{
MEM_ROOT mem_root_copy;
memcpy((char*) &mem_root_copy, (char*) &mem_root, sizeof(mem_root));
free_root(&mem_root_copy, MYF(0));
}
delete ticket;
switch (wait_status)
{
case MDL_wait::GRANTED:
return FALSE;
case MDL_wait::VICTIM:
my_error(ER_LOCK_DEADLOCK, MYF(0));
return TRUE;
case MDL_wait::TIMEOUT:
my_error(ER_LOCK_WAIT_TIMEOUT, MYF(0));
return TRUE;
case MDL_wait::KILLED:
return TRUE;
default:
DBUG_ASSERT(0);
return TRUE;
}
}
/*
Create Item_field for each column in the table.

View file

@ -45,6 +45,7 @@ class ACL_internal_schema_access;
class ACL_internal_table_access;
struct TABLE_LIST;
class Field;
class Deadlock_detection_visitor;
/*
Used to identify NESTED_JOIN structures within a join (applicable only to
@ -508,6 +509,45 @@ public:
};
/**
Class representing the fact that some thread waits for table
share to be flushed. Is used to represent information about
such waits in MDL deadlock detector.
*/
class Flush_ticket : public Wait_for_edge
{
MDL_context *m_ctx;
TABLE_SHARE *m_share;
uint m_deadlock_weight;
public:
Flush_ticket(MDL_context *ctx_arg, TABLE_SHARE *share_arg,
uint deadlock_weight_arg)
: m_ctx(ctx_arg), m_share(share_arg),
m_deadlock_weight(deadlock_weight_arg)
{}
MDL_context *get_ctx() const { return m_ctx; }
bool find_deadlock(Deadlock_detection_visitor *dvisitor);
uint get_deadlock_weight() const;
/**
Pointers for participating in the list of waiters for table share.
*/
Flush_ticket *next_in_share;
Flush_ticket **prev_in_share;
};
typedef I_P_List <Flush_ticket,
I_P_List_adapter<Flush_ticket,
&Flush_ticket::next_in_share,
&Flush_ticket::prev_in_share> >
Flush_ticket_list;
/*
This structure is shared between different table objects. There is one
instance of table share per one table in the database.
@ -662,6 +702,11 @@ struct TABLE_SHARE
/** Instrumentation for this table share. */
PSI_table_share *m_psi;
/**
List of tickets representing threads waiting for the share to be flushed.
*/
Flush_ticket_list m_flush_tickets;
/*
Set share's table cache key and update its db and table name appropriately.
@ -837,6 +882,12 @@ struct TABLE_SHARE
return (tmp_table == SYSTEM_TMP_TABLE || is_view) ? 0 : table_map_id;
}
bool find_deadlock(Flush_ticket *waiting_ticket,
Deadlock_detection_visitor *dvisitor);
bool wait_until_flushed(MDL_context *mdl_context,
struct timespec *abstime,
uint deadlock_weight);
};