test case to give valgrind warnings.
The problem was that when comparing two MDL key buffers using
memcmp(), 1 was added to the buffer length. However, this was
no longer needed since the buffer length already included the
'\0' terminator.
bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
MDL deadlock detector".
Deadlock could have occurred when workload containing mix
of DML, DDL and FLUSH TABLES statements affecting same
set of tables was executed in heavily concurrent environment.
This deadlock occurred when several connections tried to
perform deadlock detection in metadata locking subsystem.
The first connection started traversing wait-for graph,
encountered sub-graph representing wait for flush, acquired
LOCK_open and dived into sub-graph inspection. When it has
encounterd sub-graph corresponding to wait for metadata lock
and blocked while trying to acquire rd-lock on
MDL_lock::m_rwlock (*) protecting this subgraph, since some
other thread had wr-lock on it. When this wr-lock was released
it could have happened (if there was other pending wr-lock
against this rwlock) that rd-lock from the first connection
was left unsatisfied but at the same time new rd-lock request
from the second connection sneaked in and was satisfied (for
this to be possible second rd- request should come exactly
after wr-lock is released but before pending wr-lock manages
to grab rwlock, which is possible both on Linux and in our
own rwlock implementation). If this second connection
continued traversing wait-for graph and encountered sub-graph
representing wait for flush it tried to acquire LOCK_open
and thus deadlock was created.
This patch tries to workaround this problem but not allowing
deadlock detector to lock LOCK_open mutex if some other thread
doing deadlock detection already owns it and current search
depth is greater than 0. Instead deadlock is reported.
Other possible solutions are either known to have negative
effects on performance or require much more time for proper
implementation and testing.
No test case is provided as this bug is very hard to repeat
in MTR environment but is repeatable with the help of RQG
tests.
FLUSH TABLES <list> WITH READ LOCK are incompatible" to
be pushed as separate patch.
Replaced thread state name "Waiting for table", which was
used by threads waiting for a metadata lock or table flush,
with a set of names which better reflect types of resources
being waited for.
Also replaced "Table lock" thread state name, which was used
by threads waiting on thr_lock.c table level lock, with more
elaborate "Waiting for table level lock", to make it
more consistent with other thread state names.
Updated test cases and their results according to these
changes.
Fixed sys_vars.query_cache_wlock_invalidate_func test to not
to wait for timeout of wait_condition.inc script.
TABLES <list> WITH READ LOCK are incompatible".
The problem was that FLUSH TABLES <list> WITH READ LOCK
which was issued when other connection has acquired global
read lock using FLUSH TABLES WITH READ LOCK was blocked
and has to wait until global read lock is released.
This issue stemmed from the fact that FLUSH TABLES <list>
WITH READ LOCK implementation has acquired X metadata locks
on tables to be flushed. Since these locks required acquiring
of global IX lock this statement was incompatible with global
read lock.
This patch addresses problem by using SNW metadata type of
lock for tables to be flushed by FLUSH TABLES <list> WITH
READ LOCK. It is OK to acquire them without global IX lock
as long as we won't try to upgrade those locks. Since SNW
locks allow concurrent statements using same table FLUSH
TABLE <list> WITH READ LOCK now has to wait until old
versions of tables to be flushed go away after acquiring
metadata locks. Since such waiting can lead to deadlock
MDL deadlock detector was extended to take into account
waits for flush and resolve such deadlocks.
As a bonus code in open_tables() which was responsible for
waiting old versions of tables to go away was refactored.
Now when we encounter old version of table in open_table()
we don't back-off and wait for all old version to go away,
but instead wait for this particular table to be flushed.
Such approach supported by deadlock detection should reduce
number of scenarios in which FLUSH TABLES aborts concurrent
multi-statement transactions.
Note that active FLUSH TABLES <list> WITH READ LOCK still
blocks concurrent FLUSH TABLES WITH READ LOCK statement
as the former keeps tables open and thus prevents the
latter statement from doing flush.
The problem was that a statement could cause an assert if it was aborted by
KILL QUERY while it waited on a metadata lock. This assert checks that a
statement either sends OK or an error to the client. If the bug was triggered
on release builds, it caused OK to be sent to the client instead of
ER_QUERY_INTERRUPTED.
The root cause of the problem was that there are two separate ways to tell if a
statement is killed: thd->killed and mysys_var->abort. KILL QUERY causes both
to be set, thd->killed before mysys_var->abort. Also, both values are reset
at the end of statement execution. This means that it is possible for
KILL QUERY to first set thd->killed, then have the killed statement reset
both thd->killed and mysys_var->abort and finally have KILL QUERY set
mysys_var->abort. This means that the connection with the killed statement
will start executing the next statement with the two values out of sync - i.e.
thd->killed not set but mysys_var->abort set.
Since mysys_var->abort is used to check if a wait for a metadata lock should
be aborted, the next statement would immediately abort any such waiting.
When waiting is aborted, no OK message is sent and thd->killed is checked to
see if ER_QUERY_INTERRUPTED should be sent to the client. But since
the->killed had been reset, neither OK nor an error message was sent to the
client. This then triggered the assert.
This patch fixes the problem by changing the metadata lock waiting code to
check thd->killed.
No test case added as reproducing the assert is dependent on very exact timing
of two (or more) threads. The patch has been checked using RQG and the grammar
posted on the bug report.
WITH READ LOCK and FLUSH TABLES <list> WITH READ LOCK are
incompatible", which adds information about waits caused by
FLUSH TABLES statement to deadlock detector in MDL subsystem.
Remove API supporting caching of pointers to TABLE_SHARE
object in MDL subsystem and all code related to it.
The problem was that locking requirements of code
implementing this API conflicted with locking requirements
of code which adds information about waits caused by flushes
to deadlock detector in MDL subsystem (the former needed to
lock LOCK_open or its future equivalent while having
write-lock on MDL_lock's rwlock, and the latter needs to be
able to read-lock MDL_lock rwlock while owning LOCK_open or
its future equivalent).
Since caching of pointers to TABLE_SHARE objects in MDL
subsystem didn't bring expected performance benefits we
decided to remove caching API rather than try to come up
with some complex solution for this problem.
DATABASE with open HANDLER"
Remove LOCK_create_db, database name locks, and use metadata locks instead.
This exposes CREATE/DROP/ALTER DATABASE statements to the graph-based
deadlock detector in MDL, and paves the way for a safe, deadlock-free
implementation of RENAME DATABASE.
Database DDL statements will now take exclusive metadata locks on
the database name, while table/view/routine DDL statements take
intention exclusive locks on the database name. This prevents race
conditions between database DDL and table/view/routine DDL.
(e.g. DROP DATABASE with concurrent CREATE/ALTER/DROP TABLE)
By adding database name locks, this patch implements
WL#4450 "DDL locking: CREATE/DROP DATABASE must use database locks" and
WL#4985 "DDL locking: namespace/hierarchical locks".
The patch also changes code to use init_one_table() where appropriate.
The new lock_table_names() function requires TABLE_LIST::db_length to
be set correctly, and this is taken care of by init_one_table().
This patch also adds a simple template to help work with
the mysys HASH data structure.
Most of the patch was written by Konstantin Osipov.
subsystem. Fix a number of caveates that the previous
implementation suffered from, including unprotected
access to shared data and lax resource accounting
(share->ref_count) that could lead to deadlocks.
The new implementation still suffers from a number
of potential deadlocks in some edge cases, and this is
still not enabled by default. Especially since performance
testing has shown that it gives only marginable (not even
exceeding measuring accuracy) improvements.
@todo:
- Remove calls to close_cached_tables() with REFRESH_FAST,
and have_lock, because they break the MDL cache.
- rework FLUSH TABLES <list> to not use close_cached_tables()
- make sure that whenever we set TABLE_SHARE::version to
0 we free MDL cache references to it.
locks for DML statements and changes the way MDL locks
are acquired/granted in contended case.
Instead of backing-off when a lock conflict is encountered
and waiting for it to go away before restarting open_tables()
process we now wait for lock to be released without releasing
any previously acquired locks. If conflicting lock goes away
we resume opening tables. If waiting leads to a deadlock we
try to resolve it by backing-off and restarting open_tables()
immediately.
As result both waiting for possibility to acquire and
acquiring of a metadata lock now always happen within the
same MDL API call. This has allowed to make release of a lock
and granting it to the most appropriate pending request an
atomic operation.
Thanks to this it became possible to wake up during release
of lock only those waiters which requests can be satisfied
at the moment as well as wake up only one waiter in case
when granting its request would prevent all other requests
from being satisfied. This solves thundering herd problem
which occured in cases when we were releasing some lock and
woke up many waiters for SNRW or X locks (this was the issue
in bug#52289 "performance regression for MyISAM in sysbench
OLTP_RW test".
This also allowed to implement more fair (FIFO) scheduling
among waiters with the same priority.
It also opens the door for introducing new types of requests
for metadata locks such as low-prio SNRW lock which is
necessary in order to support LOCK TABLES LOW_PRIORITY WRITE.
Notice that after this sometimes can report ER_LOCK_DEADLOCK
error in cases in which it has not happened before.
Particularly we will always report this error if waiting for
conflicting lock has happened in the middle of transaction
and resulted in a deadlock. Before this patch the error was
not reported if deadlock could have been resolved by backing
off all metadata locks acquired by the current statement.
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h
Encapsulate the deadlock detection functionality into
a visitor class, and separate it from the wait-for graph
traversal code.
Use "Internal iterator" and "Visitor" patterns to
achieve the desired separation of responsibilities.
Add comments.
Fix various mismatches between function's language linkage. Any
particular function that is declared in C++ but should be callable
from C must have C linkage. Note that function types with different
linkages are also distinct. Thus, if a function type is declared in
C code, it will have C linkage (same if declared in a extern "C"
block).
transactional SELECT and ALTER TABLE ... REBUILD PARTITION".
The goal of this patch is to decouple type of metadata
lock acquired for table by open_tables() from type of
table-level lock to be acquired on it.
To achieve this we change approach to how we determine what
type of metadata lock should be acquired on table to be open.
Now instead of inferring it at open_tables() time from flags
and type of table-level lock we rely on that type of metadata
lock is properly set at parsing time and is not changed
further.
which was detected during the build of 5.5.3-m3.
This requires version 9 of IBM C++ to help.
More fixes will still be needed, it is very strict
(or rather: a bit picky?) about templates.
Before this fix, the performance schema instrumentation
in mdl.h / mdl.cc was incomplete, causing:
- build warnings,
- no data collection for the performance schema
This fix:
- added instrumentation helpers for the new preferred
reader read write lock, mysql_prlock_*
- implemented completely the performance schema
instrumentation of mdl.h / mdl.cc
on Windows".
On platforms where read-write lock implementation does not
prefer readers by default (Windows, Solaris) server might
have deadlocked while detecting MDL deadlock.
MDL deadlock detector relies on the fact that read-write
locks which are used in its implementation prefer readers
(see new comment for MDL_lock::m_rwlock for details).
So far MDL code assumed that default implementation of
read/write locks for the system has this property.
Indeed, this turned out ot be wrong, for example, for
Windows or Solaris. Thus MDL deadlock detector might have
deadlocked on these systems.
This fix simply adds portable implementation of read/write
lock which prefer readers and changes MDL code to use this
new type of synchronization primitive.
No test case is added as existing rqg_mdl_stability test can
serve as one.
TEMPORARY + HANDLER + LOCK + SP".
Server crashed when one:
1) Opened HANDLER or acquired global read lock
2) Then locked one or several temporary tables with
LOCK TABLES statement (but no base tables).
3) Then issued any statement causing commit (explicit
or implicit).
4) Issued statement which should have closed HANDLER
or released global read lock.
The problem was that when entering LOCK TABLES mode in the
scenario described above we incorrectly set transactional
MDL sentinel to zero. As result during commit all metadata
locks were released (including lock for open HANDLER or
global metadata shared lock). Indeed, attempt to release
metadata lock for the second time which happened during
HANLDER CLOSE or during release of GLR caused crash.
This patch fixes problem by changing MDL_context's
set_trans_sentinel() method to set sentinel to correct
value (it should point to the most recent ticket).
This patch introduces timeouts for metadata locks.
The timeout is specified in seconds using the new dynamic system
variable "lock_wait_timeout" which has both GLOBAL and SESSION
scopes. Allowed values range from 1 to 31536000 seconds (= 1 year).
The default value is 1 year.
The new server parameter "lock-wait-timeout" can be used to set
the default value parameter upon server startup.
"lock_wait_timeout" applies to all statements that use metadata locks.
These include DML and DDL operations on tables, views, stored procedures
and stored functions. They also include LOCK TABLES, FLUSH TABLES WITH
READ LOCK and HANDLER statements.
The patch also changes thr_lock.c code (table data locks used by MyISAM
and other simplistic engines) to use the same system variable.
InnoDB row locks are unaffected.
One exception to the handling of the "lock_wait_timeout" variable
is delayed inserts. All delayed inserts are executed with a timeout
of 1 year regardless of the setting for the global variable. As the
connection issuing the delayed insert gets no notification of
delayed insert timeouts, we want to avoid unnecessary timeouts.
It's important to note that the timeout value is used for each lock
acquired and that one statement can take more than one lock.
A statement can therefore block for longer than the lock_wait_timeout
value before reporting a timeout error. When lock timeout occurs,
ER_LOCK_WAIT_TIMEOUT is reported.
Test case added to lock_multi.test.
type-of-operation-aware metadata locks and added a
wait-for graph based deadlock detector to the MDL
subsystem (this patch fixed bug #46272 "MySQL 5.4.4,
new MDL: unnecessary deadlock" and bug #37346
"innodb does not detect deadlock between update and
alter table").
Removed unused and redundant method.
failed in open_ltable()
The problem was too restrictive asserts that enforced that
open_ltable() was called without any active HANDLERs, LOCK TABLES
or global read locks.
However, this can happen in several cases when opening system
tables. The assert would, for example, be triggered when drop
function was called from a connection with active HANDLERs as
this would cause open_ltable() to be called for mysql.proc.
The assert could also be triggered when using table-based
general log (mysql.general_log).
This patch removes the asserts since they will be triggered in
several legitimate cases and because the asserts are no longer
relevant due to changes in how locks are released.
The patch also fixes set_needs_thr_lock_abort() that before
ignored its parameter and always set the member variable to TRUE.
Test case added to mdl_sync.test.
Thanks to Dmitry Lenev for help with this bug!
caused by patch which implemented new type-of-operation-aware
metadata locks and added a wait-for graph based deadlock
detector to the MDL subsystem (this patch fixed bug #46272
"MySQL 5.4.4, new MDL: unnecessary deadlock" and bug #37346
"innodb does not detect deadlock between update and alter
table").
Crashes were caused by a race in MDL_context::try_acquire_lock().
This method added MDL_ticket to the list of granted tickets and
released lock protecting list before setting MDL_ticket::m_lock.
Thus some other thread was able to see ticket without properly
set m_lock member for some short period of time. If this thread
called method involving this member during this period crash
happened.
This fix ensures that MDL_ticket::m_lock is set in all cases
when ticket is added to granted/pending lists in MDL_lock.
Add a wait-for graph based deadlock detector to the
MDL subsystem.
Fixes bug #46272 "MySQL 5.4.4, new MDL: unnecessary deadlock" and
bug #37346 "innodb does not detect deadlock between update and
alter table".
The first bug manifested itself as an unwarranted abort of a
transaction with ER_LOCK_DEADLOCK error by a concurrent ALTER
statement, when this transaction tried to repeat use of a
table, which it has already used in a similar fashion before
ALTER started.
The second bug showed up as a deadlock between table-level
locks and InnoDB row locks, which was "detected" only after
innodb_lock_wait_timeout timeout.
A transaction would start using the table and modify a few
rows.
Then ALTER TABLE would come in, and start copying rows
into a temporary table. Eventually it would stumble on
the modified records and get blocked on a row lock.
The first transaction would try to do more updates, and get
blocked on thr_lock.c lock.
This situation of circular wait would only get resolved
by a timeout.
Both these bugs stemmed from inadequate solutions to the
problem of deadlocks occurring between different
locking subsystems.
In the first case we tried to avoid deadlocks between metadata
locking and table-level locking subsystems, when upgrading shared
metadata lock to exclusive one.
Transactions holding the shared lock on the table and waiting for
some table-level lock used to be aborted too aggressively.
We also allowed ALTER TABLE to start in presence of transactions
that modify the subject table. ALTER TABLE acquires
TL_WRITE_ALLOW_READ lock at start, and that block all writes
against the table (naturally, we don't want any writes to be lost
when switching the old and the new table). TL_WRITE_ALLOW_READ
lock, in turn, would block the started transaction on thr_lock.c
lock, should they do more updates. This, again, lead to the need
to abort such transactions.
The second bug occurred simply because we didn't have any
mechanism to detect deadlocks between the table-level locks
in thr_lock.c and row-level locks in InnoDB, other than
innodb_lock_wait_timeout.
This patch solves both these problems by moving lock conflicts
which are causing these deadlocks into the metadata locking
subsystem, thus making it possible to avoid or detect such
deadlocks inside MDL.
To do this we introduce new type-of-operation-aware metadata
locks, which allow MDL subsystem to know not only the fact that
transaction has used or is going to use some object but also what
kind of operation it has carried out or going to carry out on the
object.
This, along with the addition of a special kind of upgradable
metadata lock, allows ALTER TABLE to wait until all
transactions which has updated the table to go away.
This solves the second issue.
Another special type of upgradable metadata lock is acquired
by LOCK TABLE WRITE. This second lock type allows to solve the
first issue, since abortion of table-level locks in event of
DDL under LOCK TABLES becomes also unnecessary.
Below follows the list of incompatible changes introduced by
this patch:
- From now on, ALTER TABLE and CREATE/DROP TRIGGER SQL (i.e. those
statements that acquire TL_WRITE_ALLOW_READ lock)
wait for all transactions which has *updated* the table to
complete.
- From now on, LOCK TABLES ... WRITE, REPAIR/OPTIMIZE TABLE
(i.e. all statements which acquire TL_WRITE table-level lock) wait
for all transaction which *updated or read* from the table
to complete.
As a consequence, innodb_table_locks=0 option no longer applies
to LOCK TABLES ... WRITE.
- DROP DATABASE, DROP TABLE, RENAME TABLE no longer abort
statements or transactions which use tables being dropped or
renamed, and instead wait for these transactions to complete.
- Since LOCK TABLES WRITE now takes a special metadata lock,
not compatible with with reads or writes against the subject table
and transaction-wide, thr_lock.c deadlock avoidance algorithm
that used to ensure absence of deadlocks between LOCK TABLES
WRITE and other statements is no longer sufficient, even for
MyISAM. The wait-for graph based deadlock detector of MDL
subsystem may sometimes be necessary and is involved. This may
lead to ER_LOCK_DEADLOCK error produced for multi-statement
transactions even if these only use MyISAM:
session 1: session 2:
begin;
update t1 ... lock table t2 write, t1 write;
-- gets a lock on t2, blocks on t1
update t2 ...
(ER_LOCK_DEADLOCK)
- Finally, support of LOW_PRIORITY option for LOCK TABLES ... WRITE
was abandoned.
LOCK TABLE ... LOW_PRIORITY WRITE from now on has the same
priority as the usual LOCK TABLE ... WRITE.
SELECT HIGH PRIORITY no longer trumps LOCK TABLE ... WRITE in
the wait queue.
- We do not take upgradable metadata locks on implicitly
locked tables. So if one has, say, a view v1 that uses
table t1, and issues:
LOCK TABLE v1 WRITE;
FLUSH TABLE t1; -- (or just 'FLUSH TABLES'),
an error is produced.
In order to be able to perform DDL on a table under LOCK TABLES,
the table must be locked explicitly in the LOCK TABLES list.
condition variable per context instead of one mutex and one conditional
variable for the whole subsystem.
This should increase concurrency in this subsystem.
It also opens the way for further changes which are necessary to solve
such bugs as bug #46272 "MySQL 5.4.4, new MDL: unnecessary deadlock"
and bug #37346 "innodb does not detect deadlock between update and alter
table".
Two other notable changes done by this patch:
- MDL subsystem no longer implicitly acquires global intention exclusive
metadata lock when per-object metadata lock is acquired. Now this has
to be done by explicit calls outside of MDL subsystem.
- Instead of using separate MDL_context for opening system tables/tables
for purposes of I_S we now create MDL savepoint in the main context
before opening tables and rollback to this savepoint after closing
them. This means that it is now possible to get ER_LOCK_DEADLOCK error
even not inside a transaction. This might happen in unlikely case when
one runs DDL on one of system tables while also running DDL on some
other tables. Cases when this ER_LOCK_DEADLOCK error is not justified
will be addressed by advanced deadlock detector for MDL subsystem which
we plan to implement.
This change is supposed to reduce number of ER_LOCK_DEADLOCK
errors which occur when multi-statement transaction encounters
conflicting metadata lock in cases when waiting is possible.
The idea is not to fail ER_LOCK_DEADLOCK error immediately when
we encounter conflicting metadata lock. Instead we release all
metadata locks acquired by current statement and start to wait
until conflicting lock go away. To avoid deadlocks we use simple
empiric which aborts waiting with ER_LOCK_DEADLOCK error if it
turns out that somebody is waiting for metadata locks owned by
this transaction.
This patch also fixes bug #46273 "MySQL 5.4.4 new MDL: Bug#989
is not fully fixed in case of ALTER".
The bug was that concurrent execution of UPDATE or MULTI-UPDATE
statement as a part of multi-statement transaction that already
has used table being updated and ALTER TABLE statement might have
resulted of loss of isolation between this transaction and ALTER
TABLE statement, which manifested itself as changes performed by
ALTER TABLE becoming visible in transaction and wrong binary log
order as a consequence.
This problem occurred when UPDATE or MULTI-UPDATE's wait in
mysql_lock_tables() call was aborted due to metadata lock
upgrade performed by concurrent ALTER TABLE. After such abort all
metadata locks held by transaction were released but transaction
silently continued to be executed as if nothing has happened.
We solve this problem by changing our code not to release all
locks in such case. Instead we release only locks which were
acquired by current statement and then try to reacquire them
by restarting open/lock tables process. We piggyback on simple
deadlock detector implementation since this change has to be
done anyway for it.
"HANDLER statements within a transaction might lead to deadlocks".
Introduce a notion of a sentinel to MDL_context. A sentinel
is a ticket that separates all tickets in the context into two
groups: before and after it. Currently we can have (and need) only
one designated sentinel -- it separates all locks taken by LOCK
TABLE or HANDLER statement, which must survive COMMIT and ROLLBACK
and all other locks, which must be released at COMMIT or ROLLBACK.
The tricky part is maintaining the sentinel up to date when
someone release its corresponding ticket. This can happen, e.g.
if someone issues DROP TABLE under LOCK TABLES (generally,
see all calls to release_all_locks_for_name()).
MDL_context::release_ticket() is modified to take care of it.
******
A fix and a test case for Bug#46224 "HANDLER statements within a
transaction might lead to deadlocks".
An attempt to mix HANDLER SQL statements, which are transaction-
agnostic, an open multi-statement transaction,
and DDL against the involved tables (in a concurrent connection)
could lead to a deadlock. The deadlock would occur when
HANDLER OPEN or HANDLER READ would have to wait on a conflicting
metadata lock. If the connection that issued HANDLER statement
also had other metadata locks (say, acquired in scope of a
transaction), a classical deadlock situation of mutual wait
could occur.
Incompatible change: entering LOCK TABLES mode automatically
closes all open HANDLERs in the current connection.
Incompatible change: previously an attempt to wait on a lock
in a connection that has an open HANDLER statement could wait
indefinitely/deadlock. After this patch, an error ER_LOCK_DEADLOCK
is produced.
The idea of the fix is to merge thd->handler_mdl_context
with the main mdl_context of the connection, used for transactional
locks. This makes deadlock detection possible, since all waits
with locks are "visible" and available to analysis in a single
MDL context of the connection.
Since HANDLER locks and transactional locks have a different life
cycle -- HANDLERs are explicitly open and closed, and so
are HANDLER locks, explicitly acquired and released, whereas
transactional locks "accumulate" till the end of a transaction
and are released only with COMMIT, ROLLBACK and ROLLBACK TO SAVEPOINT,
a concept of "sentinel" was introduced to MDL_context.
All locks, HANDLER and others, reside in the same linked list.
However, a selected element of the list separates locks with
different life cycle. HANDLER locks always reside at the
end of the list, after the sentinel. Transactional locks are
prepended to the beginning of the list, before the sentinel.
Thus, ROLLBACK, COMMIT or ROLLBACK TO SAVEPOINT, only
release those locks that reside before the sentinel. HANDLER locks
must be released explicitly as part of HANDLER CLOSE statement,
or an implicit close.
The same approach with sentinel
is also employed for LOCK TABLES locks. Since HANDLER and LOCK TABLES
statement has never worked together, the implementation is
made simple and only maintains one sentinel, which is used either
for HANDLER locks, or for LOCK TABLES locks.
------------------------------------------------------------
revno: 2617.68.25
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-next-bg-pre2-2
timestamp: Wed 2009-09-16 18:26:50 +0400
message:
Follow-up for one of pre-requisite patches for fixing bug #30977
"Concurrent statement using stored function and DROP FUNCTION
breaks SBR".
Made enum_mdl_namespace enum part of MDL_key class and removed MDL_
prefix from the names of enum members. In order to do the latter
changed name of PROCEDURE symbol to PROCEDURE_SYM (otherwise macro
which was automatically generated for this symbol conflicted with
MDL_key::PROCEDURE enum member).
------------------------------------------------------------
revno: 2617.68.24
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-next-bg-pre2-2
timestamp: Wed 2009-09-16 17:25:29 +0400
message:
Pre-requisite patch for fixing bug #30977 "Concurrent statement
using stored function and DROP FUNCTION breaks SBR".
Added MDL_request for stored routine as member to Sroutine_hash_entry
in order to be able perform metadata locking for stored routines in
future (Sroutine_hash_entry is an equivalent of TABLE_LIST class for
stored routines).
(WL#4284, follow up fixes).
Bug #47249 assert in MDL_global_lock::is_lock_type_compatible
This assert could be triggered if LOCK TABLES were used to lock
both a table and a view that used the same table. The table would have
to be first WRITE locked and then READ locked. So "LOCK TABLES v1
WRITE, t1 READ" would eventually trigger the assert, "LOCK TABLES
v1 READ, t1 WRITE" would not. The reason is that the ordering of locks
in the interal representation made a difference when executing
FLUSH TABLE on the table.
During FLUSH TABLE, a lock was upgraded to exclusive. If this lock
was of type MDL_SHARED and not MDL_SHARED_UPGRADABLE, an internal
counter in the MDL subsystem would get out of sync. This would happen
if the *last* mention of the table in LOCK TABLES was a READ lock.
The counter in question is the number exclusive locks (active or intention).
This is used to make sure a global metadata lock is only taken when the
counter is zero (= no conflicts). The counter is increased when a
MDL_EXCLUSIVE or MDL_SHARED_UPGRADABLE lock is taken, but not when
upgrade_shared_lock_to_exclusive() is used to upgrade directly
from MDL_SHARED to MDL_EXCLUSIVE.
This patch fixes the problem by searching for a TABLE instance locked
with MDL_SHARED_UPGRADABLE or MDL_EXCLUSIVE before calling
upgrade_shared_lock_to_exclusive(). The patch also adds an assert checking
that only MDL_SHARED_UPGRADABLE locks are upgraded to exclusive.
Test case added to lock_multi.test.
A pre-requisite patch for Bug#30977 "Concurrent statement using
stored function and DROP FUNCTION breaks SBR".
This patch changes the MDL API by introducing a namespace for
lock keys: MDL_TABLE for tables and views and MDL_PROCEDURE
for stored procedures and functions. The latter is needed for
the fix for Bug#30977.
----------------------------------------------------------
revno: 2617.69.20
committer: Konstantin Osipov <kostja@sun.com>
branch nick: 5.4-4284-1-assert
timestamp: Thu 2009-08-13 18:29:55 +0400
message:
WL#4284 "Transactional DDL locking"
A review fix.
Since WL#4284 implementation separated MDL_request and MDL_ticket,
MDL_request becamse a utility object necessary only to get a ticket.
Store it by-value in TABLE_LIST with the intent to merge
MDL_request::key with table_list->table_name and table_list->db
in future.
Change the MDL subsystem to not require MDL_requests to
stay around till close_thread_tables().
Remove the list of requests from the MDL context.
Requests for shared metadata locks acquired in open_tables()
are only used as a list in recover_from_failed_open_table_attempt(),
which calls mdl_context.wait_for_locks() for this list.
To keep such list for recover_from_failed_open_table_attempt(),
introduce a context class (Open_table_context), that collects
all requests.
A lot of minor cleanups and simplications that became possible
with this change.
----------------------------------------------------------
revno: 2617.23.20
committer: Konstantin Osipov <kostja@sun.com>
branch nick: mysql-6.0-runtime
timestamp: Wed 2009-03-04 16:31:31 +0300
message:
WL#4284 "Transactional DDL locking"
Review comments: "Objectify" the MDL API.
MDL_request and MDL_context still need manual construction and
destruction, since they are used in environment that is averse
to constructors/destructors.
----------------------------------------------------------
revno: 2617.23.19
committer: Konstantin Osipov <kostja@sun.com>
branch nick: mysql-6.0-runtime
timestamp: Tue 2009-03-03 01:20:44 +0300
message:
Metadata locking: realign comments. No semantical changes,
only enforce a bit of the coding style.
This is a review fix for WL#4284 "Transactional DDL locking".
------------------------------------------------------------
revno: 2617.23.18
committer: Davi Arnaut <Davi.Arnaut@Sun.COM>
branch nick: 4284-6.0
timestamp: Mon 2009-03-02 18:18:26 -0300
message:
Bug#989: If DROP TABLE while there's an active transaction, wrong binlog order
WL#4284: Transactional DDL locking
This is a prerequisite patch:
These changes are intended to split lock requests from granted
locks and to allow the memory and lifetime of granted locks to
be managed within the MDL subsystem. Furthermore, tickets can
now be shared and therefore are used to satisfy multiple lock
requests, but only shared locks can be recursive.
The problem is that the MDL subsystem morphs lock requests into
granted locks locks but does not manage the memory and lifetime
of lock requests, and hence, does not manage the memory of
granted locks either. This can be problematic because it puts the
burden of tracking references on the users of the subsystem and
it can't be easily done in transactional contexts where the locks
have to be kept around for the duration of a transaction.
Another issue is that recursive locks (when the context trying to
acquire a lock already holds a lock on the same object) requires
that each time the lock is granted, a unique lock request/granted
lock structure structure must be kept around until the lock is
released. This can lead to memory leaks in transactional contexts
as locks taken during the transaction should only be released at
the end of the transaction. This also leads to unnecessary wake
ups (broadcasts) in the MDL subsystem if the context still holds
a equivalent of the lock being released.
These issues are exacerbated due to the fact that WL#4284 low-level
design says that the implementation should "2) Store metadata locks
in transaction memory root, rather than statement memory root" but
this is not possible because a memory root, as implemented in mysys,
requires all objects allocated from it to be freed all at once.
This patch combines review input and significant code contributions
from Konstantin Osipov (kostja) and Dmitri Lenev (dlenev).
------------------------------------------------------------
revno: 2630.4.33
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w2
timestamp: Fri 2008-06-20 17:11:20 +0400
message:
WL#3726 "DDL locking for all metadata objects".
After-review fixes in progress.
Minimized dependency of mdl.cc on other modules (particularly
made it independant of mysql_priv.h) in order to be able
write unit tests for metadata locking subsystem.
------------------------------------------------------------
revno: 2630.4.24
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w2
timestamp: Fri 2008-06-06 14:28:58 +0400
message:
WL#3726 "DDL locking for all metadata objects".
After review fixes in progress.
Get rid of upgradability and priority attributes of
metadata lock requests by replacing them with two
new types of lock requests MDL_SHARED_UPGRADABLE and
MDL_SHARED_HIGH_PRIO correspondingly.
------------------------------------------------------------
revno: 2630.4.18
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w2
timestamp: Tue 2008-06-03 21:07:58 +0400
message:
WL#3726 "DDL locking for all metadata objects".
After review fixes in progress.
Now during upgrading/downgrading metadata locks we deal with
individual metadata lock requests rather than with all requests
for this object in the context. This makes API a bit more clear
and makes adjust_mdl_locks_upgradability() much nicer.
------------------------------------------------------------
revno: 2630.4.17
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w2
timestamp: Thu 2008-05-29 16:52:56 +0400
message:
WL#3726 "DDL locking for all metadata objects".
After review fixes in progress.
"The great correction of names".
Renamed MDL_LOCK and MDL_LOCK_DATA classes to make usage of
these names in metadata locking subsystem consistent with
other parts of server (i.e. thr_lock.cc). Now we MDL_LOCK_DATA
corresponds to request for a lock and MDL_LOCK to the lock
itself. Adjusted code in MDL subsystem and other places
using these classes accordingly.
Did similar thing for GLOBAL_MDL_LOCK_DATA class and also
changed name of its members to correspond to names of
MDL_LOCK_DATA members.
Finally got rid of usage of one letter variables in MDL
code since it makes code harder to search in (according
to reviewer).
Backport of:
------------------------------------------------------------
revno: 2630.4.1
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w
timestamp: Fri 2008-05-23 17:54:03 +0400
message:
WL#3726 "DDL locking for all metadata objects".
After review fixes in progress.
------------------------------------------------------------
This is the first patch in series. It transforms the metadata
locking subsystem to use a dedicated module (mdl.h,cc). No
significant changes in the locking protocol.
The import passes the test suite with the exception of
deprecated/removed 6.0 features, and MERGE tables. The latter
are subject to a fix by WL#4144.
Unfortunately, the original changeset comments got lost in a merge,
thus this import has its own (largely insufficient) comments.
This patch fixes Bug#25144 "replication / binlog with view breaks".
Warning: this patch introduces an incompatible change:
Under LOCK TABLES, it's no longer possible to FLUSH a table that
was not locked for WRITE.
Under LOCK TABLES, it's no longer possible to DROP a table or
VIEW that was not locked for WRITE.
******
Backport of:
------------------------------------------------------------
revno: 2630.4.2
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w
timestamp: Sat 2008-05-24 14:03:45 +0400
message:
WL#3726 "DDL locking for all metadata objects".
After review fixes in progress.
******
Backport of:
------------------------------------------------------------
revno: 2630.4.3
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w
timestamp: Sat 2008-05-24 14:08:51 +0400
message:
WL#3726 "DDL locking for all metadata objects"
Fixed failing Windows builds by adding mdl.cc to the lists
of files needed to build server/libmysqld on Windows.
******
Backport of:
------------------------------------------------------------
revno: 2630.4.4
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w
timestamp: Sat 2008-05-24 21:57:58 +0400
message:
WL#3726 "DDL locking for all metadata objects".
Fix for assert failures in kill.test which occured when one
tried to kill ALTER TABLE statement on merge table while it
was waiting in wait_while_table_is_used() for other connections
to close this table.
These assert failures stemmed from the fact that cleanup code
in this case assumed that temporary table representing new
version of table was open with adding to THD::temporary_tables
list while code which were opening this temporary table wasn't
always fulfilling this.
This patch changes code that opens new version of table to
always do this linking in. It also streamlines cleanup process
for cases when error occurs while we have new version of table
open.
******
WL#3726 "DDL locking for all metadata objects"
Add libmysqld/mdl.cc to .bzrignore.
******
Backport of:
------------------------------------------------------------
revno: 2630.4.6
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w
timestamp: Sun 2008-05-25 00:33:22 +0400
message:
WL#3726 "DDL locking for all metadata objects".
Addition to the fix of assert failures in kill.test caused by
changes for this worklog.
Make sure we close the new table only once.