It was possibile for a user to create an interlocked state which may go on
for a significant period of time. There is a tight loop in the FTWRL code
path that tries to repeatedly acquire a read lock. As the weight of FTWRL
lock is the smallest among others, it's always selected by the deadlock
detector, but can never be killed.
Imaging the following sequence:
connection_0 connection_1
GET_LOCK("l1", 0);
LOCK TABLES t WRITE;
FLUSH TABLES WITH READ LOCK;
GET_LOCK("l1", 1000);
The GET_LOCK statement in connection_1 triggers the deadlock detector,
which tries to select the lock in FTWRL, since its weight is 0. However,
since a loop in Global_read_lock::lock_global_read_lock() tries to always
win, it tries to acquire lock again. Which invokes the deadlock detector,
and that cycle continues until GET_LOCK in connection_1 times out.
This patch resolves the live-locking by introducing a dynamic bonus to the
deadlock weight associated with every lock. Each lock gets a bonus weight
each time it's selected by the deadlock detector. In case of a live-lock
situation, those locks that cannot be killed, get additional weight each
iteration. Eventually their weight becomes so high that the deadlock
detector shifts its attention to other lock, until it find the one that
can be killed.
Some DML operations on tables having unique secondary keys cause scanning
in the secondary index, for instance to find potential unique key violations
in the seconday index. This scanning may involve GAP locking in the index.
As this locking happens also when applying replication events in high priority
applier threads, there is a probabality for lock conflicts between two wsrep
high priority threads.
This PR avoids lock conflicts of high priority wsrep threads, which do
secondary index scanning e.g. for duplicate key detection.
The actual fix is the patch in sql_class.cc:thd_need_ordering_with(), where
we allow relaxed GAP locking protocol between wsrep high priority threads.
wsrep high priority threads (replication appliers, replayers and TOI processors)
are ordered by the replication provider, and they will not need serializability
support gained by secondary index GAP locks.
PR contains also a mtr test, which exercises a scenario where two replication
applier threads have a false positive conflict in GAP of unique secondary index.
The conflicting local committing transaction has to replay, and the test verifies
also that the replaying phase will not conflict with the latter repllication applier.
Commit also contains new test scenario for galera.galera_UK_conflict.test,
where replayer starts applying after a slave applier thread, with later seqno,
has advanced to commit phase. The applier and replayer have false positive GAP
lock conflict on secondary unique index, and replayer should ignore this.
This test scenario caused crash with earlier version in this PR, and to fix this,
the secondary index uniquenes checking has been relaxed even further.
Now innodb trx_t structure has new member: bool wsrep_UK_scan, which is set to
true, when high priority thread is performing unique secondary index scanning.
The member trx_t::wsrep_UK_scan is defined inside WITH_WSREP directive, to make
it possible to prepare a MariaDB build where this additional trx_t member is
not present and is not used in the code base. trx->wsrep_UK_scan is set to true
only for the duration of function call for: lock_rec_lock() trx->wsrep_UK_scan
is used only in lock_rec_has_to_wait() function to relax the need to wait if
wsrep_UK_scan is set and conflicting transaction is also high priority.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
The reason for the failure is that
thd->mdl_context.release_transactional_locks()
was called after commit & rollback even in cases where the current
transaction is still active.
For 10.2, 10.3 and 10.4 the fix is simple:
- Replace all calls to thd->mdl_context.release_transactional_locks() with
thd->release_transactional_locks(). The thd function will only call
the mdl_context function if there are no active transactional locks.
In 10.6 we will better fix where we will change the return value for
some trans_xxx() functions to indicate if transaction did close the
transaction or not. This will avoid the need of the indirect call.
Other things:
- trans_xa_commit() and trans_xa_rollback() will automatically
call release_transactional_locks() if the transaction is closed.
- We can't do that for the other functions as the caller of many of these
are doing additional work (like close_thread_tables) before calling
release_transactional_locks().
- Added missing abort_result_set() and missing DBUG_RETURN in
select_create::send_eof()
- Fixed wrong indentation in injector::transaction::commit()
MDL_lock::Ticket_list::remove_ticket(): reduce algoritmic
complexity from O(N) to O(1)
MDL_lock::Ticket_list::clear_bit_if_not_in_list(): removed
MDL_lock::Ticket_list::m_type_counters: a map of ticket type
to count. Initialization is memset(0) which takes time.
Reverted original patch (c2e0a0b).
For consistency with "LOCK TABLE <table_name> READ" and "FLUSH TABLES
WITH READ LOCK", which are forbidden under "BACKUP STAGE", forbid "FLUSH
TABLE <table_name> FOR EXPORT" and "FLUSH TABLE <table_name> WITH READ
LOCK" as well.
It'd allow consistent fixes for problems like MDEV-18643.
Fixes MDEV-18067, MDEV-18068 and MDEV-18069
The problem was that FLUSH TABLES table_name combined with UNLOCK TABLES
calls MDL_context::set_transaction_duration_for_all_locks(), which
changed backup_locks from MDL_EXPLICT to MDL_TRANSACTION.
Fixed by ensuring that set_transaction_duration_for_all_locks() doesn't
touch BACKUP locks.
Part of MDEV-5336 Implement LOCK FOR BACKUP
- Changed check of Global_only_lock to also include BACKUP lock.
- We store latest MDL_BACKUP_DDL lock in thd->mdl_backup_ticket to be able
to downgrade lock during copy_data_between_tables()
Part of MDEV-5336 Implement LOCK FOR BACKUP
- Added new locks to MDL_BACKUP for all stages of backup locks and
a new MDL lock needed for backup stages.
- Renamed MDL_BACKUP_STMT to MDL_BACKUP_DDL
- flush_tables() takes a new parameter that decides what should be flushed.
- InnoDB, Aria (transactional tables with checksums), Blackhole, Federated
and Federatedx tables are marked to be safe for online backup. We are
using MDL_BACKUP_TRANS_DML instead of MDL_BACKUP_DML locks for these
which allows any DML's to proceed for these tables during the whole
backup process until BACKUP STAGE COMMIT which will block the final
commit.
Part of MDEV-5336 Implement LOCK FOR BACKUP
FLUSH TABLE table_names have changed slighty as we are now opening
tables before taking the MDL lock. The difference is that FLUSH TABLE
table_name will now be blocked by a table that is waiting for FTWRL.
There should not be any new deadlocks as part of this change.
The end result is still better in most cases as FTWRL is now only
waiting for write statements to end, not for read only statements and
it's not flushing tables in use from the table cache.
Share will be needed to be able to determine if table supports online
backup. Appropriate metadata lock type in BACKUP namespace will be
acquired basing on this information.
Also made pending global read lock request to be preferred victim of MDL
deadlock detector. This allows us to hide some non-fatal deadlocks and
make FTWRL less likely to break concurrent queries.
Part of MDEV-5336 Implement LOCK FOR BACKUP
Originally both table metadata lock and global read lock protection
were acquired before getting TABLE from table cache. This will be
reordered in a future commit with MDL_BACKUP_XXX locks so that we
first take table metadata lock, then get TABLE from table cache, then
acquire analogue of global read lock.
This patch both simplifies FLUSH TABLES code, makes FLUSH TABLES to
lock less and also enables FLUSH TABLES code to be used with backup
locks.
The usage of FLUSH TABLES changes slightly:
- FLUSH TABLES without any arguments will now only close not used tables
and tables locked by the FLUSH TABLES connection. All not used table
shares will be closed.
Tables locked by the FLUSH TABLES connection will be reopened and
re-locked after all others has stoped using the table (as before).
If there was no locked tables, then FLUSH TABLES is instant and will
not cause any waits.
FLUSH TABLES will not wait for any in use table.
- FLUSH TABLES with a table list, will ensure that the tables are closed
before statement returns. The code is now only using MDL locks and not
table share versions, which simplices the code greatly. One visible
change is that the server will wait for the end of the transaction that
are using the tables. Before FLUSH TABLES only waited for the statements
to end.
Signed-off-by: Monty <monty@mariadb.org>
- CREATE PACKAGE [BODY] statements are now
entirely written to mysql.proc with type='PACKAGE' and type='PACKAGE BODY'.
- CREATE PACKAGE BODY now supports IF NOT EXISTS
- DROP PACKAGE BODY now supports IF EXISTS
- CREATE OR REPLACE PACKAGE [BODY] is now supported
- CREATE PACKAGE [BODY] now support the DEFINER clause:
CREATE DEFINER user@host PACKAGE pkg ... END;
CREATE DEFINER user@host PACKAGE BODY pkg ... END;
- CREATE PACKAGE [BODY] now supports SQL SECURITY and COMMENT clauses, e.g.:
CREATE PACKAGE p1 SQL SECURITY INVOKER COMMENT "comment" AS ... END;
- Package routines are now created from the package CREATE PACKAGE BODY
statement and don't produce individual records in mysql.proc.
- CREATE PACKAGE BODY now supports package-wide variables.
Package variables can be read and set inside package routines.
Package variables are stored in a separate sp_rcontext,
which is cached in THD on the first packate routine call.
- CREATE PACKAGE BODY now supports the initialization section.
- All public routines (i.e. declared in CREATE PACKAGE)
must have implementations in CREATE PACKAGE BODY
- Only public package routines are available outside of the package
- {CREATE|DROP} PACKAGE [BODY] now respects CREATE ROUTINE and ALTER ROUTINE
privileges
- "GRANT EXECUTE ON PACKAGE BODY pkg" is now supported
- SHOW CREATE PACKAGE [BODY] is now supported
- SHOW PACKAGE [BODY] STATUS is now supported
- CREATE and DROP for PACKAGE [BODY] now works for non-current databases
- mysqldump now supports packages
- "SHOW {PROCEDURE|FUNCTION) CODE pkg.routine" now works for package routines
- "SHOW PACKAGE BODY CODE pkg" now works (the package initialization section)
- A new package body level MDL was added
- Recursive calls for package procedures are now possible
- Routine forward declarations in CREATE PACKATE BODY are now supported.
- Package body variables now work as SP OUT parameters
- Package body variables now work as SELECT INTO targets
- Package body variables now support ROW, %ROWTYPE, %TYPE
Other things, mainly to get
create_mysqld_error_find_printf_error tool to work:
- Added protection to not include mysqld_error.h twice
- Include "unireg.h" instead of "mysqld_error.h" in server
- Added protection if ER_XX messages are already defined
- Removed wrong calls to my_error(ER_OUTOFMEMORY) as
my_malloc() and my_alloc will do this automatically
- Added missing %s to ER_DUP_QUERY_NAME
- Removed old and wrong calls to my_strerror() when using
MY_ERROR_ON_RENAME (wrong merge)
- Fixed deadlock error message from Galera. Before the extra
information given to ER_LOCK_DEADLOCK was missing because
ER_LOCK_DEADLOCK doesn't provide any extra information.
I kept #ifdef mysqld_error_find_printf_error_used in sql_acl.h
to make it easy to do this kind of check again in the future
If compiling a non DBUG binary with
-DDBUG_ASSERT_AS_PRINTF asserts will be
changed to printf + stack trace (of stack
trace are enabled).
- Changed #ifndef DBUG_OFF to
#ifdef DBUG_ASSERT_EXISTS
for those DBUG_OFF that was just used to enable
assert
- Assert checking that could greatly impact
performance where changed to DBUG_ASSERT_SLOW which
is not affected by DBUG_ASSERT_AS_PRINTF
- Added one extra option to my_print_stacktrace() to
get more silent in case of stack trace printing as
part of assert.