* reuse the loop in THD::abort_current_cond_wait, don't duplicate it
* find_thread_by_id should return whatever it has found, it's the
caller's task not to kill COM_DAEMON (if the caller's a killer)
and other minor changes
This reverts the server part of the commit 775fccea0
but keeps InnoDB part (which reverted MDEV-17092 5530a93f4).
So after this both MDEV-23536 and MDEV-17092 are reverted,
and the original bug is resurrected.
A race condition may occur between the execution of transaction commit,
and an execution of a KILL statement that would attempt to abort that
transaction.
MDEV-17092 worked around this race condition by modifying InnoDB code.
After that issue was closed, Sergey Vojtovich pointed out that this
race condition would better be fixed above the storage engine layer:
If you look carefully into the above, you can conclude that
thd->free_connection() can be called concurrently with
KILL/thd->awake(). Which is the bug. And it is partially fixed in
THD::~THD(), that is destructor waits for KILL completion:
Fix: Add necessary mutex operations to THD::free_connection()
and move WSREP specific code also there. This ensures that no
one is using THD while we do free_connection(). These mutexes
will also ensures that there can't be concurrent KILL/THD::awake().
innobase_kill_query
We can now remove usage of trx_sys_mutex introduced on MDEV-17092.
trx_t::free()
Poison trx->state and trx->mysql_thd
This patch is validated with an RQG run similar to the one that
reproduced MDEV-17092.
This bug could cause a crash when executing queries that used mutually
recursive CTEs with system variable big_tables set to 1. It happened due
to several bugs in the code that handled recursive table references
referred mutually recursive CTEs. For each recursive table reference a
temporary table is created that contains all rows generated for the
corresponding recursive CTE table on the previous step of recursion.
This temporary table should be created in the same way as the temporary
table created for a regular materialized derived table using the
method select_union::create_result_table(). In this case when the
temporary table is created it uses the select_union::TMP_TABLE_PARAM
structure as the parameter for the table construction. However the
code created the temporary table using just the function create_tmp_table()
and passed pointers to certain fields of the TMP_TABLE_PARAM structure
used for accumulation of rows of the recursive CTE table as parameters
for update. This was a mistake because now different temporary tables
cannot share some TMP_TABLE_PARAM fields in a general case. Besides,
depending on how mutually recursive CTE tables were defined and which
of them were referred in the executed query the select_union object
allocated for a recursive table reference could be allocated again after
the the temporary table had been created. In this case the TMP_TABLE_PARAM
object associated with the temporary table created for the recursive
table reference contained unassigned fields needed for execution when
Aria engine is employed as the engine for temporary tables.
This patch ensures that
- select_union object is created only once for any recursive table
reference
- any temporary table created for recursive CTEs uses its own
TMP_TABLE_PARAM structure
The patch also fixes a problem caused by incomplete cleanup of join tables
associated with recursive table references.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
The reason for the failure is that
thd->mdl_context.release_transactional_locks()
was called after commit & rollback even in cases where the current
transaction is still active.
For 10.2, 10.3 and 10.4 the fix is simple:
- Replace all calls to thd->mdl_context.release_transactional_locks() with
thd->release_transactional_locks(). The thd function will only call
the mdl_context function if there are no active transactional locks.
In 10.6 we will better fix where we will change the return value for
some trans_xxx() functions to indicate if transaction did close the
transaction or not. This will avoid the need of the indirect call.
Other things:
- trans_xa_commit() and trans_xa_rollback() will automatically
call release_transactional_locks() if the transaction is closed.
- We can't do that for the other functions as the caller of many of these
are doing additional work (like close_thread_tables) before calling
release_transactional_locks().
- Added missing abort_result_set() and missing DBUG_RETURN in
select_create::send_eof()
- Fixed wrong indentation in injector::transaction::commit()
During graceful shutdowns, client connections are closed and
eventually and THD::awake() acquires LOCK_thd_data mutex which is
required later on in wsrep_thd_is_aborting(). Make sure LOCK_thd_data
is acquired, even if global wsrep_on is disabled.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
The issue here was the system variable max_sort_length was being applied
to decimals and it was truncating the value for decimals to the number
of bytes set by max_sort_length.
This was leading to a buffer overflow as the values were written
to the buffer without truncation and then we moved the offset to
the number of bytes(set by max_sort_length), that are needed for comparison.
The fix is to not apply max_sort_length for fixed size types like INT,
DECIMALS and only apply max_sort_length for CHAR, VARCHARS, TEXT and
BLOBS.
MDEV-20945: BACKUP UNLOCK + FTWRL assertion failure | SIGSEGV in I_P_List
from MDL_context::release_lock on INSERT w/ BACKUP LOCK (on optimized
builds) | Assertion `ticket->m_duration == MDL_EXPLICIT' failed
BACKUP LOCK behavior is modified so it won't be used wrong:
- BACKUP LOCK should commit any active transactions.
- BACKUP LOCK should not be allowed in stored procedures.
- When BACKUP LOCK is active, don't allow any DDL's for that connection.
- FTWRL is forbidden on the same connection while BACKUP LOCK is active.
Reviewed-by: monty@mariadb.com
- Remove row_start/row_end from keys in fix_create_like();
- Disable manual adding of implicit row_start/row_end to indexes on
CREATE TABLE. INVISIBLE_SYSTEM fields are unoperable by user;
- Fix memory leak on allocation of Key_part_spec.
MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel
replication
Fixed by partly reverting MDEV-21953 to put back MDL_BACKUP_COMMIT locking
before log_and_order.
The original problem for MDEV-21953 was that while a thread was waiting in
for another threads to commit in 'log_and_order', it had the
MDL_BACKUP_COMMIT lock. The backup thread was waiting to get the
MDL_BACKUP_WAIT_COMMIT lock, which blocks all new MDL_BACKUP_COMMIT locks.
This causes a deadlock as the waited-for thread can never get past the
MDL_BACKUP_COMMIT lock in ha_commit_trans.
The main part of the bug fix is to release the MDL_BACKUP_COMMIT lock while
a thread is waiting for other 'previous' threads to commit. This ensures
that no transactional thread keeps MDL_BACKUP_COMMIT while waiting, which
ensures that there are no deadlocks anymore.
- Adding optional qualifiers to data types:
CREATE TABLE t1 (a schema.DATE);
Qualifiers now work only for three pre-defined schemas:
mariadb_schema
oracle_schema
maxdb_schema
These schemas are virtual (hard-coded) for now, but may turn into real
databases on disk in the future.
- mariadb_schema.TYPE now always resolves to a true MariaDB data
type TYPE without sql_mode specific translations.
- oracle_schema.DATE translates to MariaDB DATETIME.
- maxdb_schema.TIMESTAMP translates to MariaDB DATETIME.
- Fixing SHOW CREATE TABLE to use a qualifier for a data type TYPE
if the current sql_mode translates TYPE to something else.
The above changes fix the reported problem, so this script:
SET sql_mode=ORACLE;
CREATE TABLE t2 AS SELECT mariadb_date_column FROM t1;
is now replicated as:
SET sql_mode=ORACLE;
CREATE TABLE t2 (mariadb_date_column mariadb_schema.DATE);
and the slave can unambiguously treat DATE as the true MariaDB DATE
without ORACLE specific translation to DATETIME.
Similar,
SET sql_mode=MAXDB;
CREATE TABLE t2 AS SELECT mariadb_timestamp_column FROM t1;
is now replicated as:
SET sql_mode=MAXDB;
CREATE TABLE t2 (mariadb_timestamp_column mariadb_schema.TIMESTAMP);
so the slave treats TIMESTAMP as the true MariaDB TIMESTAMP
without MAXDB specific translation to DATETIME.
* Allocate items on thd->mem_root while refixing vcol exprs
* Make vcol tree changes register and roll them back after the statement is executed.
Explanation:
Due to collation implementation specifics an Item tree could change while fixing.
The tricky thing here is to make it on a proper arena.
It's usually not a problem when a field is deterministic, however, makes a pain vice-versa, during allocation allocating.
A non-deterministic field should be refixed on each statement, since it depends on the environment state.
Changing the tree will be temporary and therefore it should be reverted after the statement execution.
When high priority replication slave applier encounters lock conflict in innodb,
it will force the conflicting lock holder transaction (victim) to rollback.
This is a must in multi-master sychronous replication model to avoid cluster lock-up.
This high priority victim abort (aka "brute force" (BF) abort), is started
from innodb lock manager while holding the victim's transaction's (trx) mutex.
Depending on the execution state of the victim transaction, it may happen that the
BF abort will call for THD::awake() to wake up the victim transaction for the rollback.
Now, if BF abort requires THD::awake() to be called, then the applier thread executed
locking protocol of: victim trx mutex -> victim THD::LOCK_thd_data
If, at the same time another DBMS super user issues KILL command to abort the same victim,
it will execute locking protocol of: victim THD::LOCK_thd_data -> victim trx mutex.
These two locking protocol acquire mutexes in opposite order, hence unresolvable mutex locking
deadlock may occur.
The fix in this commit adds THD::wsrep_aborter flag to synchronize who can kill the victim
This flag is set both when BF is called for from innodb and by KILL command.
Either path of victim killing will bail out if victim's wsrep_killed is already
set to avoid mutex conflicts with the other aborter execution. THD::wsrep_aborter
records the aborter THD's ID. This is needed to preserve the right to kill
the victim from different locations for the same aborter thread.
It is also good error logging, to see who is reponsible for the abort.
A new test case was added in galera.galera_bf_kill_debug.test for scenario where
wsrep applier thread and manual KILL command try to kill same idle victim
For low sort_buffer_size, in the cost calculation of using the Unique object the elements in the tree were evaluated to 0, make sure to have atleast 1 element in the Unique tree.
Also for the function Unique::get allocate memory for atleast MERGEBUFF2+1 keys.
cannot use the current THD::mem_root, because it can be temporarily
reassigned to something with a very different life time
(e.g. to TABLE::mem_root or range optimizer mem_root).
MDEV-21398 Deadlock (server hang) or assertion failure in
Diagnostics_area::set_error_status upon ALTER under lock
This failure could only happen if one locked the same table
multiple times and then did an ALTER TABLE on the table.
Major change is to change all instances of
table->m_needs_reopen= true;
to
table->mark_table_for_reopen();
The main fix for the problem was to ensure that we mark all
instances of the table in the locked_table_list and when we
reopen the tables, we first close all tables before reopening
and locking them.
Other things:
- Don't call thd->locked_tables_list.reopen_tables if there
are no tables marked for reopen. (performance)