Problem is that dropping of fts table and sync of fts table
happens concurrently during fts optimize thread shutdown.
fts_optimize_remove_table() is executed by a user thread that
performs DDL, and that the fts_optimize_wq may be removed if
fts_optimize_shutdown() has started executing.
fts_optimize_remove_table() doesn't remove the table from the queue
and it leads to above scenario. While removing the table from
fts_optimize_wq, if the table can't be removed then wait till
fts_optimize_thread shuts down.
Reviewed-by: Marko Mäkelä
This bug could manifest itself for a query with WHERE condition containing
top level OR formula such that each conjunct contained a single-range
condition supported by the same index. One of these range conditions must
be fully covered by another range condition that is used later in the OR
formula. Additionally at least one of these condition should be ANDed with
a sargable range condition supported by a different index.
There were several attempts to fix related problems for OR conditions after
the backport of range optimizer code from MySQL (commit
0e19f3e36f). Unfortunately the first of these
fixes contained typo remained unnoticed until recently. This typo bug led
to rejection of valid range accesses. This patch fixed this typo bug.
The fix revealed another two bugs: one in a constructor for SEL_ARG,
the other in the function tree_or(). Both are fixed in this patch.
This regression for debug builds was introduced by
MDEV-23101 (commit 224c950462).
Due to MDEV-16664, the parameter
innodb_lock_schedule_algorithm=VATS
is not enabled by default.
The purpose of the added assertions was to enforce the invariant that
Galera replication cannot be enabled together with VATS due to MDEV-12837.
However, upon closer inspection, it is obvious that the variable 'lock'
may be assigned to the null pointer if no match is found in the
previous->hash list.
lock_grant_and_move_on_page(), lock_grant_and_move_on_rec():
Assert !lock->trx->is_wsrep() only after ensuring that lock
is not a null pointer.
1. rr record -h randomizes number of processors. Disable THREAD_POOL_SIZE check.
2. check for kernel.perf_event_paranoid for user-friendly error message.
Remove from debian build:
* tokudb
* mroonga
* spider
* ograph
* embedded server
Add ccache to debian build.
Backport 10.3 changes to autobake-deb
that make travis faster.
Merge instructions:
Drop this commit on merge to 10.3
In MDEV-21452, SAFE_MUTEX flagged an ordering problem that involved
trx_t::mutex, LOCK_global_system_variables, and LOCK_commit_ordered
when running
./mtr --no-reorder\
binlog.binlog_checksum,mix binlog.binlog_commit_wait,mix
Because LOCK_commit_ordered is acquired by replication code before
innobase_commit_ordered() is invoked, and because LOCK_commit_ordered
should be below LOCK_global_system_variables in the global latching
order, it turns out that we must avoid acquiring
LOCK_global_system_variables in any low-level code.
It also turns out that lock_rec_lock() acquires lock_sys_t::mutex
and then carries on to call lock_rec_enqueue_waiting(), which may
invoke THDVAR() via thd_lock_wait_timeout(). This call is problematic
if THDVAR() had never been invoked in that thread earlier.
innobase_trx_init(): Let us invoke THDVAR() at the start of an InnoDB
transaction so that future invocations of THDVAR() will avoid
LOCK_global_system_variables acquisition on the same THD. Because
the first call to intern_sys_var_ptr() will initialize all session
variables by not passing the offset to sync_dynamic_session_variables(),
this will indeed make any future THDVAR() invocation mutex-free.
There are some THDVAR() calls in other code (related to indexed virtual
columns, fulltext indexes, and DDL operations). No SAFE_MUTEX warning
was known for those, but there does not appear to be any replication
test coverage for indexed virtual columns or fulltext indexes. DDL should
be covered, and perhaps DDL code paths were already invoking THDVAR()
while not holding any InnoDB mutex.
Side note: MySQL should avoid this type of deadlocks since
mysql/mysql-server@4d275c8995.
MariaDB never defined alloc_and_copy_thd_dynamic_variables(),
because we prefer to avoid overhead during connection creation.
An important part of the deadlock could be the current handling of
SET GLOBAL binlog_checksum=NONE; and similar assignments.
In binlog_checksum_update(), we would hold LOCK_global_system_variables
while potentially acquiring LOCK_commit_ordered in MYSQL_BIN_LOG::open().
Even if that code was changed later to release
LOCK_global_system_variables during the write to mysql_bin_log,
it could be a good idea for performance to avoid invoking the
expensive code path of THDVAR() while holding any InnoDB mutexes,
such as lock_sys.mutex in lock_rec_enqueue_waiting().
Thanks to Andrei Elkin for debugging the SAFE_MUTEX issue, and to
Sergei Golubchik for the suggestion to invoke THDVAR() early.
Changes to be committed:
modified: mysql-test/suite/sys_vars/r/wsrep_cluster_address_basic.result
modified: mysql-test/suite/sys_vars/t/wsrep_cluster_address_basic.test
The setting innodb_lock_schedule_algorithm=VATS that was introduced
in MDEV-11039 (commit 021212b525)
causes conflicting exclusive locks to be incorrectly granted to
two transactions. Specifically, in lock_rec_insert_by_trx_age()
the predicate !lock_rec_has_to_wait_in_queue(in_lock) would hold even
though an active transaction is already holding an exclusive lock.
This was observed between two DELETE of the same clustered index record.
The HASH_DELETE invocation in lock_rec_enqueue_waiting() may be related.
Due to lack of progress in diagnosing the problem, we will deprecate the
option and issue a warning that using it may corrupt data. The unsafe
option was enabled between
commit 0c15d1a6ff (MariaDB 10.2.3)
and the parent of
commit 1cc1d0429d (MariaDB 10.2.17, 10.3.9).
Amend check for unread client data in threadpool.
THD::NET will have unread data, in case client uses compression, and
wraps multiple commands into a single compression packet
MariaDB C/C sends COM_STMT_RESET+COM_STMT_EXECUTE, and wraps it into
a single compressed packet, when compression is on, thus trying to use
compression and prepared statements against a threadpool-enabled server
will result into a hang, before this patch.
In fts_optimize_remove_table(), InnoDB tries to access the
fts_optimize_wq after shutting down the fts optimize thread.
This issue caused by the commit a41d429765.
Fix should check for fts optimize thread shutdown state
before checking fts_optimize_wq.
ibuf_merge_or_delete_for_page(): Do not attempt to invoke
ibuf_delete_recs() on a page of the change buffer itself.
The caller could already be holding ibuf->index->lock,
and an attempt to acquire it in S mode would hang the release server
or cause an assertion failure in rw_lock_s_lock_func() in a debug
server.
This problem was reproducible on 1 out of 2 runs of the following:
./mtr --no-reorder \
innodb.innodb-page_compression_default \
innodb.innodb-page_compression_snappy \
innodb.innodb-page_compression_zip \
innodb.innodb_wl6326_big innodb.xa_recovery
Analysis:
========
"mysqlbinlog -v" option will reconstruct row events and display them as
commented SQL statements. If this option is given twice, the output includes
comments to indicate column data types and some metadata.
`log_event_print_value` is the function reponsible for printing values and
their types. This function doesn't handle GEOMETRY type. Hence the above error
gets printed.
Fix:
===
Add support for GEOMETRY datatype.
Leave debian/additions/mysqlreport as #!/usr/bin/perl
Acknowledge that `env perl` is a hack, a complete fix
needs to consider which path perl is at and insert into
these scripts.
The usefulness of these scripts is questionable.
Passing a null pointer to a nonnull argument is not only undefined
behaviour, but it also grants the compiler the permission to optimize
away further checks whether the pointer is null. GCC -O2 at least
starting with version 8 may do that, potentially causing SIGSEGV.
Shifting a 16-bit type by 16 bits is undefined behaviour.
The result is at least 32 bits, so let us cast the shift operand
to a wider type before shifting.
For some reason, adding -fsanitize=undefined (cmake -DWITH_UBSAN=ON)
to the compilation flags will cause even more warnings to be emitted.
The warnings do look bogus, but the code can be simplified.
For some reason, adding -fsanitize=undefined (cmake -DWITH_UBSAN=ON)
to the compilation flags will cause even more warnings to be emitted.
The warning was a bogus one:
tests/mysql_client_test.c:8632:22: error: '%d' directive writing between
1 and 11 bytes into a region of size 9 [-Werror=format-overflow=]
8632 | sprintf(field, "c%d int", i);
| ^~
tests/mysql_client_test.c:8632:20: note: directive argument
in the range [-2147483648, 999]
The warning does not take into account that the lower bound of the
variable actually is 0. But, we can help the compiler and use an
unsigned variable.
It turns out that we must check for DISCARD TABLESPACE both
when the table is being rebuilt and when the AUTO_INCREMENT
value of the table is being added.
This was caught by the test innodb.alter_missing_tablespace.
Somehow I failed to run all tests. Sorry!
The statement ALTER TABLE...DISCARD TABLESPACE is problematic,
because its designed purpose is to break the referential integrity
of the data dictionary and make a table point to nowhere.
ha_innobase::commit_inplace_alter_table(): Check whether the
table has been discarded. (This is a bit late to check it, right
before committing the change.) Previously, we performed this check
only in a specific branch of the function commit_set_autoinc().
Note: We intentionally allow non-rebuilding ALTER TABLE even if
the tablespace has been discarded, to remain compatible with MySQL.
(See the various tests with "wl5522" in the name, such as
innodb.innodb-wl5522.)
The test case would crash starting with 10.3 only, but it does not hurt
to minimize the code and test difference between 10.2 and 10.3.