Issue:
------
The actual order of acquisition of the IBUF pessimistic insert mutex
(SYNC_IBUF_PESS_INSERT_MUTEX) and IBUF header page latch
(SYNC_IBUF_HEADER) w.r.t space latch (SYNC_FSP) differs from the order
defined in sync0types.h. It was not discovered earlier as the path to
ibuf_remove_free_page was not covered by the mtr test. Ideal order and
one defined in sync0types.h is as follows.
SYNC_IBUF_HEADER -> SYNC_IBUF_PESS_INSERT_MUTEX -> SYNC_FSP
In ibuf_remove_free_page, we acquire space latch earlier and we have
the order as follows resulting in the assert with innodb_sync_debug=on.
SYNC_FSP -> SYNC_IBUF_HEADER -> SYNC_IBUF_PESS_INSERT_MUTEX
Fix:
---
We do maintain this order in other places and there doesn't seem to be
any real issue here. To reduce impact in GA versions, we avoid doing
extensive changes in mutex ordering to match the current
SYNC_IBUF_PESS_INSERT_MUTEX order. Instead we relax the ordering check
for IBUF pessimistic insert mutex using SYNC_NO_ORDER_CHECK.
Previous solution, that would entirely switch timer off, turned out
to be deadlock prone.
This patch fixed previous attempt to switch between long/short interval
periods in MDEV-24295. Now, initial state of the timer is fixed (it is ON).
Also, avoid switching timer to longer periods if there is any activity in
the pool.
Test was waiting INSERT-clause to make rollback but
wait_condition was too tight. State could be
Freeing items or Rollback. Fixed wait_condition
to expect one of them.
create_partitioning_metadata() should only mark transaction r/w
if it actually did anything (that is, the table is partitioned).
otherwise it's a no-op, called even for temporary tables and
it shouldn't do anything at all
Synopsis: If SELECT returned answer from Query Cache it is not really executed.
The reason for firing of assertion
DBUG_ASSERT((mem_root->flags & ROOT_FLAG_READ_ONLY) == 0);
is that in case the query_cache is on and the same query run by different
stored routines the following use case can take place:
First, lets say that bodies of routines used by the test case are the same
and contains the only query 'SELECT * FROM t1';
call p1() -- a result set is stored in query cache for further use.
call p2() -- the same query is run against the table t1, that result in
not running the actual query but using its cached result.
On finishing execution of this routine, its memory root is
marked for read only since every SP instruction that this
routine contains has been executed.
INSERT INT t1 VALUE (1); -- force following invalidation of query cache
call p2() -- query the table t1 will result in assertion failure since its
execution would require allocation on the memory root that
has been already marked as read only memory root
The root cause of firing the assertion is that memory root of the stored
routine 'p2' was marked as read only although actual execution of the query
contained inside hadn't been performed.
To fix the issue, mark a SP instruction as not yet run in case its execution
doesn't result in real query processing and a result set got from query cache
instead.
Note that, this issue relates server built in debug mode AND with the protect
statement memory root feature turned on. It doesn't affect server built
in release mode.
This way, if manager thread somehow starts and stops again quickly before
main thread wakes up to check if it started correctly, we will not hang.
Patch suggested by Monty as follow-up to
7f498fbab8
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
matched_rec::rec_buf[], matched_rec::bufp: Remove.
matched_rec::block: Make this a pointer to something that
is allocated by buf_block_alloc(). In this way, the only
case where buf_block_t is constructed outside buf_pool
is ALTER TABLE...IMPORT TABLESPACE.
rtr_info::heap: Remove. This was only used for allocating matched_rec,
which now is smaller.
mtr_t::memmove(): Simplify some code to avoid GCC 9.4.0 -Wconversion
in the 10.6 branch as a result of these changes.
Reviewed by: Debarun Banerjee
It was updated for 10.6+ in MDEV-7317. Because a lower version spider
node may connect to a higher version data node, we need to change this
for 10.4 and 10.5 as well.
network timeouts might be rather large and feedback plugin
waits forever for the sender thread to exit.
an alternative could've been to use GNU-specific pthread_timedjoin_np(),
where _np mean "not portable".
followup for 81f75ca83a
improve over take 2. It's technically possible, though unlikely,
to see THD after it already reset the info to NULL, but has not
changed the command to COM_SLEEP yet (see THD::mark_connection_idle()).
Let's wait for "Sleep", not for NULL.
- Add additional MTRs for more coverage on invalid options
- Updating a few error messages to be more informative
- Use the exit code from handle_options() when there is an error processing
user options
All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.
When passing in an invalid value (e.g. incorrect data type) for a variable, the
server startup will fail with misleading error messages.
The behavior **before** this change:
For server options:
- The error message will indicate that the argument is being adjusted to a valid value
- Server startup still fails
For plugin options:
- The error message will indicate that the argument is being adjusted to a valid value
- The plugin is still disabled
- Server startup fails with a message that it does not recognize the plugin option
The behavior **after** this change:
For server options:
- Output that an invalid argument was provided
- Exit server startup
For plugin options:
- Output that an invalid argument was provided
- Disable the plugin
- Attempt to continue server startup
All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.
In 69a4d6ae, an MTR test was added to verify that we handled multiple invalid
options. However, the logic to perform this test relied on a non-trivial regex
to filter out the noise in the logs.
Instead, we now just simply search for what we expect to be in the logs.
All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.
In the case if some unique key fields are nullable, there can be
several records with the same key fields in unique index with at least
one key field equal to NULL, as NULL != NULL.
When transaction is resumed after waiting on the record with at least one
key field equal to NULL, and stored in persistent cursor record is
deleted, persistent cursor can be restored to the record with all key
fields equal to the stored ones, but with at least one field equal to
NULL. And such record is wrongly treated as a record with the same unique
key as stored in persistent cursor record one, what is wrong as
NULL != NULL.
The fix is to check if at least one unique field is NULL in restored
persistent cursor position, and, if so, then don't treat the record as
one with the same unique key as in the stored record key.
dict_index_t::nulls_equal was removed, as it was initially developed for
never existed in MariaDB "intrinsic tables", and there is no code, which
would set it to "true".
Reviewed by Marko Mäkelä.
The test failure in rpl.rpl_domain_id_filter_restart is caused by
MDEV-33887. That is, the test uses master_pos_wait() (called
indirectly by sync_slave_with_master) to try and wait for the
replica to catch up to the master. However, the waited on
transaction is ignored by the configured
CHANGE MASTER TO IGNORE_DOMAIN_IDS=()
As MDEV-33887 reports, due to the IO thread updating the binlog
coordinates and the SQL thread updating the GTID state, if the
replica is stopped in-between these updates, the replica state will
be inconsistent. That is, the test expects that the GTID state will
be updated, so upon restart, the replica will be up-to-date.
However, if the replica is stopped before the SQL thread updates its
GTID state, then upon restart, the replica will fetch the previously
ignored event, which is no longer ignored upon restart, and execute
it. This leads to the sporadic extra row in t2.
This patch changes master_pos_wait() to use master_gtid_wait() to
ensure the replica state is consistent with the master state.
When navigating through the existing links in the README, it is not
immediately obvious where to go to find instructions in building
and testing the source code. Since the README is often the first
thing people see when looking at a repository, this information
should be front and centre so that newcomers to the project can
get setup as quickly as possible.
The condition checked the value of the leftmost byte before checking if
at least one byte is still available in the buffer.
Changing the order in the condition: check for a byte availability before
checking the byte value.
increase the MASTER_CONNECT_RETRY time under valgrind,
otherwise the slave gives up retrying before the master is ready
also, cosmetic cleanup of rpl_semi_sync_master_shutdown.test
The crash at running mysqlbinlog on a SEQUENCE containing binlog file
was caused MDEV-29621 fixes that did not check which of the slave
or binlog applier executes a block introduced there.
The block is meaningful only for the parallel slave applier, so
it's safe to fix this bug with identified the actual applier and
skipping the block when it's the mysqlbinlog one.
TIME-alike string and numeric arguments to TIMEDIFF()
can get additional fractional seconds during the supported
TIME range adjustment in get_time().
For example, during TIMEDIFF('839:00:00','00:00:00') evaluation
in Item_func_timediff::get_date(), the call for args[0]->get_time()
returns MYSQL_TIME '838:59:59.999999'.
Item_func_timediff::get_date() did not handle these extra digits
and returned a MYSQL_TIME result with fractional digits outside
of Item_func_timediff::decimals. This mismatch could further be
caught by a DBUG_ASSERT() in various other pieces of the code,
leading to a crash.
Fix:
In case if get_time() returned MYSQL_TIMESTAMP_TIME,
let's truncate all extra digits using my_time_trunc(&l_time,decimals).
This guarantees that the rest of the code returns a MYSQL_TIME
with second_part not conflicting with Item_func_timediff::decimals.
In commit d74d95961a (MDEV-18543)
there was an error that would cause the hidden metadata record
to be deleted, and therefore cause the table to appear corrupted
when it is reloaded into the data dictionary cache.
PageConverter::update_records(): Do not delete the metadata record,
but do validate it.
RecIterator::open(): Make the API more similar to 10.6, to simplify
merges.
When the system variables @@debug_dbug was assigned to
some expression, Sys_debug_dbug::do_check() did not properly
convert the value from the expression character set to utf8.
So the value was erroneously re-interpretted as utf8 without
conversion. In case of a tricky expression character set
(e.g. utf16le), this led to unexpected results.
Fix:
Re-using Sys_var_charptr::do_string_check() in Sys_debug_dbug::do_check().
Same as MDEV-29579. For some reason, libodbc does not clean up
properly if unloaded too early with the dlclose() of spider. So we add
UNIQUE symbols to spider so the spider does not reload in dlclose().
This change, however, uncovers some hidden problems in the spider
codebase, for which we move the initialisation of some spider global
variables into the initialisation of spider itself.
Spider has some global variables. Their initialisation should be done
in the initialisation of spider itself, otherwise, if spider were
re-initialised without these symbol being unloaded, the values could
be inconsistent and causing issues.
One such issue is caused by the variables
spider_mon_table_cache_version and spider_mon_table_cache_version_req.
They are used for resetting the spider monitoring table cache and have
initial values of 0 and 1 respectively. We have that always
spider_mon_table_cache_version_req >= spider_mon_table_cache_version,
and when the relation is strict, the cache is reset,
spider_mon_table_cache_version is brought to be equal to
spider_mon_table_cache_version_req, and the cache is searched for
matching table_name, db_name and link_idx. If the relation is equal,
no reset would happen and the cache would be searched directly.
When spider is re-inited without resetting the values of
spider_mon_table_cache_version and spider_mon_table_cache_version_req
that were set to be equal in the previous cache reset action, the
cache was emptied in the previous spider deinit, which would result in
HA_ERR_KEY_NOT_FOUND unexpectedly.
An alternative way to fix this issue would be to call the spider udf
spider_flush_mon_cache_table(), which increments
spider_mon_table_cache_version_req thus making sure the inequality is
strict. However, there's no reason for spider to initialise these
global variables on dlopen(), rather than on spider init, which is
cleaner and "purer".
To reproduce this issue, simply revert the changes involving the two
variables and then run:
mtr --no-reorder spider.ha{,_part}
The signal handler thread can use various different runtime
resources when processing a SIGHUP (e.g. master-info information)
due to calling into reload_acl_and_cache(). Currently, the shutdown
process waits for the termination of the signal thread after
performing cleanup. However, this could cause resources actively
used by the signal handler to be freed while reload_acl_and_cache()
is processing.
The specific resource that caused MDEV-30260 is a race condition for
the hostname_cache, such that mysqld would delete it in
clean_up()::hostname_cache_free(), before the signal handler would
use it in reload_acl_and_cache()::hostname_cache_refresh().
Another similar resource is the active_mi/master_info_index. There
was a race between its deletion by the main thread in end_slave(),
and their usage by the Signal Handler as a part of
Master_info_index::flush_all_relay_logs.read(active_mi) in
reload_acl_and_cache().
This patch fixes these race conditions by relocating where server
shutdown waits for the signal handler to die until after
server-level threads have been killed (i.e., as a last step of
close_connections()). With respect to the hostname_cache, active_mi
and master_info_cache, this ensures that they cannot be destroyed
while the signal handler is still active, and potentially using
them.
Additionally:
1) This requires that Events memory is still in place for SIGHUP
handling's mysql_print_status(). So event deinitialization is moved
into clean_up(), but the event scheduler still needs to be stopped
in close_connections() at the same spot.
2) The function kill_server_thread is no longer used, so it is
deleted
3) The timeout to wait for the death of the signal thread was not
consistent with the comment. The comment mentioned up to 10 seconds,
whereas it was actually 0.01s. The code has been fixed to wait up to
10 seconds.
4) A warning has been added if the signal handler thread fails to
exit in time.
5) Added pthread_join() to end of wait_for_signal_thread_to_end()
if it hadn't ended in 10s with a warning. Note this also removes
the pthread_detached attribute from the signal_thread to allow
for the pthread_join().
Reviewed By:
===========
Vladislav Vaintroub <wlad@mariadb.com>
Andrei Elkin <andrei.elkin@mariadb.com>
What's happening:
1. Query_cache::insert() locks the QC and verifies that it's enabled
2. parallel thread tries to disable it. trylock fails (QC is locked)
so the status becomes DISABLE_REQUEST
3. Query_cache::insert() calls Query_cache::write_result_data()
which allocates a new block and unlocks the QC.
4. Query_cache::unlock() notices there are no more QC users and a
pending DISABLE_REQUEST so it disables the QC and frees all the
memory, including the new block that was just allocated
5. Query_cache::write_result_data() proceeds to write into the freed block
Fix: change m_cache_status under a mutex.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
These patches:
# commit 74891ed257
#
# MDEV-11514, MDEV-11497, MDEV-11554, MDEV-11555 - IN and CASE type aggregation problems
# commit 53499cd1ea
#
# MDEV-31303 Key not used when IN clause has both signed and usigned values
earlier fixed MDEV-18898.
Adding only an MTR case.
modified: mysql-test/main/func_in.result
modified: mysql-test/main/func_in.test
Problem was too tight condition on ha_commit_trans to not
allow non transactional storage engines participate 2pc
in Galera case. This is required because transaction
using e.g. procedures might read mysql.proc table inside
a trasaction and these tables use at the moment Aria
storage engine that does not support 2pc.
Fixed by allowing read only transactions to storage
engines that do not support two phase commit to participate
2pc transaction. These will be committed later separately.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Before patch, maintenance timer will tick every 0.4 seconds.
After this patch, timer will tick every 0.4 seconds when necessary(
there are delayed thread creation), switching off completely after 20
seconds of being idle.
Only locked will participate in the query in this case. Chances are
that not-locked partitions were not opened, which is the cause of the
crash in the added test case spider/bugfix.mdev_33731 without this
patch.