MySQL 5.6 implemented WL#344, which is about a MASTER_DELAY option to CHANGE
MASTER. But as part of this worklog, the format of the realy-log.info file was
changed. The new format is not understood by earlier versions, and nor by
MariaDB 10.0, so changing server to those versions would cause the slave to
abort with an error due to reading incorrect data out of relay-log.info.
Fix this by backporting from the WL#344 patch just the code that understands
the new relay-log.info format. We still write out the old format, and none of
the MASTER_DELAY feature is backported with this commit.
replication causing replication to fail.
Remove the temporary fix for MDEV-5914, which used READ COMMITTED for parallel
replication worker threads. Replace it with a better, more selective solution.
The issue is with certain edge cases of InnoDB gap locks, for example between
INSERT and ranged DELETE. It is possible for the gap lock set by the DELETE to
block the INSERT, if the DELETE runs first, while the record lock set by
INSERT does not block the DELETE, if the INSERT runs first. This can cause a
conflict between the two in parallel replication on the slave even though they
ran without conflicts on the master.
With this patch, InnoDB will ask the server layer about the two involved
transactions before blocking on a gap lock. If the server layer tells InnoDB
that the transactions are already fixed wrt. commit order, as they are in
parallel replication, InnoDB will ignore the gap lock and allow the two
transactions to proceed in parallel, avoiding the conflict.
Improve the fix for MDEV-6020. When InnoDB itself detects a deadlock, it now
asks the server layer for any preferences about which transaction to roll
back. In case of parallel replication with two transactions T1 and T2 fixed to
commit T1 before T2, the server layer will ask InnoDB to roll back T2 as the
deadlock victim, not T1. This helps in some cases to avoid excessive deadlock
rollback, as T2 will in any case need to wait for T1 to complete before it can
itself commit.
Also some misc. fixes found during development and testing:
- Remove thd_rpl_is_parallel(), it is not used or needed.
- Use KILL_CONNECTION instead of KILL_QUERY when a parallel replication
worker thread is killed to resolve a deadlock with fixed commit
ordering. There are some cases, eg. in sql/sql_parse.cc, where a KILL_QUERY
can be ignored if the query otherwise completed successfully, and this
could cause the deadlock kill to be lost, so that the deadlock was not
correctly resolved.
- Fix random test failure due to missing wait_for_binlog_checkpoint.inc.
- Make sure that deadlock or other temporary errors during parallel
replication are not printed to the the error log; there were some places
around the replication code with extra error logging. These conditions can
occur occasionally and are handled automatically without breaking
replication, so they should not pollute the error log.
- Fix handling of rgi->gtid_sub_id. We need to be able to access this also at
the end of a transaction, to be able to detect and resolve deadlocks due to
commit ordering. But this value was also used as a flag to mark whether
record_gtid() had been called, by being set to zero, losing the value. Now,
introduce a separate flag rgi->gtid_pending, so rgi->gtid_sub_id remains
valid for the entire duration of the transaction.
- Fix one place where the code to handle ignored errors called reset_killed()
unconditionally, even if no error was caught that should be ignored. This
could cause loss of a deadlock kill signal, breaking deadlock detection and
resolution.
- Fix a couple of missing mysql_reset_thd_for_next_command(). This could
cause a prior error condition to remain for the next event executed,
causing assertions about errors already being set and possibly giving
incorrect error handling for following event executions.
- Fix code that cleared thd->rgi_slave in the parallel replication worker
threads after each event execution; this caused the deadlock detection and
handling code to not be able to correctly process the associated
transactions as belonging to replication worker threads.
- Remove useless error code in slave_background_kill_request().
- Fix bug where wfc->wakeup_error was not cleared at
wait_for_commit::unregister_wait_for_prior_commit(). This could cause the
error condition to wrongly propagate to a later wait_for_prior_commit(),
causing spurious ER_PRIOR_COMMIT_FAILED errors.
- Do not put the binlog background thread into the processlist. It causes
too many result differences in mtr, but also it probably is not useful
for users to pollute the process list with a system thread that does not
really perform any user-visible tasks...
replication causing replication to fail.
In parallel replication, we run transactions from the master in parallel, but
force them to commit in the same order they did on the master. If we force T1
to commit before T2, but T2 holds eg. a row lock that is needed by T1, we get
a deadlock when T2 waits until T1 has committed.
Usually, we do not run T1 and T2 in parallel if there is a chance that they
can have conflicting locks like this, but there are certain edge cases where
it can occasionally happen (eg. MDEV-5914, MDEV-5941, MDEV-6020). The bug was
that this would cause replication to hang, eventually getting a lock timeout
and causing the slave to stop with error.
With this patch, InnoDB will report back to the upper layer whenever a
transactions T1 is about to do a lock wait on T2. If T1 and T2 are parallel
replication transactions, and T2 needs to commit later than T1, we can thus
detect the deadlock; we then kill T2, setting a flag that causes it to catch
the kill and convert it to a deadlock error; this error will then cause T2 to
roll back and release its locks (so that T1 can commit), and later T2 will be
re-tried and eventually also committed.
The kill happens asynchroneously in a slave background thread; this is
necessary, as the reporting from InnoDB about lock waits happen deep inside
the locking code, at a point where it is not possible to directly call
THD::awake() due to mutexes held.
Deadlock is assumed to be (very) rarely occuring, so this patch tries to
minimise the performance impact on the normal case where no deadlocks occur,
rather than optimise the handling of the occasional deadlock.
Also fix transaction retry due to deadlock when it happens after a transaction
already signalled to later transactions that it started to commit. In this
case we need to undo this signalling (and later redo it when we commit again
during retry), so following transactions will not start too early.
Also add a missing thd->send_kill_message() that got triggered during testing
(this corrects an incorrect fix for MySQL Bug#58933).
Handle retry of event groups that span multiple relay log files.
- If retry reaches the end of one relay log file, move on to the next.
- Handle refcounting of relay log files, and avoid purging relay log
files until all event groups have completed that might have needed
them for transaction retry.
Start implementing that an event group can be re-tried in parallel replication
if it fails with a temporary error (like deadlock).
Patch is very incomplete, just some very basic retry works.
Stuff still missing (not complete list):
- Handle moving to the next relay log file, if event group to be retried
spans multiple relay log files.
- Handle refcounting of relay log files, to ensure that we do not purge a
relay log file and then later attempt to re-execute events out of it.
- Handle description_event_for_exec - we need to save this somehow for the
possible retry - and use the correct one in case it differs between relay
logs.
- Do another retry attempt in case the first retry also fails.
- Limit the max number of retries.
- Lots of testing will be needed for the various edge cases.
The previous patch for this bug was unfortunately completely wrong.
The purpose of cached_charset is to remember which character set we
have installed currently in the THD, so that in the common case where
charset does not change between queries, we do not need to update it
in the THD. Thus, it is important that the cached_charset field is
tightly coupled to the THD for which it handles caching.
Thus the right place to put cached_charset seems to be in the THD.
This patch introduces a field THD:system_thread_info where such info
in general can be placed without further inflating the THD with unused
data for other threads (THD is already far too big as it is). It then
moves the cached_charset into this slot for the SQL driver thread and
for the parallel replication worker threads.
The THD::rpl_filter field is also moved inside system_thread_info, to
keep the size of THD unchanged. Moving further fields in to reduce the
size of THD is a separate task, filed as MDEV-6164.
Replication caches the character sets used in a query, to be able to quickly
reuse them for the next query in the common case of them not having changed.
In parallel replication, this caching needs to be per-worker-thread. The
code was not modified to handle this correctly, so the caching in one worker
could cause another worker to run a query using the wrong character set,
causing replication corruption.
When a transaction fails in parallel replication, it should signal the error
to any following transactions doing wait_for_prior_commit() on it. But the
code for this was incorrect, and would not correctly remember a prior error
when sending the signal. This caused corruption when slave stopped due to an
error.
Fix by remembering the error code when we first get an error, and passing the
saved error code to wakeup_subsequent_commits().
Thanks to nanyi607rao who reported this bug on
maria-developers@lists.launchpad.net and analysed the root cause.
Before, the arrival of same GTID twice in multi-source replication
would cause double-apply or in gtid strict mode an error.
Keep the behaviour, but add an option --gtid-ignore-duplicates which
allows to correctly handle duplicates, ignoring all but the first.
This relies on the user ensuring correct configuration so that
sequence numbers are strictly increasing within each replication
domain; then duplicates can be detected simply by comparing the
sequence numbers against what is already applied.
Only one master connection (but possibly multiple parallel worker
threads within that connection) is allowed to apply events within
one replication domain at a time; any other connection that
receives a GTID in the same domain either discards it (if it is
already applied) or waits for the other connection to not have
any events to apply.
Intermediate patch, as proof-of-concept for testing. The main limitation
is that currently it is only implemented for parallel replication,
@@slave_parallel_threads > 0.
With parallel replication, there can be any number of events queued on
in-memory lists in the worker threads.
For normal STOP SLAVE, we want to skip executing any remaining events on those
lists and stop as quickly as possible.
However, for START SLAVE UNTIL, when the UNTIL position is reached in the SQL
driver thread, we must _not_ stop until all already queued events for the
workers have been executed - otherwise we would stop too early, before the
actual UNTIL position had been completely reached.
The code did not handle UNTIL correctly, stopping too early due to not
executing the queued events to completion. Fix this, and also implement that
an explicit STOP SLAVE in the middle (when the SQL driver thread has reached
the UNTIL position but the workers have not) _will_ cause an immediate stop.
Clean up and improve the parallel implementation code, mainly related to
scheduling of work to threads and handling of stop and errors.
Fix a lot of bugs in various corner cases that could lead to crashes or
corruption.
Fix that a single replication domain could easily grab all worker threads and
stall all other domains; now a configuration variable
--slave-domain-parallel-threads allows to limit the number of
workers.
Allow next event group to start as soon as previous group begins the commit
phase (as opposed to when it ends it); this allows multiple event groups on
the slave to participate in group commit, even when no other opportunities for
parallelism are available.
Various fixes:
- Fix some races in the rpl.rpl_parallel test case.
- Fix an old incorrect assertion in Log_event iocache read.
- Fix repeated malloc/free of wait_for_commit and rpl_group_info objects.
- Simplify wait_for_commit wakeup logic.
- Fix one case in queue_for_group_commit() where killing one thread would
fail to correctly signal the error to the next, causing loss of the
transaction after slave restart.
- Fix leaking of pthreads (and their allocated stack) due to missing
PTHREAD_CREATE_DETACHED attribute.
- Fix how one batch of group-committed transactions wait for the previous
batch before starting to execute themselves. The old code had a very
complex scheduling where the first transaction was handled differently,
with subtle bugs in corner cases. Now each event group is always scheduled
for a new worker (in a round-robin fashion amongst available workers).
Keep a count of how many transactions have started to commit, and wait for
that counter to reach the appropriate value.
- Fix slave stop to wait for all workers to actually complete processing;
before, the wait was for update of last_committed_sub_id, which happens a
bit earlier, and could leave worker threads potentially accessing bits of
the replication state that is no longer valid after slave stop.
- Fix a couple of places where the test suite would kill a thread waiting
inside enter_cond() in connection with debug_sync; debug_sync + kill can
crash in rare cases due to a race with mysys_var_current_mutex in this
case.
- Fix some corner cases where we had enter_cond() but no exit_cond().
- Fix that we could get failure in wait_for_prior_commit() but forget to flag
the error with my_error().
- Fix slave stop (both for normal stop and stop due to error). Now, at stop
we pick a specific safe point (in terms of event groups executed) and make
sure that all event groups before that point are executed to completion,
and that no event group after start executing; this ensures a safe place to
restart replication, even for non-transactional stuff/DDL. In error stop,
make sure that all prior event groups are allowed to execute to completion,
and that any later event groups that have started are rolled back, if
possible. The old code could leave eg. T1 and T3 committed but T2 not, or
it could even leave half a transaction not rolled back in some random
worker, which would cause big problems when that worker was later reused
after slave restart.
- Fix the accounting of amount of events queued for one worker. Before, the
amount was reduced immediately as soon as the events were dequeued (which
happens all at once); this allowed twice the amount of events to be queued
in memory for each single worker, which is not what users would expect.
- Fix that an error set during execution of one event was sometimes not
cleared before executing the next, causing problems with the error
reporting.
- Fix incorrect handling of thd->killed in worker threads.
MDEV-4984: Implement MASTER_GTID_WAIT() and @@LAST_GTID.
MDEV-4726: Race in mysql-test/suite/rpl/t/rpl_gtid_stop_start.test
MDEV-5636: Deadlock in RESET MASTER
MASTER_GTID_WAIT() is similar to MASTER_POS_WAIT(), but works with a
GTID position rather than an old-style filename/offset.
@@LAST_GTID gives the GTID assigned to the last transaction written
into the binlog.
Together, the two can be used by applications to obtain the GTID of
an update on the master, and then do a MASTER_GTID_WAIT() for that
position on any read slave where it is important to get results that
are caught up with the master at least to the point of the update.
The implementation of MASTER_GTID_WAIT() is implemented in a way
that tries to minimise the performance impact on the SQL threads,
even in the presense of many waiters on single GTID positions (as
from @@LAST_GTID).
- CREATE TABLE is by default executed on the slave as CREATE OR REPLACE
- DROP TABLE is by default executed on the slave as DROP TABLE IF NOT EXISTS
This means that a slave will by default continue even if we try to create
a table that existed on the slave (the table will be deleted and re-created) or
if we try to drop a table that didn't exist on the slave.
This should be safe as instead of having the slave stop because of an inconsistency between
master and slave, it will fix the inconsistency.
Those that would prefer to get a stopped slave instead for the above cases can set slave_ddl_exec_mode to STRICT.
- Ensure that a CREATE OR REPLACE TABLE which dropped a table is replicated
- DROP TABLE that generated an error on master is handled as an identical DROP TABLE on the slave (IF NOT EXISTS is not added in this case)
- Added slave_ddl_exec_mode variable to decide how DDL's are replicated
New logic for handling BEGIN GTID ... COMMIT from the binary log:
- When we find a BEGIN GTID, we start a transaction and set OPTION_GTID_BEGIN
- When we find COMMIT, we reset OPTION_GTID_BEGIN and execute the normal COMMIT code.
- While OPTION_GTID_BEGIN is set:
- We don't generate implict commits before or after statements
- All tables are regarded as transactional tables in the binary log (to ensure things are executed exactly as on the master)
- We reset OPTION_GTID_BEGIN also on rollback
This will help ensuring that we don't get any sporadic commits (and thus new GTID's) on the slave and will help keep the GTID's between master and slave in sync.
mysql-test/extra/rpl_tests/rpl_log.test:
Added testing of mode slave_ddl_exec_mode=STRICT
mysql-test/r/mysqld--help.result:
New help messages
mysql-test/suite/rpl/r/create_or_replace_mix.result:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/r/create_or_replace_row.result:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/r/create_or_replace_statement.result:
Testing replication of create or replace
mysql-test/suite/rpl/r/rpl_gtid_startpos.result:
Test must be run in slave_ddl_exec_mode=STRICT as part of the test depends on that DROP TABLE should fail on slave.
mysql-test/suite/rpl/r/rpl_row_log.result:
Updated result
mysql-test/suite/rpl/r/rpl_row_log_innodb.result:
Updated result
mysql-test/suite/rpl/r/rpl_row_show_relaylog_events.result:
Updated result
mysql-test/suite/rpl/r/rpl_stm_log.result:
Updated result
mysql-test/suite/rpl/r/rpl_stm_mix_show_relaylog_events.result:
Updated result
mysql-test/suite/rpl/r/rpl_temp_table_mix_row.result:
Updated result
mysql-test/suite/rpl/t/create_or_replace.inc:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_mix.cnf:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_mix.test:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_row.cnf:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_row.test:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_statement.cnf:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_statement.test:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/rpl_gtid_startpos.test:
Test must be run in slave_ddl_exec_mode=STRICT as part of the test depends on that DROP TABLE should fail on slave.
mysql-test/suite/rpl/t/rpl_stm_log.test:
Removed some lines
mysql-test/suite/sys_vars/r/slave_ddl_exec_mode_basic.result:
Testing of slave_ddl_exec_mode
mysql-test/suite/sys_vars/t/slave_ddl_exec_mode_basic.test:
Testing of slave_ddl_exec_mode
sql/handler.cc:
Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
This is to ensure that statments are not commited too early if non transactional tables are used.
sql/log.cc:
Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
Also treat 'direct' log events as transactional (to get them logged as they where on the master)
sql/log_event.cc:
Ensure that the new error from DROP TABLE when trying to drop a view is treated same as the old one.
Store error code that slave expects in THD.
Set OPTION_GTID_BEGIN if we find a BEGIN.
Reset OPTION_GTID_BEGIN if we find a COMMIT.
sql/mysqld.cc:
Added slave_ddl_exec_mode_options
sql/mysqld.h:
Added slave_ddl_exec_mode_options
sql/rpl_gtid.cc:
Reset OPTION_GTID_BEGIN if we record a gtid (safety)
sql/sql_class.cc:
Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
sql/sql_class.h:
Added to THD: log_current_statement and slave_expected_error
sql/sql_insert.cc:
Ensure that CREATE OR REPLACE is logged if table was deleted.
Don't do implicit commit for CREATE if we are under OPTION_GTID_BEGIN
sql/sql_parse.cc:
Change CREATE TABLE -> CREATE OR REPLACE TABLE for slaves
Change DROP TABLE -> DROP TABLE IF EXISTS for slaves
CREATE TABLE doesn't force implicit commit in case of OPTION_GTID_BEGIN
Don't do commits before or after any statement if OPTION_GTID_BEGIN was set.
sql/sql_priv.h:
Added OPTION_GTID_BEGIN
sql/sql_show.cc:
Enhanced store_create_info() to also be able to handle CREATE OR REPLACE
sql/sql_show.h:
Updated prototype
sql/sql_table.cc:
Ensure that CREATE OR REPLACE is logged if table was deleted.
sql/sys_vars.cc:
Added slave_ddl_exec_mode
sql/transaction.cc:
Added warning if we got a GTID under OPTION_GTID_BEGIN
The problem was a race between the SQL driver thread and the worker threads.
The SQL driver thread would set rli->last_master_timestamp to zero to
mark that it has caught up with the master, while the worker threads would
set it to the timestamp of the executed event. This can happen out-of-order
in parallel replication, causing the "caught up" status to be overwritten
and Seconds_Behind_Master to wrongly grow when the slave is idle.
To fix, introduce a separate flag rli->sql_thread_caught_up to mark that the
SQL driver thread is caught up. This avoids issues with worker threads
overwriting the SQL driver thread status. In parallel replication, we then
make SHOW SLAVE STATUS check in addition that all worker threads are idle
before showing Seconds_Behind_Master as 0 due to slave idle.
Add another test case.
Fix bug found: When one thread aborts, another thread might
wrongly advance the relay log position too far, causing the
fauling transaction to be skipped and thus lost upon
restarting the SQL thread after the failure.
The merge is still missing a few hunks related to temporary tables and
InnoDB log file size. The associated code did not seem to exist in
10.0, so the merge of that needs more work. Until this is fixed, there
are a number of test failures as a result.
Fix a couple of issues in MDEV-4506, Parallel replication:
- Missing mysql_cond_signal(), which could cause hangs.
- Fix incorrect update of old-style replication position.
- Change assertion to error handling (can trigger on manipulated/
corrupt binlog).
MDEV-5189: Error handling in parallel replication.
Fix error handling in parallel worker threads when a query fails:
- Report the error to the error log.
- Return the error back, and set rli->abort_slave.
- Stop executing more events after the error.
Do not update relay-log.info and master.info on disk after every event
when using GTID mode:
- relay-log.info and master.info are not crash-safe, and are not used
when slave restarts in GTID mode (slave connects with GTID position
instead and immediately rewrites the file with the new, correct
information found).
- When using GTID and parallel replication, the position in
relay-log.info is misleading at best and simply wrong at worst.
- When using parallel replication, the fact that every single
transaction needs to do a write() syscall to the same file is
likely to become a serious bottleneck.
The files are still written at normal slave stop.
In non-GTID mode, the files are written as normal (this is needed to
be able to restart after slave crash, even if such restart is then not
crash-safe, no change).
Fix some more parts of old-style position updates.
Now we save in rgi some coordinates for master log and relay log, so
that in do_update_pos() we can use the right set of coordinates with
the right events.
The Rotate_log_event::do_update_pos() is fixed in the parallel case
to not directly update relay-log.info (as Rotate event runs directly
in the driver SQL thread, ahead of actual event execution). Instead,
group_master_log_file is updated as part of do_update_pos() in each
event execution.
In the parallel case, position updates happen in parallel without
any ordering, but taking care that position is not updated backwards.
Since position update happens only after event execution this leads
to the right result.
Also fix an access-after-free introduced in an earlier commit.
Fix some part of update of old-style coordinates in parallel replication:
- Ignore XtraDB request for old-style coordinates, not meaningful for
parallel replication (must use GTID to get crash-safe parallel slave).
- Only update relay log coordinates forward, not backwards, to ensure
that parallel threads do not conflict with each other.
- Move future_event_relay_log_pos to rgi.
-row_stmt_start_timestamp
-last_event_start_time
-long_find_row_note
-trans_retries
Added slave_executed_entries_lock to protect rli->executed_entries
Added primitives for thread safe 64 bit increment
Update rli->executed_entries when event has executed, not when event has been sent to sql execution thread
sql/log_event.cc:
row_stmt_start and long_find_row_note is now in rpl_group_info
sql/mysqld.cc:
Added slave_executed_entries_lock to protect rli->executed_entries
sql/mysqld.h:
Added slave_executed_entries_lock to protect rli->executed_entries
Added primitives for thread safe 64 bit increment
sql/rpl_parallel.cc:
Update rli->executed_entries when event has executed, not when event has been sent to sql execution thread
sql/rpl_rli.cc:
Moved row_stmt_start_timestamp, last_event_start_time and long_find_row_note from Relay_log_info to rpl_group_info
sql/rpl_rli.h:
Moved trans_retries, row_stmt_start_timestamp, last_event_start_time and long_find_row_note from Relay_log_info to rpl_group_info
sql/slave.cc:
Use rgi for trans_retries and last_event_start_time
Update rli->executed_entries when event has executed, not when event has been sent to sql execution thread
Reset trans_retries when object is created
- Made slaves temporary table multi-thread slave safe by adding mutex around save_temporary_table usage.
- rli->save_temporary_tables is the active list of all used temporary tables
- This is copied to THD->temporary_tables when temporary tables are opened and updated when temporary tables are closed
- Added THD->lock_temporary_tables() and THD->unlock_temporary_tables() to simplify this.
- Relay_log_info->sql_thd renamed to Relay_log_info->sql_driver_thd to avoid wrong usage for merged code.
- Added is_part_of_group() to mark functions that are part of the next function. This replaces setting IN_STMT when events are executed.
- Added is_begin(), is_commit() and is_rollback() functions to Query_log_event to simplify code.
- If slave_skip_counter is set run things in single threaded mode. This simplifies code for skipping events.
- Updating state of relay log (IN_STMT and IN_TRANSACTION) is moved to one single function: update_state_of_relay_log()
We can't use OPTION_BEGIN to check for the state anymore as the sql_driver and sql execution threads may be different.
Clear IN_STMT and IN_TRANSACTION in init_relay_log_pos() and Relay_log_info::cleanup_context() to ensure the flags doesn't survive slave restarts
is_in_group() is now independent of state of executed transaction.
- Reset thd->transaction.all.modified_non_trans_table() if we did set it for single table row events.
This was mainly for keeping the flag as documented.
- Changed slave_open_temp_tables to uint32 to be able to use atomic operators on it.
- Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock
- Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond
- Changed some functions to take rpl_group_info instead of Relay_log_info to make them multi-slave safe and to simplify usage
- do_shall_skip()
- continue_group()
- sql_slave_killed()
- next_event()
- Simplifed arguments to io_salve_killed(), check_io_slave_killed() and sql_slave_killed(); No reason to supply THD as this is part of the given structure.
- set_thd_in_use_temporary_tables() removed as in_use is set on usage
- Added information to thd_proc_info() which thread is waiting for slave mutex to exit.
- In open_table() reuse code from find_temporary_table()
Other things:
- More DBUG statements
- Fixed the rpl_incident.test can be run with --debug
- More comments
- Disabled not used function rpl_connect_master()
mysql-test/suite/perfschema/r/all_instances.result:
Moved sleep_lock and sleep_cond to rpl_group_info
mysql-test/suite/rpl/r/rpl_incident.result:
Updated result
mysql-test/suite/rpl/t/rpl_incident-master.opt:
Not needed anymore
mysql-test/suite/rpl/t/rpl_incident.test:
Fixed that test can be run with --debug
sql/handler.cc:
More DBUG_PRINT
sql/log.cc:
More comments
sql/log_event.cc:
Added DBUG statements
do_shall_skip(), continue_group() now takes rpl_group_info param
Use is_begin(), is_commit() and is_rollback() functions instead of inspecting query string
We don't have set slaves temporary tables 'in_use' as this is now done when tables are opened.
Removed IN_STMT flag setting. This is now done in update_state_of_relay_log()
Use IN_TRANSACTION flag to test state of relay log.
In rows_event_stmt_cleanup() reset thd->transaction.all.modified_non_trans_table if we had set this before.
sql/log_event.h:
do_shall_skip(), continue_group() now takes rpl_group_info param
Added is_part_of_group() to mark events that are part of the next event. This replaces setting IN_STMT when events are executed.
Added is_begin(), is_commit() and is_rollback() functions to Query_log_event to simplify code.
sql/log_event_old.cc:
Removed IN_STMT flag setting. This is now done in update_state_of_relay_log()
do_shall_skip(), continue_group() now takes rpl_group_info param
sql/log_event_old.h:
Added is_part_of_group() to mark events that are part of the next event.
do_shall_skip(), continue_group() now takes rpl_group_info param
sql/mysqld.cc:
Changed slave_open_temp_tables to uint32 to be able to use atomic operators on it.
Relay_log_info::sleep_lock -> Rpl_group_info::sleep_lock
Relay_log_info::sleep_cond -> Rpl_group_info::sleep_cond
sql/mysqld.h:
Updated types and names
sql/rpl_gtid.cc:
More DBUG
sql/rpl_parallel.cc:
Updated TODO section
Set thd for event that is execution
Use new is_begin(), is_commit() and is_rollback() functions.
More comments
sql/rpl_rli.cc:
sql_thd -> sql_driver_thd
Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock
Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond
Clear IN_STMT and IN_TRANSACTION in init_relay_log_pos() and Relay_log_info::cleanup_context() to ensure the flags doesn't survive slave restarts.
Reset table->in_use for temporary tables as the table may have been used by another THD.
Use IN_TRANSACTION instead of OPTION_BEGIN to check state of relay log.
Removed IN_STMT flag setting. This is now done in update_state_of_relay_log()
sql/rpl_rli.h:
Changed relay log state flags to bit masks instead of bit positions (most other code we have uses bit masks)
Added IN_TRANSACTION to mark if we are in a BEGIN ... COMMIT section.
save_temporary_tables is now thread safe
Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock
Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond
Relay_log_info->sql_thd renamed to Relay_log_info->sql_driver_thd to avoid wrong usage for merged code
is_in_group() is now independent of state of executed transaction.
sql/slave.cc:
Simplifed arguments to io_salve_killed(), sql_slave_killed() and check_io_slave_killed(); No reason to supply THD as this is part of the given structure.
set_thd_in_use_temporary_tables() removed as in_use is set on usage in sql_base.cc
sql_thd -> sql_driver_thd
More DBUG
Added update_state_of_relay_log() which will calculate the IN_STMT and IN_TRANSACTION state of the relay log after the current element is executed.
If slave_skip_counter is set run things in single threaded mode.
Simplifed arguments to io_salve_killed(), check_io_slave_killed() and sql_slave_killed(); No reason to supply THD as this is part of the given structure.
Added information to thd_proc_info() which thread is waiting for slave mutex to exit.
Disabled not used function rpl_connect_master()
Updated argument to next_event()
sql/sql_base.cc:
Added mutex around usage of slave's temporary tables. The active list is always kept up to date in sql->rgi_slave->save_temporary_tables.
Clear thd->temporary_tables after query (safety)
More DBUG
When using temporary table, set table->in_use to current thd as the THD may be different for slave threads.
Some code is ifdef:ed with REMOVE_AFTER_MERGE_WITH_10 as the given code in 10.0 is not yet in this tree.
In open_table() reuse code from find_temporary_table()
sql/sql_binlog.cc:
rli->sql_thd -> rli->sql_driver_thd
Remove duplicate setting of rgi->rli
sql/sql_class.cc:
Added helper functions rgi_lock_temporary_tables() and rgi_unlock_temporary_tables()
Would have been nicer to have these inline, but there was no easy way to do that
sql/sql_class.h:
Added functions to protect slaves temporary tables
sql/sql_parse.cc:
Added DBUG_PRINT
sql/transaction.cc:
Added comment
Implement @@gtid_binlog_state. This is the internal state of the binlog
(most recent GTID logged for every domain_id and server_id). This allows
to save the state before RESET MASTER and restore it afterwards.
STATUS OF ROLLBACKED TRANSACTION" and bug #17054007 - "TRANSACTION
IS NOT FULLY ROLLED BACK IN CASE OF INNODB DEADLOCK".
The problem in the first bug report was that although deadlock involving
metadata locks was reported using the same error code and message as InnoDB
deadlock it didn't rollback transaction like the latter. This caused
confusion to users as in some cases after ER_LOCK_DEADLOCK transaction
could have been restarted immediately and in some cases rollback was
required.
The problem in the second bug report was that although InnoDB deadlock
caused transaction rollback in all storage engines it didn't cause release
of metadata locks. So concurrent DDL on the tables used in transaction was
blocked until implicit or explicit COMMIT or ROLLBACK was issued in the
connection which got InnoDB deadlock.
The former issue has stemmed from the fact that when support for detection
and reporting metadata locks deadlocks was added we erroneously assumed
that InnoDB doesn't rollback transaction on deadlock but only last statement
(while this is what happens on InnoDB lock timeout actually) and so didn't
implement rollback of transactions on MDL deadlocks.
The latter issue was caused by the fact that rollback of transaction due
to deadlock is carried out by setting THD::transaction_rollback_request
flag at the point where deadlock is detected and performing rollback
inside of trans_rollback_stmt() call when this flag is set. And
trans_rollback_stmt() is not aware of MDL locks, so no MDL locks are
released.
This patch solves these two problems in the following way:
- In case when MDL deadlock is detect transaction rollback is requested
by setting THD::transaction_rollback_request flag.
- Code performing rollback of transaction if THD::transaction_rollback_request
is moved out from trans_rollback_stmt(). Now we handle rollback request
on the same level as we call trans_rollback_stmt() and release statement/
transaction MDL locks.
STATUS OF ROLLBACKED TRANSACTION" and bug #17054007 - "TRANSACTION
IS NOT FULLY ROLLED BACK IN CASE OF INNODB DEADLOCK".
The problem in the first bug report was that although deadlock involving
metadata locks was reported using the same error code and message as InnoDB
deadlock it didn't rollback transaction like the latter. This caused
confusion to users as in some cases after ER_LOCK_DEADLOCK transaction
could have been restarted immediately and in some cases rollback was
required.
The problem in the second bug report was that although InnoDB deadlock
caused transaction rollback in all storage engines it didn't cause release
of metadata locks. So concurrent DDL on the tables used in transaction was
blocked until implicit or explicit COMMIT or ROLLBACK was issued in the
connection which got InnoDB deadlock.
The former issue has stemmed from the fact that when support for detection
and reporting metadata locks deadlocks was added we erroneously assumed
that InnoDB doesn't rollback transaction on deadlock but only last statement
(while this is what happens on InnoDB lock timeout actually) and so didn't
implement rollback of transactions on MDL deadlocks.
The latter issue was caused by the fact that rollback of transaction due
to deadlock is carried out by setting THD::transaction_rollback_request
flag at the point where deadlock is detected and performing rollback
inside of trans_rollback_stmt() call when this flag is set. And
trans_rollback_stmt() is not aware of MDL locks, so no MDL locks are
released.
This patch solves these two problems in the following way:
- In case when MDL deadlock is detect transaction rollback is requested
by setting THD::transaction_rollback_request flag.
- Code performing rollback of transaction if THD::transaction_rollback_request
is moved out from trans_rollback_stmt(). Now we handle rollback request
on the same level as we call trans_rollback_stmt() and release statement/
transaction MDL locks.
includes:
* remove some remnants of "Bug#14521864: MYSQL 5.1 TO 5.5 BUGS PARTITIONING"
* introduce LOCK_share, now LOCK_ha_data is strictly for engines
* rea_create_table() always creates .par file (even in "frm-only" mode)
* fix a 5.6 bug, temp file leak on dummy ALTER TABLE
Fix a bunch of issues found with locking, ordering, and non-thread-safe stuff
in Relay_log_info.
Now able to do a simple benchmark, showing 4.5 times speedup for applying a
binlog with 10000 REPLACE statements.
Fix some bugs around waiting for worker threads to end during SQL slave stop.
Free Log_event after parallel execution (still needs to be made thread-safe by
using rpl_group_info rather than rli).
Hook in the wait-for-prior-commit logic (not really tested yet).
Clean up some resource maintenance around rpl_group_info (may still be some
smaller issues there though).
Add a ToDo list at the top of rpl_parallel.cc
When we load the slave state from the mysql.gtid_slave_pos at server start, we
need to load all the rows into the in-memory hash, not just the most recent
one in each replication domain. Otherwise we accumulate cruft in the form of
old rows each time the server restarts.
There were several cases where the slave GTID position was not loaded
correctly before being used. This caused various failures such as
corrupting the position at slave start and empty values of
@@gtid_slave_pos and @@gtid_current_pos.
Fixed by adding more checks for loaded position, and by always loading
the position at server startup.
When @@GLOBAL.gtid_strict_mode=1, then certain operations result
in error that would otherwise result in out-of-order binlog files
between servers.
GTID sequence numbers are now allocated independently per domain;
this results in less/no holes in GTID sequences, increasing the
likelyhood that diverging binlogs will be caught by the slave when
GTID strict mode is enabled.
Implement START SLAVE UNTIL master_gtid_pos = "<GTID position>".
Add test cases, including a test showing how to use this to promote
a new master among a set of slaves.
- Calls to cleanup_load_tmpdir() could delete temporary files for another master connection
- Concurrent LOAD DATA commands from two master connections could use the same file name
Other bug fixes:
- Enlarge buffer for connection names with 'special characters' one can't store in filenames
Optimization:
- Don't do 'lower case' of connection names. We can use cmp_connection_name, where we already have the connection name in lower case.
mysql-test/suite/multi_source/load_data.result:
Test case for MDEV-4352
mysql-test/suite/multi_source/load_data.test:
Test case for MDEV-4352
sql/log_event.cc:
Fixed: MDEV-4352
- Calls to cleanup_load_tmpdir() could delete temporary files for another master connection
- Concurrent LOAD DATA commands from two master connections could use the same file name
The fix was to add the connection name (if one exists) to all slave temporary files used by LOAD DATA
sql/rpl_mi.cc:
Enlarge buffer for connection names with 'special characters' one can't store in filenames
Use mi->cmp_connection_name for connection file names.
sql/rpl_rli.cc:
Use mi->cmp_connection_name for connection file names.
sql/slave.cc:
Removed not needed empty line
sql/sql_const.h:
Added MAX_FILENAME_MBWIDTH to be able to calculate buffer length for connection_names stored in file names
sql/sql_repl.cc:
Use mi->cmp_connection_name for connection file names.
sql/keycaches.cc:
Added free_all_rpl_filters() to be able to free all filters at cleanup
sql/keycaches.h:
Added prototype
sql/rpl_rli.cc:
Fixed compiler warning
sql/slave.cc:
Free all rpl_filters at cleanup
sql/sp.cc:
Fixed compiler warning when not all struct elements was initialized
sql/sql_acl.cc:
Fixed compiler warning when not all struct elements was initialized
storage/perfschema/table_events_waits.cc:
Fixed compiler warning when not all struct elements was initialized
storage/perfschema/table_events_waits_summary.cc:
Fixed compiler warning when not all struct elements was initialized
storage/perfschema/table_ews_global_by_event_name.cc:
Fixed compiler warning when not all struct elements was initialized
storage/perfschema/table_file_instances.cc:
Fixed compiler warning when not all struct elements was initialized
storage/perfschema/table_file_summary.cc:
Fixed compiler warning when not all struct elements was initialized
storage/perfschema/table_performance_timers.cc:
Fixed compiler warning when not all struct elements was initialized
storage/perfschema/table_setup_consumers.cc:
Fixed compiler warning when not all struct elements was initialized
storage/perfschema/table_setup_instruments.cc:
Fixed compiler warning when not all struct elements was initialized
storage/perfschema/table_setup_timers.cc:
Fixed compiler warning when not all struct elements was initialized
storage/perfschema/table_sync_instances.cc:
Fixed compiler warning when not all struct elements was initialized
storage/perfschema/table_threads.cc:
Fixed compiler warning when not all struct elements was initialized
storage/xtradb/os/os0file.c:
Fixed compiler warning when not all struct elements was initialized
Users can set different repplication filter rules for each replication connection, in my.cnf or command line.
But the rules set online will not record in master.info, it means if users restart MySQL, these rules will lose.
So if users wantn't their replication filter rules lose, they should write the rules in my.cnf.
Users can set rules by 2 ways:
1. Online SET command, "SET connection_name.replication_filter_settings = rules;".
2. In my.cnf, "connection_name.replication_filter_settings = rules".
If no connection_name in my.cnf, this rule will apply for ALL replication connection.
If no connetion_name in SET statement, this rull will apply for default_connection_name.
Add tests crashing the slave in the middle of replication and checking that
replication picks-up again on restart in a crash-safe way.
Fix silly code that causes crash by inserting uninitialised data into a hash.
Fix things so that a master can switch with MASTER_GTID_POS=AUTO to a slave
that was previously running with log_slave_updates=0, by looking into the
slave replication state on the master when the slave requests something not
present in the binlog.
Be a bit more strict about what position the slave can ask for, to avoid some
easy-to-hit misconfiguration errors.
Start over with seq_no counter when RESET MASTER.
- Fix that slave GTID state was updated from the wrong place in the code,
causing random crashing and other misery.
- Fix updates to mysql.rpl_slave_state to not go to binlog (this would cause
duplicate key errors on the slave and is generally the wrong thing to do).
- Added --verbose to BUILD scripts to get make to write out compile commands.
- Detect if AM_EXTRA_MAKEFLAGS=VERBOSE=1 was used with build scripts.
- Don't write warnings about replication variables when doing bootstrap.
- Fixed that mysql_cond_wait() and mysql_cond_timedwait() will report original source file in case of errors.
- Ignore some compiler warnings
BUILD/FINISH.sh:
Detect if AM_EXTRA_MAKEFLAGS=VERBOSE=1 or --verbose was used
BUILD/SETUP.sh:
Added --verbose to print out the full compile lines
Updated help message
client/mysqltest.cc:
Fixed that one can use 'replace' with cat_file
cmake/configure.pl:
If --verbose is used, get make to write out compile commands
debian/dist/Debian/rules:
Added $AM_EXTRA_MAKEFLAGS to get VERBOSE=1 on buildbot builds
debian/dist/Ubuntu/rules:
Added $AM_EXTRA_MAKEFLAGS to get VERBOSE=1 on buildbot builds
include/my_pthread.h:
Made set_timespec_time_nsec() more portable.
include/mysql/psi/mysql_thread.h:
Fixed that mysql_cond_wait() and mysql_cond_timedwait() will report original source file in case of errors.
mysql-test/suite/innodb/r/auto_increment_dup.result:
Fixed wrong DBUG_SYNC
mysql-test/suite/innodb/t/auto_increment_dup.test:
Fixed wrong DBUG_SYNC
mysql-test/suite/perfschema/include/upgrade_check.inc:
Make test more portable for changes in *.sql files
mysql-test/suite/perfschema/r/pfs_upgrade.result:
Updated test results
mysql-test/valgrind.supp:
Ignore running Aria checkpoint thread
scripts/mysqlaccess.sh:
Changed reference of bugs database
Ensure that also client-server group is read.
sql/handler.cc:
Added missing syncpoint
sql/mysqld.cc:
Don't write warnings about replication variables when doing bootstrap
sql/mysqld.h:
Don't write warnings about replication variables when doing bootstrap
sql/rpl_rli.cc:
Don't write warnings about replication variables when doing bootstrap
sql/sql_insert.cc:
Don't mask SERVER_SHUTDOWN in insert_delayed
This is done to be able to distingush between shutdown and interrupt errors
support-files/compiler_warnings.supp:
Ignore some compiler warnings in xtradb,innobase, oqgraph, yassl, string3.h
Make the commit checkpoint inside InnoDB be asynchroneous.
Implement a background thread in binlog to do the writing and flushing of
binlog checkpoint events to disk.
MDEV-26: Global transaction id, partial commit
Change server_id to be a session variable.
User with SUPER can set it to binlog with different server_id.
Implement backward-compatible ::server_id mirror for plugins.
This allows us to avoid calculating variables (including those involving mutex) that doesn't match the given
wildcard in SHOW STATUS LIKE '...'
Removed all references to active_mi that could cause problems for multi-source replication.
Added START|STOP ALL SLAVES
Added SHOW ALL SLAVES STATUS
include/mysql/plugin.h:
Added SHOW_SIMPLE_FUNC
include/mysql/plugin_audit.h.pp:
Updated .pp file
include/mysql/plugin_auth.h.pp:
Updated .pp file
include/mysql/plugin_ftparser.h.pp:
Updated .pp file
mysql-test/suite/multi_source/info_logs.result:
New columns in SHOW ALL SLAVES STATUS
mysql-test/suite/multi_source/info_logs.test:
Test new syntax
mysql-test/suite/multi_source/simple.result:
New columns in SHOW ALL SLAVES STATUS
mysql-test/suite/multi_source/simple.test:
test new syntax
mysql-test/suite/multi_source/syntax.result:
Updated result
mysql-test/suite/multi_source/syntax.test:
Test new syntax
mysql-test/suite/rpl/r/rpl_skip_replication.result:
Updated result
plugin/semisync/semisync_master_plugin.cc:
SHOW_FUNC -> SHOW_SIMPLE_FUNC
sql/item_create.cc:
Simplify code
sql/lex.h:
Added SLAVES keyword
sql/log.cc:
Constant -> define
sql/log_event.cc:
Added comment
sql/mysqld.cc:
SHOW_FUNC -> SHOW_SIMPLE_FUNC
Made slave_retried_trans, slave_received_heartbeats and heartbeat_period multi-source safe
Clear variable denied_connections and slave_retried_transactions on startup
sql/mysqld.h:
Added slave_retried_transactions
sql/rpl_mi.cc:
create_signed_file_name -> create_logfile_name_with_suffix
Added start_all_slaves() and stop_all_slaves()
sql/rpl_mi.h:
Added prototypes
sql/rpl_rli.cc:
create_signed_file_name -> create_logfile_name_with_suffix
added executed_entries
sql/rpl_rli.h:
Added executed_entries
sql/share/errmsg-utf8.txt:
More and better error messages
sql/slave.cc:
Added more fields to SHOW ALL SLAVES STATUS
Added slave_retried_transactions
Changed constants -> defines
sql/sql_class.h:
Added comment
sql/sql_insert.cc:
active_mi.rli -> thd->rli_slave
sql/sql_lex.h:
Added SQLCOM_SLAVE_ALL_START & SQLCOM_SLAVE_ALL_STOP
sql/sql_load.cc:
active_mi.rli -> thd->rli_slave
sql/sql_parse.cc:
Added START|STOP ALL SLAVES
sql/sql_prepare.cc:
Added SQLCOM_SLAVE_ALL_START & SQLCOM_SLAVE_ALL_STOP
sql/sql_reload.cc:
Made REFRESH RELAY LOG multi-source safe
sql/sql_repl.cc:
create_signed_file_name -> create_logfile_name_with_suffix
Don't send my_ok() from start_slave() or stop_slave() so that we can call it for all master connections
sql/sql_show.cc:
Compare wild cards early for all variables
sql/sql_yacc.yy:
Added START|STOP ALL SLAVES
Added SHOW ALL SLAVES STATUS
sql/sys_vars.cc:
Made replicate_events_marked_for_skip,slave_net_timeout and rpl_filter multi-source safe.
sql/sys_vars.h:
Simplify Sys_var_rpl_filter
Changed names of multi-source log files so that original suffixes are kept.
include/my_sys.h:
Added fn_ext2(), which returns pointer to last '.' in file name
mysql-test/extra/rpl_tests/rpl_max_relay_size.test:
Updated test
mysql-test/suite/multi_source/info_logs-master.opt:
Test with strange file names
mysql-test/suite/multi_source/info_logs.result:
Updated results
mysql-test/suite/multi_source/info_logs.test:
Changed to test with complex names to be able to verify the filename generator code
mysql-test/suite/multi_source/relaylog_events.result:
Updated results
mysql-test/suite/multi_source/reset_slave.result:
Updated results
mysql-test/suite/multi_source/skip_counter.result:
Updated results
mysql-test/suite/multi_source/skip_counter.test:
Added testing of max_relay_log_size
mysql-test/suite/rpl/r/rpl_row_max_relay_size.result:
Updated results
mysql-test/suite/rpl/r/rpl_stm_max_relay_size.result:
Updated results
mysql-test/suite/sys_vars/r/max_relay_log_size_basic.result:
Updated results
mysql-test/suite/sys_vars/t/max_relay_log_size_basic.test:
Updated results
mysys/mf_fn_ext.c:
Added fn_ext2(), which returns pointer to last '.' in file name
sql/log.cc:
Removed some wrong casts
sql/log.h:
Updated comment to reflect new code
sql/log_event.cc:
Updated DBUG_PRINT
sql/mysqld.cc:
Added that max_relay_log_size copies it's values from max_binlog_size
sql/mysqld.h:
Removed max_relay_log_size
sql/rpl_mi.cc:
Changed names of multi-source log files so that original suffixes are kept.
sql/rpl_mi.h:
Updated prototype
sql/rpl_rli.cc:
Updated comment to reflect new code
Made max_relay_log_size depending on master connection.
sql/rpl_rli.h:
Made max_relay_log_size depending on master connection.
sql/set_var.h:
Made option global so that one can check and change min & max values (sorry Sergei)
sql/sql_class.h:
Made max_relay_log_size depending on master connection.
sql/sql_repl.cc:
Updated calls to create_signed_file_name()
sql/sys_vars.cc:
Made max_relay_log_size depending on master connection.
Made old code more reusable
sql/sys_vars.h:
Changed Sys_var_multi_source_uint to ulong to be able to handle max_relay_log_size
Made old code more reusable
- Added parameter to reset_logs() so that one can specify if new logs should be created.
mysql-test/include/setup_fake_relay_log.inc:
There is no orphan relay log files anymore
mysql-test/mysql-test-run.pl:
Added multi_source to test suite
mysql-test/suite/multi_source/info_logs.result:
New test
mysql-test/suite/multi_source/info_logs.test:
New test
mysql-test/suite/multi_source/my.cnf:
Added log-warnings to get more information to the log files
mysql-test/suite/multi_source/relaylog_events.result:
Added cleanup
mysql-test/suite/multi_source/relaylog_events.test:
Added cleanup
mysql-test/suite/multi_source/reset_slave.result:
Updated results after improved RESET SLAVE
mysql-test/suite/multi_source/simple.result:
Updated results after improved RESET SLAVE
mysql-test/suite/multi_source/simple.test:
Syncronize positions before show full slave status
mysql-test/suite/rpl/r/rpl_row_show_relaylog_events.result:
Updated results after improved RESET SLAVE (we now use less relay log files)
mysql-test/suite/rpl/r/rpl_stm_mix_show_relaylog_events.result:
Updated results after improved RESET SLAVE (we now use less relay log files)
sql/log.cc:
Added parameter to reset_logs() so that one can specify if new logs should be created.
sql/log.h:
Added parameter to reset_logs()
sql/rpl_mi.cc:
Create Master_info_index::index_file_names once at init
More DBUG_PRINT
Give error if Master_info_index::check_duplicate_master_info fails
sql/rpl_rli.cc:
If we do a full reset, don't create any new relay log files.
sql/share/errmsg-utf8.txt:
Improved error message if connection exists
sql/sql_parse.cc:
Fixed memory leak
sql/sql_repl.cc:
check_duplicate_master_info() now generates an error
Added parameter to reset_logs()
Documentation of the feature can be found at: http://kb.askmonty.org/en/multi-source-replication/
This code is based on code from Taobao, developed by Plinux
BUILD/SETUP.sh:
Added -Wno-invalid-offsetof to get rid of warning of offsetof() on C++ class (safe in the contex we use it)
client/mysqltest.cc:
Added support for error names starting with 'W'
Added connection_name support to --sync_with_master
cmake/maintainer.cmake:
Added -Wno-invalid-offsetof to get rid of warning of offsetof() on C++ class (safe in the contex we use it)
mysql-test/r/mysqltest.result:
Updated results
mysql-test/r/parser.result:
Updated results
mysql-test/suite/multi_source/my.cnf:
Setup of multi-master tests
mysql-test/suite/multi_source/simple.result:
Simple basic test of multi-source functionality
mysql-test/suite/multi_source/simple.test:
Simple basic test of multi-source functionality
mysql-test/suite/multi_source/syntax.result:
Test of multi-source syntax
mysql-test/suite/multi_source/syntax.test:
Test of multi-source syntax
mysql-test/suite/rpl/r/rpl_rotate_logs.result:
Updated results because of new error messages
mysql-test/t/parser.test:
Updated test as master_pos_wait() now takes more arguments than before
sql/event_scheduler.cc:
No reason to initialize slave_thread (it's guaranteed to be zero here)
sql/item_create.cc:
Added connection_name argument to master_pos_wait()
Simplified code
sql/item_func.cc:
Added connection_name argument to master_pos_wait()
sql/item_func.h:
Added connection_name argument to master_pos_wait()
sql/log.cc:
Added tag "Master 'connection_name'" to slave errors that has a connection name.
sql/mysqld.cc:
Added variable mysqld_server_initialized so that other functions can test if server is fully initialized.
Free all slave data in one place (fewer ifdef's)
Removed not needed call to close_active_mi()
Initialize slaves() later in startup to ensure that everthing is really initialized when slaves start.
Made status variable slave_running multi-source safe
sql/mysqld.h:
Added mysqld_server_initialized
sql/rpl_mi.cc:
Store connection name and cmp_connection_name (only used for show full slave status) in Master_info
Added code for Master_info_index, which handles storage of multi-master information
Don't write the empty "" connection_name to multi-master.info file. This is handled by the original code.
sql/rpl_mi.h:
Added connection_name and Master_info_index
sql/rpl_rli.cc:
Added connection_name to relay log files.
sql/rpl_rli.h:
Fixed type of slave_skip_counter as we now access it directly in sys_vars.cc, so it must be uint
sql/share/errmsg-utf8.txt:
Added new error messages needed for multi-source
Added multi-source name to error ER_MASTER_INFO and WARN_NO_MASTER_INFO
sql/slave.cc:
Moved things a bit around to make it easier to handle error conditions.
Create a global master_info_index and add the "" connection to it
Ensure that new Master_info doesn't fail.
Don't call terminate_slave_threads(active_mi..) on end_slave() as this is now done automaticly when deleting master_info_index.
Delete not needed function close_active_mi(). One can achive same thing by calling end_slave().
Added support for SHOW FULL SLAVE STATUS (show status for all master connections with connection_name as first column)
sql/slave.h:
Added new prototypes
sql/sql_base.cc:
More DBUG_PRINT
sql/sql_class.cc:
Reset thd->connection_name and thd-->default_master_connection
sql/sql_class.h:
Added thd->connection_name and thd-->default_master_connection
Added slave_skip_count to variables to make changing the @@sql_slave_skip_count variable thread safe
sql/sql_const.h:
Added MAX_CONNECTION_NAME
sql/sql_lex.cc:
Reset 'lex->verbose' (to simplify some sql_yacc.yy code)
sql/sql_lex.h:
Added connection_name
sql/sql_parse.cc:
Added support for connection_name to all SLAVE commands.
- Instead of using active_mi, we now get the current Master_info from master_info_index.
- Create new replication threads with CHANGE MASTER
- Added support for show_all_master_info()
sql/sql_reload.cc:
Made reset/full slave use master_info_index->get_master_info() instead of active_mi.
If one uses 'RESET SLAVE "connection_name" all' the connection is removed from master_info_index.
sql/sql_repl.cc:
sql_slave_skip_counter is moved to thd->variables to make it thread safe and fix some bugs with it
Add connection name to relay log files.
Added connection name to errors.
Added some logging for multi-master if log_warnings > 1
stop_slave():
- Don't check if thd is set. It's guaranteed to always be set.
change_master():
- Check for duplicate connection names in change_master()
- Check for wrong arguments first in file (to simplify error handling)
- Register new connections in master_info_index
sql/sql_yacc.yy:
Added optional connection_name to a all relevant master/slave commands
sql/strfunc.cc:
my_global.h shoud always be included first.
sql/sys_vars.cc:
Added variable default_master_connection
Made variable sql_slave_skip_counter multi-source safe
sql/sys_vars.h:
Added Sys_var_session_lexstring (needed for default_master_connection)
Added Sys_var_multi_source_uint (needed for sql_slave_skip_counter).
and small collateral changes
mysql-test/lib/My/Test.pm:
somehow with "print" we get truncated writes sometimes
mysql-test/suite/perfschema/r/digest_table_full.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/dml_handler.result:
host table is not ported over yet
mysql-test/suite/perfschema/r/information_schema.result:
host table is not ported over yet
mysql-test/suite/perfschema/r/nesting.result:
this differs, because we don't rewrite general log queries, and multi-statement
packets are logged as a one entry. this result file is identical to what mysql-5.6.5
produces with the --log-raw option.
mysql-test/suite/perfschema/r/relaylog.result:
MariaDB modifies the binlog index file directly, while MySQL 5.6 has a feature "crash-safe binlog index" and modifies a special "crash-safe" shadow copy of the index file and then moves it over. That's why this test shows "NONE" index file writes in MySQL and "MANY" in MariaDB.
mysql-test/suite/perfschema/r/server_init.result:
MariaDB initializes the "manager" resources from the "manager" thread, and starts this thread only when --flush-time is not 0. MySQL 5.6 initializes "manager" resources unconditionally on server startup.
mysql-test/suite/perfschema/r/stage_mdl_global.result:
this differs, because MariaDB disables query cache when query_cache_size=0. MySQL does not
do that, and this causes useless mutex locks and waits.
mysql-test/suite/perfschema/r/statement_digest.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/statement_digest_consumers.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/statement_digest_long_query.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/rpl/r/rpl_mixed_drop_create_temp_table.result:
will be updated to match 5.6 when alfranio.correia@oracle.com-20110512172919-c1b5kmum4h52g0ni and anders.song@greatopensource.com-20110105052107-zoab0bsf5a6xxk2y are merged
mysql-test/suite/rpl/r/rpl_non_direct_mixed_mixing_engines.result:
will be updated to match 5.6 when anders.song@greatopensource.com-20110105052107-zoab0bsf5a6xxk2y is merged
Introduce a new storage engine API method commit_checkpoint_request().
This is used to replace the fsync() at the end of every storage engine
commit with a single fsync() when a binlog is rotated.
Binlog rotation is now done during group commit instead of being
delayed until unlog(), removing some server stall and avoiding an
expensive lock/unlock of LOCK_log inside unlog().
BUG#11761686 insert_id event is not filtered.
Two issues are covered.
INSERT into autoincrement field which is not the first part in the composed primary key
is unsafe by autoincrement logging design. The case is specific to MyISAM engine
because Innodb does not allow such table definition.
However no warnings and row-format logging in the MIXED mode was done, and
that is fixed.
Int-, Rand-, User-var log-events were not filtered along with their parent
query that made possible them to screw up execution context of the following
query.
Fixed with deferring their execution until the parent query.
******
Bug#11754117
Post review fixes.
mysql-test/suite/rpl/r/rpl_auto_increment_bug45679.result:
a new result file is added.
mysql-test/suite/rpl/r/rpl_filter_tables_not_exist.result:
results updated.
mysql-test/suite/rpl/t/rpl_auto_increment_bug45679.test:
regression test for BUG#11754117-45670 is added.
mysql-test/suite/rpl/t/rpl_filter_tables_not_exist.test:
regression test for filtering issue of BUG#11754117 - 45670 is added.
sql/log_event.cc:
Logics are added for deferring and executing events associated
with the Query event.
sql/log_event.h:
Interface to deferred events batch execution is added.
sql/rpl_rli.cc:
initialization for new RLI members is added.
sql/rpl_rli.h:
New members to RLI are added to facilitate deferred events gathering
and execution control;
two general character RLI cleanup methods are constructed.
sql/rpl_utility.cc:
Deferred_log_events methods are difined.
sql/rpl_utility.h:
A new class Deferred_log_events is defined to implement
IRU events gathering, execution and cleanup.
sql/slave.cc:
Necessary changes to initialize `rli->deferred_events' and prevent
deferred event deletion in the main read-exec branch.
sql/sql_base.cc:
A new safe-check function for multi-part pk with auto-increment is defined
and deployed in lock_tables().
sql/sql_class.cc:
Initialization for a new member and replication cleanups are added
to THD class.
sql/sql_class.h:
THD class receives a new member to hold a specific execution
context for slave applier.
sql/sql_parse.cc:
Execution of the deferred event in started prior to its parent query.
BUG#11761686 insert_id event is not filtered.
Two issues are covered.
INSERT into autoincrement field which is not the first part in the composed primary key
is unsafe by autoincrement logging design. The case is specific to MyISAM engine
because Innodb does not allow such table definition.
However no warnings and row-format logging in the MIXED mode was done, and
that is fixed.
Int-, Rand-, User-var log-events were not filtered along with their parent
query that made possible them to screw up execution context of the following
query.
Fixed with deferring their execution until the parent query.
******
Bug#11754117
Post review fixes.
mysql-test/suite/innodb/t/group_commit_crash.test:
remove autoincrement to avoid rbr being used for insert ... select
mysql-test/suite/innodb/t/group_commit_crash_no_optimize_thread.test:
remove autoincrement to avoid rbr being used for insert ... select
mysys/my_addr_resolve.c:
a pointer to a buffer is returned to the caller -> the buffer cannot be on the stack
mysys/stacktrace.c:
my_vsnprintf() is ok here, in 5.5
PROBLEM: After WL 4144, when using MyISAM Merge tables, the routine
open_and_lock_tables will append to the list of tables to lock, the
base tables that make up the MERGE table. This has two side-effects in
replication:
1. On the master side, we log additional table maps for the base
tables, since they appear in the list of locked tables, even
though we don't really use them at the slave.
2. On the slave side, when opening a MERGE table while applying a
ROW event, additional tables are appended to the list of tables
to lock.
Side-effect #1 is not harmful. It's just that when using MyISAM Merge
tables a few table maps more may be logged.
Side-effect #2, is harmful, because the list rli->tables_to_lock is an
extended structure from TABLE_LIST in which the extra fields are
filled from the table maps that are processed. Since
open_and_lock_tables appends tables to the list after all table map
events have been processed we end up with entries without
replication/table map data on them. Thus when trying to access that
info for these extra tables, the server will crash.
SOLUTION: We fix side-effect #2 by making sure that we access the
replication part of the structure for those in the list that were
accounted for when processing the correspondent table map events. All
in all, we never go beyond rli->tables_to_lock_count.
We also deploy an assertion when clearing rli->tables_to_lock, making
sure that the base tables are not in the list anymore (were closed in
close_thread_tables).
PROBLEM: After WL 4144, when using MyISAM Merge tables, the routine
open_and_lock_tables will append to the list of tables to lock, the
base tables that make up the MERGE table. This has two side-effects in
replication:
1. On the master side, we log additional table maps for the base
tables, since they appear in the list of locked tables, even
though we don't really use them at the slave.
2. On the slave side, when opening a MERGE table while applying a
ROW event, additional tables are appended to the list of tables
to lock.
Side-effect #1 is not harmful. It's just that when using MyISAM Merge
tables a few table maps more may be logged.
Side-effect #2, is harmful, because the list rli->tables_to_lock is an
extended structure from TABLE_LIST in which the extra fields are
filled from the table maps that are processed. Since
open_and_lock_tables appends tables to the list after all table map
events have been processed we end up with entries without
replication/table map data on them. Thus when trying to access that
info for these extra tables, the server will crash.
SOLUTION: We fix side-effect #2 by making sure that we access the
replication part of the structure for those in the list that were
accounted for when processing the correspondent table map events. All
in all, we never go beyond rli->tables_to_lock_count.
We also deploy an assertion when clearing rli->tables_to_lock, making
sure that the base tables are not in the list anymore (were closed in
close_thread_tables).
Problem : The basic problem is the way the thread sleeps in mysql-5.5 and also in mysql-5.1
when we execute a stop slave on windows platform.
On windows platform if the stop slave is executed after the master dies, we have
this long wait before the stop slave return a value. This is because there is a
sleep of the thread. The sleep is uninterruptable in the two above version,
which was fixed by Davi patch for the BUG#11765860 for mysql-trunk. Backporting
his patch for mysql-5.5 fixes the problem.
Solution : A new pair of mutex and condition variable is introduced to synchronize thread
sleep and finalization. A new mutex is required because the slave threads are
terminated while holding the slave thread locks (run_lock), which can not be
relinquished during termination as this would affect the lock order.
mysql-test/suite/rpl/r/rpl_start_stop_slave.result:
The result file associated with the test added.
mysql-test/suite/rpl/t/rpl_start_stop_slave.test:
A test to check the new functionality.
sql/rpl_mi.cc:
The constructor using the new mutex and condition variables for the master_info.
sql/rpl_mi.h:
The condition variable and mutex have been added for the master_info.
sql/rpl_rli.cc:
The constructor using the new mutex and condition variables for the realy_log_info.
sql/rpl_rli.h:
The condition variable and mutex have been added for the relay_log_info.
sql/slave.cc:
Use a timed wait on a condition variable to implement a interruptible sleep.
The wait is registered with the THD object so that the thread will be woken
up if killed.
Problem : The basic problem is the way the thread sleeps in mysql-5.5 and also in mysql-5.1
when we execute a stop slave on windows platform.
On windows platform if the stop slave is executed after the master dies, we have
this long wait before the stop slave return a value. This is because there is a
sleep of the thread. The sleep is uninterruptable in the two above version,
which was fixed by Davi patch for the BUG#11765860 for mysql-trunk. Backporting
his patch for mysql-5.5 fixes the problem.
Solution : A new pair of mutex and condition variable is introduced to synchronize thread
sleep and finalization. A new mutex is required because the slave threads are
terminated while holding the slave thread locks (run_lock), which can not be
relinquished during termination as this would affect the lock order.
SCAN/CPU) => SLAVE FAILURE
When a statement containing a large amount of ROWs to be applied on
the slave, and the slave's table does not contain a PK, it can take a
considerable amount of time to find and change all the rows that are
to be changed.
The proper slave enhancement will be implemented in WL 5597. However,
in this bug we are making it clear to the user what the problem is, by
printing a message to the error log if the execution time, for a given
statement in RBR, takes more than LONG_FIND_ROW_THRESHOLD (set to 60
seconds). This shall help the DBA to diagnose what's happening when
facing a slave server that is quiet for no apparent reason...
The note is only printed to the error log if log_warnings is set to be
greater than 1.
sql/log_event.cc:
Core of the patch.
In Rows_log_event::do_apply_event, sets STMT start
timestamp.
In Rows_log_event::find_row, if a PK is not used, then the start
timestamp is checked to see if the time spent on this STMT is enough
to justify the printing of a note (only if it was not printed before).
sql/log_event.h:
Defining LONG_FIND_ROW_THRESHOLD.
sql/rpl_rli.cc:
Resets long_find_row state in rli->context_cleanup().
sql/rpl_rli.h:
Two new rli properties that are necessary to control when to
emit a note in the error log: 1) timestamp that states when the
ROW statement started; 2) flag indicating whether the note has
been emitted for the current statement or not. Also deployed
accessors.
SCAN/CPU) => SLAVE FAILURE
When a statement containing a large amount of ROWs to be applied on
the slave, and the slave's table does not contain a PK, it can take a
considerable amount of time to find and change all the rows that are
to be changed.
The proper slave enhancement will be implemented in WL 5597. However,
in this bug we are making it clear to the user what the problem is, by
printing a message to the error log if the execution time, for a given
statement in RBR, takes more than LONG_FIND_ROW_THRESHOLD (set to 60
seconds). This shall help the DBA to diagnose what's happening when
facing a slave server that is quiet for no apparent reason...
The note is only printed to the error log if log_warnings is set to be
greater than 1.
sql/sql_insert.cc:
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
******
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
small cleanup
******
small cleanup
A lot of small fixes and new test cases.
client/mysqlbinlog.cc:
Cast removed
client/mysqltest.cc:
Added missing DBUG_RETURN
include/my_pthread.h:
set_timespec_time_nsec() now only takes one argument
mysql-test/t/date_formats.test:
Remove --disable_ps_protocl as now also ps supports microseconds
mysys/my_uuid.c:
Changed to use my_interval_timer() instead of my_getsystime()
mysys/waiting_threads.c:
Changed to use my_hrtime()
sql/field.h:
Added bool special_const_compare() for fields that may convert values before compare (like year)
sql/field_conv.cc:
Added test to get optimal copying of identical temporal values.
sql/item.cc:
Return that item_int is equal if it's positive, even if unsigned flag is different.
Fixed Item_cache_str::save_in_field() to have identical null check as other similar functions
Added proper NULL check to Item_cache_int::save_in_field()
sql/item_cmpfunc.cc:
Don't call convert_constant_item() if there is nothing that is worth converting.
Simplified test when years should be converted
sql/item_sum.cc:
Mark cache values in Item_sum_hybrid as not constants to ensure they are not replaced by other cache values in compare_datetime()
sql/item_timefunc.cc:
Changed sec_to_time() to take a my_decimal argument to ensure we don't loose any sub seconds.
Added Item_temporal_func::get_time() (This simplifies some things)
sql/mysql_priv.h:
Added Lazy_string_decimal()
sql/mysqld.cc:
Added my_decimal constants max_seconds_for_time_type, time_second_part_factor
sql/table.cc:
Changed expr_arena to be of type CONVENTIONAL_EXECUTION to ensure that we don't loose any items that are created by fix_fields()
sql/tztime.cc:
TIME_to_gmt_sec() now sets *in_dst_time_gap in case of errors
This is needed to be able to detect if timestamp is 0
storage/maria/lockman.c:
Changed from my_getsystime() to set_timespec_time_nsec()
storage/maria/ma_loghandler.c:
Changed from my_getsystime() to my_hrtime()
storage/maria/ma_recovery.c:
Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/maria/unittest/trnman-t.c:
Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/xtradb/handler/ha_innodb.cc:
Added support for new time,datetime and timestamp
unittest/mysys/thr_template.c:
my_getsystime() -> my_interval_timer()
unittest/mysys/waiting_threads-t.c:
my_getsystime() -> my_interval_timer()
In RBR and in case of converting blob fields, the space allocated
while unpacking into the conversion field was not freed after
copying from it into the real field.
We fix this by freeing the conversion field when the conversion
table is not needed anymore (on close_tables_to_lock).
In RBR and in case of converting blob fields, the space allocated
while unpacking into the conversion field was not freed after
copying from it into the real field.
We fix this by freeing the conversion field when the conversion
table is not needed anymore (on close_tables_to_lock).
Changed test suite to use --log-basename (to get the code tested)
Added --sync-sys=1 to test suite to speed it up.
Better error messages if something goes wrong with mysql_install_db
mysql-test/Makefile.am:
Removed not existing directory
mysql-test/lib/My/ConfigFactory.pm:
Use log-basename
We had to also set 'log_error' as some test was explicitely using the old name
Added 'sync-sys=1' to speed up test suite
mysql-test/r/variables-notembedded.result:
Updated test results (variable relay_log is now set)
mysql-test/suite/binlog/t/binlog_delete_and_flush_index-master.opt:
Force specific names for some log files.
mysql-test/suite/binlog/t/binlog_index-master.opt:
Force specific names for some log files.
mysql-test/suite/binlog/t/binlog_stm_unsafe_warning-master.opt:
Force specific names for some log files.
mysql-test/suite/binlog/t/binlog_stm_unsafe_warning.test:
Better error message if something goes wrong
mysql-test/suite/rpl/r/rpl_flushlog_loop.result:
Updated results
mysql-test/suite/rpl/rpl_1slave_base.cnf:
Use --log-basename
scripts/mysql_install_db.sh:
More information to --help
Write url to knowledge base if something goes wrong
Fail at once if we can't create a database directory (no reason to continue and write a screenful of not related text)
scripts/mysqld_safe.sh:
Also allow one to use --data for --datadir (common shortening)
Added support for --log-basename
Fail at once if we can't create a log directory
Fixed bug where we used a pid file name without '.pid' extension
sql/log.cc:
Create a log file name trough my_once_alloc() (To get it automaticly freed at exit)
sql/mysql_priv.h:
Added new prototype
sql/mysqld.cc:
Added support for --log-basename
Better help for a lot of log-filename related variables.
sql/rpl_rli.cc:
Write information that one can use --log-basename
sql/set_var.cc:
Add log_basename as a readonly variable
Before this fix, all the performance schema instrumentation for both the binary log
and the relay log would use the following instruments:
- wait/io/file/sql/binlog
- wait/io/file/sql/binlog_index
- wait/synch/mutex/sql/MYSQL_BIN_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond
This instrumentation is too general and can be more specific.
With this fix, the binlog instrumentation is identical,
and the relay log instrumentation is changed to:
- wait/io/file/sql/relaylog
- wait/io/file/sql/relaylog_index
- wait/synch/mutex/sql/MYSQL_RELAY_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond
With this change, the performance instrumentation for the binary log and the relay log,
which share the same structure but have different uses, is more detailed.
This is especially important for hosts in the middle of a replication chain,
that are both masters (binlog) and slaves (relaylog).
Before this fix, all the performance schema instrumentation for both the binary log
and the relay log would use the following instruments:
- wait/io/file/sql/binlog
- wait/io/file/sql/binlog_index
- wait/synch/mutex/sql/MYSQL_BIN_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond
This instrumentation is too general and can be more specific.
With this fix, the binlog instrumentation is identical,
and the relay log instrumentation is changed to:
- wait/io/file/sql/relaylog
- wait/io/file/sql/relaylog_index
- wait/synch/mutex/sql/MYSQL_RELAY_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond
With this change, the performance instrumentation for the binary log and the relay log,
which share the same structure but have different uses, is more detailed.
This is especially important for hosts in the middle of a replication chain,
that are both masters (binlog) and slaves (relaylog).
and collateral changes.
* introduce my_hrtime_t, my_timediff_t, and conversion macros
* inroduce TIME_RESULT, but it can only be returned from Item::cmp_type(),
never from Item::result_type()
* pack_time/unpack_time function for "packed" representation of
MYSQL_TIME in a longlong that can be compared
* ADDTIME()/SUBTIME()/+- INTERVAL now work with TIME values
* numbers aren't quoted in EXPLAIN EXTENDED
* new column I_S.COLUMNS.DATETIME_PRECISION
* date/time values are compares to anything as date/time, not as strings or numbers.
* old timestamp(X) is no longer supported
* MYSQL_TIME to string conversion functions take precision as an argument
* unified the warnings from Field_timestamp/datetime/time/date/newdate store methods
* Field_timestamp_hires, Field_datetime_hires, Field_time_hires
* Field_temporal
* Lazy_string class to pass a value (string, number, time) polymorphically down the stack
* make_truncated_value_warning and Field::set_datetime_warning use Lazy_string as an argument, removed char*/int/double variants
* removed Field::can_be_compared_as_longlong(). Use Field::cmp_type() == INT_RESULT instead
* introduced Item::cmp_result() instead of Item::is_datetime() and Item::result_as_longlong()
* in many cases date/time types are treated like other types, not as special cases
* greatly simplified Arg_comparator (regarding date/time/year code)
* SEC_TO_TIME is real function, not integer.
* microsecond precision in NOW, CURTIME, etc
* Item_temporal. All items derived from it only provide get_date, but no val* methods
* replication of NOW(6)
* Protocol::store(time) now takes the precision as an argument
* @@TIMESTAMP is a double
client/mysqlbinlog.cc:
remove unneded casts
include/my_sys.h:
introduce my_hrtime_t, my_timediff_t, and conversion macros
include/my_time.h:
pack_time/unpack_time, etc.
convenience functions to work with MYSQL_TIME::second_part
libmysql/libmysql.c:
str_to_time() is gone. str_to_datetime() does it now.
my_TIME_to_str() takes the precision as an argument
mysql-test/include/ps_conv.inc:
time is not equal to datetime anymore
mysql-test/r/distinct.result:
a test for an old MySQL bug
mysql-test/r/explain.result:
numbers aren't quoted in EXPLAIN EXTENDED
mysql-test/r/func_default.result:
numbers aren't quoted in EXPLAIN EXTENDED
mysql-test/r/func_sapdb.result:
when decimals=NOT_FIXED_DEC it means "not fixed" indeed
mysql-test/r/func_test.result:
numbers aren't quoted in EXPLAIN EXTENDED
mysql-test/r/func_time.result:
ADDTIME()/SUBTIME()/+- INTERVAL now work with TIME values
mysql-test/r/having.result:
numbers aren't quoted in EXPLAIN EXTENDED
mysql-test/r/information_schema.result:
new column I_S.COLUMNS.DATETIME_PRECISION
mysql-test/r/join_outer.result:
numbers aren't quoted in EXPLAIN EXTENDED
mysql-test/r/metadata.result:
TIMESTAMP no longer has zerofill flag
mysql-test/r/range.result:
invalid datetime is not compared with as a string
mysql-test/r/select.result:
NO_ZERO_IN_DATE, etc only affect storage - according to the manual
numbers aren't quoted in EXPLAIN EXTENDED
mysql-test/r/subselect.result:
numbers aren't quoted in EXPLAIN EXTENDED
mysql-test/r/sysdate_is_now.result:
when decimals=NOT_FIXED_DEC it means "not fixed" indeed
mysql-test/r/type_blob.result:
TIMESTAMP(N) is not deprecated
mysql-test/r/type_timestamp.result:
old TIMESTAMP(X) semantics is not supported anymore
mysql-test/r/union.result:
numbers aren't quoted in EXPLAIN EXTENDED
mysql-test/r/varbinary.result:
numbers aren't quoted in EXPLAIN EXTENDED
mysql-test/t/distinct.test:
test for an old MySQL bug
mysql-test/t/func_time.test:
+- INTERVAL now works with TIME values
mysql-test/t/select.test:
typo
mysql-test/t/subselect.test:
only one error per statement, please
mysql-test/t/system_mysql_db_fix40123.test:
old timestamp(X) is no longer supported
mysql-test/t/system_mysql_db_fix50030.test:
old timestamp(X) is no longer supported
mysql-test/t/system_mysql_db_fix50117.test:
old timestamp(X) is no longer supported
mysql-test/t/type_blob.test:
old timestamp(X) is no longer supported
mysql-test/t/type_timestamp.test:
old timestamp(X) is no longer supported
mysys/my_getsystime.c:
functions to get the time with microsecond precision
mysys/my_init.c:
move the my_getsystime.c initialization code to my_getsystime.c
mysys/my_static.c:
no need to make these variables extern
mysys/my_static.h:
no need to make these variables extern
scripts/mysql_system_tables.sql:
old timestamp(X) is no longer supported
scripts/mysql_system_tables_fix.sql:
old timestamp(X) is no longer supported
scripts/mysqlhotcopy.sh:
old timestamp(X) is no longer supported
sql-common/my_time.c:
* call str_to_time from str_to_datetime, as appropriate
* date/time to string conversions take precision as an argument
* number_to_time()
* TIME_to_double()
* pack_time() and unpack_time()
sql/event_data_objects.cc:
cast is not needed
my_datetime_to_str() takes precision as an argument
sql/event_db_repository.cc:
avoid dangerous downcast (because the pointer is
not always Field_timestamp, see events_1.test)
sql/event_queue.cc:
avoid silly double-work for cond_wait
(having an endpoint of wait, subtract the current time to get the timeout,
and use set_timespec() macro to fill in struct timespec, by adding the current
time to the timeout)
sql/field.cc:
* remove virtual Field::get_time(), everyone should use only Field::get_date()
* remove lots of #ifdef WORDS_BIGENDIAN
* unified the warnings from Field_timestamp/datetime/time/date/newdate store methods
* Field_timestamp_hires, Field_datetime_hires, Field_time_hires
* Field_temporal
* make_truncated_value_warning and Field::set_datetime_warning use Lazy_string as an argument, removed char*/int/double variants
sql/field.h:
* remove virtual Field::get_time(), everyone should use only Field::get_date()
* remove lots of #ifdef WORDS_BIGENDIAN
* unified the warnings from Field_timestamp/datetime/time/date/newdate store methods
* Field_timestamp_hires, Field_datetime_hires, Field_time_hires
* Field_temporal
* make_truncated_value_warning and Field::set_datetime_warning use Lazy_string as an argument, removed char*/int/double variants
* removed Field::can_be_compared_as_longlong(). Use Field::cmp_type() == INT_RESULT instead
sql/filesort.cc:
TIME_RESULT, cmp_time()
sql/item.cc:
* numbers aren't quoted in EXPLAIN EXTENDED
* Item::cmp_result() instead of Item::is_datetime() and Item::result_as_longlong()
* virtual Item::get_time() is gone
* Item_param::field_type() is set correctly
* Item_datetime, for a datetime constant
* time to anything is compared as a time
* Item_cache::print() prints the value is available
* bug fixed in Item_cache_int::val_str()
sql/item.h:
* Item::print_value(), to be used from Item_xxx::print() when needed
* Item::cmp_result() instead of Item::is_datetime() and Item::result_as_longlong()
* virtual Item::get_time() is gone
* Item_datetime, for a datetime constant
* better default for cast_to_int_type()
* Item_cache objects now *always* have the field_type() set
sql/item_cmpfunc.cc:
* get_year_value, get_time_value are gone. get_datetime_value does it all
* get_value_a_func, get_value_b_func are gone
* can_compare_as_dates() is gone too, TIME_RESULT is used instead
* cmp_type() instead or result_type() when doing a comparison
* compare_datetime and compate_e_datetime in the comparator_matrix, is_nulls_eq is gone
* Item::cmp_result() instead of Item::is_datetime() and Item::result_as_longlong()
sql/item_cmpfunc.h:
greatly simplified Arg_comparator
sql/item_create.cc:
* fix a bug in error messages in CAST
sql/item_func.cc:
Item::cmp_result() instead of Item::is_datetime() and Item::result_as_longlong()
mention all possibitiles in switch over Item_result values, or use default:
sql/item_row.h:
overwrite the default cmp_type() for Item_row,
as no MYSQL_TYPE_xxx value corresponds to ROW_RESULT
sql/item_timefunc.cc:
rewrite make_datetime to support precision argument
SEC_TO_TIME is real function, not integer.
many functions that returned temporal values had duplicate code in val_* methods,
some of them did not have get_date() which resulted in unnecessary date->str->date conversions.
Now they all are derived from Item_temporal_func and *only* provide get_date, not val* methods.
many fixes to set decimals (datetime precision) correctly.
sql/item_timefunc.h:
SEC_TO_TIME is real function, not integer.
many functions that returned temporal values had duplicate code in val_* methods,
some of them did not have get_date() which resulted in unnecessary date->str->date conversions.
Now they all are derived from Item_temporal_func and *only* provide get_date, not val* methods.
many fixes to set decimals (datetime precision) correctly.
sql/log_event.cc:
replication of NOW(6)
sql/log_event.h:
replication of NOW(6)
sql/mysql_priv.h:
Lazy_string class to pass a value (string, number, time) polymorphically down the stack.
make_truncated_value_warning() that uses it.
sql/mysqld.cc:
datetime in Arg_comparator::comparator_matrix
sql/opt_range.cc:
cleanup: don't disable warnings before calling save_in_field_no_warnings()
sql/protocol.cc:
Protocol::store(time) now takes the precision as an argument
sql/protocol.h:
Protocol::store(time) now takes the precision as an argument
sql/rpl_rli.cc:
small cleanup
sql/set_var.cc:
SET TIMESTAMP=double
sql/set_var.h:
@@TIMESTAMP is a double
sql/share/errmsg.txt:
precision and scale are unsigned
sql/slave.cc:
replication of NOW(6)
sql/sp_head.cc:
cleanup
sql/sql_class.cc:
support for NOW(6)
sql/sql_class.h:
support for NOW(6)
sql/sql_insert.cc:
support for NOW(6)
sql/sql_select.cc:
use item->cmp_type().
move a comment where it belongs
sql/sql_show.cc:
new column I_S.COLUMNS.DATETIME_PRECISION
sql/sql_yacc.yy:
TIME(X), DATETIME(X), cast, NOW(X), CURTIME(X), etc
sql/time.cc:
fix date_add_interval() to support MYSQL_TIMESTAMP_TIME argument
storage/myisam/ha_myisam.cc:
TIMESTAMP no longer carries ZEROFIELD flag, still we keep MYI file compatible.
strings/my_vsnprintf.c:
warnings
tests/mysql_client_test.c:
old timestamp(X) does not work anymore
datetime is no longer equal to time
This patch adds options to annotate the binlog (and the mysqlbinlog
output) with the original SQL query for queries that are logged
using row-based replication.
bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
mysql-test/include/check_ftwrl_compatible.inc:
Added helper script which allows to check that a statement is
compatible with FLUSH TABLES WITH READ LOCK.
mysql-test/include/check_ftwrl_incompatible.inc:
Added helper script which allows to check that a statement is
incompatible with FLUSH TABLES WITH READ LOCK.
mysql-test/include/handler.inc:
Adjusted test case to the fact that now DROP TABLE closes
open HANDLERs for the table to be dropped before checking
if there active FTWRL in this connection.
mysql-test/include/wait_show_condition.inc:
Fixed small error in the timeout message. The correct name
of variable used as parameter for this script is "$condition"
and not "$wait_condition".
mysql-test/r/delayed.result:
Added test coverage for scenario which triggered assert in
metadata locking subsystem.
mysql-test/r/events_2.result:
Updated test results after prohibiting event DDL operations
under LOCK TABLES.
mysql-test/r/flush.result:
Added test coverage for bug #57006 "Deadlock between HANDLER
and FLUSH TABLES WITH READ LOCK".
mysql-test/r/flush_read_lock.result:
Added test coverage for various aspects of FLUSH TABLES WITH
READ LOCK functionality.
mysql-test/r/flush_read_lock_kill.result:
Adjusted test case after replacing custom global read lock
implementation with one based on metadata locks. Use new
debug_sync point. Do not disable concurrent inserts as now
InnoDB we always use InnoDB table.
mysql-test/r/handler_innodb.result:
Adjusted test case to the fact that now DROP TABLE closes
open HANDLERs for the table to be dropped before checking
if there active FTWRL in this connection.
mysql-test/r/handler_myisam.result:
Adjusted test case to the fact that now DROP TABLE closes
open HANDLERs for the table to be dropped before checking
if there active FTWRL in this connection.
mysql-test/r/mdl_sync.result:
Adjusted test case after replacing custom global read lock
implementation with one based on metadata locks. Replaced
usage of GRL-specific debug_sync's with appropriate sync
points in MDL subsystem.
mysql-test/suite/perfschema/r/dml_setup_instruments.result:
Updated test results after removing global
COND_global_read_lock condition variable.
mysql-test/suite/perfschema/r/func_file_io.result:
Ensure that this test doesn't affect subsequent tests.
At the end of its execution enable back P_S instrumentation
which this test disables at some point.
mysql-test/suite/perfschema/r/func_mutex.result:
Ensure that this test doesn't affect subsequent tests.
At the end of its execution enable back P_S instrumentation
which this test disables at some point.
mysql-test/suite/perfschema/r/global_read_lock.result:
Adjusted test case to take into account that new GRL
implementation is based on MDL.
mysql-test/suite/perfschema/r/server_init.result:
Adjusted test case after replacing custom global read
lock implementation with one based on MDL and replacing
LOCK_event_metadata mutex with metadata lock.
mysql-test/suite/perfschema/t/func_file_io.test:
Ensure that this test doesn't affect subsequent tests.
At the end of its execution enable back P_S instrumentation
which this test disables at some point.
mysql-test/suite/perfschema/t/func_mutex.test:
Ensure that this test doesn't affect subsequent tests.
At the end of its execution enable back P_S instrumentation
which this test disables at some point.
mysql-test/suite/perfschema/t/global_read_lock.test:
Adjusted test case to take into account that new GRL
implementation is based on MDL.
mysql-test/suite/perfschema/t/server_init.test:
Adjusted test case after replacing custom global read
lock implementation with one based on MDL and replacing
LOCK_event_metadata mutex with metadata lock.
mysql-test/suite/rpl/r/rpl_tmp_table_and_DDL.result:
Updated test results after prohibiting event DDL under
LOCK TABLES.
mysql-test/t/delayed.test:
Added test coverage for scenario which triggered assert in
metadata locking subsystem.
mysql-test/t/events_2.test:
Updated test case after prohibiting event DDL operations
under LOCK TABLES.
mysql-test/t/flush.test:
Added test coverage for bug #57006 "Deadlock between HANDLER
and FLUSH TABLES WITH READ LOCK".
mysql-test/t/flush_block_commit.test:
Adjusted test case after changing thread state name which
is used when COMMIT waits for FLUSH TABLES WITH READ LOCK
from "Waiting for release of readlock" to "Waiting for commit
lock".
mysql-test/t/flush_block_commit_notembedded.test:
Adjusted test case after changing thread state name which is
used when DML waits for FLUSH TABLES WITH READ LOCK. Now we
use "Waiting for global read lock" in this case.
mysql-test/t/flush_read_lock.test:
Added test coverage for various aspects of FLUSH TABLES WITH
READ LOCK functionality.
mysql-test/t/flush_read_lock_kill-master.opt:
We no longer need to use make_global_read_lock_block_commit_loop
debug tag in this test. Instead we rely on an appropriate
debug_sync point in MDL code.
mysql-test/t/flush_read_lock_kill.test:
Adjusted test case after replacing custom global read lock
implementation with one based on metadata locks. Use new
debug_sync point. Do not disable concurrent inserts as now
InnoDB we always use InnoDB table.
mysql-test/t/lock_multi.test:
Adjusted test case after changing thread state names which
are used when DML or DDL waits for FLUSH TABLES WITH READ
LOCK to "Waiting for global read lock".
mysql-test/t/mdl_sync.test:
Adjusted test case after replacing custom global read lock
implementation with one based on metadata locks. Replaced
usage of GRL-specific debug_sync's with appropriate sync
points in MDL subsystem. Updated thread state names which
are used when DDL waits for FTWRL.
mysql-test/t/trigger_notembedded.test:
Adjusted test case after changing thread state names which
are used when DML or DDL waits for FLUSH TABLES WITH READ
LOCK to "Waiting for global read lock".
sql/event_data_objects.cc:
Removed Event_queue_element::status/last_executed_changed
members and Event_queue_element::update_timing_fields()
method. We no longer use this class for updating mysql.events
once event is chosen for execution. Accesses to instances of
this class in scheduler thread require protection by
Event_queue::LOCK_event_queue mutex and we try to avoid
updating table while holding this lock.
sql/event_data_objects.h:
Removed Event_queue_element::status/last_executed_changed
members and Event_queue_element::update_timing_fields()
method. We no longer use this class for updating mysql.events
once event is chosen for execution. Accesses to instances of
this class in scheduler thread require protection by
Event_queue::LOCK_event_queue mutex and we try to avoid
updating table while holding this lock.
sql/event_db_repository.cc:
- Changed Event_db_repository methods to not release all
metadata locks once they are done updating mysql.events
table. This allows to keep metadata lock protecting
against GRL and lock protecting particular event around
until corresponding DDL statement is written to the binary
log.
- Removed logic for conditional update of "status" and
"last_executed" fields from update_timing_fields_for_event()
method. In the only case when this method is called now
"last_executed" is always modified and tracking change
of "status" is too much hassle.
sql/event_db_repository.h:
Removed logic for conditional update of "status" and
"last_executed" fields from Event_db_repository::
update_timing_fields_for_event() method.
In the only case when this method is called now "last_executed"
is always modified and tracking change of "status" field is
too much hassle.
sql/event_queue.cc:
Changed event scheduler code not to update mysql.events
table while holding Event_queue::LOCK_event_queue mutex.
Doing so led to a deadlock with a new GRL implementation.
This deadlock didn't occur with old implementation due to
fact that code acquiring protection against GRL ignored
pending GRL requests (which lead to GRL starvation).
One of goals of new implementation is to disallow GRL
starvation and so we have to solve problem with this
deadlock in a different way.
sql/events.cc:
Changed methods of Events class to acquire protection
against GRL while perfoming DDL statement and keep it
until statement is written to the binary log.
Unfortunately this step together with new GRL implementation
exposed deadlock involving Events::LOCK_event_metadata
and GRL. To solve it Events::LOCK_event_metadata mutex was
replaced with a metadata lock on event. As a side-effect
events DDL has to be prohibited under LOCK TABLES even in
cases when mysql.events table was explicitly locked for
write.
sql/events.h:
Replaced Events::LOCK_event_metadata mutex with a metadata
lock on event.
sql/ha_ndbcluster.cc:
Updated code after replacing custom global read lock
implementation with one based on MDL. Since MDL subsystem
should now be able to detect deadlocks involving metadata
locks and GRL there is no need for special handling of
active GRL.
sql/handler.cc:
Replaced custom implementation of global read lock with
one based on metadata locks. Consequently when doing
commit instead of calling method of Global_read_lock
class to acquire protection against GRL we simply acquire
IX in COMMIT namespace.
sql/lock.cc:
Replaced custom implementation of global read lock with
one based on metadata locks. This step allows to expose
wait for GRL to deadlock detector of MDL subsystem and
thus succesfully resolve deadlocks similar to one behind
bug #57006 "Deadlock between HANDLER and FLUSH TABLES
WITH READ LOCK". It also solves problem with GRL starvation
described in bug #54673 "It takes too long to get readlock
for 'FLUSH TABLES WITH READ LOCK'" since metadata locks used
by GRL give preference to FTWRL statement instead of DML
statements (if needed in future this can be changed to
fair scheduling).
Similar to old implementation of acquisition of GRL is
two-step. During the first step we block all concurrent
DML and DDL statements by acquiring global S metadata lock
(each DML and DDL statement acquires global IX lock for
its duration). During the second step we block commits by
acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
To support this change:
- Global_read_lock::lock/unlock_global_read_lock and
make_global_read_lock_block_commit methods were changed
accordingly.
- Global_read_lock::wait_if_global_read_lock() and
start_waiting_global_read_lock() methods were dropped.
It is now responsibility of code acquiring metadata locks
opening tables to acquire protection against GRL by
explicitly taking global IX lock with statement duration.
- Global variables, mutex and condition variable used by
old implementation was removed.
- lock_routine_name() was changed to use statement duration for
its global IX lock. It was also renamed to lock_object_name()
as it now also used to take metadata locks on events.
- Global_read_lock::set_explicit_lock_duration() was added which
allows not to release locks used for GRL when leaving prelocked
mode.
sql/lock.h:
- Renamed lock_routine_name() to lock_object_name() and changed
its signature to allow its usage for events.
- Removed broadcast_refresh() function. It is no longer needed
with new GRL implementation.
sql/log_event.cc:
Release metadata locks with statement duration at the end
of processing legacy event for LOAD DATA. This ensures that
replication thread processing such event properly releases
its protection against global read lock.
sql/mdl.cc:
Changed MDL subsystem to support new MDL-based implementation
of global read lock.
Added COMMIT and EVENTS namespaces for metadata locks. Changed
thread state name for GLOBAL namespace to "Waiting for global
read lock".
Optimized MDL_map::find_or_insert() method to avoid taking
m_mutex mutex when looking up MDL_lock objects for GLOBAL
or COMMIT namespaces. We keep pre-created MDL_lock objects
for these namespaces around and simply return pointers to
these global objects when needed.
Changed MDL_lock/MDL_scoped_lock to properly handle
notification of insert delayed handler threads when FTWRL
takes global S lock.
Introduced concept of lock duration. In addition to locks with
transaction duration which work in the way which is similar to
how locks worked before (i.e. they are released at the end of
transaction), locks with statement and explicit duration were
introduced.
Locks with statement duration are automatically released at the
end of statement. Locks with explicit duration require explicit
release and obsolete concept of transactional sentinel.
* Changed MDL_request and MDL_ticket classes to support notion
of duration.
* Changed MDL_context to keep locks with different duration in
different lists. Changed code handling ticket list to take
this into account.
* Changed methods responsible for releasing locks to take into
account duration of tickets. Particularly public
MDL_context::release_lock() method now only can release
tickets with explicit duration (there is still internal
method which allows to specify duration). To release locks
with statement or transaction duration one have to use
release_statement/transactional_locks() methods.
* Concept of savepoint for MDL subsystem now has to take into
account locks with statement duration. Consequently
MDL_savepoint class was introduced and methods working with
savepoints were updated accordingly.
* Added methods which allow to set duration for one or all
locks in the context.
sql/mdl.h:
Changed MDL subsystem to support new MDL-based implementation
of global read lock.
Added COMMIT and EVENTS namespaces for metadata locks.
Introduced concept of lock duration. In addition to locks with
transaction duration which work in the way which is similar to
how locks worked before (i.e. they are released at the end of
transaction), locks with statement and explicit duration were
introduced.
Locks with statement duration are automatically released at the
end of statement. Locks with explicit duration require explicit
release and obsolete concept of transactional sentinel.
* Changed MDL_request and MDL_ticket classes to support notion
of duration.
* Changed MDL_context to keep locks with different duration in
different lists. Changed code handling ticket list to take
this into account.
* Changed methods responsible for releasing locks to take into
account duration of tickets. Particularly public
MDL_context::release_lock() method now only can release
tickets with explicit duration (there is still internal
method which allows to specify duration). To release locks
with statement or transaction duration one have to use
release_statement/transactional_locks() methods.
* Concept of savepoint for MDL subsystem now has to take into
account locks with statement duration. Consequently
MDL_savepoint class was introduced and methods working with
savepoints were updated accordingly.
* Added methods which allow to set duration for one or all
locks in the context.
sql/mysqld.cc:
Removed global mutex and condition variables which were used
by old implementation of GRL.
Also we no longer need to initialize Events::LOCK_event_metadata
mutex as it was replaced with metadata locks on events.
sql/mysqld.h:
Removed global variable, mutex and condition variables which
were used by old implementation of GRL.
sql/rpl_rli.cc:
When slave thread closes tables which were open for handling
of RBR events ensure that it releases global IX lock which
was acquired as protection against GRL.
sql/sp.cc:
Adjusted code to the new signature of lock_object/routine_name(),
to the fact that one now needs specify duration of lock when
initializing MDL_request and to the fact that savepoints for MDL
subsystem are now represented by MDL_savepoint class.
sql/sp_head.cc:
Ensure that statements in stored procedures release statement
metadata locks and thus release their protectiong against GRL
in proper moment in time.
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request.
sql/sql_admin.cc:
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request.
sql/sql_base.cc:
- Implemented support for new approach to acquiring protection
against global read lock. We no longer acquire such protection
explicitly on the basis of statement flags. Instead we always
rely on code which is responsible for acquiring metadata locks
on object to be changed acquiring this protection. This is
achieved by acquiring global IX metadata lock with statement
duration. Code doing this also responsible for checking that
current connection has no active GRL by calling an
Global_read_lock::can_acquire_protection() method.
Changed code in open_table() and lock_table_names()
accordingly.
Note that as result of this change DDL and DML on temporary
tables is always compatible with GRL (before it was
incompatible in some cases and compatible in other cases).
- To speed-up code acquiring protection against GRL introduced
m_has_protection_against_grl member in Open_table_context
class. It indicates that protection was already acquired
sometime during open_tables() execution and new attempts
can be skipped.
- Thanks to new GRL implementation calls to broadcast_refresh()
became unnecessary and were removed.
- Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request and to the fact that
savepoints for MDL subsystem are now represented by
MDL_savepoint class.
sql/sql_base.h:
Adjusted code to the fact that savepoints for MDL subsystem are
now represented by MDL_savepoint class.
Also introduced Open_table_context::m_has_protection_against_grl
member which allows to avoid acquiring protection against GRL
while opening tables if such protection was already acquired.
sql/sql_class.cc:
Changed THD::leave_locked_tables_mode() after transactional
sentinel for metadata locks was obsoleted by introduction of
locks with explicit duration.
sql/sql_class.h:
- Adjusted code to the fact that savepoints for MDL subsystem
are now represented by MDL_savepoint class.
- Changed Global_read_lock class according to changes in
global read lock implementation:
* wait_if_global_read_lock and start_waiting_global_read_lock
are now gone. Instead code needing protection against GRL
has to acquire global IX metadata lock with statement
duration itself. To help it new can_acquire_protection()
was introduced. Also as result of the above change
m_protection_count member is gone too.
* Added m_mdl_blocks_commits_lock member to store metadata
lock blocking commits.
* Adjusted code to the fact that concept of transactional
sentinel was obsoleted by concept of lock duration.
- Removed CF_PROTECT_AGAINST_GRL flag as it is no longer
necessary. New GRL implementation acquires protection
against global read lock automagically when statement
acquires metadata locks on tables or other objects it
is going to change.
sql/sql_db.cc:
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request.
sql/sql_handler.cc:
Removed call to broadcast_refresh() function. It is no longer
needed with new GRL implementation.
Adjusted code after introducing duration concept for metadata
locks. Particularly to the fact transactional sentinel was
replaced with explicit duration.
sql/sql_handler.h:
Renamed mysql_ha_move_tickets_after_trans_sentinel() to
mysql_ha_set_explicit_lock_duration() after transactional
sentinel was obsoleted by locks with explicit duration.
sql/sql_insert.cc:
Adjusted code handling delaying inserts after switching to
new GRL implementation. Now connection thread initiating
delayed insert has to acquire global IX lock in addition
to metadata lock on table being inserted into. This IX lock
protects against GRL and similarly to SW lock on table being
inserted into has to be passed to handler thread in order to
avoid deadlocks.
sql/sql_lex.cc:
LEX::protect_against_global_read_lock member is no longer
necessary since protection against GRL is automatically
taken by code acquiring metadata locks/opening tables.
sql/sql_lex.h:
LEX::protect_against_global_read_lock member is no longer
necessary since protection against GRL is automatically
taken by code acquiring metadata locks/opening tables.
sql/sql_parse.cc:
- Implemented support for new approach to acquiring protection
against global read lock. We no longer acquire such protection
explicitly on the basis of statement flags. Instead we always
rely on code which is responsible for acquiring metadata locks
on object to be changed acquiring this protection. This is
achieved by acquiring global IX metadata lock with statement
duration. This lock is automatically released at the end of
statement execution.
- Changed implementation of CREATE/DROP PROCEDURE/FUNCTION not
to release metadata locks and thus protection against of GRL
in the middle of statement execution.
- Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request and to the fact that
savepoints for MDL subsystem are now represented by
MDL_savepoint class.
sql/sql_prepare.cc:
Adjusted code to the to the fact that savepoints for MDL
subsystem are now represented by MDL_savepoint class.
sql/sql_rename.cc:
With new GRL implementation there is no need to explicitly
acquire protection against GRL before renaming tables.
This happens automatically in code which acquires metadata
locks on tables being renamed.
sql/sql_show.cc:
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request and to the fact that
savepoints for MDL subsystem are now represented by
MDL_savepoint class.
sql/sql_table.cc:
- With new GRL implementation there is no need to explicitly
acquire protection against GRL before dropping tables.
This happens automatically in code which acquires metadata
locks on tables being dropped.
- Changed mysql_alter_table() not to release lock on new table
name explicitly and to rely on automatic release of locks
at the end of statement instead. This was necessary since
now MDL_context::release_lock() is supported only for locks
for explicit duration.
sql/sql_trigger.cc:
With new GRL implementation there is no need to explicitly
acquire protection against GRL before changing table triggers.
This happens automatically in code which acquires metadata
locks on tables which triggers are to be changed.
sql/sql_update.cc:
Fix bug exposed by GRL testing. During prepare phase acquire
only S metadata locks instead of SW locks to keep prepare of
multi-UPDATE compatible with concurrent LOCK TABLES WRITE
and global read lock.
sql/sql_view.cc:
With new GRL implementation there is no need to explicitly
acquire protection against GRL before creating view.
This happens automatically in code which acquires metadata
lock on view to be created.
sql/sql_yacc.yy:
LEX::protect_against_global_read_lock member is no longer
necessary since protection against GRL is automatically
taken by code acquiring metadata locks/opening tables.
sql/table.cc:
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request.
sql/table.h:
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request.
sql/transaction.cc:
Replaced custom implementation of global read lock with
one based on metadata locks. Consequently when doing
commit instead of calling method of Global_read_lock
class to acquire protection against GRL we simply acquire
IX in COMMIT namespace.
Also adjusted code to the fact that MDL savepoint is now
represented by MDL_savepoint class.
bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.