When rolling back and retrying a transaction in parallel replication, don't
release the domain ownership (for --gtid-ignore-duplicates) as part of the
rollback. Otherwise another master connection could grab the ownership and
double-apply the transaction in parallel with the retry.
Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
AKA rpl.rpl_parallel, binlog_encryption.rpl_parallel fails in
buildbot with timeout in include
A replication parallel worker thread can deadlock with another
connection running SHOW SLAVE STATUS. That is, if the replication
worker thread is in do_gco_wait() and is killed, it will already
hold the LOCK_parallel_entry, and during error reporting, try to
grab the err_lock. SHOW SLAVE STATUS, however, grabs these locks in
reverse order. It will initially grab the err_lock, and then try to
grab LOCK_parallel_entry. This leads to a deadlock when both threads
have grabbed their first lock without the second.
This patch implements the MDEV-31894 proposed fix to optimize the
workers_idle() check to compare the last in-use relay log’s
queued_count==dequeued_count for idleness. This removes the need for
workers_idle() to grab LOCK_parallel_entry, as these values are
atomically updated.
Huge thanks to Kristian Nielsen for diagnosing the problem!
Reviewed By:
============
Kristian Nielsen <knielsen@knielsen-hq.org>
Andrei Elkin <andrei.elkin@mariadb.com>
The immediate bug was caused by a failure to recognize a correct
position to stop the slave applier run in optimistic parallel mode.
There were the following set of issues that the analysis unveil.
1 incorrect estimate for the event binlog position passed to
is_until_satisfied
2 wait for workers to complete by the driver thread did not account non-group events
that could be left unprocessed and thus to mix up the last executed
binlog group's file and position:
the file remained old and the position related to the new rotated file
3 incorrect 'slave reached file:pos' by the parallel slave report in the error log
4 relay log UNTIL missed out the parallel slave branch in
is_until_satisfied.
The patch addresses all of them to simplify logics of log change
notification in either the master and relay-log until case.
P.1 is addressed with passing the event into is_until_satisfied()
for proper analisis by the function.
P.2 is fixed by changes in handle_queued_pos_update().
P.4 required removing relay-log change notification by workers.
Instead the driver thread updates the notion of the current relay-log
fully itself with aid of introduced
bool Relay_log_info::until_relay_log_names_defer.
An extra print out of the requested until file:pos is arranged
with --log-warning=3.
This patch changes how old rows in mysql.gtid_slave_pos* tables are deleted.
Instead of doing it as part of every replicated transaction in
record_gtid(), it is done periodically (every @@gtid_cleanup_batch_size
transaction) in the slave background thread.
This removes the deletion step from the replication process in SQL or worker
threads, which could speed up replication with many small transactions. It
also decreases contention on the global mutex LOCK_slave_state. And it
simplifies the logic, eg. when a replicated transaction fails after having
deleted old rows.
With this patch, the deletion of old GTID rows happens asynchroneously and
slightly non-deterministic. Thus the number of old rows in
mysql.gtid_slave_pos can temporarily exceed @@gtid_cleanup_batch_size. But
all old rows will be deleted eventually after sufficiently many new GTIDs
have been replicated.
This would happen especially in optimistic parallel replication, where there
is a good chance that a transaction will be rolled back (due to conflicts)
after it has executed record_gtid(). If the transaction did any deletions of
old rows as part of record_gtid(), those deletions will be undone as well.
And the code did not properly ensure that the deletions would be re-tried.
This patch makes record_gtid() remember the list of deletions done as part
of a transaction. Then in rpl_slave_state::update() when the changes have
been committed, we discard the list. However, in case of error and rollback,
in cleanup_context() we will instead put the list back into
rpl_global_gtid_slave_state so that the deletions will be re-tried later.
Probably fixes part of the cause of MDEV-12147 as well.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
- Use microsecond_interval_timer() to calculate time for applying row events.
(Faster execution)
- Removed return value for set_row_stmt_start_timestamp()
as it was never used.
Intermediate commit.
Keep track of which mysql.gtid_slave_posXXX tables are available for each
engine, by searching for all tables in the mysql schema with names that
start with "gtid_slave_pos".
The list is computed at server start when the GTID position is loaded, and
it is re-computed on every START SLAVE command. This way, the DBA can
manually add a table for a new engine, and it will be automatically picked
up on next START SLAVE, so a full server restart is not needed.
The list is not yet actually used in the code.
Annotate_rows_log_event again. When a new annotate event comes,
the server applies it first (which backs up thd->query_string),
then frees the old annotate event, if any. Normally there isn't.
But with sub-statements (e.g. triggers) new annotate event comes
before the first one is freed, so the second event backs up
thd->query_string that was set by the first annotate event. Then
the first event is freed, together with its query string. And then
the second event restores thd->query_string to this freed memory.
Fix: free old annotate event before applying the new one.
Description:
============
If you have a relay log index file that has ended up with
some relay log files that do not exists, then RESET SLAVE
ALL is not enough to get back to a clean state.
Analysis:
=========
In the bug scenario slave server is in stopped state and
some of the relay logs got deleted but the relay log index
file is not updated.
During slave server restart replication initialization fails
as some of the required relay logs are missing. User
executes RESET SLAVE/RESET SLAVE ALL command to start a
clean slave. As per the documentation RESET SLAVE command
clears the master info and relay log info repositories,
deletes all the relay log files, and starts a new relay log
file. But in a scenario where the slave server's
Relay_log_info object is not initialized slave will not
purge the existing relay logs. Hence the index file still
remains in a bad state. Users will not be able to start
the slave unless these files are cleared.
Fix:
===
RESET SLAVE/RESET SLAVE ALL commands should do the cleanup
even in a scenario where Relay_log_info object
initialization failed.
Backported a flag named 'error_on_rli_init_info' which is
required to identify slave's Relay_log_info object
initialization failure. This flag exists in MySQL-5.6
onwards as part of BUG#14021292 fix.
During RESET SLAVE/RESET SLAVE ALL execution this flag
indicates the Relay_log_info initialization failure.
In such a case open the relay log index/relay log files
and do the required clean up.
Merge feature into 10.2 from feature branch.
Delayed replication adds an option
CHANGE MASTER TO master_delay=<seconds>
Replication will then delay applying events with that many
seconds. This creates a replication slave that reflects the state of
the master some time in the past.
Feature is ported from MySQL source tree.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
The original MySQL patch left some refactoring todo's, possibly
because of known conflicts with other parallel development (like
info-repository feature perhaps).
This patch fixes those todos/refactorings.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Initial merge of delayed replication from MySQL git.
The code from the initial push into MySQL is merged, and the
associated test case passes. A number of tasks are still pending:
1. Check full test suite run for any regressions or .result file updates.
2. Extend the feature to also work for parallel replication.
3. There are some todo-comments about future refactoring left from
MySQL, these should be located and merged on top.
4. There are some later related MySQL commits, these should be checked
and merged. These include:
e134b9362ba0b750d6ac1b444780019622d14aa5
b38f0f7857c073edfcc0a64675b7f7ede04be00f
fd2b210383358fe7697f201e19ac9779879ba72a
afc397376ec50e96b2918ee64e48baf4dda0d37d
5. The testcase from MySQL relies heavily on sleep and timing for
testing, and seems likely to sporadically fail on heavily loaded test
servers in buildbot or distro build farms.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
When a deadlock kill is detected inside the storage engine, the kill
is not done immediately, to avoid calling back into the storage engine
kill_query method with various lock subsystem mutexes held. Instead the
kill is queued and done later by a slave background thread.
This patch in preparation for fixing TokuDB optimistic parallel
replication, as well as for removing locking hacks in InnoDB/XtraDB in
10.2.
Signed-off-by: Kristian Nielsen <knielsen at knielsen-hq.org>
mysqld maintains a list of TABLE objects for all temporary
tables created within a session in THD. Here each table is
represented by a TABLE object.
A query referencing a particular temporary table for more
than once, however, failed with ER_CANT_REOPEN_TABLE error
because a TABLE_SHARE was allocate together with the TABLE,
so temporary tables always had only one TABLE per TABLE_SHARE.
This patch lift this restriction by separating TABLE and
TABLE_SHARE objects and storing TABLE_SHAREs for temporary
tables in a list in THD, and TABLEs in a list within their
respective TABLE_SHAREs.
- Avoid some realloc() during startup
- Ensure that file_key_management_plugin frees it's memory early, even if
it's linked statically.
- Fixed compiler warnings from unused variables and missing destructors
- Fixed wrong indentation
This includes fixing all utilities to not have any memory leaks,
as safemalloc warnings stopped tests from passing on MacOSX.
- Ensure that all clients takes character-set-dir, as the
libmysqlclient library will use it.
- mysql-test-run now passes character-set-dir to all external clients.
- Changed dynstr_free() so that it can be called twice (made freeing code easier)
- Changed rpl_global_gtid_slave_state to be allocated dynamicly as it
includes a mutex that needs to be initizlied/destroyed before my_end() is called.
- Removed rpl_slave_state::init() and rpl_slave_stage::deinit() as
their job are better handling by constructor and delete.
- Print alias instead of table_name in check_duplicate_key as
table_name may have been converted to lower case.
Other things:
- Fixed a case in time_to_datetime_with_warn() where we where
using && instead of & in tests
Before, the Seconds_behind_master was updated already when an event
was queued for a worker thread to execute later. This might lead users
to interpret a low value as the slave being almost up to date with the
master, while in reality there might still be lots and lots of events
still queued up waiting to be applied by the slave.
See https://lists.launchpad.net/maria-developers/msg08958.html for
more detailed discussions.
commit f1abd015dc
Author: Andrei Elkin <aelkin@mysql.com>
Date: Thu Nov 12 17:10:19 2009 +0200
Bug #47210 first execution of "start slave until" stops too early