MASTER_GTID_WAIT() is similar to MASTER_POS_WAIT(), but works with a
GTID position rather than an old-style filename/offset.
@@LAST_GTID gives the GTID assigned to the last transaction written
into the binlog.
Together, the two can be used by applications to obtain the GTID of
an update on the master, and then do a MASTER_GTID_WAIT() for that
position on any read slave where it is important to get results that
are caught up with the master at least to the point of the update.
The implementation of MASTER_GTID_WAIT() is implemented in a way
that tries to minimise the performance impact on the SQL threads,
even in the presense of many waiters on single GTID positions (as
from @@LAST_GTID).
There were some places where insufficient locking between
parallel threads could cause invalid memory accesses and
possibly other grief.
This patch adds the missing locking, and moves the locking
into the struct rpl_binlog_state methods to make it easier
to see that proper locking is in place everywhere.
Problem:
When log_warnings is greater than 1, master prints binlog
dump thread information in mysqld.1.err file.
The information contains slave server id, binlog file and
binlog position. The slave server id is uint32 and the print
format was wrongly specifified (%d instead of %u).
Hence a server id which is more than 2 billion is getting
printed with a negative value.
Eg: Start binlog_dump to slave_server(-1340259414),
pos(mysql-bin.001663, 325187493)
Fix: Changed the uint32 format to %u.
Implement @@gtid_binlog_state. This is the internal state of the binlog
(most recent GTID logged for every domain_id and server_id). This allows
to save the state before RESET MASTER and restore it afterwards.
The main bug here was the following situation:
Suppose we set up a completely new master2 as an extra multi-master to an
existing slave that already has a different master1 for domain_id=0. When the
slave tries to connect to master2, master2 will not have anything that slave
requests in domain_id=0, but that is fine as master2 is supposedly meant to
serve eg. domain_id=1. (This is MDEV-4485).
But suppose that master2 then actually starts sending events from
domain_id=0. In this case, the fix for MDEV-4485 was incomplete, and the code
would fail to give the error that the position requested by the slave in
domain_id=0 was missing from the binlogs of master2. This could lead to lost
events or completely wrong replication.
The patch for this bug fixes this issue.
In addition, it cleans up the code a bit, getting rid of the fake_gtid_hash in
the code. And the error message when slave and master have diverged due to
alternate future is clarified, as requested in the bug description.
When a new master is provisioned that does not have any old binlogs,
the @@gtid_slave_pos is used to know where in the GTID history the
provisioning happened. A slave is allowed to connect at the point of
this value of @@gtid_slave_pos, even if that GTID is not in the
binlogs on the new master.
The code to handle this case when the binlog on the newly provisioned
master is completely empty was just wrong (couple of typos). Clearly it
had never been tested ... :-/
When a new master is provisioned that does not have any old binlogs,
the @@gtid_slave_pos is used to know where in the GTID history the
provisioning happened. A slave is allowed to connect at the point of
this value of @@gtid_slave_pos, even if that GTID is not in the
binlogs on the new master.
But --gtid-strict-mode did not correctly handle this case. When strict
mode was enabled, an attempt to connect at the position would cause an
error about holes in the binlog, which is not correct.
This patch adds a hash of GTIDs that need to be treated specially by
GTID strict mode to deal correctly with this case.
Now whenever we reach the GTID point requested from the slave (when using GTID
position to connect), we send a fake Gtid_list event. This event is used by
the slave to know the current old-style position for MASTER_POS_WAIT(), and
later the similar binlog position for MASTER_GTID_WAIT().
Without this fake event, if the slave is already fully up-to-date with the
master, there may be no events sent at the given position for an indeterminate
time.
There was some old code that cleared the position in CHANGE MASTER,
it was forgotten to be removed.
In addition, add code that saves/restores the old-style position
when we nuke the old relay logs as part of GTID slave start.
Normally we will not use these, but it could be useful in case
the GTID connect fails and user wants to go back to the old-style
coordinates.
Fix problems related to reconnect. When we need to reconnect (ie. explict
stop/start of just the IO thread by user, or automatic reconnect due to
loosing network connection with the master), it is a bit complex to correctly
resume at the right point without causing duplicate or missing events in the
relay log. The previous code had multiple problems in this regard.
With this patch, the problem is solved as follows. The IO thread keeps track
(in memory) of which GTID was last queued to the relay log. If it needs to
reconnect, it resumes at that GTID position. It also counts number of events
received within the last, possibly partial, event group, and skips the same
number of events after a reconnect, so that events already enqueued before the
reconnect are not duplicated.
(There is no need to keep any persistent state; whenever we restart slave
threads after both of them being stopped (such as after server restart), we
erase the relay logs and start over from the last GTID applied by SQL thread.
But while the SQL thread is running, this patch is needed to get correct relay
log).
The idea in the code was to protect the user that tries to connect a slave
to a master with completely different domains than what was intended. If
none of the domains in the start position are present at all in the master
binlog, we gave an error.
However, this is a stupid idea. Because when a slave connects to a master
to start replication from the very start of binlogs - such as when setting
up new master->slave servers from scratch - there will be just this
situation, the requested slave position is empty for all the domains in the
master's binlog.
So the code that gives this error is wrong, and the solution is simply to
remove it.
When @@GLOBAL.gtid_strict_mode=1, then certain operations result
in error that would otherwise result in out-of-order binlog files
between servers.
GTID sequence numbers are now allocated independently per domain;
this results in less/no holes in GTID sequences, increasing the
likelyhood that diverging binlogs will be caught by the slave when
GTID strict mode is enabled.
Change of user interface to be more logical and more in line with expectations
to work similar to old-style replication.
User can now explicitly choose in CHANGE MASTER whether binlog position is
taken into account (master_gtid_pos=current_pos) or not (master_gtid_pos=
slave_pos) when slave connects to master.
@@gtid_pos is replaced by three separate variables @@gtid_slave_pos (can
be set by user, replicated GTIDs only), @@gtid_binlog_pos (read only), and
@@gtid_current_pos (a combination of the two, most recent GTID within each
domain). mysql.rpl_slave_state is renamed to mysql.gtid_slave_pos to match.
This fixes MDEV-4474.
There was missing a check for THD::killed after THD::enter_cond(). This could
cause the binlog dump thread to miss the kill signal during server shutdown
and hang until it was force-closed.
Also fix a race in a test case that occasionally fails in Buildbot.
Implement START SLAVE UNTIL master_gtid_pos = "<GTID position>".
Add test cases, including a test showing how to use this to promote
a new master among a set of slaves.
Suppose binlog file X has in its Gtid_list_event: 0-1-3,0-2-5, and suppose the
slave requests to start replicating after 0-1-3.
In this case the bug was that master would start sending events from the start
of X. This is wrong, because 0-2-4 and 0-2-5 are contained in X-1, and are
needed by the slave. So these events were lost.
On the other hand, if the slave requested 0-2-5, then it _is_ correct to start
sending from the beginning of binlog file X, because 0-2-5 is the last GTID
logged in earlier binlogs. The difference is that 0-2-5 is the last of the
GTIDs in the Gtid_list_event. The problem was that the code did not check that
the matched GTID was the last one in the list.
Fixed by checking if the gtid requested by slave that matches a gtid in the
Gtid_list_event is the last event for that domain in the list. If not, go back
to a prior binlog to ensure all needed events are sent to slave.
mysql-test/include/show_events.inc:
Backport --let $binlog_file=LAST, used by MDEV-4473 test case.
- Calls to cleanup_load_tmpdir() could delete temporary files for another master connection
- Concurrent LOAD DATA commands from two master connections could use the same file name
Other bug fixes:
- Enlarge buffer for connection names with 'special characters' one can't store in filenames
Optimization:
- Don't do 'lower case' of connection names. We can use cmp_connection_name, where we already have the connection name in lower case.
mysql-test/suite/multi_source/load_data.result:
Test case for MDEV-4352
mysql-test/suite/multi_source/load_data.test:
Test case for MDEV-4352
sql/log_event.cc:
Fixed: MDEV-4352
- Calls to cleanup_load_tmpdir() could delete temporary files for another master connection
- Concurrent LOAD DATA commands from two master connections could use the same file name
The fix was to add the connection name (if one exists) to all slave temporary files used by LOAD DATA
sql/rpl_mi.cc:
Enlarge buffer for connection names with 'special characters' one can't store in filenames
Use mi->cmp_connection_name for connection file names.
sql/rpl_rli.cc:
Use mi->cmp_connection_name for connection file names.
sql/slave.cc:
Removed not needed empty line
sql/sql_const.h:
Added MAX_FILENAME_MBWIDTH to be able to calculate buffer length for connection_names stored in file names
sql/sql_repl.cc:
Use mi->cmp_connection_name for connection file names.
When the slave connects, the master skips binlog event groups
until it reaches the position requested by the slave. To
identify event groups, it needs to detect COMMIT events. But
this detection did not correctly handle binlog checksums, so
could incorrectly skip extra groups due to not detecting the
end of an event group.
Since event types can be >=128 and are read from a (possibly signed) char
pointer, we need to cast to unsigned char before extending to int, or we will
get an incorrect negative number. This was done in the main code path already,
but there is a rare case where we check for new events first without a lock
and then again with the lock. If the second check succeeds because a new event
turns up at just the right time, then we took a code path that was missing the
correct unsigned char cast, leading to incorrect handling of events for old
slave servers and possibly other grief.
(This was found from a sporadic failure in Buildbot of test case
rpl_mariadb_slave_capability).
The slave dump thread running on the master only checked thd->killed whenever
it reached the end of a binlog file, not between events. This could
unnecessarily delay server shutdown.
This was found by code inspection while tracking down some occasional "forcing
close of thread..." errors in Buildbot. Hopefully this will fix the failures,
but the fix is correct in any case.
Also increase the wait during server shutdown, 2 seconds is a bit tight in
case of heavy I/O stall, and it seems better to delay shutdown a bit than
force-kill threads unnecessarily.
Also fix some races in test cases that restart the mysqld server. The .expect
file should be changed with --append_file, --remove_file + --write_file
creates a short window where mysqld can error out due to .expect file missing.
Merge of 10.0-mdev26 feature tree into 10.0-base.
Global transaction ID is prepended to each event group in the binlog.
Slave connect can request to start from GTID position instead of specifying
file name/offset of master binlog. This facilitates easy switch to a new
master.
Slave GTID state is stored in a table mysql.rpl_slave_state, which can be
InnoDB to get crash-safe slave state.
GTID includes a replication domain ID, allowing to keep track of distinct
positions for each of multiple masters.
Replace CHANGE MASTER TO ... master_gtid_pos='xxx' with a new system
variable @@global.gtid_pos.
This is more logical; @@gtid_pos is global, not per-master, and it is not
affected by RESET SLAVE.
Also rename master_gtid_pos=AUTO to master_use_gtid=1, which again is more
logical.
Move combining slave and gtid binlog state into a separate function.
Make SHOW ALL SLAVES STATUS use the same function, so it shows the
same value used by slave connect.
Add a test case.
NO ERRORS REPORTED
Problem:
=======
Errors from my_b_fill are ignored. MYSQL_BIN_LOG::write_cache
code assumes that 0 returned from my_b_fill always means
end-of-cache, but that is incorrect. It can result in error
and the error is ignored. Other callers of my_b_fill don't
check for error: my_b_copy_to_file, maybe my_b_gets.
Fix:
===
An error handler is already present to check the "cache"
error that is reported during "MYSQL_BIN_LOG::write_cache"
call. Hence error handlers are added for "my_b_copy_to_file"
and "my_b_gets".
During my_b_fill() function call, when the cache read fails
info->error= -1 is set. Hence a check for "info->error"
is added for the above to callers upon their return.
mysys/mf_iocache2.c:
Added a check for "cache->error" and simulation of cache read failure
mysys/my_read.c:
Simulation of read failure
sql/log_event.cc:
Added debug simulation
sql/sql_repl.cc:
Added a check for cache error
Implement test case rpl_gtid_stop_start.test to test normal stop and restart
of master and slave mysqld servers.
Fix a couple bugs found with the test:
- When InnoDB is disabled (no XA), the binlog state was not read when master
mysqld starts.
- Remove old code that puts a bogus D-S-0 into the initial binlog state, it
is not correct in current design.
- Fix memory leak in gtid_find_binlog_file().
When slave requested to start at some GTID, and that GTID was the very
last event (within its replication domain) in some binlog file, we did
not allow the binlog dump thread on the master to start from the
beginning of a following binlog file. This is a problem, since the
binlog file containing the GTID is likely to be purged if the
replication domain is unused for long.
With this fix, if the Gtid list event at the start of a binlog file
contains exactly the GTID requested by the slave, we allow to start
the binlog dump thread from this file, taking care not to skip any
events from that domain in the file.
Fix MDEV-4329. When user does CHANGE MASTER TO
MASTER_GTID_POS='<explicit GTID state>', we check that this state
does not conflict with the binlog. But the code forgot to give an
error in the case where a domain was completely missing from the
requested position (eg. MASTER_GTID_POS='').
Fix MDEV-4278: Slave does not check that master understands GTID.
Now the slave will abort with a suitable error if an attempt is made to connect
with GTID to a master that does not support GTID.
Fix MDEV-4275 - I/O thread restart duplicates events in the relay log.
The first time we connect to master after CHANGE MASTER or restart, we connect
from the GTID position. But then subsequent reconnects or IO thread restarts
reconnect with the old-style file/offset binlog pos from where it left off at
last disconnect. This is necessary to avoid duplicate events in the relay
logs, as there is nothing that synchronises the SQL thread update of GTID
state (multiple threads in case of multi-source) with IO thread reconnects.
Test cases.
Some small cleanups and fixes.
Fix things so that a master can switch with MASTER_GTID_POS=AUTO to a slave
that was previously running with log_slave_updates=0, by looking into the
slave replication state on the master when the slave requests something not
present in the binlog.
Be a bit more strict about what position the slave can ask for, to avoid some
easy-to-hit misconfiguration errors.
Start over with seq_no counter when RESET MASTER.
Fix that CHANGE MASTER ... MASTER_GTID_POS="" works to start from the very
beginning of the binary log (with test case).
Fix that not finding the requested GTID position in master binlog results in
fatal error, not endless connect retry.
Remove the two-component form of GTID with implicit domain_id=0, as it
is likely to cause more confusion than help.
Give a better error for CHANGE MASTER ... MASTER_GTID_POS='gtid,gitd,...'
when two specified GTIDs have conflicting domain_id.