If the mysql.gtid_slave_pos table is not available, we cannot load nor update
the current GTID position persistently. This can happen eg. after an upgrade,
before mysql_upgrade_db is run, or if the table is InnoDB and the server is
restarted without the InnoDB storage engine enabled.
Before, replication always failed to start if the table was unavailable. With
this patch, we try to continue with old-style replication, after suitable
complaints in the error log. In strict mode, or if slave is configured to use
GTID, slave still refuses to start.
Fix problems related to reconnect. When we need to reconnect (ie. explict
stop/start of just the IO thread by user, or automatic reconnect due to
loosing network connection with the master), it is a bit complex to correctly
resume at the right point without causing duplicate or missing events in the
relay log. The previous code had multiple problems in this regard.
With this patch, the problem is solved as follows. The IO thread keeps track
(in memory) of which GTID was last queued to the relay log. If it needs to
reconnect, it resumes at that GTID position. It also counts number of events
received within the last, possibly partial, event group, and skips the same
number of events after a reconnect, so that events already enqueued before the
reconnect are not duplicated.
(There is no need to keep any persistent state; whenever we restart slave
threads after both of them being stopped (such as after server restart), we
erase the relay logs and start over from the last GTID applied by SQL thread.
But while the SQL thread is running, this patch is needed to get correct relay
log).
When @@GLOBAL.gtid_strict_mode=1, then certain operations result
in error that would otherwise result in out-of-order binlog files
between servers.
GTID sequence numbers are now allocated independently per domain;
this results in less/no holes in GTID sequences, increasing the
likelyhood that diverging binlogs will be caught by the slave when
GTID strict mode is enabled.
Change of user interface to be more logical and more in line with expectations
to work similar to old-style replication.
User can now explicitly choose in CHANGE MASTER whether binlog position is
taken into account (master_gtid_pos=current_pos) or not (master_gtid_pos=
slave_pos) when slave connects to master.
@@gtid_pos is replaced by three separate variables @@gtid_slave_pos (can
be set by user, replicated GTIDs only), @@gtid_binlog_pos (read only), and
@@gtid_current_pos (a combination of the two, most recent GTID within each
domain). mysql.rpl_slave_state is renamed to mysql.gtid_slave_pos to match.
This fixes MDEV-4474.
Implement START SLAVE UNTIL master_gtid_pos = "<GTID position>".
Add test cases, including a test showing how to use this to promote
a new master among a set of slaves.
Solaris fixes:
- Fixed that wait_timeout_func and wait_timeout tests works on solaris
- We have to compile without NO_ALARM on Solaris as Solaris doesn't support timeouts on sockets with setsockopt(.. SO_RCVTIMEO).
- Fixed that compile-solaris-amd64-debug works (before that we got a wrong ELF class: ELFCLASS64 on linkage)
- Added missing sync_with_master
Other bug fixes:
- Free memory for rpl_global_gtid_binlog_state before exit() to avoid 'accessing uninitalized mutex' error.
BUILD/FINISH.sh:
Fixed issues on Solaris with ksh
BUILD/compile-solaris-amd64-debug:
Added missing -m64 flag
configure.cmake:
We have to compile without NO_ALARM on Solaris as Solaris doesn't support timeouts on sockets with setsockopt(.. SO_RCVTIMEO)
mysql-test/suite/rpl/t/rpl_gtid_mdev4473.test:
- Added missing sync_with_master (fix by knielsen)
sql-common/client.c:
Added () to get rid of compiler warning
sql/item_strfunc.cc:
Fixed compiler warning
sql/log.cc:
Free memory for static variable rpl_global_gtid_binlog_state before exit()
- If we are compiling with safemalloc, we would try to call sf_free() for some members after sf_terminate() was called, which would result of trying to access the uninitalized mutex 'sf_mutex'
sql/multi_range_read.cc:
Fixed compiler warnings of converting double to ulong.
sql/opt_range.cc:
Fixed compiler warnings of converting double to ulong or uint
- Better to have all variables that can be number of rows as 'ha_rows'
sql/rpl_gtid.cc:
Added rpl_binlog_state::free() to be able to free memory for static objects before exit()
sql/rpl_gtid.h:
Added rpl_binlog_state::free() to be able to free memory for static objects before exit()
sql/set_var.cc:
Fixed compiler warning
sql/sql_join_cache.cc:
Fixed compiler warnings of converting double to uint
sql/sql_show.cc:
Added cast to get rid of compiler warning
sql/sql_statistics.cc:
Remove code that didn't do anything.
(store_record() with record[0] is a no-op)
storage/xtradb/os/os0file.c:
Added __attribute__ ((unused))
support-files/compiler_warnings.supp:
Ignore warnings from atomic_add_64_nv
(was not able to fix this with a cast as the macro is a bit different between systems)
vio/viosocket.c:
Added more DBUG_PRINT
Add tests crashing the slave in the middle of replication and checking that
replication picks-up again on restart in a crash-safe way.
Fix silly code that causes crash by inserting uninitialised data into a hash.
Implement test case rpl_gtid_stop_start.test to test normal stop and restart
of master and slave mysqld servers.
Fix a couple bugs found with the test:
- When InnoDB is disabled (no XA), the binlog state was not read when master
mysqld starts.
- Remove old code that puts a bogus D-S-0 into the initial binlog state, it
is not correct in current design.
- Fix memory leak in gtid_find_binlog_file().
Fix things so that a master can switch with MASTER_GTID_POS=AUTO to a slave
that was previously running with log_slave_updates=0, by looking into the
slave replication state on the master when the slave requests something not
present in the binlog.
Be a bit more strict about what position the slave can ask for, to avoid some
easy-to-hit misconfiguration errors.
Start over with seq_no counter when RESET MASTER.