The current semi-sync binlog fail-over recovery process uses
rpl_semi_sync_slave_enabled==TRUE as its condition to truncate a
primary server’s binlog, as it is anticipating the server to re-join
a replication topology as a replica. However, for servers configured
with both rpl_semi_sync_master_enabled=1 and
rpl_semi_sync_slave_enabled=1, if a primary is just re-started (i.e.
retaining its role as master), it can truncate its binlog to drop
transactions which its replica(s) has already received and executed.
If this happens, when the replica reconnects, its gtid_slave_pos can
be ahead of the recovered primary’s gtid_binlog_pos, resulting in an
error state where the replica’s state is ahead of the primary’s.
This patch changes the condition for semi-sync recovery to truncate
the binlog to instead use the configuration variable
--init-rpl-role, when set to SLAVE. This allows for both
rpl_semi_sync_master_enabled and rpl_semi_sync_slave_enabled to be
set for a primary that is restarted, and no transactions will be
lost, so long as --init-rpl-role is not set to SLAVE.
Reviewed By:
============
Sergei Golubchik <serg@mariadb.com>
that is not in binlog.
Post-crash recovery of --rpl-semi-sync-slave-enabled server
failed to recognize a transaction in-doubt that needed rolled back.
A prepared-but-not-in-binlog transaction gets committed instead
to possibly create inconsistency with a master (e.g the way it was observed
in the bug report).
The semisync recovery is corrected now with initializing binlog coordinates
of any transaction in-doubt to the maximum offset which is
unreachable.
In effect when a prepared transaction that is not found in binlog
it will be decided to rollback because it's guaranteed to reside
in a truncated tail area of binlog.
Mtr tests are reinforced to cover the described scenario.
1. sync_binlog=1 is needed to ensure that on flush binlog is available on
disk.
2. Insert on 'connection master2' should wait for 'master1_ready' from
'connection master1'.
Added sync_binlog=1 to all tests that got added as part of MDEV-21117 fix
@mysql-test/suite/binlog/t/binlog_truncate_multi_log_unsafe.test
@mysql-test/suite/binlog/t/binlog_truncate_multi_engine.test
@mysql-test/suite/binlog/t/binlog_truncate_active_log.test
@mysql-test/suite/binlog/t/binlog_truncate_multi_log.test
Problem:
=======
When the semisync master is crashed and restarted as slave it could
recover transactions that former slaves may never have seen.
A known method existed to clear out all prepared transactions
with --tc-heuristic-recover=rollback does not care to adjust
binlog accordingly.
Fix:
===
The binlog-based recovery is made to concern of the slave semisync role of
post-crash restarted server.
No changes in behavior is done to the "normal" binloggging server
and the semisync master.
When the restarted server is configured with
--rpl-semi-sync-slave-enabled=1
the refined recovery attempts to roll back prepared transactions
and truncate binlog accordingly.
In case of a partially committed (that is committed at least
in one of the engine participants) such transaction gets committed.
It's guaranteed no (partially as well) committed transactions
exist beyond the truncate position.
In case there exists a non-transactional replication event
(being in a way a committed transaction) past the
computed truncate position the recovery ends with an error.
As after master crash and failover to slave, the demoted-to-slave
ex-master must be ready to face and accept its own (generated by)
events, without generally necessary --replicate-same-server-id.
So the acceptance conditions are relaxed for the semisync slave
to accept own events without that option.
While gtid_strict_mode ON ensures no duplicate transaction can be
(re-)executed the master_use_gtid=none slave has to be
configured with --replicate-same-server-id.
*NOTE* for reviewers.
This patch does not handle the user XA which is done
in next git commit.