mariadb/mysql-test/suite/binlog/t/binlog_truncate_active_log.inc

connect(master1,localhost,root,,);
connect(master2,localhost,root,,);
connect(master3,localhost,root,,);
connect(master4,localhost,root,,);

--connection default

# First to commit few transactions
INSERT INTO t  VALUES (10);
INSERT INTO tm VALUES (10);

--connection master1
# Hold insert after write to binlog and before "run_commit_ordered" in engine
SET DEBUG_SYNC= "commit_before_get_LOCK_commit_ordered SIGNAL master1_ready WAIT_FOR signal_never_arrives";
--send_eval $query1

--connection master2
SET DEBUG_SYNC= "now WAIT_FOR master1_ready";
SET DEBUG_SYNC= "commit_before_get_LOCK_after_binlog_sync SIGNAL master2_ready";
--send_eval $query2

--connection master3
SET DEBUG_SYNC= "now WAIT_FOR master2_ready";
SELECT @@global.gtid_binlog_pos as 'Before the crash';

--connection master4
# Simulate prepared & not-logged trx; it will never recover.
SET DEBUG_SYNC= "ha_commit_trans_before_log_and_order SIGNAL master4_ready WAIT_FOR signal_never_arrives";
--send INSERT INTO t4  VALUES (13)

--connection master3
SET DEBUG_SYNC= "now WAIT_FOR master4_ready";
SELECT @@global.gtid_binlog_pos as 'Before the crash and never logged trx';

--connection default
--source include/kill_mysqld.inc
--disconnect master1
--disconnect master2
--disconnect master3
--disconnect master4

#
# Server restart
#
--let $restart_parameters= --init-rpl-role=SLAVE --sync-binlog=1 --log-warnings=3
--source include/start_mysqld.inc

# Check error log for a successful truncate message.
--let $log_error_ = $MYSQLTEST_VARDIR/log/mysqld.1.err

--let SEARCH_FILE=$log_error_
--let SEARCH_PATTERN=Successfully truncated.*to remove transactions starting from GTID $truncate_gtid_pos

--source include/search_pattern_in_file.inc

--echo Pre-crash binlog file content:
--let $binlog_file= query_get_value(show binary logs, Log_name, $binlog_file_index)
--source include/show_binlog_events.inc

SELECT @@global.gtid_binlog_pos as 'After the crash';
--echo "One row should be present in table 't'"
SELECT * FROM t;
--echo "No row should be present in table 't4'"
SELECT * FROM t4;

# prepare binlog file index for the next test
--inc $binlog_file_index

# Local cleanup
DELETE FROM t;
MDEV-21117: refine the server binlog-based recovery for semisync Problem: ======= When the semisync master is crashed and restarted as slave it could recover transactions that former slaves may never have seen. A known method existed to clear out all prepared transactions with --tc-heuristic-recover=rollback does not care to adjust binlog accordingly. Fix: === The binlog-based recovery is made to concern of the slave semisync role of post-crash restarted server. No changes in behavior is done to the "normal" binloggging server and the semisync master. When the restarted server is configured with --rpl-semi-sync-slave-enabled=1 the refined recovery attempts to roll back prepared transactions and truncate binlog accordingly. In case of a partially committed (that is committed at least in one of the engine participants) such transaction gets committed. It's guaranteed no (partially as well) committed transactions exist beyond the truncate position. In case there exists a non-transactional replication event (being in a way a committed transaction) past the computed truncate position the recovery ends with an error. As after master crash and failover to slave, the demoted-to-slave ex-master must be ready to face and accept its own (generated by) events, without generally necessary --replicate-same-server-id. So the acceptance conditions are relaxed for the semisync slave to accept own events without that option. While gtid_strict_mode ON ensures no duplicate transaction can be (re-)executed the master_use_gtid=none slave has to be configured with --replicate-same-server-id. NOTE for reviewers. This patch does not handle the user XA which is done in next git commit. 2020-04-09 17:15:45 +02:00			`connect(master1,localhost,root,,);`
			`connect(master2,localhost,root,,);`
			`connect(master3,localhost,root,,);`
MDEV-28461 semisync-slave server recovery fails to rollback prepared transaction that is not in binlog. Post-crash recovery of --rpl-semi-sync-slave-enabled server failed to recognize a transaction in-doubt that needed rolled back. A prepared-but-not-in-binlog transaction gets committed instead to possibly create inconsistency with a master (e.g the way it was observed in the bug report). The semisync recovery is corrected now with initializing binlog coordinates of any transaction in-doubt to the maximum offset which is unreachable. In effect when a prepared transaction that is not found in binlog it will be decided to rollback because it's guaranteed to reside in a truncated tail area of binlog. Mtr tests are reinforced to cover the described scenario. 2022-05-11 12:12:48 +02:00			`connect(master4,localhost,root,,);`
MDEV-21117: refine the server binlog-based recovery for semisync Problem: ======= When the semisync master is crashed and restarted as slave it could recover transactions that former slaves may never have seen. A known method existed to clear out all prepared transactions with --tc-heuristic-recover=rollback does not care to adjust binlog accordingly. Fix: === The binlog-based recovery is made to concern of the slave semisync role of post-crash restarted server. No changes in behavior is done to the "normal" binloggging server and the semisync master. When the restarted server is configured with --rpl-semi-sync-slave-enabled=1 the refined recovery attempts to roll back prepared transactions and truncate binlog accordingly. In case of a partially committed (that is committed at least in one of the engine participants) such transaction gets committed. It's guaranteed no (partially as well) committed transactions exist beyond the truncate position. In case there exists a non-transactional replication event (being in a way a committed transaction) past the computed truncate position the recovery ends with an error. As after master crash and failover to slave, the demoted-to-slave ex-master must be ready to face and accept its own (generated by) events, without generally necessary --replicate-same-server-id. So the acceptance conditions are relaxed for the semisync slave to accept own events without that option. While gtid_strict_mode ON ensures no duplicate transaction can be (re-)executed the master_use_gtid=none slave has to be configured with --replicate-same-server-id. NOTE for reviewers. This patch does not handle the user XA which is done in next git commit. 2020-04-09 17:15:45 +02:00
			`--connection default`

			`# First to commit few transactions`
			`INSERT INTO t VALUES (10);`
			`INSERT INTO tm VALUES (10);`

			`--connection master1`
			`# Hold insert after write to binlog and before "run_commit_ordered" in engine`
			`SET DEBUG_SYNC= "commit_before_get_LOCK_commit_ordered SIGNAL master1_ready WAIT_FOR signal_never_arrives";`
			`--send_eval $query1`

			`--connection master2`
			`SET DEBUG_SYNC= "now WAIT_FOR master1_ready";`
			`SET DEBUG_SYNC= "commit_before_get_LOCK_after_binlog_sync SIGNAL master2_ready";`
			`--send_eval $query2`

			`--connection master3`
			`SET DEBUG_SYNC= "now WAIT_FOR master2_ready";`
			`SELECT @@global.gtid_binlog_pos as 'Before the crash';`

MDEV-28461 semisync-slave server recovery fails to rollback prepared transaction that is not in binlog. Post-crash recovery of --rpl-semi-sync-slave-enabled server failed to recognize a transaction in-doubt that needed rolled back. A prepared-but-not-in-binlog transaction gets committed instead to possibly create inconsistency with a master (e.g the way it was observed in the bug report). The semisync recovery is corrected now with initializing binlog coordinates of any transaction in-doubt to the maximum offset which is unreachable. In effect when a prepared transaction that is not found in binlog it will be decided to rollback because it's guaranteed to reside in a truncated tail area of binlog. Mtr tests are reinforced to cover the described scenario. 2022-05-11 12:12:48 +02:00			`--connection master4`
			`# Simulate prepared & not-logged trx; it will never recover.`
			`SET DEBUG_SYNC= "ha_commit_trans_before_log_and_order SIGNAL master4_ready WAIT_FOR signal_never_arrives";`
			`--send INSERT INTO t4 VALUES (13)`

			`--connection master3`
			`SET DEBUG_SYNC= "now WAIT_FOR master4_ready";`
			`SELECT @@global.gtid_binlog_pos as 'Before the crash and never logged trx';`

MDEV-21117: refine the server binlog-based recovery for semisync Problem: ======= When the semisync master is crashed and restarted as slave it could recover transactions that former slaves may never have seen. A known method existed to clear out all prepared transactions with --tc-heuristic-recover=rollback does not care to adjust binlog accordingly. Fix: === The binlog-based recovery is made to concern of the slave semisync role of post-crash restarted server. No changes in behavior is done to the "normal" binloggging server and the semisync master. When the restarted server is configured with --rpl-semi-sync-slave-enabled=1 the refined recovery attempts to roll back prepared transactions and truncate binlog accordingly. In case of a partially committed (that is committed at least in one of the engine participants) such transaction gets committed. It's guaranteed no (partially as well) committed transactions exist beyond the truncate position. In case there exists a non-transactional replication event (being in a way a committed transaction) past the computed truncate position the recovery ends with an error. As after master crash and failover to slave, the demoted-to-slave ex-master must be ready to face and accept its own (generated by) events, without generally necessary --replicate-same-server-id. So the acceptance conditions are relaxed for the semisync slave to accept own events without that option. While gtid_strict_mode ON ensures no duplicate transaction can be (re-)executed the master_use_gtid=none slave has to be configured with --replicate-same-server-id. NOTE for reviewers. This patch does not handle the user XA which is done in next git commit. 2020-04-09 17:15:45 +02:00			`--connection default`
			`--source include/kill_mysqld.inc`
			`--disconnect master1`
			`--disconnect master2`
			`--disconnect master3`
MDEV-28461 semisync-slave server recovery fails to rollback prepared transaction that is not in binlog. Post-crash recovery of --rpl-semi-sync-slave-enabled server failed to recognize a transaction in-doubt that needed rolled back. A prepared-but-not-in-binlog transaction gets committed instead to possibly create inconsistency with a master (e.g the way it was observed in the bug report). The semisync recovery is corrected now with initializing binlog coordinates of any transaction in-doubt to the maximum offset which is unreachable. In effect when a prepared transaction that is not found in binlog it will be decided to rollback because it's guaranteed to reside in a truncated tail area of binlog. Mtr tests are reinforced to cover the described scenario. 2022-05-11 12:12:48 +02:00			`--disconnect master4`
MDEV-21117: refine the server binlog-based recovery for semisync Problem: ======= When the semisync master is crashed and restarted as slave it could recover transactions that former slaves may never have seen. A known method existed to clear out all prepared transactions with --tc-heuristic-recover=rollback does not care to adjust binlog accordingly. Fix: === The binlog-based recovery is made to concern of the slave semisync role of post-crash restarted server. No changes in behavior is done to the "normal" binloggging server and the semisync master. When the restarted server is configured with --rpl-semi-sync-slave-enabled=1 the refined recovery attempts to roll back prepared transactions and truncate binlog accordingly. In case of a partially committed (that is committed at least in one of the engine participants) such transaction gets committed. It's guaranteed no (partially as well) committed transactions exist beyond the truncate position. In case there exists a non-transactional replication event (being in a way a committed transaction) past the computed truncate position the recovery ends with an error. As after master crash and failover to slave, the demoted-to-slave ex-master must be ready to face and accept its own (generated by) events, without generally necessary --replicate-same-server-id. So the acceptance conditions are relaxed for the semisync slave to accept own events without that option. While gtid_strict_mode ON ensures no duplicate transaction can be (re-)executed the master_use_gtid=none slave has to be configured with --replicate-same-server-id. NOTE for reviewers. This patch does not handle the user XA which is done in next git commit. 2020-04-09 17:15:45 +02:00
			`#`
			`# Server restart`
			`#`
MDEV-33465: an option to enable semisync recovery The current semi-sync binlog fail-over recovery process uses rpl_semi_sync_slave_enabled==TRUE as its condition to truncate a primary server’s binlog, as it is anticipating the server to re-join a replication topology as a replica. However, for servers configured with both rpl_semi_sync_master_enabled=1 and rpl_semi_sync_slave_enabled=1, if a primary is just re-started (i.e. retaining its role as master), it can truncate its binlog to drop transactions which its replica(s) has already received and executed. If this happens, when the replica reconnects, its gtid_slave_pos can be ahead of the recovered primary’s gtid_binlog_pos, resulting in an error state where the replica’s state is ahead of the primary’s. This patch changes the condition for semi-sync recovery to truncate the binlog to instead use the configuration variable --init-rpl-role, when set to SLAVE. This allows for both rpl_semi_sync_master_enabled and rpl_semi_sync_slave_enabled to be set for a primary that is restarted, and no transactions will be lost, so long as --init-rpl-role is not set to SLAVE. Reviewed By: ============ Sergei Golubchik <serg@mariadb.com> 2024-02-29 19:51:09 +01:00			`--let $restart_parameters= --init-rpl-role=SLAVE --sync-binlog=1 --log-warnings=3`
MDEV-21117: refine the server binlog-based recovery for semisync Problem: ======= When the semisync master is crashed and restarted as slave it could recover transactions that former slaves may never have seen. A known method existed to clear out all prepared transactions with --tc-heuristic-recover=rollback does not care to adjust binlog accordingly. Fix: === The binlog-based recovery is made to concern of the slave semisync role of post-crash restarted server. No changes in behavior is done to the "normal" binloggging server and the semisync master. When the restarted server is configured with --rpl-semi-sync-slave-enabled=1 the refined recovery attempts to roll back prepared transactions and truncate binlog accordingly. In case of a partially committed (that is committed at least in one of the engine participants) such transaction gets committed. It's guaranteed no (partially as well) committed transactions exist beyond the truncate position. In case there exists a non-transactional replication event (being in a way a committed transaction) past the computed truncate position the recovery ends with an error. As after master crash and failover to slave, the demoted-to-slave ex-master must be ready to face and accept its own (generated by) events, without generally necessary --replicate-same-server-id. So the acceptance conditions are relaxed for the semisync slave to accept own events without that option. While gtid_strict_mode ON ensures no duplicate transaction can be (re-)executed the master_use_gtid=none slave has to be configured with --replicate-same-server-id. NOTE for reviewers. This patch does not handle the user XA which is done in next git commit. 2020-04-09 17:15:45 +02:00			`--source include/start_mysqld.inc`

			`# Check error log for a successful truncate message.`
			`--let $log_error_ = $MYSQLTEST_VARDIR/log/mysqld.1.err`

			`--let SEARCH_FILE=$log_error_`
			`--let SEARCH_PATTERN=Successfully truncated.*to remove transactions starting from GTID $truncate_gtid_pos`
MDEV-25962: binlog.binlog_truncate_multi_log_unsafe test fails in buildbot 1. sync_binlog=1 is needed to ensure that on flush binlog is available on disk. 2. Insert on 'connection master2' should wait for 'master1_ready' from 'connection master1'. Added sync_binlog=1 to all tests that got added as part of MDEV-21117 fix @mysql-test/suite/binlog/t/binlog_truncate_multi_log_unsafe.test @mysql-test/suite/binlog/t/binlog_truncate_multi_engine.test @mysql-test/suite/binlog/t/binlog_truncate_active_log.test @mysql-test/suite/binlog/t/binlog_truncate_multi_log.test 2021-08-13 10:31:26 +02:00
MDEV-21117: refine the server binlog-based recovery for semisync Problem: ======= When the semisync master is crashed and restarted as slave it could recover transactions that former slaves may never have seen. A known method existed to clear out all prepared transactions with --tc-heuristic-recover=rollback does not care to adjust binlog accordingly. Fix: === The binlog-based recovery is made to concern of the slave semisync role of post-crash restarted server. No changes in behavior is done to the "normal" binloggging server and the semisync master. When the restarted server is configured with --rpl-semi-sync-slave-enabled=1 the refined recovery attempts to roll back prepared transactions and truncate binlog accordingly. In case of a partially committed (that is committed at least in one of the engine participants) such transaction gets committed. It's guaranteed no (partially as well) committed transactions exist beyond the truncate position. In case there exists a non-transactional replication event (being in a way a committed transaction) past the computed truncate position the recovery ends with an error. As after master crash and failover to slave, the demoted-to-slave ex-master must be ready to face and accept its own (generated by) events, without generally necessary --replicate-same-server-id. So the acceptance conditions are relaxed for the semisync slave to accept own events without that option. While gtid_strict_mode ON ensures no duplicate transaction can be (re-)executed the master_use_gtid=none slave has to be configured with --replicate-same-server-id. NOTE for reviewers. This patch does not handle the user XA which is done in next git commit. 2020-04-09 17:15:45 +02:00			`--source include/search_pattern_in_file.inc`

			`--echo Pre-crash binlog file content:`
			`--let $binlog_file= query_get_value(show binary logs, Log_name, $binlog_file_index)`
			`--source include/show_binlog_events.inc`

			`SELECT @@global.gtid_binlog_pos as 'After the crash';`
			`--echo "One row should be present in table 't'"`
			`SELECT * FROM t;`
MDEV-28461 semisync-slave server recovery fails to rollback prepared transaction that is not in binlog. Post-crash recovery of --rpl-semi-sync-slave-enabled server failed to recognize a transaction in-doubt that needed rolled back. A prepared-but-not-in-binlog transaction gets committed instead to possibly create inconsistency with a master (e.g the way it was observed in the bug report). The semisync recovery is corrected now with initializing binlog coordinates of any transaction in-doubt to the maximum offset which is unreachable. In effect when a prepared transaction that is not found in binlog it will be decided to rollback because it's guaranteed to reside in a truncated tail area of binlog. Mtr tests are reinforced to cover the described scenario. 2022-05-11 12:12:48 +02:00			`--echo "No row should be present in table 't4'"`
			`SELECT * FROM t4;`
MDEV-21117: refine the server binlog-based recovery for semisync Problem: ======= When the semisync master is crashed and restarted as slave it could recover transactions that former slaves may never have seen. A known method existed to clear out all prepared transactions with --tc-heuristic-recover=rollback does not care to adjust binlog accordingly. Fix: === The binlog-based recovery is made to concern of the slave semisync role of post-crash restarted server. No changes in behavior is done to the "normal" binloggging server and the semisync master. When the restarted server is configured with --rpl-semi-sync-slave-enabled=1 the refined recovery attempts to roll back prepared transactions and truncate binlog accordingly. In case of a partially committed (that is committed at least in one of the engine participants) such transaction gets committed. It's guaranteed no (partially as well) committed transactions exist beyond the truncate position. In case there exists a non-transactional replication event (being in a way a committed transaction) past the computed truncate position the recovery ends with an error. As after master crash and failover to slave, the demoted-to-slave ex-master must be ready to face and accept its own (generated by) events, without generally necessary --replicate-same-server-id. So the acceptance conditions are relaxed for the semisync slave to accept own events without that option. While gtid_strict_mode ON ensures no duplicate transaction can be (re-)executed the master_use_gtid=none slave has to be configured with --replicate-same-server-id. NOTE for reviewers. This patch does not handle the user XA which is done in next git commit. 2020-04-09 17:15:45 +02:00
			`# prepare binlog file index for the next test`
			`--inc $binlog_file_index`

			`# Local cleanup`
			`DELETE FROM t;`