mariadb/mysql-test/suite
Brandon Nesterenko a394fc0270 MDEV-29981: Replica stops with "Found invalid event in binary log"
Replication can stop in error if a Heartbeat log event is sent to a
replica during rotation. There are two bugs at play:

  1. Prior to MDEV-30128 (added in 11.0), there is a bug when checking
     legacy events. When the replica rotates its relay logs, it
     initializes its Format_description_log_event with binlog version 3
     (this is hard-coded). So immediately after rotation (and until a
     new Format_descriptor with binlog_format 4 is sent from the
     master), the IO thread is expecting binlog_format 3 (i.e. it will
     call queue_old_event() for incoming events). This invalidates any
     events that are sent with an event type higher than 14. In theory,
     we wouldn't expect any events to be sent in-between a rotate and
     the next format descriptor log event, but if a long enough period
     of time passes between then, the primary will generate and send a
     Heartbeat event (of type 27). In such case, the slave will see the
     heartbeat event of type 27, see it is higher than 14, and result
     in an error mentioning 'Found invalid event in binary log', with
     the expected log coordinates of the new log (which is
     optimistically populated from the Rotate log event, not the new
     event).

  2. In all versions of MariaDB (11.0+), there is a bug when checking
     the state of a Heartbeat log event, in that it doesn't consider a
     rotated binary log. The check is meant to ensure that the
     heartbeat provided by the master (i.e. the state of the master) is
     greater than or equal to the state of the slave. In other words,
     it checks that the slave isn't ahead of the master. However, if
     the filename provided by the master heartbeat event is different
     than the filename saved for the slave's state, the check always
     fails. This is broken, because when the master rotates its logs,
     the new binary log file will have a different filename (i.e. an
     incremented index counter suffix). For example, if the master
     rotates its binary logs from master-bin.000002 to
     master-bin.000003, master-bin.000003 is ahead of
     master-bin.000002, but the slave will see a difference between the
     filenames and fail the check.

To fix the first problem, this patch disallows passing a heartbeat
event into queue_old_event (which is the source of the error, as it
tries to parse a heartbeat log event). This function (queue_old_event)
was removed with MDEV-30128, so bypassing it for heartbeat events is
not consequential (and it is already also done for
Format_description_events, which are not supported in old binlog file
versions). Note that backporting all of MDEV-30128 was also considered,
but this is less risky for GA.

To fix the second problem, we simply ignore heartbeat events on the
slave if the filenames don't match. This is because during rotation,
it can appear that the slave is ahead of the master, which breaks the
validity of the check (i.e. the check is to ensure the master is
ahead of the slave).

Additionally note that this patch restores a heartbeat check that was
incorrectly removed in 780db8e252

Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com>
Signed-off-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
2025-08-22 15:04:02 -06:00
..
archive Merge branch '10.6' into '10.11' 2025-04-16 03:34:40 +02:00
atomic Merge branch '10.6' into 10.11 2025-04-26 10:47:03 +02:00
binlog cleanup: select ... into tests 2025-07-17 09:18:18 +02:00
binlog_encryption cleanup: select ... into tests 2025-07-17 09:18:18 +02:00
client
compat Merge branch '10.6' into 10.11 2025-01-30 11:55:13 +01:00
csv
encryption MDEV-37083 fixup for PLUGIN_PERFSCHEMA=NO 2025-08-07 14:48:01 +03:00
engines MDEV-29001 DROP DEFAULT makes SHOW CREATE non-idempotent 2025-07-17 09:18:18 +02:00
federated cleanup: select ... into tests 2025-07-17 09:18:18 +02:00
funcs_1 cleanup: select ... into tests 2025-07-17 09:18:18 +02:00
funcs_2 Merge 10.5 into 10.6 2025-03-26 17:09:57 +02:00
galera Merge branch '10.6' into '10.11' 2025-08-14 22:10:45 +02:00
galera_3nodes Merge branch '10.6' into '10.11' 2025-08-14 22:10:45 +02:00
galera_3nodes_sr galera mtr tests: synchronization between branches and editions 2025-04-02 04:50:11 +02:00
galera_sr galera tests: synchronization between versions and editions 2025-08-14 17:04:40 +02:00
gcol cleanup: select ... into tests 2025-07-17 09:18:18 +02:00
handler Merge branch '10.5' into 10.6 2024-12-17 11:06:09 +11:00
heap Merge branch '10.6' into 10.11 2025-01-30 11:55:13 +01:00
innodb MDEV-37296 ALTER TABLE allows adding unique hash key with duplicate values 2025-08-11 13:29:32 +05:30
innodb_fts Merge branch '10.6' into 10.11 2025-07-28 18:06:31 +02:00
innodb_gis Merge branch '10.6' into 10.11 2025-06-04 14:09:23 +02:00
innodb_i_s
innodb_zip Merge branch '10.6' into 10.11 2025-01-30 11:55:13 +01:00
jp
json MDEV-35614: JSON_UNQUOTE doesn't work with emojis 2025-04-19 08:55:05 +10:00
large_tests
maria Merge branch '10.6' into 10.11 2025-06-04 14:09:23 +02:00
mariabackup MDEV-36159 mariabackup failed after upgrade 2025-08-20 15:30:49 +03:00
mtr/t Remove dates from all rdiff files 2025-01-05 16:40:11 +02:00
mtr2
multi_source MDEV-7611: create multi_source.mariadb-dump_slave 2025-07-10 18:31:36 -06:00
optimizer_unfixed_bugs
parts MDEV-37328 Assertion failure in make_empty_rec upon CONVERT PARTITION 2025-07-28 18:06:11 +02:00
perfschema Merge branch '10.6' into 10.11 2025-06-04 14:09:23 +02:00
perfschema_stress
period Merge branch '10.6' into 10.11 2025-07-28 18:06:31 +02:00
plugins MDEV-30190 Password check plugin prevents changing grants for CURRENT_USER 2025-07-17 09:18:18 +02:00
roles Merge branch '10.6' into 10.11 2025-01-30 11:55:13 +01:00
rpl MDEV-29981: Replica stops with "Found invalid event in binary log" 2025-08-22 15:04:02 -06:00
s3 Merge branch '10.6' into 10.11 2025-06-04 14:09:23 +02:00
sql_sequence Merge branch '10.6' into 10.11 2025-07-28 18:06:31 +02:00
storage_engine
stress MDEV-34453 Trying to read 16384 bytes at 70368744161280 outside the bounds of the file: ./ibdata1 2024-09-20 20:26:43 +05:30
sys_vars MDEV-36385 Fix slave_parallel_threads_basic test in view-protocol mode 2025-08-01 09:10:56 +10:00
sysschema MDEV-37083: Fixed type mismatch in sys views 2025-07-25 17:02:59 +05:30
unit
vcol cleanup: select ... into tests 2025-07-17 09:18:18 +02:00
versioning MDEV-15990 versioning: don't allow changes in the past 2025-08-04 17:44:05 +02:00
wsrep galera mtr tests: post-merge correction for variables_debug.result 2025-08-14 04:17:56 +02:00