mariadb/mysql-test/suite/innodb/t/max_trx_no_recovery.test
Vlad Lesin 9c6017474f MDEV-36845 InnoDB: Failing assertion: tail.trx_no <= last_trx_no
The scenario of the bug is the following. Before killing the server some
transaction A starts undo log writing in some undo segment U of rseg R.
It writes its trx_id into the undo log header. Then new trx_id is assigned
to transaction B, but undo log hasn't been started yet. Then transaction
A commits and writes trx_no into its undo log header. Transaction B
starts writing undo log into the undo segment U. So we have the
following undo logs in the undo segments U:

... undo log 1...
... undo log 2...
      ...
undo log A, trx_id: L, trx_no: M, ...
undo log B, trx_id: N, trx_no: 0, ...

Where L < N < M.

Then server is killed.

On recovery the maximum trx_no is extracted from each rseg, and the
maximum trx_no among all rsegs plus one is considered as a new value
for server-wide transaction id/no counter.

For each undo segment of each rseg we read the last undo log header. If
the last undo log is committed, then we read trx_no from the header,
otherwise we treat trx_id as trx_no. The maximum trx_no from all undo
log segments of the current rseg is treated as the maximum trx_no of the
rseg.

For the above case the undo log of transaction B is not committed and
its trx_no is 0. So we read trx_id and treat it as trx_no. But M < N. If
U is the last modified undo segment in rseg R, and trx_(id/no) N is the
maximum trx_no among all rsegs, then there can be the case when after
recovery some transaction with trx_no_C, such as N < trx_no_C <= M, is
committed.

During a purging we store trx_no of the last parsed undo log of a
committed transaction in purge_sys.tail.trx_no. So if the last parsed
undo log is the undo log of transaction A(transaction B was rolled back
on recovery and its undo log was also removed from the undo segment U),
then purse_sys.tail.trx_no = M. Than if some other transaction C with
trx_no_C <= M is being committed and purged, we hit
"tail.trx_no <= last_trx_no" assertion failure in
purge_sys_t::choose_next_log(), because purge queue is min-heap of
(trx_no, trx_sys.rseg_array index) pairs, where the key is trx_no, and it
must not be that trx_no of the last parsed undo log of a committed
transaction is greater than the last trx_no of the rseg at the top of
the queue.

The fix is to read the trx_no of the previous to last undo log in undo
segment, if the last undo log in that undo segment is not committed, and
set trx_no=max(trx_id of the last undo log, trx_no of the previous to
last undo log) during recovery.

We can do this because we need to extract the maximum
value of trx_no or trx_id of the undo log segment, and the maximum value
is either trx_id of the last undo log or trx_no of the previous to
last undo log, because undo segment can be assigned only to the one
transaction at time, and undo logs in the undo segment are ordered by
trx_id.

Reviewed by Marko Mäkelä.
2025-11-19 10:47:07 +03:00

135 lines
No EOL
4.6 KiB
Text

--source include/have_innodb.inc
# The embedded server tests do not support restarting.
--source include/not_embedded.inc
--source include/have_debug.inc
--source include/have_debug_sync.inc
--let transactions_count=10
--let perl_inc_file=$MYSQL_TMP_DIR/max_trx_no_recovery.inc
--let $i=$transactions_count
# We need several tables to avoid blocking on records locking and pages latching
while($i) {
--dec $i
if ($i == 0) {
--eval CREATE TABLE t_1_$i(a INT primary key) Engine=InnoDB STATS_PERSISTENT=0
--eval INSERT INTO t_1_$i SET a = 10
}
--eval CREATE TABLE t_2_$i(a INT primary key) Engine=InnoDB STATS_PERSISTENT=0
--eval INSERT INTO t_2_$i SET a = 10
}
--source include/wait_all_purged.inc
--connect (prevent_purge,localhost,root,,)
START TRANSACTION WITH CONSISTENT SNAPSHOT;
# Let all the following redo logs will be pushed in the same rseg
SET GLOBAL DEBUG_DBUG ="+d,assign_same_rseg";
--connection default
# Add the first undo log to the history of trx_sys.rseg_array[0] to fordbid
# "if (auto len= flst_get_len(TRX_RSEG + TRX_RSEG_HISTORY + ...)" branch of
# "trx_rseg_mem_restore()" to update rseg->needs_purge. Let trx_id and trx_no
# of the following transaction is L and M correspondingly.
UPDATE t_2_1 SET a=150;
--connect (con_1_0, localhost,root,,)
BEGIN;
# After the following transaction the undo segment will contain the following
# undo logs:
# sn: 0, trx_id: L, trx_no: M
# sn: 1, trx_id: N=M+1, trx_no: 0
UPDATE t_1_0 SET a=15;
--connect (con_2_0, localhost,root,,)
BEGIN;
# Stop the following "UPDATE" just before the undo log creating.
SET DEBUG_SYNC="before_undo_log_trx_id_write SIGNAL before_0 WAIT_FOR cont_0";
SET DEBUG_SYNC="after_undo_log_trx_id_write SIGNAL after_0";
--send UPDATE t_2_0 SET a=15
--connection con_1_0
SET DEBUG_SYNC="now WAIT_FOR before_0";
# The following trick with perl code, which generates mysqltest code, is applied
# to bypass the restriction of mysqltest in substitution variable values in
# connection names.
--perl END_OF_FILE
use strict;
use warnings;
my $out_file = $ENV{'perl_inc_file'};
my $transactions_count = $ENV{'transactions_count'};
open (my $output, '>', $out_file) or die $!;
my $i = 1;
while ($i < $transactions_count) {
print $output
"--connect (con_2_$i, localhost,root,,)\n".
"BEGIN;\n".
"SET DEBUG_SYNC=\"before_undo_log_trx_id_write SIGNAL before_$i WAIT_FOR cont_$i\";\n".
"--send UPDATE t_2_$i SET a=15\n".
"--connection default\n".
"SET DEBUG_SYNC=\"now WAIT_FOR before_$i\";\n";
++$i;
}
close $output;
END_OF_FILE
# The generated code starts rw transactions just to increase
# trx_sys.m_max_trx_id by the number of started transactions S.
--source $perl_inc_file
--remove_file $perl_inc_file
--connection con_1_0
# After the following COMMIT and debug sync points the undo segment will contain
# the following undo logs:
# sn: 0, trx_id: L, trx_no: M
# sn: 1, trx_id: N=M+1, trx_no: N+S
# sn: 2, trx_id: N+1, trx_no: 0
COMMIT;
SET DEBUG_SYNC="now SIGNAL cont_0";
SET DEBUG_SYNC="now WAIT_FOR after_0";
--source include/kill_and_restart_mysqld.inc
# The trx_no of the last undo log is 0 and it's in active state, that's why
# the rseg->needs_purge is set to trx_id+1 = N+1+1 in
# trx_undo_mem_create_at_db_start(). The trx_sys.m_max_trx_id is set to maximum
# value of rseg->needs_purge among all rsegs + 1, i.e. for our case it's
# N+1+1+1.
#
# After uncommitted transactions rolling back on recovery, the undo
# segment will contain the following undo logs:
# sn: 0, trx_id: L, trx_no: M
# sn: 1, trx_id: N=M+1, trx_no: N+S
# and
# trx_sys.m_max_trx_id =
# N + 3(see above)
# + 1(row_merge_drop_temp_indexes())
# + 1(active transaction rollback)
# So trx_sys.m_max_trx_id = N + 5.
#
# The trx_no of the last undo log in the above undo segment is N+S. In the case
# if S > 5, the trx_no of one of the following transactions will be greater than
# N + S, i.e. the trx_no of the last undo log in the undo segment.
#
# That is treated as error in purge thread, because purge queue is
# min-heap of (trx_no, trx_sys.rseg_array index) pairs, where the key is trx_no,
# and it must not be that trx_no of the last parsed undo log of a committed
# transaction is greater than the last trx_no of the rseg at the top of
# the queue.
INSERT INTO t_2_1 SET a = 20;
INSERT INTO t_2_1 SET a = 30;
INSERT INTO t_2_1 SET a = 40;
INSERT INTO t_2_1 SET a = 50;
INSERT INTO t_2_1 SET a = 60;
# Assertion ut_a(tail.trx_no <= last_trx_no) failures in
# purge_sys_t::choose_next_log if the bug is not fixed.
SET GLOBAL innodb_max_purge_lag_wait=0;
--let $i=$transactions_count
while($i) {
--dec $i
if ($i == 0) {
--eval DROP TABLE t_1_$i
}
--eval DROP TABLE t_2_$i
}