mariadb/mysql-test/suite/galera/t/galera_bf_lock_wait.test
Marko Mäkelä ddd7d5d8e3 MDEV-24035 Failing assertion: UT_LIST_GET_LEN(lock.trx_locks) == 0 causing disruption and replication failure
Under unknown circumstances, the SQL layer may wrongly disregard an
invocation of thd_mark_transaction_to_rollback() when an InnoDB
transaction had been aborted (rolled back) due to one of the following errors:
* HA_ERR_LOCK_DEADLOCK
* HA_ERR_RECORD_CHANGED (if innodb_snapshot_isolation=ON)
* HA_ERR_LOCK_WAIT_TIMEOUT (if innodb_rollback_on_timeout=ON)

Such an error used to cause a crash of InnoDB during transaction commit.
These changes aim to catch and report the error earlier, so that not only
this crash can be avoided but also the original root cause be found and
fixed more easily later.

The idea of this fix is from Michael 'Monty' Widenius.

HA_ERR_ROLLBACK: A new error code that will be translated into
ER_ROLLBACK_ONLY, signalling that the current transaction
has been aborted and the only allowed action is ROLLBACK.

trx_t::state: Add TRX_STATE_ABORTED that is like
TRX_STATE_NOT_STARTED, but noting that the transaction had been
rolled back and aborted.

trx_t::is_started(): Replaces trx_is_started().

ha_innobase: Check the transaction state in various places.
Simplify the logic around SAVEPOINT.

ha_innobase::is_valid_trx(): Replaces ha_innobase::is_read_only().

The InnoDB logic around transaction savepoints, commit, and rollback
was unnecessarily complex and might have contributed to this
inconsistency. So, we are simplifying that logic as well.

trx_savept_t: Replace with const undo_no_t*. When we rollback to
a savepoint, all we need to know is the number of undo log records
that must survive.

trx_named_savept_t, DB_NO_SAVEPOINT: Remove. We can store undo_no_t
directly in the space allocated at innobase_hton->savepoint_offset.

fts_trx_create(): Do not copy previous savepoints.

fts_savepoint_rollback(): If a savepoint was not found, roll back
everything after the default savepoint of fts_trx_create().
The test innodb_fts.savepoint is extended to cover this code.

Reviewed by: Vladislav Lesin
Tested by: Matthias Leich
2024-12-12 18:02:00 +02:00

100 lines
2.5 KiB
Text

--source include/galera_cluster.inc
--source include/have_innodb.inc
--source include/big_test.inc
--connection node_2
call mtr.add_suppression("InnoDB: Transaction was aborted due to ");
call mtr.add_suppression("WSREP: Trying to continue unpaused monitor");
--connection node_1
call mtr.add_suppression("InnoDB: Transaction was aborted due to ");
call mtr.add_suppression("WSREP: Trying to continue unpaused monitor");
CREATE TABLE t1 ENGINE=InnoDB select 1 as a, 1 as b union select 2, 2;
ALTER TABLE t1 add primary key(a);
DELIMITER |;
CREATE PROCEDURE p1(repeat_count INT)
BEGIN
DECLARE current_num int;
DECLARE CONTINUE HANDLER FOR SQLEXCEPTION rollback;
SET current_num = 0;
WHILE current_num < repeat_count DO
start transaction;
update t1 set b=connection_id() where a=1;
commit;
SET current_num = current_num + 1;
END WHILE;
END|
DELIMITER ;|
--connection node_2
--let $wait_condition = SELECT COUNT(*) = 1 FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_TYPE = 'PROCEDURE' AND ROUTINE_NAME = 'p1'
--source include/wait_condition.inc
--connect node_1_p1, 127.0.0.1, root, , test, $NODE_MYPORT_1
SET SESSION wsrep_sync_wait=0;
send call p1(1000);
--connect node_1_p2, 127.0.0.1, root, , test, $NODE_MYPORT_1
SET SESSION wsrep_sync_wait=0;
send call p1(1000);
--connect node_2_p1, 127.0.0.1, root, , test, $NODE_MYPORT_2
SET SESSION wsrep_sync_wait=0;
send call p1(1000);
--connect node_2_p2, 127.0.0.1, root, , test, $NODE_MYPORT_2
SET SESSION wsrep_sync_wait=0;
send call p1(1000);
connection node_1;
let $counter=10;
let $sleep_period=10;
echo checking error log for 'BF lock wait long' message for $counter times every $sleep_period seconds ...;
--let assert_text= BF lock wait long
--let assert_select= BF lock wait long
--let assert_count= 0
--let assert_only_after= CURRENT_TEST: galera.galera_bf_lock_wait
while($counter > 0)
{
--disable_query_log
--disable_result_log
eval do sleep($sleep_period);
--enable_query_log
--enable_result_log
--let assert_file= $MYSQLTEST_VARDIR/log/mysqld.1.err
--source include/assert_grep.inc
--let assert_file= $MYSQLTEST_VARDIR/log/mysqld.2.err
--source include/assert_grep.inc
dec $counter;
}
--connection node_1_p1
--error 0,1213
--reap
--connection node_1_p2
--error 0,1213
--reap
--connection node_2_p1
--error 0,1213
--reap
--connection node_2_p2
--error 0,1213
--reap
--connection node_1
drop table t1;
drop procedure p1;
--disconnect node_1_p1
--disconnect node_1_p2
--disconnect node_2_p1
--disconnect node_2_p2