mariadb/mysql-test/suite/galera/t/MDEV-22232.test
Jan Lindström e40277d29b MDEV-38218 : Galera test failure on galera_bf_abort_flush_for_export
Problem was in wsrep_handle_mdl_conflict function was comparing
thd->lex->sql_command variable for granted MDL-lock.

There is two possible schedules:

    (1) FLUSH TABLES ... FOR EXPORT that will take MDL-lock (granted_thd).
        INSERT from other node is conflicting operation (request_thd)
        and sees MDL-conflict. Because granted_thd has not executed anything
        else thd->lex->sql_command == SQLCOM_FLUSH and this case was
        correctly handled in wsrep_handle_mdl_conflict i.e. INSERT needs
        to wait.

    (2) FLUSH TABLES ... FOR EXPORT that will take MDL-lock (granted_thd).
        SET SESSION wsrep_sync_wait=0; (granted_thd)
        INSERT from other node is conflicting operation (request_thd)

        However, thd->lex->sql_command is not stored to taken MDL-lock. Now
        as granted_thd is executing SET thd->lex->sql_command != SQLCOM_FLUSH
        and INSERT that is BF will abort it and that means also FTFE is
        killed and MDL-lock relesed. This is incorrect as FTFE has written
        file on filesystem and it can't be really killed.

In this fix wsrep_handle_mdl_conflict is refactored not to use
thd->lex->sql_command as a variable used for decisions. Instead
connection state can be determined also via THD members. E.g.:

    * wsrep_thd_is_toi() || wsrep_thd_is_applying - ongoing TOI or applier
    * wsrep_thd_is_BF - thread is brute force
    * wsrep_thd_is_SR - thread is streaming replication thread
    * thd->current_backup_stage != BACKUP_FINISHED - there's ongoing BACKUP
    * thd->global_read_lock.is_acquired() - ongoing FTWRL
    * thd->locked_tables_mode == LTM_LOCK_TABLES - ongoing FTFE or LOCK TABLES
2026-01-20 10:23:44 +02:00

66 lines
1.8 KiB
Text

#
# MDEV-22232: CTAS execution crashes during replay.
#
# There were multiple problems and two failing scenarios with empty result set
# and with non-empty result set:
# - CTAS didn't add shared keys for selected tables
# - Security context wasn't set on the replayer thread
# - CTAS was retried after failure - now retry disabled
--source include/galera_cluster.inc
--source include/have_debug_sync.inc
--source include/have_debug.inc
--connect con1,127.0.0.1,root,,test,$NODE_MYPORT_1
# Scenario 1
--echo --- CTAS with empty result set ---
CREATE TABLE t1 (a INT) ENGINE=InnoDB;
# Run CTAS until the resulting table gets created,
# then it gets BF aborted by other DDL.
SET DEBUG_SYNC = 'create_table_select_before_create SIGNAL may_run WAIT_FOR bf_abort';
--send
CREATE TABLE t2 SELECT * FROM t1;
# Wait for CTAS to reach the table create point,
# start executing other DDL and BF abort CTAS.
--connection node_1
SET DEBUG_SYNC = 'now WAIT_FOR may_run';
TRUNCATE TABLE t1;
--connection con1
# CTAS gets BF aborted.
--error ER_QUERY_INTERRUPTED, ER_LOCK_DEADLOCK
--reap
# Cleanup
SET DEBUG_SYNC = 'RESET';
# Scenario 2
--echo --- CTAS with non-empty result set ---
INSERT INTO t1 VALUES (10), (20), (30);
# Run CTAS until the resulting table gets created,
# then it gets BF aborted by other DDL.
SET DEBUG_SYNC = 'create_table_select_before_create SIGNAL may_run WAIT_FOR bf_abort';
--send
CREATE TABLE t2 SELECT * FROM t1;
# Wait for CTAS to reach the table create point,
# start executing other DDL and BF abort CTAS.
--connection node_1
SET DEBUG_SYNC = 'now WAIT_FOR may_run';
TRUNCATE TABLE t1;
--connection con1
# CTAS gets BF aborted.
--error ER_QUERY_INTERRUPTED, ER_LOCK_DEADLOCK
--reap
# Cleanup
SET DEBUG_SYNC = 'RESET';
DROP TABLE t1;
--disconnect con1
--source include/galera_end.inc