mariadb/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect.test
Marko Mäkelä 3cef4f8f0f MDEV-515 Reduce InnoDB undo logging for insert into empty table
We implement an idea that was suggested by Michael 'Monty' Widenius
in October 2017: When InnoDB is inserting into an empty table or partition,
we can write a single undo log record TRX_UNDO_EMPTY, which will cause
ROLLBACK to clear the table.

For this to work, the insert into an empty table or partition must be
covered by an exclusive table lock that will be held until the transaction
has been committed or rolled back, or the INSERT operation has been
rolled back (and the table is empty again), in lock_table_x_unlock().

Clustered index records that are covered by the TRX_UNDO_EMPTY record
will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot
be distinguished from what MDEV-12288 leaves behind after purging the
history of row-logged operations.

Concurrent non-locking reads must be adjusted: If the read view was
created before the INSERT into an empty table, then we must continue
to imagine that the table is empty, and not try to read any records.
If the read view was created after the INSERT was committed, then
all records must be visible normally. To implement this, we introduce
the field dict_table_t::bulk_trx_id.

This special handling only applies to the very first INSERT statement
of a transaction for the empty table or partition. If a subsequent
statement in the transaction is modifying the initially empty table again,
we must enable row-level undo logging, so that we will be able to
roll back to the start of the statement in case of an error (such as
duplicate key).

INSERT IGNORE will continue to use row-level logging and locking, because
implementing it would require the ability to roll back the latest row.
Since the undo log that we write only allows us to roll back the entire
statement, we cannot support INSERT IGNORE. We will introduce a
handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage
engines that INSERT IGNORE is being executed.

In many test cases, we add an extra record to the table, so that during
the 'interesting' part of the test, row-level locking and logging will
be used.

Replicas will continue to use row-level logging and locking until
MDEV-24622 has been addressed. Likewise, this optimization will be
disabled in Galera cluster until MDEV-24623 enables it.

dict_table_t::bulk_trx_id: The latest active or committed transaction
that initiated an insert into an empty table or partition.
Protected by exclusive table lock and a clustered index leaf page latch.

ins_node_t::bulk_insert: Whether bulk insert was initiated.

trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert).
Unlike earlier, this collection will cover also temporary tables.

trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(),
is_bulk_insert(), was_bulk_insert().

trx_undo_report_row_operation(): Before accessing any undo log pages,
invoke trx->mod_tables.emplace() in order to determine whether undo
logging was disabled, or whether this is the first INSERT and we are
supposed to write a TRX_UNDO_EMPTY record.

row_ins_clust_index_entry_low(): If we are inserting into an empty
clustered index leaf page, set the ins_node_t::bulk_insert flag for
the subsequent trx_undo_report_row_operation() call.

lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock():
Remove the redundant parameter 'flags' that can be checked in the caller.

btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write
DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation().

trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT),
ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that
the next statement will not be covered by table-level undo logging.

ReadView::changes_visible(trx_id_t) const: New accessor for the case
where the trx_id_t is not read from a potentially corrupted index page
but directly from the memory. In this case, we can skip a sanity check.

row_sel(), row_sel_try_search_shortcut(), row_search_mvcc():
row_sel_try_search_shortcut_for_mysql(),
row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id.

row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees().

lock_sec_rec_cons_read_sees(): Replaced with lower-level code.

btr_root_page_init(): Refactored from btr_create().

dict_index_t::clear(), dict_table_t::clear(): Empty an index or table,
for the ROLLBACK of an INSERT operation.

ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT
into an empty table.

This is joint work with Thirunarayanan Balathandayuthapani,
who created a working prototype.
Thanks to Matthias Leich for extensive testing.
2021-01-25 18:41:27 +02:00

301 lines
8.1 KiB
Text

# BUG #12161 Xa recovery and client disconnection
# the test verifies that
# a. disconnection does not lose a prepared transaction
# so it can be committed from another connection
# c. the prepared transaction is logged
# d. interleaved prepared transactions are correctly applied on the slave.
#
# Both replication format are checked through explict
# set @@binlog_format in the test.
#
--source include/have_innodb.inc
--source include/have_binlog_format_mixed.inc
#
# Prepared XA can't get available to an external connection
# until a connection, that either leaves actively or is killed,
# has completed a necessary part of its cleanup.
# Selecting from P_S.threads provides a method to learn that.
#
--source include/have_perfschema.inc
--source include/master-slave.inc
--connection master
call mtr.add_suppression("Found 2 prepared XA transactions");
CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
CREATE DATABASE d1;
CREATE DATABASE d2;
CREATE TABLE d1.t (a INT) ENGINE=innodb;
CREATE TABLE d2.t (a INT) ENGINE=innodb;
# MDEV-515 takes X-lock on the table for the first insert.
# So concurrent DML won't happen on the table
INSERT INTO d1.t VALUES(100);
connect (master_conn1, 127.0.0.1,root,,test,$MASTER_MYPORT,);
--let $conn_id=`SELECT connection_id()`
SET @@session.binlog_format= statement;
XA START '1-stmt';
INSERT INTO d1.t VALUES (1);
XA END '1-stmt';
XA PREPARE '1-stmt';
--disconnect master_conn1
--connection master
--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
--source include/wait_condition.inc
connect (master_conn2, 127.0.0.1,root,,test,$MASTER_MYPORT,);
--let $conn_id=`SELECT connection_id()`
SET @@session.binlog_format= row;
XA START '1-row';
INSERT INTO d2.t VALUES (1);
XA END '1-row';
XA PREPARE '1-row';
--disconnect master_conn2
--connection master
--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
--source include/wait_condition.inc
XA START '2';
INSERT INTO d1.t VALUES (2);
XA END '2';
XA PREPARE '2';
XA COMMIT '2';
XA COMMIT '1-row';
XA COMMIT '1-stmt';
source include/show_binlog_events.inc;
# the proof: slave is in sync with the table updated by the prepared transactions.
--source include/sync_slave_sql_with_master.inc
--source include/stop_slave.inc
#
# Recover with Master server restart
#
--connection master
connect (master2, 127.0.0.1,root,,test,$MASTER_MYPORT,);
--connection master2
SET @@session.binlog_format= statement;
XA START '3-stmt';
INSERT INTO d1.t VALUES (3);
XA END '3-stmt';
XA PREPARE '3-stmt';
--disconnect master2
connect (master2, 127.0.0.1,root,,test,$MASTER_MYPORT,);
--connection master2
SET @@session.binlog_format= row;
XA START '3-row';
INSERT INTO d2.t VALUES (4);
XA END '3-row';
XA PREPARE '3-row';
--disconnect master2
--connection master
#
# Testing read-only
#
connect (master2, 127.0.0.1,root,,test,$MASTER_MYPORT,);
--connection master2
XA START '4';
SELECT * FROM d1.t;
XA END '4';
XA PREPARE '4';
--disconnect master2
#
# Logging few disconnected XA:s for replication.
#
--let $bulk_trx_num=10
--let $i = $bulk_trx_num
while($i > 0)
{
--connect (master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,)
--let $conn_id=`SELECT connection_id()`
--eval XA START 'bulk_trx_$i'
--eval INSERT INTO d1.t VALUES ($i)
--eval INSERT INTO d2.t VALUES ($i)
--eval XA END 'bulk_trx_$i'
--eval XA PREPARE 'bulk_trx_$i'
--disconnect master_bulk_conn$i
--connection master
--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
--source include/wait_condition.inc
--dec $i
}
#
# Prove the slave applier is capable to resume the prepared XA:s
# upon its restart.
#
--connection slave
--source include/start_slave.inc
--connection master
--source include/sync_slave_sql_with_master.inc
--source include/stop_slave.inc
--connection master
--let $i = $bulk_trx_num
while($i > 0)
{
--let $command=COMMIT
if (`SELECT $i % 2`)
{
--let $command=ROLLBACK
}
--eval XA $command 'bulk_trx_$i'
--dec $i
}
--let $rpl_server_number= 1
--source include/rpl_restart_server.inc
--connection slave
--source include/start_slave.inc
--connection master
--echo *** '3-stmt','3-row' xa-transactions must be in the list ***
XA RECOVER;
XA COMMIT '3-stmt';
XA ROLLBACK '3-row';
--source include/sync_slave_sql_with_master.inc
#
# Testing replication with marginal XID values and in two formats.
#
--connection master
--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
--source include/wait_condition.inc
# Max size XID incl max value of formatID
--let $formatid_range=`SELECT (1<<31)`
--let $max_formatid=`SELECT (1<<31) - 1`
connect (master_conn2, 127.0.0.1,root,,test,$MASTER_MYPORT,);
--let $conn_id=`SELECT connection_id()`
--let $gtrid=0123456789012345678901234567890123456789012345678901234567890124
--let $bqual=0123456789012345678901234567890123456789012345678901234567890124
--eval XA START '$gtrid','$bqual',$max_formatid
INSERT INTO d1.t VALUES (64);
--eval XA END '$gtrid','$bqual',$max_formatid
--eval XA PREPARE '$gtrid','$bqual',$max_formatid
--disconnect master_conn2
--connection master
--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
--source include/wait_condition.inc
# Max size XID with non-ascii chars
connect (master_conn3, 127.0.0.1,root,,test,$MASTER_MYPORT,);
--let $conn_id=`SELECT connection_id()`
--let $gtrid_hex=FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
--let $bqual_hex=00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
--eval XA START X'$gtrid_hex',X'$bqual_hex',0
INSERT INTO d1.t VALUES (0);
--eval XA END X'$gtrid_hex',X'$bqual_hex',0
--eval XA PREPARE X'$gtrid_hex',X'$bqual_hex',0
--disconnect master_conn3
--connection master
--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
--source include/wait_condition.inc
# Random XID
--disable_query_log
connect (master_conn4, 127.0.0.1,root,,test,$MASTER_MYPORT,);
--let $conn_id=`SELECT connection_id()`
--let $gtridlen=`SELECT 2*(1 + round(rand()*100) % 31)`
--let $bquallen=`SELECT 2*(1 + round(rand()*100) % 31)`
--let $gtrid_rand=`SELECT substring(concat(MD5(rand()), MD5(rand())), 1, $gtridlen)`
--let $bqual_rand=`SELECT substring(concat(MD5(rand()), MD5(rand())), 1, $bquallen)`
--let $formt_rand=`SELECT floor((rand()*10000000000) % $formatid_range)`
--eval XA START X'$gtrid_rand',X'$bqual_rand',$formt_rand
INSERT INTO d1.t VALUES (0);
--eval XA END X'$gtrid_rand',X'$bqual_rand',$formt_rand
--eval XA PREPARE X'$gtrid_rand',X'$bqual_rand',$formt_rand
--enable_query_log
--disconnect master_conn4
--connection master
--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
--source include/wait_condition.inc
--eval XA COMMIT '$gtrid','$bqual',$max_formatid
--eval XA COMMIT X'$gtrid_hex',X'$bqual_hex',0
--disable_query_log
--echo XA COMMIT 'RANDOM XID'
--eval XA COMMIT X'$gtrid_rand',X'$bqual_rand',$formt_rand
--enable_query_log
--source include/sync_slave_sql_with_master.inc
#
# Testing ONE PHASE
#
--let $onephase_trx_num=10
--let $i = $onephase_trx_num
while($i > 0)
{
--connect (master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,)
--connection master_bulk_conn$i
--eval XA START 'one_phase_$i'
--eval INSERT INTO d1.t VALUES ($i)
--eval INSERT INTO d2.t VALUES ($i)
--eval XA END 'one_phase_$i'
--eval XA COMMIT 'one_phase_$i' ONE PHASE
--disconnect master_bulk_conn$i
--dec $i
}
--connection master
--source include/sync_slave_sql_with_master.inc
#
# Overall consistency check
#
--let $diff_tables= master:d1.t, slave:d1.t
--source include/diff_tables.inc
--let $diff_tables= master:d2.t, slave:d2.t
--source include/diff_tables.inc
#
# cleanup
#
--connection master
DELETE FROM d1.t;
DELETE FROM d2.t;
DROP TABLE d1.t, d2.t;
DROP DATABASE d1;
DROP DATABASE d2;
DROP VIEW v_processlist;
--source include/sync_slave_sql_with_master.inc
--source include/rpl_end.inc