MDEV-25180 Atomic ALTER TABLE

MDEV-25604 Atomic DDL: Binlog event written upon recovery does not
           have default database

The purpose of this task is to ensure that ALTER TABLE is atomic even if
the MariaDB server would be killed at any point of the alter table.
This means that either the ALTER TABLE succeeds (including that triggers,
the status tables and the binary log are updated) or things should be
reverted to their original state.

If the server crashes before the new version is fully up to date and
commited, it will revert to the original table and remove all
temporary files and tables.
If the new version is commited, crash recovery will use the new version,
and update triggers, the status tables and the binary log.
The one execption is ALTER TABLE .. RENAME .. where no changes are done
to table definition. This one will work as RENAME and roll back unless
the whole statement completed, including updating the binary log (if
enabled).

Other changes:
- Added handlerton->check_version() function to allow the ddl recovery
  code to check, in case of inplace alter table, if the table in the
  storage engine is of the new or old version.
- Added handler->table_version() so that an engine can report the current
  version of the table. This should be changed each time the table
  definition changes.
- Added  ha_signal_ddl_recovery_done() and
  handlerton::signal_ddl_recovery_done() to inform all handlers when
  ddl recovery has been done. (Needed by InnoDB).
- Added handlerton call inplace_alter_table_committed, to signal engine
  that ddl_log has been closed for the alter table query.
- Added new handerton flag
  HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we
  should call hton->notify_tabledef_changed() during
  mysql_inplace_alter_table. This was required as MyRocks and InnoDB
  needed the call at different times.
- Added function server_uuid_value() to be able to generate a temporary
  xid when ddl recovery writes the query to the binary log. This is
  needed to be able to handle crashes during ddl log recovery.
- Moved freeing of the frm definition to end of mysql_alter_table() to
  remove duplicate code and have a common exit strategy.

-------
InnoDB part of atomic ALTER TABLE
(Implemented by Marko Mäkelä)
innodb_check_version(): Compare the saved dict_table_t::def_trx_id
to determine whether an ALTER TABLE operation was committed.

We must correctly recover dict_table_t::def_trx_id for this to work.
Before purge removes any trace of DB_TRX_ID from system tables, it
will make an effort to load the user table into the cache, so that
the dict_table_t::def_trx_id can be recovered.

ha_innobase::table_version(): return garbage, or the trx_id that would
be used for committing an ALTER TABLE operation.

In InnoDB, table names starting with #sql-ib will remain special:
they will be dropped on startup. This may be revisited later in
MDEV-18518 when we implement proper undo logging and rollback
for creating or dropping multiple tables in a transaction.

Table names starting with #sql will retain some special meaning:
dict_table_t::parse_name() will not consider such names for
MDL acquisition, and dict_table_rename_in_cache() will treat such
names specially when handling FOREIGN KEY constraints.

Simplify InnoDB DROP INDEX.
Prevent purge wakeup

To ensure that dict_table_t::def_trx_id will be recovered correctly
in case the server is killed before ddl_log_complete(), we will block
the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS
between ha_innobase::commit_inplace_alter_table(commit=true)
(purge_sys.stop_SYS()) and purge_sys.resume_SYS().
The completion callback purge_sys.resume_SYS() must be between
ddl_log_complete() and MDL release.

--------

MyRocks support for atomic ALTER TABLE
(Implemented by Sergui Petrunia)

Implement these SE API functions:
- ha_rocksdb::table_version()
- hton->check_version = rocksdb_check_versionMyRocks data dictionary
  now stores table version for each table.
  (Absence of table version record is interpreted as table_version=0,
  that is, which means no upgrade changes are needed)
- For inplace alter table of a partitioned table, call the underlying
  handlerton when checking if the table is ok. This assumes that the
  partition engine commits all changes at once.
This commit is contained in:
Monty 2021-03-18 12:41:08 +02:00 committed by Sergei Golubchik
commit 7762ee5dbe
60 changed files with 8968 additions and 461 deletions

View file

@ -0,0 +1 @@
--innodb-max-dirty-pages-pct=0

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,183 @@
--source include/have_debug.inc
--source include/have_innodb.inc
--source include/have_log_bin.inc
--source include/not_valgrind.inc
#
# Testing of atomic create table with crashes in a lot of different places
#
# Things tested:
# With myisam and InnoDB engines to ensure that cover both normal and
# online alter table paths.
# Alter table with new columns
# Alter table which only touches .frm
# Alter table disable keys (has it own code path)
# Alter table with rename
# Alter table with rename and only options that touches .frm
# Alter table with rename and add new columns
# Alter table with storage engine change (with and without column definition
# changes)
# Alter table with storage engine change and rename
# Alter table to another database
--disable_query_log
call mtr.add_suppression("InnoDB: .* does not exist in the InnoDB internal");
--enable_query_log
let $MYSQLD_DATADIR= `SELECT @@datadir`;
create database test2;
if ($engine_count == "")
{
let $engine_count=2;
let $engines='myisam','innodb';
}
if ($extra_engine == "")
{
let $extra_engine=aria;
}
let $crash_count=13;
let $crash_points='ddl_log_alter_after_create_frm', 'ddl_log_alter_after_create_table', 'ddl_log_alter_after_prepare_inplace','ddl_log_alter_after_copy', 'ddl_log_alter_after_log', 'ddl_log_alter_after_rename_to_backup', 'ddl_log_alter_after_rename_to_backup_log', 'ddl_log_alter_rename_frm', 'ddl_log_alter_after_rename_to_original', 'ddl_log_alter_before_rename_triggers', 'ddl_log_alter_after_rename_triggers', 'ddl_log_alter_after_delete_backup', 'ddl_log_alter_after_drop_original_table';
let $statement_count=16;
let $statements='ALTER TABLE t1 ADD COLUMN c INT, COMMENT "new"',
'ALTER TABLE t1 COMMENT "new"',
'ALTER TABLE t1 change column a c int COMMENT "new"',
'ALTER TABLE t1 ADD COLUMN c INT, COMMENT "new", rename t2',
'ALTER TABLE t1 disable keys',
'ALTER TABLE t1 ADD COLUMN c INT, ALGORITHM=copy, COMMENT "new"',
'ALTER TABLE t1 rename t2',
'ALTER TABLE t1 COMMENT "new", rename t2',
'ALTER TABLE t1 change column a c int COMMENT "new", rename t2',
'ALTER TABLE t1 ENGINE=$extra_engine, COMMENT "new"',
'ALTER TABLE t1 change column a c int COMMENT "new", engine=$extra_engine',
'ALTER TABLE t1 ADD COLUMN c INT, COMMENT "new", rename t2, engine=$extra_engine',
'ALTER TABLE t1 ADD COLUMN c INT, COMMENT "new", rename test2.t2',
'ALTER TABLE t1 COMMENT "new", rename test2.t2',
'ALTER TABLE t1 ADD key(b), COMMENT "new"',
'ALTER TABLE t1 DROP INDEX a';
# If there is a need of testing one specific state (crash point and query),
# one can use the comments below to execute one specific test combination
#let $crash_count=1;
#let $crash_points='ddl_log_alter_after_create_frm';
#let $statement_count= 1;
#let $statements='ALTER TABLE t1 ADD COLUMN c int, COMMENT "new"';
#let $engine_count=1;
#let $engines='rocksdb';
#--source include/have_rocksdb.inc
let $old_debug=`select @@debug_dbug`;
let $e=0;
let $keep_include_silent=1;
let $grep_script=ALTER;
--disable_query_log
while ($e < $engine_count)
{
inc $e;
let $engine=`select ELT($e, $engines)`;
let $default_engine=$engine;
--echo
--echo engine: $engine
--echo
let $r=0;
while ($r < $statement_count)
{
inc $r;
let $statement=`select ELT($r, $statements)`;
--echo
--echo query: $statement
--echo
let $c=0;
while ($c < $crash_count)
{
inc $c;
let $crash=`select ELT($c, $crash_points)`;
--eval create table t1 (a int, b int, key(a)) engine=$engine
insert into t1 values (1,1),(2,2);
commit;
flush tables;
RESET MASTER;
--echo crash point: $crash
if ($crash_count > 1)
{
--exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
}
# The following can be used for testing one specific failure
# if ($crash == "ddl_log_alter_after_log")
# {
# if ($r == 2)
# {
# --remove_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
# }
# }
--disable_reconnect
--eval set @@debug_dbug="+d,$crash",@debug_crash_counter=1
let $errno=0;
--error 0,2013
--eval $statement;
let $error=$errno;
--enable_reconnect
--source include/wait_until_connected_again.inc
--disable_query_log
--eval set @@debug_dbug="$old_debug"
if ($error == 0)
{
echo "No crash!";
}
if ($error != 0)
{
--list_files $MYSQLD_DATADIR/test t*
--list_files $MYSQLD_DATADIR/test *sql*
--list_files $MYSQLD_DATADIR/test2 t*
--list_files $MYSQLD_DATADIR/test2 *sql*
# Check which tables still exists
--error 0,1
--file_exists $MYSQLD_DATADIR/test/t1.frm
let $error2=$errno;
if ($error2 == 0)
{
show create table t1;
select count(*) from t1;
}
if ($error2 == 1)
{
--error 0,1
--file_exists $MYSQLD_DATADIR/test/t2.frm
let $error3=$errno;
if ($error3 == 0)
{
show create table t2;
select count(*) from t2;
}
if ($error3 == 1)
{
--echo "Table is in test2"
show create table test2.t2;
select count(*) from test2.t2;
}
}
--let $binlog_file=master-bin.000001
--source include/show_binlog_events.inc
if ($error)
{
--let $binlog_file=master-bin.000002
--source include/show_binlog_events.inc
}
}
--disable_warnings
drop table if exists t1,t2;
drop table if exists test2.t2;
--enable_warnings
}
}
}
drop database test2;
--enable_query_log

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,7 @@
#
# Test atomic alter table with aria
let $engine_count=1;
let $engines='aria';
let $extra_engine=myisam;
--source alter_table.test

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1,109 @@
--source include/have_debug.inc
--source include/have_sequence.inc
--source include/have_innodb.inc
--source include/have_log_bin.inc
--source include/not_valgrind.inc
#
# Testing of query > 4K. For this we do not have to run many tests as we
# only want to test the query storage, which is identical for all cases.
#
--disable_query_log
call mtr.add_suppression("InnoDB: .* does not exist in the InnoDB internal");
--enable_query_log
let $MYSQLD_DATADIR= `SELECT @@datadir`;
let $engine_count=1;
let $engines='myisam';
let $crash_count=1;
let $crash_points='ddl_log_alter_after_log';
let $statement_count=2;
let $statements='ALTER TABLE t1 ADD COLUMN c INT, COMMENT "new"',
'ALTER TABLE t1 COMMENT "new"';
# If there is a need of testing one specific state (crash point and query),
# one can remove the comments below and modify them.
#let $crash_count=1;
#let $crash_points='ddl_log_alter_before_rename_triggers';
#let $statement_count= 1;
#let $statements='ALTER TABLE t1 change column b c int, COMMENT "new"';
let $old_debug=`select @@debug_dbug`;
let $e=0;
let $keep_include_silent=1;
let $grep_script=ALTER;
--disable_query_log
while ($e < $engine_count)
{
inc $e;
let $engine=`select ELT($e, $engines)`;
let $default_engine=$engine;
--echo
--echo engine: $engine
--echo
let $r=0;
while ($r < $statement_count)
{
inc $r;
let $statement=`select ELT($r, $statements)`;
--echo
--echo query: $statement
let $statement=`select concat(replace('$statement', "new", repeat("x",2000)), "/* long code comment: ", repeat("y",6000), "*/")`;
--echo
let $c=0;
while ($c < $crash_count)
{
inc $c;
let $crash=`select ELT($c, $crash_points)`;
--eval create table t1 (a int, b int) engine=$engine
insert into t1 (a) values (1),(2);
flush tables;
RESET MASTER;
--echo crash point: $crash
--exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
--disable_reconnect
--eval set @@debug_dbug="+d,$crash",@debug_crash_counter=1
let $errno=0;
--error 0,2013
--eval $statement;
let $error=$errno;
--enable_reconnect
--source include/wait_until_connected_again.inc
--disable_query_log
--eval set @@debug_dbug="$old_debug"
if ($error == 0)
{
echo "No crash!";
}
if ($error != 0)
{
--list_files $MYSQLD_DATADIR/test t*
--list_files $MYSQLD_DATADIR/test *sql*
show create table t1;
select sum(a) from t1;
--let $binlog_file=master-bin.000001
--source include/show_binlog_events.inc
if ($error)
{
--let $binlog_file=master-bin.000002
--source include/show_binlog_events.inc
}
}
--disable_warnings
drop table if exists t1,t2;
--enable_warnings
}
}
}
--enable_query_log

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,6 @@
--source include/have_rocksdb.inc
let $engine_count=1;
let $engines='rocksdb';
set global rocksdb_flush_log_at_trx_commit=1;
--source alter_table.test

View file

@ -0,0 +1,131 @@
engine: myisam
query: ALTER TABLE t1 ADD COLUMN c INT, COMMENT "new", rename t2
crash point: ddl_log_alter_before_rename_triggers
t1_trg.TRN
t2.MYD
t2.MYI
t2.TRG
t2.frm
Table Create Table
t2 CREATE TABLE `t2` (
`a` int(11) DEFAULT NULL,
`b` int(11) DEFAULT NULL,
`c` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT='new'
count(*)
2
sum(a)
1003
master-bin.000002 # Query # # use `test`; ALTER TABLE t1 ADD COLUMN c INT, COMMENT "new", rename t2
crash point: ddl_log_alter_after_rename_triggers
t1_trg.TRN
t2.MYD
t2.MYI
t2.TRG
t2.frm
Table Create Table
t2 CREATE TABLE `t2` (
`a` int(11) DEFAULT NULL,
`b` int(11) DEFAULT NULL,
`c` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT='new'
count(*)
2
sum(a)
1003
master-bin.000002 # Query # # use `test`; ALTER TABLE t1 ADD COLUMN c INT, COMMENT "new", rename t2
crash point: ddl_log_alter_after_drop_original_table
t1_trg.TRN
t2.MYD
t2.MYI
t2.TRG
t2.frm
Table Create Table
t2 CREATE TABLE `t2` (
`a` int(11) DEFAULT NULL,
`b` int(11) DEFAULT NULL,
`c` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT='new'
count(*)
2
sum(a)
1003
master-bin.000002 # Query # # use `test`; ALTER TABLE t1 ADD COLUMN c INT, COMMENT "new", rename t2
query: ALTER TABLE t1 COMMENT "new", rename t2
crash point: ddl_log_alter_before_rename_triggers
t1_trg.TRN
t2.MYD
t2.MYI
t2.TRG
t2.frm
Table Create Table
t2 CREATE TABLE `t2` (
`a` int(11) DEFAULT NULL,
`b` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT='new'
count(*)
2
sum(a)
1003
master-bin.000002 # Query # # use `test`; ALTER TABLE t1 COMMENT "new", rename t2
crash point: ddl_log_alter_after_rename_triggers
t1_trg.TRN
t2.MYD
t2.MYI
t2.TRG
t2.frm
Table Create Table
t2 CREATE TABLE `t2` (
`a` int(11) DEFAULT NULL,
`b` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT='new'
count(*)
2
sum(a)
1003
master-bin.000002 # Query # # use `test`; ALTER TABLE t1 COMMENT "new", rename t2
crash point: ddl_log_alter_after_drop_original_table
"No crash!"
query: ALTER TABLE t1 change column b c int, COMMENT "new", rename t2
crash point: ddl_log_alter_before_rename_triggers
t1_trg.TRN
t2.MYD
t2.MYI
t2.TRG
t2.frm
Table Create Table
t2 CREATE TABLE `t2` (
`a` int(11) DEFAULT NULL,
`c` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT='new'
count(*)
2
sum(a)
1003
master-bin.000002 # Query # # use `test`; ALTER TABLE t1 change column b c int, COMMENT "new", rename t2
crash point: ddl_log_alter_after_rename_triggers
t1_trg.TRN
t2.MYD
t2.MYI
t2.TRG
t2.frm
Table Create Table
t2 CREATE TABLE `t2` (
`a` int(11) DEFAULT NULL,
`c` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT='new'
count(*)
2
sum(a)
1003
master-bin.000002 # Query # # use `test`; ALTER TABLE t1 change column b c int, COMMENT "new", rename t2
crash point: ddl_log_alter_after_drop_original_table
"No crash!"

View file

@ -0,0 +1,140 @@
--source include/have_debug.inc
--source include/have_sequence.inc
--source include/have_innodb.inc
--source include/have_log_bin.inc
--source include/not_valgrind.inc
#
# Testing of atomic create table with crashes in a lot of different places
#
# This is very similar to the alter_table.test, but includes testing of
# triggers in with ALTER TABLE .. RENAME.
#
--disable_query_log
call mtr.add_suppression("InnoDB: .* does not exist in the InnoDB internal");
--enable_query_log
let $MYSQLD_DATADIR= `SELECT @@datadir`;
let $engine_count=1;
let $engines='myisam','innodb';
let $crash_count=3;
let $crash_points='ddl_log_alter_before_rename_triggers', 'ddl_log_alter_after_rename_triggers', 'ddl_log_alter_after_drop_original_table';
let $statement_count=3;
let $statements='ALTER TABLE t1 ADD COLUMN c INT, COMMENT "new", rename t2',
'ALTER TABLE t1 COMMENT "new", rename t2',
'ALTER TABLE t1 change column b c int, COMMENT "new", rename t2';
# If there is a need of testing one specific state (crash point and query),
# one can remove the comments below and modify them.
#let $crash_count=1;
#let $crash_points='ddl_log_alter_before_rename_triggers';
#let $statement_count= 1;
#let $statements='ALTER TABLE t1 change column b c int, COMMENT "new", rename t2';
let $old_debug=`select @@debug_dbug`;
let $e=0;
let $keep_include_silent=1;
let $grep_script=ALTER;
--disable_query_log
while ($e < $engine_count)
{
inc $e;
let $engine=`select ELT($e, $engines)`;
let $default_engine=$engine;
--echo
--echo engine: $engine
--echo
let $r=0;
while ($r < $statement_count)
{
inc $r;
let $statement=`select ELT($r, $statements)`;
--echo
--echo query: $statement
--echo
let $c=0;
while ($c < $crash_count)
{
inc $c;
let $crash=`select ELT($c, $crash_points)`;
--eval create table t1 (a int, b int) engine=$engine
insert into t1 (a) values (1),(2);
flush tables;
delimiter |;
create trigger t1_trg before insert on t1 for each row
begin
if isnull(new.a) then
set new.a:= 1000;
end if;
end|
delimiter ;|
RESET MASTER;
--echo crash point: $crash
if ($crash_count != 1)
{
--exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
}
--disable_reconnect
--eval set @@debug_dbug="+d,$crash",@debug_crash_counter=1
let $errno=0;
--error 0,2013
--eval $statement;
let $error=$errno;
--enable_reconnect
--source include/wait_until_connected_again.inc
--disable_query_log
--eval set @@debug_dbug="$old_debug"
if ($error == 0)
{
echo "No crash!";
}
if ($error != 0)
{
--list_files $MYSQLD_DATADIR/test t*
--list_files $MYSQLD_DATADIR/test *sql*
# Check which tables still exists
--error 0,1
--file_exists $MYSQLD_DATADIR/test/t1.frm
let $error2=$errno;
if ($error2 == 0)
{
show create table t1;
# Ensure that triggers work
insert into t1 (a) values(null);
select sum(a) from t1;
}
if ($error2 == 1)
{
show create table t2;
select count(*) from t2;
# Ensure that triggers work
insert into t2 (a) values(null);
select sum(a) from t2;
}
--let $binlog_file=master-bin.000001
--source include/show_binlog_events.inc
if ($error)
{
--let $binlog_file=master-bin.000002
--source include/show_binlog_events.inc
}
}
--disable_warnings
drop table if exists t1,t2;
--enable_warnings
}
}
}
--enable_query_log