2018-05-14 15:16:25 +02:00
|
|
|
--source include/have_innodb.inc
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
# The embedded server does not support restarting in mysql-test-run.
|
|
|
|
-- source include/not_embedded.inc
|
2018-05-14 15:16:25 +02:00
|
|
|
-- source include/no_valgrind_without_big.inc
|
MDEV-12026: Implement innodb_checksum_algorithm=full_crc32
MariaDB data-at-rest encryption (innodb_encrypt_tables)
had repurposed the same unused data field that was repurposed
in MySQL 5.7 (and MariaDB 10.2) for the Split Sequence Number (SSN)
field of SPATIAL INDEX. Because of this, MariaDB was unable to
support encryption on SPATIAL INDEX pages.
Furthermore, InnoDB page checksums skipped some bytes, and there
are multiple variations and checksum algorithms. By default,
InnoDB accepts all variations of all algorithms that ever existed.
This unnecessarily weakens the page checksums.
We hereby introduce two more innodb_checksum_algorithm variants
(full_crc32, strict_full_crc32) that are special in a way:
When either setting is active, newly created data files will
carry a flag (fil_space_t::full_crc32()) that indicates that
all pages of the file will use a full CRC-32C checksum over the
entire page contents (excluding the bytes where the checksum
is stored, at the very end of the page). Such files will always
use that checksum, no matter what the parameter
innodb_checksum_algorithm is assigned to.
For old files, the old checksum algorithms will continue to be
used. The value strict_full_crc32 will be equivalent to strict_crc32
and the value full_crc32 will be equivalent to crc32.
ROW_FORMAT=COMPRESSED tables will only use the old format.
These tables do not support new features, such as larger
innodb_page_size or instant ADD/DROP COLUMN. They may be
deprecated in the future. We do not want an unnecessary
file format change for them.
The new full_crc32() format also cleans up the MariaDB tablespace
flags. We will reserve flags to store the page_compressed
compression algorithm, and to store the compressed payload length,
so that checksum can be computed over the compressed (and
possibly encrypted) stream and can be validated without
decrypting or decompressing the page.
In the full_crc32 format, there no longer are separate before-encryption
and after-encryption checksums for pages. The single checksum is
computed on the page contents that is written to the file.
We do not make the new algorithm the default for two reasons.
First, MariaDB 10.4.2 was a beta release, and the default values
of parameters should not change after beta. Second, we did not
yet implement the full_crc32 format for page_compressed pages.
This will be fixed in MDEV-18644.
This is joint work with Marko Mäkelä.
2019-02-19 20:00:00 +01:00
|
|
|
-- source include/innodb_checksum_algorithm.inc
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
|
|
|
|
let MYSQLD_DATADIR=`select @@datadir`;
|
|
|
|
let PAGE_SIZE=`select @@innodb_page_size`;
|
|
|
|
|
|
|
|
-- disable_query_log
|
|
|
|
call mtr.add_suppression("InnoDB: innodb_force_recovery is on.");
|
2018-05-14 15:16:25 +02:00
|
|
|
call mtr.add_suppression("InnoDB: Ignoring tablespace for.*bug16720368");
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
call mtr.add_suppression("Found 1 prepared XA transactions");
|
|
|
|
call mtr.add_suppression("InnoDB: Operating system error.*in a file operation");
|
|
|
|
call mtr.add_suppression("InnoDB: \(The error means\|If you are\)");
|
2018-05-14 15:16:25 +02:00
|
|
|
call mtr.add_suppression("InnoDB: Ignoring tablespace `test/bug16720368` because it could not be opened");
|
|
|
|
call mtr.add_suppression("InnoDB: Tablespace .* was not found at.*bug16735660");
|
|
|
|
call mtr.add_suppression("InnoDB: Set innodb_force_recovery=1 to ignore this and to permanently lose all changes to the tablespace.");
|
|
|
|
call mtr.add_suppression("InnoDB: Plugin initialization aborted*");
|
|
|
|
call mtr.add_suppression("Plugin 'InnoDB' init function returned error.");
|
|
|
|
call mtr.add_suppression("Plugin 'InnoDB' registration as a STORAGE ENGINE failed.");
|
2019-04-03 15:10:20 +02:00
|
|
|
call mtr.add_suppression("InnoDB: Table `test`\\.`bug16720368` is corrupted");
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
-- enable_query_log
|
|
|
|
|
|
|
|
-- echo #
|
|
|
|
-- echo # Bug#16720368 INNODB CRASHES ON BROKEN #SQL*.IBD FILE AT STARTUP
|
|
|
|
-- echo #
|
|
|
|
|
|
|
|
SET GLOBAL innodb_file_per_table=1;
|
2018-10-10 05:14:14 +02:00
|
|
|
SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
|
|
|
|
CREATE TABLE bug16720368_1 (a INT PRIMARY KEY) ENGINE=InnoDB;
|
|
|
|
|
|
|
|
connect (con1,localhost,root);
|
|
|
|
CREATE TABLE bug16720368 (a INT PRIMARY KEY, b INT) ENGINE=InnoDB;
|
|
|
|
INSERT INTO bug16720368 (a) VALUES (1),(2),(3),(4),(5),(6),(7),(8);
|
2018-10-10 05:14:14 +02:00
|
|
|
--source include/wait_all_purged.inc
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
|
|
|
|
connection default;
|
|
|
|
|
|
|
|
-- echo # Cleanly shutdown mysqld
|
|
|
|
-- source include/shutdown_mysqld.inc
|
|
|
|
|
|
|
|
disconnect con1;
|
|
|
|
|
|
|
|
-- echo # Corrupt FIL_PAGE_OFFSET in bug16720368.ibd,
|
2019-02-16 11:06:52 +01:00
|
|
|
-- echo # and recompute innodb_checksum_algorithm=crc32
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
perl;
|
2019-02-16 11:06:52 +01:00
|
|
|
do "$ENV{MTR_SUITE_DIR}/include/crc32.pl";
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
my $file = "$ENV{MYSQLD_DATADIR}/test/bug16720368.ibd";
|
|
|
|
open(FILE, "+<$file") || die "Unable to open $file";
|
2019-02-16 11:06:52 +01:00
|
|
|
binmode FILE;
|
|
|
|
my $ps= $ENV{PAGE_SIZE};
|
|
|
|
my $page;
|
|
|
|
die "Unable to read $file" unless sysread(FILE, $page, $ps) == $ps;
|
2019-04-05 10:41:03 +02:00
|
|
|
my $full_crc32 = unpack("N",substr($page,54,4)) & 0x10; # FIL_SPACE_FLAGS
|
2019-04-03 15:10:20 +02:00
|
|
|
sysseek(FILE, 3*$ps, 0) || die "Unable to seek $file\n";
|
2019-02-16 11:06:52 +01:00
|
|
|
die "Unable to read $file" unless sysread(FILE, $page, $ps) == $ps;
|
|
|
|
substr($page,4,4)=pack("N",0xc001cafe);
|
|
|
|
my $polynomial = 0x82f63b78; # CRC-32C
|
MDEV-12026: Implement innodb_checksum_algorithm=full_crc32
MariaDB data-at-rest encryption (innodb_encrypt_tables)
had repurposed the same unused data field that was repurposed
in MySQL 5.7 (and MariaDB 10.2) for the Split Sequence Number (SSN)
field of SPATIAL INDEX. Because of this, MariaDB was unable to
support encryption on SPATIAL INDEX pages.
Furthermore, InnoDB page checksums skipped some bytes, and there
are multiple variations and checksum algorithms. By default,
InnoDB accepts all variations of all algorithms that ever existed.
This unnecessarily weakens the page checksums.
We hereby introduce two more innodb_checksum_algorithm variants
(full_crc32, strict_full_crc32) that are special in a way:
When either setting is active, newly created data files will
carry a flag (fil_space_t::full_crc32()) that indicates that
all pages of the file will use a full CRC-32C checksum over the
entire page contents (excluding the bytes where the checksum
is stored, at the very end of the page). Such files will always
use that checksum, no matter what the parameter
innodb_checksum_algorithm is assigned to.
For old files, the old checksum algorithms will continue to be
used. The value strict_full_crc32 will be equivalent to strict_crc32
and the value full_crc32 will be equivalent to crc32.
ROW_FORMAT=COMPRESSED tables will only use the old format.
These tables do not support new features, such as larger
innodb_page_size or instant ADD/DROP COLUMN. They may be
deprecated in the future. We do not want an unnecessary
file format change for them.
The new full_crc32() format also cleans up the MariaDB tablespace
flags. We will reserve flags to store the page_compressed
compression algorithm, and to store the compressed payload length,
so that checksum can be computed over the compressed (and
possibly encrypted) stream and can be validated without
decrypting or decompressing the page.
In the full_crc32 format, there no longer are separate before-encryption
and after-encryption checksums for pages. The single checksum is
computed on the page contents that is written to the file.
We do not make the new algorithm the default for two reasons.
First, MariaDB 10.4.2 was a beta release, and the default values
of parameters should not change after beta. Second, we did not
yet implement the full_crc32 format for page_compressed pages.
This will be fixed in MDEV-18644.
This is joint work with Marko Mäkelä.
2019-02-19 20:00:00 +01:00
|
|
|
if ($full_crc32)
|
|
|
|
{
|
|
|
|
my $ck = mycrc32(substr($page, 0, $ps-4), 0, $polynomial);
|
|
|
|
substr($page, $ps-4, 4) = pack("N", $ck);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
my $ck= pack("N",mycrc32(substr($page, 4, 22), 0, $polynomial) ^
|
2019-02-16 11:06:52 +01:00
|
|
|
mycrc32(substr($page, 38, $ps - 38 - 8), 0, $polynomial));
|
MDEV-12026: Implement innodb_checksum_algorithm=full_crc32
MariaDB data-at-rest encryption (innodb_encrypt_tables)
had repurposed the same unused data field that was repurposed
in MySQL 5.7 (and MariaDB 10.2) for the Split Sequence Number (SSN)
field of SPATIAL INDEX. Because of this, MariaDB was unable to
support encryption on SPATIAL INDEX pages.
Furthermore, InnoDB page checksums skipped some bytes, and there
are multiple variations and checksum algorithms. By default,
InnoDB accepts all variations of all algorithms that ever existed.
This unnecessarily weakens the page checksums.
We hereby introduce two more innodb_checksum_algorithm variants
(full_crc32, strict_full_crc32) that are special in a way:
When either setting is active, newly created data files will
carry a flag (fil_space_t::full_crc32()) that indicates that
all pages of the file will use a full CRC-32C checksum over the
entire page contents (excluding the bytes where the checksum
is stored, at the very end of the page). Such files will always
use that checksum, no matter what the parameter
innodb_checksum_algorithm is assigned to.
For old files, the old checksum algorithms will continue to be
used. The value strict_full_crc32 will be equivalent to strict_crc32
and the value full_crc32 will be equivalent to crc32.
ROW_FORMAT=COMPRESSED tables will only use the old format.
These tables do not support new features, such as larger
innodb_page_size or instant ADD/DROP COLUMN. They may be
deprecated in the future. We do not want an unnecessary
file format change for them.
The new full_crc32() format also cleans up the MariaDB tablespace
flags. We will reserve flags to store the page_compressed
compression algorithm, and to store the compressed payload length,
so that checksum can be computed over the compressed (and
possibly encrypted) stream and can be validated without
decrypting or decompressing the page.
In the full_crc32 format, there no longer are separate before-encryption
and after-encryption checksums for pages. The single checksum is
computed on the page contents that is written to the file.
We do not make the new algorithm the default for two reasons.
First, MariaDB 10.4.2 was a beta release, and the default values
of parameters should not change after beta. Second, we did not
yet implement the full_crc32 format for page_compressed pages.
This will be fixed in MDEV-18644.
This is joint work with Marko Mäkelä.
2019-02-19 20:00:00 +01:00
|
|
|
substr($page,0,4)=$ck;
|
|
|
|
substr($page,$ps-8,4)=$ck;
|
|
|
|
}
|
2019-04-03 15:10:20 +02:00
|
|
|
sysseek(FILE, 3*$ps, 0) || die "Unable to rewind $file\n";
|
2019-02-16 11:06:52 +01:00
|
|
|
syswrite(FILE, $page, $ps)==$ps || die "Unable to write $file\n";
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
close(FILE) || die "Unable to close $file";
|
|
|
|
EOF
|
|
|
|
|
|
|
|
-- source include/start_mysqld.inc
|
|
|
|
|
2018-05-14 15:16:25 +02:00
|
|
|
--error ER_NO_SUCH_TABLE_IN_ENGINE
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
SELECT COUNT(*) FROM bug16720368;
|
2018-05-14 15:16:25 +02:00
|
|
|
--error ER_NO_SUCH_TABLE_IN_ENGINE
|
2019-04-03 15:10:20 +02:00
|
|
|
INSERT INTO bug16720368 VALUES(1);
|
|
|
|
INSERT INTO bug16720368_1 VALUES(1);
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
|
|
|
|
-- echo # Shut down the server to uncorrupt the data.
|
|
|
|
-- source include/shutdown_mysqld.inc
|
|
|
|
|
|
|
|
# Uncorrupt the FIL_PAGE_OFFSET.
|
|
|
|
perl;
|
2019-02-16 11:06:52 +01:00
|
|
|
do "$ENV{MTR_SUITE_DIR}/include/crc32.pl";
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
my $file = "$ENV{MYSQLD_DATADIR}/test/bug16720368.ibd";
|
|
|
|
open(FILE, "+<$file") || die "Unable to open $file";
|
2019-02-16 11:06:52 +01:00
|
|
|
binmode FILE;
|
|
|
|
my $ps= $ENV{PAGE_SIZE};
|
|
|
|
my $page;
|
|
|
|
die "Unable to read $file" unless sysread(FILE, $page, $ps) == $ps;
|
MDEV-12026: Implement innodb_checksum_algorithm=full_crc32
MariaDB data-at-rest encryption (innodb_encrypt_tables)
had repurposed the same unused data field that was repurposed
in MySQL 5.7 (and MariaDB 10.2) for the Split Sequence Number (SSN)
field of SPATIAL INDEX. Because of this, MariaDB was unable to
support encryption on SPATIAL INDEX pages.
Furthermore, InnoDB page checksums skipped some bytes, and there
are multiple variations and checksum algorithms. By default,
InnoDB accepts all variations of all algorithms that ever existed.
This unnecessarily weakens the page checksums.
We hereby introduce two more innodb_checksum_algorithm variants
(full_crc32, strict_full_crc32) that are special in a way:
When either setting is active, newly created data files will
carry a flag (fil_space_t::full_crc32()) that indicates that
all pages of the file will use a full CRC-32C checksum over the
entire page contents (excluding the bytes where the checksum
is stored, at the very end of the page). Such files will always
use that checksum, no matter what the parameter
innodb_checksum_algorithm is assigned to.
For old files, the old checksum algorithms will continue to be
used. The value strict_full_crc32 will be equivalent to strict_crc32
and the value full_crc32 will be equivalent to crc32.
ROW_FORMAT=COMPRESSED tables will only use the old format.
These tables do not support new features, such as larger
innodb_page_size or instant ADD/DROP COLUMN. They may be
deprecated in the future. We do not want an unnecessary
file format change for them.
The new full_crc32() format also cleans up the MariaDB tablespace
flags. We will reserve flags to store the page_compressed
compression algorithm, and to store the compressed payload length,
so that checksum can be computed over the compressed (and
possibly encrypted) stream and can be validated without
decrypting or decompressing the page.
In the full_crc32 format, there no longer are separate before-encryption
and after-encryption checksums for pages. The single checksum is
computed on the page contents that is written to the file.
We do not make the new algorithm the default for two reasons.
First, MariaDB 10.4.2 was a beta release, and the default values
of parameters should not change after beta. Second, we did not
yet implement the full_crc32 format for page_compressed pages.
This will be fixed in MDEV-18644.
This is joint work with Marko Mäkelä.
2019-02-19 20:00:00 +01:00
|
|
|
my $full_crc32 = unpack("N",substr($page,54,4)) & 0x10; # FIL_SPACE_FLAGS
|
2019-04-03 15:10:20 +02:00
|
|
|
sysseek(FILE, 3*$ps, 0) || die "Unable to seek $file\n";
|
2019-02-16 11:06:52 +01:00
|
|
|
die "Unable to read $file" unless sysread(FILE, $page, $ps) == $ps;
|
2019-04-03 15:10:20 +02:00
|
|
|
substr($page,4,4)=pack("N",3);
|
2019-02-16 11:06:52 +01:00
|
|
|
my $polynomial = 0x82f63b78; # CRC-32C
|
MDEV-12026: Implement innodb_checksum_algorithm=full_crc32
MariaDB data-at-rest encryption (innodb_encrypt_tables)
had repurposed the same unused data field that was repurposed
in MySQL 5.7 (and MariaDB 10.2) for the Split Sequence Number (SSN)
field of SPATIAL INDEX. Because of this, MariaDB was unable to
support encryption on SPATIAL INDEX pages.
Furthermore, InnoDB page checksums skipped some bytes, and there
are multiple variations and checksum algorithms. By default,
InnoDB accepts all variations of all algorithms that ever existed.
This unnecessarily weakens the page checksums.
We hereby introduce two more innodb_checksum_algorithm variants
(full_crc32, strict_full_crc32) that are special in a way:
When either setting is active, newly created data files will
carry a flag (fil_space_t::full_crc32()) that indicates that
all pages of the file will use a full CRC-32C checksum over the
entire page contents (excluding the bytes where the checksum
is stored, at the very end of the page). Such files will always
use that checksum, no matter what the parameter
innodb_checksum_algorithm is assigned to.
For old files, the old checksum algorithms will continue to be
used. The value strict_full_crc32 will be equivalent to strict_crc32
and the value full_crc32 will be equivalent to crc32.
ROW_FORMAT=COMPRESSED tables will only use the old format.
These tables do not support new features, such as larger
innodb_page_size or instant ADD/DROP COLUMN. They may be
deprecated in the future. We do not want an unnecessary
file format change for them.
The new full_crc32() format also cleans up the MariaDB tablespace
flags. We will reserve flags to store the page_compressed
compression algorithm, and to store the compressed payload length,
so that checksum can be computed over the compressed (and
possibly encrypted) stream and can be validated without
decrypting or decompressing the page.
In the full_crc32 format, there no longer are separate before-encryption
and after-encryption checksums for pages. The single checksum is
computed on the page contents that is written to the file.
We do not make the new algorithm the default for two reasons.
First, MariaDB 10.4.2 was a beta release, and the default values
of parameters should not change after beta. Second, we did not
yet implement the full_crc32 format for page_compressed pages.
This will be fixed in MDEV-18644.
This is joint work with Marko Mäkelä.
2019-02-19 20:00:00 +01:00
|
|
|
if ($full_crc32)
|
|
|
|
{
|
|
|
|
my $ck = mycrc32(substr($page, 0, $ps-4), 0, $polynomial);
|
|
|
|
substr($page, $ps-4, 4) = pack("N", $ck);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
my $ck= pack("N",mycrc32(substr($page, 4, 22), 0, $polynomial) ^
|
2019-02-16 11:06:52 +01:00
|
|
|
mycrc32(substr($page, 38, $ps - 38 - 8), 0, $polynomial));
|
MDEV-12026: Implement innodb_checksum_algorithm=full_crc32
MariaDB data-at-rest encryption (innodb_encrypt_tables)
had repurposed the same unused data field that was repurposed
in MySQL 5.7 (and MariaDB 10.2) for the Split Sequence Number (SSN)
field of SPATIAL INDEX. Because of this, MariaDB was unable to
support encryption on SPATIAL INDEX pages.
Furthermore, InnoDB page checksums skipped some bytes, and there
are multiple variations and checksum algorithms. By default,
InnoDB accepts all variations of all algorithms that ever existed.
This unnecessarily weakens the page checksums.
We hereby introduce two more innodb_checksum_algorithm variants
(full_crc32, strict_full_crc32) that are special in a way:
When either setting is active, newly created data files will
carry a flag (fil_space_t::full_crc32()) that indicates that
all pages of the file will use a full CRC-32C checksum over the
entire page contents (excluding the bytes where the checksum
is stored, at the very end of the page). Such files will always
use that checksum, no matter what the parameter
innodb_checksum_algorithm is assigned to.
For old files, the old checksum algorithms will continue to be
used. The value strict_full_crc32 will be equivalent to strict_crc32
and the value full_crc32 will be equivalent to crc32.
ROW_FORMAT=COMPRESSED tables will only use the old format.
These tables do not support new features, such as larger
innodb_page_size or instant ADD/DROP COLUMN. They may be
deprecated in the future. We do not want an unnecessary
file format change for them.
The new full_crc32() format also cleans up the MariaDB tablespace
flags. We will reserve flags to store the page_compressed
compression algorithm, and to store the compressed payload length,
so that checksum can be computed over the compressed (and
possibly encrypted) stream and can be validated without
decrypting or decompressing the page.
In the full_crc32 format, there no longer are separate before-encryption
and after-encryption checksums for pages. The single checksum is
computed on the page contents that is written to the file.
We do not make the new algorithm the default for two reasons.
First, MariaDB 10.4.2 was a beta release, and the default values
of parameters should not change after beta. Second, we did not
yet implement the full_crc32 format for page_compressed pages.
This will be fixed in MDEV-18644.
This is joint work with Marko Mäkelä.
2019-02-19 20:00:00 +01:00
|
|
|
substr($page,0,4)=$ck;
|
|
|
|
substr($page,$ps-8,4)=$ck;
|
|
|
|
}
|
2019-04-03 15:10:20 +02:00
|
|
|
sysseek(FILE, 3*$ps, 0) || die "Unable to rewind $file\n";
|
2019-02-16 11:06:52 +01:00
|
|
|
syswrite(FILE, $page, $ps)==$ps || die "Unable to write $file\n";
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
close(FILE) || die "Unable to close $file";
|
|
|
|
EOF
|
|
|
|
|
|
|
|
-- source include/start_mysqld.inc
|
|
|
|
|
|
|
|
INSERT INTO bug16720368 VALUES(9,1);
|
|
|
|
SELECT COUNT(*) FROM bug16720368;
|
|
|
|
# A debug assertion would fail in buf_block_align_instance()
|
|
|
|
# if we did not uncorrupt the page number first.
|
|
|
|
DROP TABLE bug16720368, bug16720368_1;
|
|
|
|
|
|
|
|
-- echo #
|
|
|
|
-- echo # Bug#16735660 ASSERT TABLE2 == NULL, ROLLBACK OF RESURRECTED TXNS,
|
|
|
|
-- echo # DICT_TABLE_ADD_TO_CACHE
|
|
|
|
-- echo #
|
|
|
|
|
|
|
|
SET GLOBAL innodb_file_per_table=1;
|
|
|
|
|
|
|
|
CREATE TEMPORARY TABLE t1 (a INT PRIMARY KEY) ENGINE=InnoDB;
|
|
|
|
BEGIN;
|
|
|
|
INSERT INTO t1 VALUES(42);
|
|
|
|
|
|
|
|
-- connect (con1,localhost,root)
|
|
|
|
|
|
|
|
CREATE TABLE bug16735660 (a INT PRIMARY KEY) ENGINE=InnoDB;
|
|
|
|
|
|
|
|
XA START 'x';
|
2018-10-10 05:31:43 +02:00
|
|
|
--source ../include/no_checkpoint_start.inc
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
INSERT INTO bug16735660 VALUES(1),(2),(3);
|
|
|
|
XA END 'x';
|
|
|
|
XA PREPARE 'x';
|
2018-10-10 05:31:43 +02:00
|
|
|
--connection default
|
|
|
|
--let CLEANUP_IF_CHECKPOINT=XA ROLLBACK 'x';DROP TABLE bug16735660;
|
|
|
|
--source ../include/no_checkpoint_end.inc
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
|
2018-05-14 15:16:25 +02:00
|
|
|
-- disconnect con1
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
-- move_file $MYSQLD_DATADIR/test/bug16735660.ibd $MYSQLD_DATADIR/bug16735660.omg
|
|
|
|
|
|
|
|
-- echo # Attempt to start without an *.ibd file.
|
2018-05-14 15:16:25 +02:00
|
|
|
let SEARCH_FILE= $MYSQLTEST_VARDIR/log/mysqld.1.err;
|
|
|
|
--source include/start_mysqld.inc
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
|
2018-05-14 15:16:25 +02:00
|
|
|
let SEARCH_PATTERN= \[ERROR\] InnoDB: Tablespace [0-9]+ was not found at .*test.bug16735660.ibd;
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
-- source include/search_pattern_in_file.inc
|
|
|
|
|
|
|
|
-- move_file $MYSQLD_DATADIR/bug16735660.omg $MYSQLD_DATADIR/test/bug16735660.ibd
|
|
|
|
|
2018-05-14 15:16:25 +02:00
|
|
|
-- source include/restart_mysqld.inc
|
Bug#19330255 WL#7142 - CRASH DURING ALTER TABLE LEADS TO DATA DICTIONARY INCONSISTENCY
The server crashes on a SELECT because of space id mismatch. The
mismatch happens if the server crashes during an ALTER TABLE.
There are actually two cases of inconsistency, and three fixes needed
for the InnoDB problems.
We have dictionary data (tablespace or table name) in 3 places:
(a) The *.frm file is for the old table definition.
(b) The InnoDB data dictionary is for the new table definition.
(c) The file system did not rename the tablespace files yet.
In this fix, we will not care if the *.frm file is in sync with the
InnoDB data dictionary and file system. We will concentrate on the
mismatch between (b) and (c).
Two scenarios have been mentioned in this bug report. The simpler one
first:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
The files were not yet renamed in the file system.
2a. The server is killed, without making a log checkpoint.
3a. The server refuses to start up, because replaying MLOG_FILE_RENAME2
fails.
I failed to repeat this myself. I repeated step 3a with a saved
dataset. The problem seems to be that MLOG_FILE_RENAME2 replay is
incorrectly being skipped when there is no page-redo log or
MLOG_FILE_NAME record for the old name of the tablespace.
FIX#1: Recover the id-to-name mapping also from MLOG_FILE_RENAME2
records when scanning the redo log. It is not necessary to write
MLOG_FILE_NAME records in addition to MLOG_FILE_RENAME2 records for
renaming tablespace files.
The scenario in the original Description involves a log checkpoint:
1. The changes to SYS_TABLES were committed, and MLOG_FILE_RENAME2
records were written in a single mini-transaction commit.
2. A log checkpoint and a server kill was injected.
3. Crash recovery will see no records (other than the MLOG_CHECKPOINT).
4. dict_check_tablespaces_and_store_max_id() will emit a message about
a non-found table #sql-ib22*.
5. A mismatch is triggering the assertion failure.
In my test, at step 4 the SYS_TABLES root page (0:8) contains these 3
records right before the page supremum:
* delete-marked (committed) name=#sql-ib21* record, with space=10.
* name=#sql-ib22*, space=9.
* name=t1, space=10.
space=10 is the rebuilt table (#sql-ib21*.ibd in the file system).
space=9 is the old table (t1.ibd in the file system).
The function dict_check_tablespaces_and_store_max_id() will enter
t1.ibd with space_id=10 into the fil_system cache without noticing
that t1.ibd contains space_id=9, because it invokes
fil_open_single_table_tablespace() with validate=false.
In MySQL 5.6, the space_id from all *.ibd files are being read when
the redo log checkpoint LSN disagrees with the FIL_PAGE_FILE_FLUSH_LSN
in the system tablespace. This field is only updated during a clean
shutdown, after performing the final log checkpoint.
FIX#2: dict_check_tablespaces_and_store_max_id() should pass
validate=true to fil_open_single_table_tablespace() when a non-clean
shutdown is detected, forcing the first page of each *.ibd file to be
read. (We do not want to slow down startup after a normal shutdown.)
With FIX#2, the SELECT would fail to find the table. This would
introduce a regression, because before WL#7142, a copy of the table
was accessible after recovery.
FIX#3: Maintain a list of MLOG_FILE_RENAME2 records that have been
written to the redo log, but not performed yet in the file system.
When performing a checkpoint, re-emit these records to the redo
log. In this way, a mismatch between (b) and (c) should be impossible.
fil_name_process(): Refactored from fil_name_parse(). Adds an item to
the id-to-filename mapping.
fil_name_parse(): Parses and applies a MLOG_FILE_NAME,
MLOG_FILE_DELETE or MLOG_FILE_RENAME2 record. This implements FIX#1.
fil_name_write_rename(): A wrapper function for writing
MLOG_FILE_RENAME2 records.
fil_op_replay_rename(): Apply MLOG_FILE_RENAME2 records. Replaces
fil_op_log_parse_or_replay(), whose logic was moved to fil_name_parse().
fil_tablespace_exists_in_mem(): Return fil_space_t* instead of bool.
dict_check_tablespaces_and_store_max_id(): Add the parameter
"validate" to implement FIX#2.
log_sys->append_on_checkpoint: Extra log records to append in case of
a checkpoint. Needed for FIX#3.
log_append_on_checkpoint(): New function, to update
log_sys->append_on_checkpoint.
mtr_write_log(): New function, to append mtr_buf_t to the redo log.
fil_names_clear(): Append the data from log_sys->append_on_checkpoint
if needed.
ha_innobase::commit_inplace_alter_table(): Add any MLOG_FILE_RENAME2
records to log_sys->append_on_checkpoint(), and remove them once the
files have been renamed in the file system.
mtr_buf_copy_t: A helper functor for copying a mini-transaction log.
rb#6282 approved by Jimmy Yang
2014-08-11 09:43:11 +02:00
|
|
|
|
|
|
|
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
|
|
|
|
SELECT * FROM bug16735660;
|
|
|
|
|
|
|
|
XA RECOVER;
|
|
|
|
XA ROLLBACK 'x';
|
|
|
|
|
|
|
|
SELECT * FROM bug16735660;
|
|
|
|
DROP TABLE bug16735660;
|