If we fail to open a tablespace while looking for FILE_CHECKPOINT, we
set the corruption flag. Specifically, if encryption key is missing, we
would not be able to open an encrypted tablespace and the flag could be
set. We miss checking for this flag and report "Missing FILE_CHECKPOINT"
Address review comment to improve the test. Flush pages before starting
no-checkpoint block. It should improve the number of cases where the
test is skipped because some intermediate checkpoint is triggered.
Before MariaDB 10.3.5, the binlog position was stored in the TRX_SYS page,
while after it is stored in rollback segments. There is code to read the
legacy position from TRX_SYS to handle upgrades. The problem was if the
legacy position happens to compare larger than the position found in
rollback segments; in this case, the old TRX_SYS position would incorrectly
be preferred over the newer position from rollback segments.
Fixed by always preferring a position from rollback segments over a legacy
position.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
A few regression tests invoke heavy flushing of the buffer pool
and may trigger warnings that tablespaces could not be deleted
because of pending writes. Those warnings are to be expected
during the execution of such tests.
The warnings are also frequently seen with Valgrind or MemorySanitizer.
For those, the global suppression in have_innodb.inc does the trick.
The debug parameter innodb_simulate_comp_failures injected compression
failures for ROW_FORMAT=COMPRESSED tables, breaking the pre-existing
logic that I had implemented in the InnoDB Plugin for MySQL 5.1 to prevent
compressed page overflows. A much better check is already achieved by
defining UNIV_ZIP_COPY at the compilation time.
(Only UNIV_ZIP_DEBUG is part of cmake -DWITH_INNODB_EXTRA_DEBUG=ON.)
Let us introduce a dummy variable innodb_max_purge_lag_wait
for waiting that the InnoDB history list length is below
the user-specified limit. Specifically,
SET GLOBAL innodb_max_purge_lag_wait=0;
should wait for all history to be purged. This could be useful
when upgrading from an older version to MariaDB 10.3 or later,
to avoid hitting MDEV-15912.
Note: the history cannot be purged if there exist transactions
that may see old versions.
Reviewed by: Vladislav Vaintroub
innobase_pk_order_preserved(): Treat an added AUTO_INCREMENT
column in the same way as an added existing column.
In either case, the column values are not guaranteed to
be constant, and thus the ordering may change if such a column
is added before any existing PRIMARY KEY columns.
prepare_inplace_alter_table_dict(): Initialize
dict_table_t::persistent_autoinc before invoking
innobase_pk_order_preserved().
Now there can be only one log file instead of several which
logically work as a single file.
Possible names of redo log files: ib_logfile0,
ib_logfile101 (for just created one)
innodb_log_fiels_in_group: value of this variable is not used
by InnoDB. Possible values are still 1..100, to not break upgrade
LOG_FILE_NAME: add constant of value "ib_logfile0"
LOG_FILE_NAME_PREFIX: add constant of value "ib_logfile"
get_log_file_path(): convenience function that returns full
path of a redo log file
SRV_N_LOG_FILES_MAX: removed
srv_n_log_files: we can't remove this for compatibility reasons,
but now server doesn't use this variable
log_sys_t::file::fd: now just one, not std::vector
log_sys_t::log_capacity: removed word 'group'
find_and_check_log_file(): part of logic from huge srv_start()
moved here
recv_sys_t::files: file descriptors of redo log files.
There can be several of those in case we're upgrading
from older MariaDB version.
recv_sys_t::remove_extra_log_files: whether to remove
ib_logfile{1,2,3...} after successfull upgrade.
recv_sys_t::read(): open if needed and read from one
of several log files
recv_sys_t::files_size(): open if needed and return files count
redo_file_sizes_are_correct(): check that redo log files
sizes are equal. Just to log an error for a user.
Corresponding check was moved from srv0start.cc
namespace deprecated: put all deprecated variables here to
prevent usage of it by us, developers
row_drop_table_for_mysql(): If a #sql2 table is open in another
thread (purge) while we attempting to drop it, rename it to #sql-ib
name to hide it from the SQL layer.
Adjust tests accordingly to hide #sql-ib tables, which can
continue to exist until the background DROP TABLE completes.
- Note that some issues was also fixed in 10.2 and 10.4. I also fixed them
here to be able to continue with making 10.5 valgrind safe again
- Disable connection threads warnings when doing shutdown
Use DEBUG_SYNC to hang the execution at the interesting point,
and then kill and restart the server externally. This will work
also with Valgrind. DBUG_SUICIDE() causes Valgrind to hang,
and it could also cause uninteresting reports about memory leaks.
While we are at it, let us clean up innodb.innodb_bulk_create_index_debug
so that it will actually test the desired functionality also in future
versions (with instant ADD COLUMN and DROP COLUMN) and avoid
some unnecessary restarts.
We are adding two DEBUG_SYNC points for ALTER TABLE, because there were
none that would be executed right before ha_commit_trans().
Shorten some VARCHAR attributes to a more reasonable length.
INNODB_METRICS: Rename the column STATUS to ENABLED, and make it Boolean.
Replace with INT(1) many Boolean attributes that were declared as VARCHAR
containing 'NO','YES','disabled','enabled','Uninitialized','Initialized'.
Replace some VARCHAR attributes with ENUM.
Replace some BIGINT with INT when 32 bits are sufficient.
Remove INNODB_SYS_TABLESPACES.SPACE_TYPE. The type of a tablespace
can be derived from the tablespace ID. A fixed number is used for
the system tablespace and the temporary tablespace. All other tablespaces
are single-table or single-partition tablespaces.
i_s_locks_row_t::lock_type, lock_get_type_str(): Remove.
This is a redundant field. Table and record locks can be
distinguished by whether i_s_locks_row_t::lock_index is NULL.
fill_trx_row(): Do not unnecessarily copy the constant strings that
trx->op_info is pointing to.
i_s_locks_row_t::lock_mode: Replace string with integer.
lock_get_mode_str(), lock_get_trx_id(), lock_get_trx(): Remove.
field_store_ulint(): Remove.
In tests that directly write InnoDB data file pages,
compute the innodb_checksum_algorithm=crc32 checksums,
instead of writing the 0xdeadbeef value used by
innodb_checksum_algorithm=none. In this way, these tests
will not cause failures when executing
./mtr --mysqld=--loose-innodb-checksum-algorithm=strict_crc32
Write a test case that computes valid crc32 checksums for
an encrypted page, but zeroes out the payload area, so
that the checksum after decryption fails.
xb_fil_cur_read(): Validate the page number before trying
any checksum calculation or decrypting or decompression.
Also, skip zero-filled pages. For page_compressed pages,
ensure that the FIL_PAGE_TYPE was changed. Also, reject
FIL_PAGE_PAGE_COMPRESSED_ENCRYPTED if no decryption was attempted.
Remove the separate test innodb.alter_instant, because
it can be easily mistaken for innodb.instant_alter, which
in turn is covering various instant ALTER TABLE operations.
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.
main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
thd_rpl_stmt_based(): A new predicate to check if statement-based
replication is active. (This can also hold when replication is not
in use, but binlog is.)
que_thr_stop(), row_ins_duplicate_error_in_clust(),
row_ins_sec_index_entry_low(), row_ins(): On a duplicate key error,
only lock all index records when statement-based replication is in use.
While MariaDB Server 10.2 is not really guaranteed to be compatible
with Percona XtraBackup 2.4 (for example, the MySQL 5.7 undo log format
change that could be present in XtraBackup, but was reverted from
MariaDB in MDEV-12289), we do not want to disrupt users who have
deployed xtrabackup and MariaDB Server 10.2 in their environments.
With this change, MariaDB 10.2 will continue to use the backup-unsafe
TRUNCATE TABLE code, so that neither the undo log nor the redo log
formats will change in an incompatible way.
Undo tablespace truncation will keep using the redo log only. Recovery
or backup with old code will fail to shrink the undo tablespace files,
but the contents will be recovered just fine.
In the MariaDB Server 10.2 series only, we introduce the configuration
parameter innodb_unsafe_truncate and make it ON by default. To allow
MariaDB Backup (mariabackup) to work properly with TRUNCATE TABLE
operations, use loose_innodb_unsafe_truncate=OFF.
MariaDB Server 10.3.10 and later releases will always use the
backup-safe TRUNCATE TABLE, and this parameter will not be
added there.
recv_recovery_rollback_active(): Skip row_mysql_drop_garbage_tables()
unless innodb_unsafe_truncate=OFF. It is too unsafe to drop orphan
tables if RENAME operations are not transactional within InnoDB.
LOG_HEADER_FORMAT_10_3: Replaces LOG_HEADER_FORMAT_CURRENT.
log_init(), log_group_file_header_flush(),
srv_prepare_to_delete_redo_log_files(),
innobase_start_or_create_for_mysql(): Choose the redo log format
and subformat based on the value of innodb_unsafe_truncate.
Allow combination of non-instant, non-rebuilding operations with
changes of table options that do not require a rebuild.
For example, DROP INDEX or ADD INDEX can be performed with
ALGORITHM=NOCOPY together with changing such table options.
Changing the table options alone would be allowed with ALGORITHM=INSTANT.
INNOBASE_ALTER_NOCREATE: A new set of flags, for operations that
are refused for ALGORITHM=INSTANT and do not involve creating
index trees.
Move ALTER_RENAME_INDEX to the proper place (INNOBASE_ALTER_INSTANT).
innobase_need_rebuild(): Do not require a rebuild if
INNOBASE_ALTER_NOREBUILD operations are combined with ALTER_OPTIONS.
ha_innobase::prepare_inplace_alter_table(),
ha_innobase::inplace_alter_table(): Use the fast path if
ALTER_OPTIONS is combined with INNOBASE_ALTER_NOCREATE.
In this case, the actual changes would be deferred to
ha_innobase::commit_inplace_alter_table().
This is a merge from 10.2, but the 10.2 version of this will not
be pushed into 10.2 yet, because the 10.2 version would include
backports of MDEV-14717 and MDEV-14585, which would introduce
a crash recovery regression: Tables could be lost on
table-rebuilding DDL operations, such as ALTER TABLE,
OPTIMIZE TABLE or this new backup-friendly TRUNCATE TABLE.
The test innodb.truncate_crash occasionally loses the table due to
the following bug:
MDEV-17158 log_write_up_to() sometimes fails
Remove the innodb_undo suite, and move and adapt the tests.
Remove unnecessary restarts, and add innodb_page_size_small.inc
for combinations.
innodb.undo_truncate is the merge of innodb_undo.truncate
and innodb_undo.truncate_multi_client.
Add the global status variable innodb_undo_truncations.
Without this, the test innodb.undo_truncate would occasionally
report that truncation did not happen. The test was only waiting
for the history list length to reach 0, but the undo tablespace
truncation would only take place some time after that.
Undo tablespace truncation will only occasionally occur with
innodb_page_size=32k, and typically never occur (with this amount
of undo log operations) with innodb_page_size=64k. We disable
these combinations.
innodb.undo_truncate_recover was formerly called
innodb_undo.truncate_recover.
Implement undo tablespace truncation via normal redo logging.
Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name,
CREATE, and DROP.
Note: Orphan #sql-ib*.ibd may be left behind if MariaDB Server 10.2
is killed before the DROP operation is committed. If MariaDB Server 10.2
is killed during TRUNCATE, it is also possible that the old table
was renamed to #sql-ib*.ibd but the data dictionary will refer to the
table using the original name.
In MariaDB Server 10.3, RENAME inside InnoDB is transactional,
and #sql-* tables will be dropped on startup. So, this new TRUNCATE
will be fully crash-safe in 10.3.
ha_mroonga::wrapper_truncate(): Pass table options to the underlying
storage engine, now that ha_innobase::truncate() will need them.
rpl_slave_state::truncate_state_table(): Before truncating
mysql.gtid_slave_pos, evict any cached table handles from
the table definition cache, so that there will be no stale
references to the old table after truncating.
== TRUNCATE TABLE ==
WL#6501 in MySQL 5.7 introduced separate log files for implementing
atomic and crash-safe TRUNCATE TABLE, instead of using the InnoDB
undo and redo log. Some convoluted logic was added to the InnoDB
crash recovery, and some extra synchronization (including a redo log
checkpoint) was introduced to make this work. This synchronization
has caused performance problems and race conditions, and the extra
log files cannot be copied or applied by external backup programs.
In order to support crash-upgrade from MariaDB 10.2, we will keep
the logic for parsing and applying the extra log files, but we will
no longer generate those files in TRUNCATE TABLE.
A prerequisite for crash-safe TRUNCATE is a crash-safe RENAME TABLE
(with full redo and undo logging and proper rollback). This will
be implemented in MDEV-14717.
ha_innobase::truncate(): Invoke RENAME, create(), delete_table().
Because RENAME cannot be fully rolled back before MariaDB 10.3
due to missing undo logging, add some explicit rename-back in
case the operation fails.
ha_innobase::delete(): Introduce a variant that takes sqlcom as
a parameter. In TRUNCATE TABLE, we do not want to touch any
FOREIGN KEY constraints.
ha_innobase::create(): Add the parameters file_per_table, trx.
In TRUNCATE, the new table must be created in the same transaction
that renames the old table.
create_table_info_t::create_table_info_t(): Add the parameters
file_per_table, trx.
row_drop_table_for_mysql(): Replace a bool parameter with sqlcom.
row_drop_table_after_create_fail(): New function, wrapping
row_drop_table_for_mysql().
dict_truncate_index_tree_in_mem(), fil_truncate_tablespace(),
fil_prepare_for_truncate(), fil_reinit_space_header_for_table(),
row_truncate_table_for_mysql(), TruncateLogger,
row_truncate_prepare(), row_truncate_rollback(),
row_truncate_complete(), row_truncate_fts(),
row_truncate_update_system_tables(),
row_truncate_foreign_key_checks(), row_truncate_sanity_checks():
Remove.
row_upd_check_references_constraints(): Remove a check for
TRUNCATE, now that the table is no longer truncated in place.
The new test innodb.truncate_foreign uses DEBUG_SYNC to cover some
race-condition like scenarios. The test innodb-innodb.truncate does
not use any synchronization.
We add a redo log subformat to indicate backup-friendly format.
MariaDB 10.4 will remove support for the old TRUNCATE logging,
so crash-upgrade from old 10.2 or 10.3 to 10.4 will involve
limitations.
== Undo tablespace truncation ==
MySQL 5.7 implements undo tablespace truncation. It is only
possible when innodb_undo_tablespaces is set to at least 2.
The logging is implemented similar to the WL#6501 TRUNCATE,
that is, using separate log files and a redo log checkpoint.
We can simply implement undo tablespace truncation within
a single mini-transaction that reinitializes the undo log
tablespace file. Unfortunately, due to the redo log format
of some operations, currently, the total redo log written by
undo tablespace truncation will be more than the combined size
of the truncated undo tablespace. It should be acceptable
to have a little more than 1 megabyte of log in a single
mini-transaction. This will be fixed in MDEV-17138 in
MariaDB Server 10.4.
recv_sys_t: Add truncated_undo_spaces[] to remember for which undo
tablespaces a MLOG_FILE_CREATE2 record was seen.
namespace undo: Remove some unnecessary declarations.
fil_space_t::is_being_truncated: Document that this flag now
only applies to undo tablespaces. Remove some references.
fil_space_t::is_stopping(): Do not refer to is_being_truncated.
This check is for tablespaces of tables. Potentially used
tablespaces are never truncated any more.
buf_dblwr_process(): Suppress the out-of-bounds warning
for undo tablespaces.
fil_truncate_log(): Write a MLOG_FILE_CREATE2 with a nonzero
page number (new size of the tablespace in pages) to inform
crash recovery that the undo tablespace size has been reduced.
fil_op_write_log(): Relax assertions, so that MLOG_FILE_CREATE2
can be written for undo tablespaces (without .ibd file suffix)
for a nonzero page number.
os_file_truncate(): Add the parameter allow_shrink=false
so that undo tablespaces can actually be shrunk using this function.
fil_name_parse(): For undo tablespace truncation,
buffer MLOG_FILE_CREATE2 in truncated_undo_spaces[].
recv_read_in_area(): Avoid reading pages for which no redo log
records remain buffered, after recv_addr_trim() removed them.
trx_rseg_header_create(): Add a FIXME comment that we could write
much less redo log.
trx_undo_truncate_tablespace(): Reinitialize the undo tablespace
in a single mini-transaction, which will be flushed to the redo log
before the file size is trimmed.
recv_addr_trim(): Discard any redo logs for pages that were
logged after the new end of a file, before the truncation LSN.
If the rec_list becomes empty, reduce n_addrs. After removing
any affected records, actually truncate the file.
recv_apply_hashed_log_recs(): Invoke recv_addr_trim() right before
applying any log records. The undo tablespace files must be open
at this point.
buf_flush_or_remove_pages(), buf_flush_dirty_pages(),
buf_LRU_flush_or_remove_pages(): Add a parameter for specifying
the number of the first page to flush or remove (default 0).
trx_purge_initiate_truncate(): Remove the log checkpoints, the
extra logging, and some unnecessary crash points. Merge the code
from trx_undo_truncate_tablespace(). First, flush all to-be-discarded
pages (beyond the new end of the file), then trim the space->size
to make the page allocation deterministic. At the only remaining
crash injection point, flush the redo log, so that the recovery
can be tested.
NULL values when there is no DEFAULT
- Merged the alter_non_null test case to alter_not_null test case.
Renamed the alter_non_null_debug to alter_not_null_debug test case
NULL values when there is no DEFAULT
Copy and inplace algorithm works similarly for
NULL to NOT NULL conversion for the following cases:
(1) strict sql mode - Should give error.
(2) non-strict sql mode - Should give warnings alone
(3) alter ignore table command. - Should give warnings alone.