This is a backport of the following fix from MySQL 5.7.23.
Some code refactoring has been omitted, and the test case has
been adapted to MariaDB.
commit 7a689acaa65e9d602575f7aa53fe36a64a07460f
Author: Krzysztof Kapuścik <krzysztof.kapuscik@oracle.com>
Date: Tue Mar 13 12:34:03 2018 +0100
Bug#27082268 Invalid FTS sync synchronization
The fix closes two issues:
Bug #27082268 - INNODB: FAILING ASSERTION: SYM_NODE->TABLE != NULL DURING FTS SYNC
Bug #27095935 - DEADLOCK BETWEEN FTS_DROP_INDEX AND FTS_OPTIMIZE_SYNC_TABLE
Both issues were related to a FTS cache sync being done during
operations that perfomed DDL actions on internal FTS tables
(ALTER TABLE, TRUNCATE). In some cases the FTS tables and/or
internal cache structures could get removed while still being
used to perform FTS synchronization leading to crashes. In other
the sync operations could not get finishes as it was waiting for
dict lock which was taken by thread waiting for the background
sync to be finished.
The changes done includes:
- Stopping background operations during ALTER TABLE and TRUNCATE.
- Removal of unused code in FTS.
- Cleanup of FTS sync related code to make it more readable and
easier to maintain.
RB#18262
We did not merge Percona XtraDB 5.6.40-84.0 yet.
The changes in it are mostly cosmetic, except for
2 bug fixes from Oracle MySQL 5.6.40, which could
be security bugs.
This was achieved by taking the applicable parts
of an earlier InnoDB commit to XtraDB:
git diff 15ec8c2f28f08517ecbffb959d756b4bdd53ab45{~,} storage/innobase|
sed -e s+/innobase/+/xtradb/+|patch -p1
ha_innobase::commit_inplace_alter_table(): Defer the freeing of ctx->trx
until after the operation has been successfully committed. In this way,
rollback on a partitioned table will be possible.
rollback_inplace_alter_table(): Handle ctx->new_table == NULL when
ctx->trx != NULL.
While the test case crashes a MariaDB 10.2 debug build only,
let us apply the fix to the earliest applicable MariaDB series (10.0)
to avoid any data corruption on a table-rebuilding ALTER TABLE
using ALGORITHM=INPLACE.
innobase_create_key_defs(): Use altered_table->s->primary_key
when a new primary key is being created.
InnoDB limited the maximum number of bytes per character to 4.
But, the filename character set that was introduced in MySQL 5.1
uses up to 5 bytes per character.
To allow InnoDB tables to be created with wider characters, let
us split the mbminmaxlen fields into mbminlen, mbmaxlen, and increase
the limit to 7 bytes per character. This will increase the payload size
of dtype_t and dict_col_t by one bit. The storage size will be unchanged
(54 bits and 77 bits will use the same number of bytes as the
previous sizes 53 and 76 bits).
When MariaDB 10.1.0 introduced table options for encryption and
compression, it unnecessarily changed
ha_innobase::check_if_supported_inplace_alter() so that ALGORITHM=COPY
is forced when these parameters differ.
A better solution is to move the check to innobase_need_rebuild().
In that way, the ALGORITHM=INPLACE interface (yes, the syntax is
very misleading) can be used for rebuilding the table much more
efficiently, with merge sort, with no undo logging, and allowing
concurrent DML operations.
Reverted incorrect changes done on MDEV-7367 and MDEV-9469. Fixes properly
also related bugs:
MDEV-13668: InnoDB unnecessarily rebuilds table when renaming a column and adding index
MDEV-9469: 'Incorrect key file' on ALTER TABLE
MDEV-9548: Alter table (renaming and adding index) fails with "Incorrect key file for table"
MDEV-10535: ALTER TABLE causes standalone/wsrep cluster crash
MDEV-13640: ALTER TABLE CHANGE and ADD INDEX on auto_increment column fails with "Incorrect key file for table..."
Root cause for all these bugs is the fact that MariaDB .frm file
can contain virtual columns but InnoDB dictionary does not and
previous fixes were incorrect or unnecessarily forced table
rebuilt. In index creation key_part->fieldnr can be bigger than
number of columns in InnoDB data dictionary. We need to skip not
stored fields when calculating correct column number for InnoDB
data dictionary.
dict_table_get_col_name_for_mysql
Remove
innobase_match_index_columns
Revert incorrect change done on MDEV-7367
innobase_need_rebuild
Remove unnecessary rebuild force when column is renamed.
innobase_create_index_field_def
Calculate InnoDB column number correctly and remove
unnecessary column name set.
innobase_create_index_def, innobase_create_key_defs
Remove unneeded fields parameter. Revert unneeded memset.
prepare_inplace_alter_table_dict
Remove unneeded col_names parameter
index_field_t
Remove unneeded col_name member.
row_merge_create_index
Remove unneeded col_names parameter and resolution.
Effected tests:
innodb-alter-table : Add test case for MDEV-13668
innodb-alter : Remove MDEV-13668, MDEV-9469 FIXMEs
and restore original tests
innodb-wl5980-alter : Remove MDEV-13668, MDEV-9469 FIXMEs
and restore original tests
In all InnoDB row formats, the pointers or lengths stored in the record
header can be at most 14 bits, that is, count up to 16383.
In ROW_FORMAT=REDUNDANT, this limits the maximum possible record length
to 16383 bytes. In other ROW_FORMAT, it could merely limit the maximum
length of variable-length fields.
When MySQL 5.7 introduced innodb_page_size=32k and 64k, the maximum
record length was limited to 16383 bytes (I hope 16383, not 16384,
to be able to distinguish from a record whose length is 0 bytes).
This change is present in MariaDB Server 10.2.
btr_cur_optimistic_update(): Restrict maximum record size to 16K-1
for REDUNDANT and 64K page size.
dict_index_too_big_for_tree(): The maximum allowed record size
is half a B-tree page or 16K(-1 for REDUNDANT) for 64K page size.
convert_error_code_to_mysql(): Fix error message to print
correct limits.
my_error_innodb(): Fix error message to print correct limits.
page_zip_rec_needs_ext() : record size was already restricted to 16K.
Restrict REDUNDANT to 16K-1.
rem0rec.h: Introduce REDUNDANT_REC_MAX_DATA_SIZE (16K-1)
and COMPRESSED_REC_MAX_DATA_SIZE (16K).
This fixes warnings that were emitted when running InnoDB test
suites on a debug server that was compiled with GCC 7.1.0 using
the flags -O3 -fsanitize=undefined.
thd_requested_durability(): XtraDB can call this with trx->mysql_thd=NULL.
Remove the function in InnoDB, because it is not used there.
calc_row_difference(): Do not call memcmp(o_ptr, NULL, 0).
innobase_index_name_is_reserved(): This can be called with
key_info=NULL, num_of_keys=0.
innobase_dropping_foreign(), innobase_check_foreigns_low(),
innobase_check_foreigns(): This can be called with
drop_fk=NULL, n_drop_fk=0.
rec_convert_dtuple_to_rec_comp(): Do not invoke memcpy(end, NULL, 0).
This only merges MDEV-12253, adapting it to MDEV-12602 which is already
present in 10.2 but not yet in the 10.1 revision that is being merged.
TODO: Error handling in crash recovery needs to be improved.
If a page cannot be decrypted (or read), we should cleanly abort
the startup. If innodb_force_recovery is specified, we should
ignore the problematic page and apply redo log to other pages.
Currently, the test encryption.innodb-redo-badkey randomly fails
like this (the last messages are from cmake -DWITH_ASAN):
2017-05-05 10:19:40 140037071685504 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1635994
2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Missing MLOG_FILE_NAME or MLOG_FILE_DELETE before MLOG_CHECKPOINT for tablespace 1
2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[2201] with error Data structure corruption
2017-05-05 10:19:41 140037071685504 [Note] InnoDB: Starting shutdown...
i=================================================================
==5226==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x612000018588 in thread T0
#0 0x736750 in operator delete(void*) (/mariadb/server/build/sql/mysqld+0x736750)
#1 0x1e4833f in LatchCounter::~LatchCounter() /mariadb/server/storage/innobase/include/sync0types.h:599:4
#2 0x1e480b8 in LatchMeta<LatchCounter>::~LatchMeta() /mariadb/server/storage/innobase/include/sync0types.h:786:17
#3 0x1e35509 in sync_latch_meta_destroy() /mariadb/server/storage/innobase/sync/sync0debug.cc:1622:3
#4 0x1e35314 in sync_check_close() /mariadb/server/storage/innobase/sync/sync0debug.cc:1839:2
#5 0x1dfdc18 in innodb_shutdown() /mariadb/server/storage/innobase/srv/srv0start.cc:2888:2
#6 0x197e5e6 in innobase_init(void*) /mariadb/server/storage/innobase/handler/ha_innodb.cc:4475:3
Problem was that bpage was referenced after it was already freed
from LRU. Fixed by adding a new variable encrypted that is
passed down to buf_page_check_corrupt() and used in
buf_page_get_gen() to stop processing page read.
This patch should also address following test failures and
bugs:
MDEV-12419: IMPORT should not look up tablespace in
PageConverter::validate(). This is now removed.
MDEV-10099: encryption.innodb_onlinealter_encryption fails
sporadically in buildbot
MDEV-11420: encryption.innodb_encryption-page-compression
failed in buildbot
MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8
Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing
and replaced these with dict_table_t::file_unreadable. Table
ibd file is missing if fil_get_space(space_id) returns NULL
and encrypted if not. Removed dict_table_t::is_corrupted field.
Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(),
buf_page_decrypt_after_read(), buf_page_encrypt_before_write(),
buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats().
Added test cases when enrypted page could be read while doing
redo log crash recovery. Also added test case for row compressed
blobs.
btr_cur_open_at_index_side_func(),
btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is
NULL.
buf_page_get_zip(): Issue error if page read fails.
buf_page_get_gen(): Use dberr_t for error detection and
do not reference bpage after we hare freed it.
buf_mark_space_corrupt(): remove bpage from LRU also when
it is encrypted.
buf_page_check_corrupt(): @return DB_SUCCESS if page has
been read and is not corrupted,
DB_PAGE_CORRUPTED if page based on checksum check is corrupted,
DB_DECRYPTION_FAILED if page post encryption checksum matches but
after decryption normal page checksum does not match. In read
case only DB_SUCCESS is possible.
buf_page_io_complete(): use dberr_t for error handling.
buf_flush_write_block_low(),
buf_read_ahead_random(),
buf_read_page_async(),
buf_read_ahead_linear(),
buf_read_ibuf_merge_pages(),
buf_read_recv_pages(),
fil_aio_wait():
Issue error if page read fails.
btr_pcur_move_to_next_page(): Do not reference page if it is
NULL.
Introduced dict_table_t::is_readable() and dict_index_t::is_readable()
that will return true if tablespace exists and pages read from
tablespace are not corrupted or page decryption failed.
Removed buf_page_t::key_version. After page decryption the
key version is not removed from page frame. For unencrypted
pages, old key_version is removed at buf_page_encrypt_before_write()
dict_stats_update_transient_for_index(),
dict_stats_update_transient()
Do not continue if table decryption failed or table
is corrupted.
dict0stats.cc: Introduced a dict_stats_report_error function
to avoid code duplication.
fil_parse_write_crypt_data():
Check that key read from redo log entry is found from
encryption plugin and if it is not, refuse to start.
PageConverter::validate(): Removed access to fil_space_t as
tablespace is not available during import.
Fixed error code on innodb.innodb test.
Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown
to innodb-bad-key-change2. Removed innodb-bad-key-change5 test.
Decreased unnecessary complexity on some long lasting tests.
Removed fil_inc_pending_ops(), fil_decr_pending_ops(),
fil_get_first_space(), fil_get_next_space(),
fil_get_first_space_safe(), fil_get_next_space_safe()
functions.
fil_space_verify_crypt_checksum(): Fixed bug found using ASAN
where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly
accessed from row compressed tables. Fixed out of page frame
bug for row compressed tables in
fil_space_verify_crypt_checksum() found using ASAN. Incorrect
function was called for compressed table.
Added new tests for discard, rename table and drop (we should allow them
even when page decryption fails). Alter table rename is not allowed.
Added test for restart with innodb-force-recovery=1 when page read on
redo-recovery cant be decrypted. Added test for corrupted table where
both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted.
Adjusted the test case innodb_bug14147491 so that it does not anymore
expect crash. Instead table is just mostly not usable.
fil0fil.h: fil_space_acquire_low is not visible function
and fil_space_acquire and fil_space_acquire_silent are
inline functions. FilSpace class uses fil_space_acquire_low
directly.
recv_apply_hashed_log_recs() does not return anything.
MDEV-11581: Mariadb starts InnoDB encryption threads
when key has not changed or data scrubbing turned off
Background: Key rotation is based on background threads
(innodb-encryption-threads) periodically going through
all tablespaces on fil_system. For each tablespace
current used key version is compared to max key age
(innodb-encryption-rotate-key-age). This process
naturally takes CPU. Similarly, in same time need for
scrubbing is investigated. Currently, key rotation
is fully supported on Amazon AWS key management plugin
only but InnoDB does not have knowledge what key
management plugin is used.
This patch re-purposes innodb-encryption-rotate-key-age=0
to disable key rotation and background data scrubbing.
All new tables are added to special list for key rotation
and key rotation is based on sending a event to
background encryption threads instead of using periodic
checking (i.e. timeout).
fil0fil.cc: Added functions fil_space_acquire_low()
to acquire a tablespace when it could be dropped concurrently.
This function is used from fil_space_acquire() or
fil_space_acquire_silent() that will not print
any messages if we try to acquire space that does not exist.
fil_space_release() to release a acquired tablespace.
fil_space_next() to iterate tablespaces in fil_system
using fil_space_acquire() and fil_space_release().
Similarly, fil_space_keyrotation_next() to iterate new
list fil_system->rotation_list where new tables.
are added if key rotation is disabled.
Removed unnecessary functions fil_get_first_space_safe()
fil_get_next_space_safe()
fil_node_open_file(): After page 0 is read read also
crypt_info if it is not yet read.
btr_scrub_lock_dict_func()
buf_page_check_corrupt()
buf_page_encrypt_before_write()
buf_merge_or_delete_for_page()
lock_print_info_all_transactions()
row_fts_psort_info_init()
row_truncate_table_for_mysql()
row_drop_table_for_mysql()
Use fil_space_acquire()/release() to access fil_space_t.
buf_page_decrypt_after_read():
Use fil_space_get_crypt_data() because at this point
we might not yet have read page 0.
fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly
to functions needing it and store fil_space_t* to rotation state.
Use fil_space_acquire()/release() when iterating tablespaces
and removed unnecessary is_closing from fil_crypt_t. Use
fil_space_t::is_stopping() to detect when access to
tablespace should be stopped. Removed unnecessary
fil_space_get_crypt_data().
fil_space_create(): Inform key rotation that there could
be something to do if key rotation is disabled and new
table with encryption enabled is created.
Remove unnecessary functions fil_get_first_space_safe()
and fil_get_next_space_safe(). fil_space_acquire()
and fil_space_release() are used instead. Moved
fil_space_get_crypt_data() and fil_space_set_crypt_data()
to fil0crypt.cc.
fsp_header_init(): Acquire fil_space_t*, write crypt_data
and release space.
check_table_options()
Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_*
i_s.cc: Added ROTATING_OR_FLUSHING field to
information_schema.innodb_tablespace_encryption
to show current status of key rotation.
Also, implement MDEV-11027 a little differently from 5.5 and 10.0:
recv_apply_hashed_log_recs(): Change the return type back to void
(DB_SUCCESS was always returned).
Report progress also via systemd using sd_notifyf().
Memory was leaked when ALTER TABLE is attempted on a table
that contains corrupted indexes.
The memory leak was reported by AddressSanitizer for the test
innodb.innodb_corrupt_bit. The leak was introduced into
MariaDB Server 10.0.26, 10.1.15, 10.2.1 by the following:
commit c081c978a2
Merge: 1d21b22155a482e76e65
Author: Sergei Golubchik <serg@mariadb.org>
Date: Tue Jun 21 14:11:02 2016 +0200
Merge branch '5.5' into bb-10.0
MariaDB Server 10.0.28 and 10.1.19 merged code from Percona XtraDB
that introduced support for compressed columns. Much but not all
of this code was disabled by placing #ifdef HAVE_PERCONA_COMPRESSED_COLUMNS
around it.
Among the unused but not disabled code is code to access
some new system tables related to compressed columns.
The creation of these system tables SYS_ZIP_DICT and SYS_ZIP_DICT_COLS
would cause a crash in --innodb-read-only mode when upgrading
from an earlier version to 10.0.28 or 10.1.19.
Let us remove all the dead code related to compressed columns.
Users who already upgraded to 10.0.28 and 10.1.19 will have the two
above mentioned empty tables in their InnoDB system tablespace.
Subsequent versions of MariaDB Server will completely ignore those tables.