When InnoDB is invoking posix_fallocate() to extend data files, it
was missing a call to fsync() to update the file system metadata.
If file system recovery is needed, the file size could be incorrect.
When the setting innodb_flush_method=O_DIRECT_NO_FSYNC
that was introduced in MariaDB 10.0.11 (and MySQL 5.6) is enabled,
InnoDB would wrongly skip fsync() after extending files.
Furthermore, the merge commit d8b45b0c00
inadvertently removed XtraDB error checking for posix_fallocate()
which this fix is restoring.
fil_flush(): Add the parameter bool metadata=false to request that
fil_buffering_disabled() be ignored.
fil_extend_space_to_desired_size(): Invoke fil_flush() with the
extra parameter. After successful posix_fallocate(), invoke
os_file_flush(). Note: The bookkeeping for fil_flush() would not be
updated the posix_fallocate() code path, so the "redundant"
fil_flush() should be a no-op.
row_drop_table_for_mysql(): Fix a regression introduced in MDEV-16515.
Similar to the follow-up fixes MDEV-16647 and MDEV-17470, we must make
the internal tables of FULLTEXT INDEX immune to kills, to avoid noise
and resource leakage on DROP TABLE or ALTER TABLE. (Orphan internal tables
would be dropped at the next InnoDB startup only.)
Problem:
========
MLOG_FILE_WRITE_CRYPT_DATA redo log fails to apply type for
the crypt_data present in the space. While processing the double-write
buffer pages, page fails to decrypt. It leads to warning message.
Fix:
====
Set the type while parsing MLOG_FILE_WRITE_CRYPT_DATA redo log.
If type and length is of invalid type then mark it as corrupted.
When performing a hash search via HASH_SEARCH we first look at a key of a node
and then at its pointer to the next node in chain. If we have those in one cache
line instead of a two we reduce memory reads.
I found dict_table_t, fil_space_t and buf_page_t suitable for such improvement.
During database recovery, a transaction with wsrep XID is
recovered from InnoDB in prepared state. However, when the
transaction is looked up with trx_get_trx_by_xid() in
innobase_commit_by_xid(), trx->xid gets cleared in
trx_get_trx_by_xid_low() and commit time serialization history
write does not update the wsrep XID in trx sys header for
that recovered trx. As a result the transaction gets
committed during recovery but the wsrep position does not
get updated appropriately.
As a fix, we preserve trx->xid for Galera over transaction
commit in recovery phase.
Fix authored by: Teemu Ollakka (GaleraCluster) and Marko Mäkelä.
modified: mysql-test/suite/galera/disabled.def
modified: mysql-test/suite/galera/r/galera_gcache_recover_full_gcache.result
modified: mysql-test/suite/galera/r/galera_gcache_recover_manytrx.result
modified: mysql-test/suite/galera/t/galera_gcache_recover_full_gcache.test
modified: mysql-test/suite/galera/t/galera_gcache_recover_manytrx.test
modified: storage/innobase/trx/trx0trx.cc
modified: storage/xtradb/trx/trx0trx.cc
This is a regression after MDEV-13671.
The bug is related to key part prefix lengths wich are stored in SYS_FIELDS.
Storage format is not obvious and was handled incorrectly which led to data
dictionary corruption.
SYS_FIELDS.POS actually contains prefix length too in case if any key part
has prefix length.
innobase_rename_column_try(): fixed prefixes handling
Tests for prefixed indexes added too.
Closes#1063
Orphan #sql* tables may remain after ALTER TABLE
was interrupted by timeout or KILL or client disconnect.
This is a regression caused by MDEV-16515.
Similar to temporary tables (MDEV-16647), we had better ignore the
KILL when dropping the original table in the final part of ALTER TABLE.
Closes#1020
This fixes a regression that was introduced in MySQL 5.6.6
in an error handling code path, in the following change:
commit 024f363d6b5f09b20d1bba411af55be95c7398d3
Author: kevin.lewis@oracle.com <>
Date: Fri Jun 15 09:01:42 2012 -0500
Bug #14169459 INNODB; DROP TABLE DOES NOT DELETE THE IBD FILE
FOR A TEMPORARY TABLE.
In MDEV-13103, I made a mistake in the error handling of
page_compressed=1 decryption when the default
innodb_compression_algorithm=zlib is used.
Due to this mistake, with certain versions of zlib,
MariaDB would fail to detect a corrupted page.
The problem was uncovered by the following tests:
mariabackup.unencrypted_page_compressed
mariabackup.encrypted_page_compressed
Starting with commit 6b6987154a
the encrypted page checksum is only computed with crc32.
Encrypted data pages that were written earlier can contain
other checksums, but new ones will only contain crc32.
Because of this, it does not make sense to implement strict
checks of innodb_checksum_algorithm for other than strict_crc32.
fil_space_verify_crypt_checksum(): Treat strict_innodb as innodb
and strict_none as none. That is, allow a match from any of the
algorithms none, innodb, crc32. (This is how it worked before the
second MDEV-12112 fix.)
Thanks to Thirunarayanan Balathandayuthapani for pointing this out.
The initial fix only covered a part of Mariabackup.
This fix hardens InnoDB and XtraDB in a similar way, in order
to reduce the probability of mistaking a corrupted encrypted page
for a valid unencrypted one.
This is based on work by Thirunarayanan Balathandayuthapani.
fil_space_verify_crypt_checksum(): Assert that key_version!=0.
Let the callers guarantee that. Now that we have this assertion,
we also know that buf_page_is_zeroes() cannot hold.
Also, remove all diagnostic output and related parameters,
and let the relevant callers emit such messages.
Last but not least, validate the post-encryption checksum
according to the innodb_checksum_algorithm (only accepting
one checksum for the strict variants), and no longer
try to validate the page as if it was unencrypted.
buf_page_is_zeroes(): Move to the compilation unit of the only callers,
and declare static.
xb_fil_cur_read(), buf_page_check_corrupt(): Add a condition before
calling fil_space_verify_crypt_checksum(). This is a non-functional
change.
buf_dblwr_process(): Validate the page only as encrypted or unencrypted,
but not both.
Also, apply the MDEV-17957 changes to encrypted page checksums,
and remove error message output from the checksum function,
because these messages would be useless noise when mariabackup
is retrying reads of corrupted-looking pages, and not that
useful during normal server operation either.
The error messages in fil_space_verify_crypt_checksum()
should be refactored separately.
Problem:
Innodb_checksum_algorithm checks for all checksum algorithm to
validate the page checksum even though the algorithm is specified as
strict_crc32, strict_innodb, strict_none.
Fix:
Remove the checks for all checksum algorithm to validate the page
checksum if the algo is specified as strict_* values.
ha_innobase::prepare_inplace_alter_table(): check max column length for every
index in a table, not just added in this particular ALTER TABLE with ADD INDEX ones.
There was an incorrect check for MariaDB and InnoDB
tables fields count. Corruption was reported when there was no corruption.
Also, a warning message had incorrect field numbers for both MariaDB and InnoDB
tables.
ha_innobase::open(): fixed check and message
Background: Used encryption key_id is stored to encryption metadata
i.e. crypt_data that is stored on page 0 of the tablespace of the
table. crypt_data is created only if implicit encryption/not encryption
is requested i.e. ENCRYPTED=[YES|NO] table option is used
fil_create_new_single_table_tablespace on fil0fil.cc.
Later if encryption is enabled all tables that use default encryption
mode (i.e. no encryption table option is set) are encrypted with
default encryption key_id that is 1. See fil_crypt_start_encrypting_space on
fil0crypt.cc.
ha_innobase::check_table_options()
If default encryption is used and encryption is disabled, you may
not use nondefault encryption_key_id as it is not stored anywhere.