Reduce the number of calls to encryption_get_key_get_latest_version
when doing key rotation with two different methods:
(1) We need to fetch key information when tablespace not yet
have a encryption information, invalid keys are handled now
differently (see below). There was extra call to detect
if key_id is not found on key rotation.
(2) If key_id is not found from encryption plugin, do not
try fetching new key_version for it as it will fail anyway.
We store return value from encryption_get_key_get_latest_version
call and if it returns ENCRYPTION_KEY_VERSION_INVALID there
is no need to call it again.
crashes server
This bug is the result of merging the Oracle MySQL follow-up fix
BUG#22963169 MYSQL CRASHES ON CREATE FULLTEXT INDEX
without merging the base bug fix:
Bug#79475 Insert a token of 84 4-bytes chars into fts index causes
server crash.
Unlike the above mentioned fixes in MySQL, our fix will not change
the storage format of fulltext indexes in InnoDB or XtraDB
when a character encoding with mbmaxlen=2 or mbmaxlen=3
and the length of a word is between 128 and 84*mbmaxlen bytes.
The Oracle fix would allocate 2 length bytes for these cases.
Compatibility with other MySQL and MariaDB releases is ensured by
persisting the used maximum length in the SYS_COLUMNS table in the
InnoDB data dictionary.
This fix also removes some unnecessary strcmp() calls when checking
for the legacy default collation my_charset_latin1
(my_charset_latin1.name=="latin1_swedish_ci").
fts_create_one_index_table(): Store the actual length in bytes.
This metadata will be written to the SYS_COLUMNS table.
fts_zip_initialize(): Initialize only the first byte of the buffer.
Actually the code should not even care about this first byte, because
the length is set as 0.
FTX_MAX_WORD_LEN: Define as HA_FT_MAXCHARLEN * 4 aka 336 bytes,
not as 254 bytes.
row_merge_create_fts_sort_index(): Set the actual maximum length of the
column in bytes, similar to fts_create_one_index_table().
row_merge_fts_doc_tokenize(): Remove the redundant parameter word_dtype.
Use the actual maximum length of the column. Calculate the extra_size
in the same way as row_merge_buf_encode() does.
trx_state_eq(): Add the parameter bool relaxed=false, to
allow trx->state==TRX_STATE_NOT_STARTED where a different
state is expected, if an error has been reported.
trx_release_savepoint_for_mysql(): Pass relaxed=true to
trx_state_eq(). That is, allow the transaction to be idle
when ROLLBACK TO SAVEPOINT is attempted after an error
has been reported to the client.
buf_block_init(): Initialize buf_page_t::flush_type.
For some reason, Valgrind 3.12.0 would seem to flag some
bits in adjacent bitfields as uninitialized, even though only
the two bits of flush_type were left uninitialized. Initialize
the field to get rid of many warnings.
buf_page_init_low(): Initialize buf_page_t::old.
For some reason, Valgrind 3.12.0 would seem to flag all 32
bits uninitialized when buf_page_init_for_read() invokes
buf_LRU_add_block(bpage, TRUE). This would trigger bogus warnings
for buf_page_t::freed_page_clock being uninitialized.
(The V-bits would later claim that only "old" is initialized
in the 32-bit word.) Perhaps recent compilers
(GCC 6.2.1 and clang 4.0.0) generate more optimized x86_64 code
for bitfield operations, confusing Valgrind?
mach_write_to_1(), mach_write_to_2(), mach_write_to_3():
Rewrite the assertions that ensure that the most significant
bits are zero. Apparently, clang 4.0.0 would optimize expressions
of the form ((n | 0xFF) <= 0x100) to (n <= 0x100). The redundant
0xFF was added in the first place in order to suppress a
Valgrind warning. (Valgrind would warn about comparing uninitialized
values even in the case when the uninitialized bits do not affect
the result of the comparison.)
In InnoDB and XtraDB functions that declare pointer parameters as nonnull,
remove nullness checks, because GCC would optimize them away anyway.
Use #ifdef instead of #if when checking for a configuration flag.
Clang says that left shifts of negative values are undefined.
So, use ~0U instead of ~0 in a number of macros.
Some functions that were defined as UNIV_INLINE were declared as
UNIV_INTERN. Consistently use the same type of linkage.
ibuf_merge_or_delete_for_page() could pass bitmap_page=NULL to
buf_page_print(), conflicting with the __attribute__((nonnull)).
This is not a fix, this is instrumentation to find out is MySQL frm dictionary
and InnoDB data dictionary really out-of-sync when this assertion is fired,
or is there some other reason (bug).
Analysis: Problem is that page is encrypted but encryption information
on page 0 has already being changed.
Fix: If page header contains key_version != 0 and even if based on
current encryption information tablespace is not encrypted we
need to check is page corrupted. If it is not, then we know that
page is not encrypted. If page is corrupted, we need to try to
decrypt it and then compare the stored and calculated checksums
to see is page corrupted or not.
Two problems:
(1) When pushing warning to sql-layer we need to check that thd != NULL
to avoid NULL-pointer reference.
(2) At tablespace key rotation if used key_id is not found from
encryption plugin tablespace should not be rotated.
MDEV-10394: Innodb system table space corrupted
Analysis: After we have read the page in buf_page_io_complete try to
find if the page is encrypted or corrupted. Encryption was determined
by reading FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION field from FIL-header
as a key_version. However, this field is not always zero even when
encryption is not used. Thus, incorrect key_version could lead situation where
decryption is tried to page that is not encrypted.
Fix: We still read key_version information from FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION
field but also check if tablespace has encryption information before trying
encrypt the page.
(Fixing both InnoDB and XtraDB)
Re-opening a TABLE object (after e.g. FLUSH TABLES or open table cache
eviction) causes ha_innobase to call
dict_stats_update(DICT_STATS_FETCH_ONLY_IF_NOT_IN_MEMORY).
Inside this call, the following is done:
dict_stats_empty_table(table);
dict_stats_copy(table, t);
On the other hand, commands like UPDATE make this call to get the "rows in
table" statistics in table->stats.records:
ha_innobase->info(HA_STATUS_VARIABLE|HA_STATUS_NO_LOCK)
note the HA_STATUS_NO_LOCK parameter. It means, no locks are taken by
::info() If the ::info() call happens between dict_stats_empty_table
and dict_stats_copy calls, the UPDATE's optimizer will get an estimate
of table->stats.records=1, which causes it to pick a full table scan,
which in turn will take a lot of row locks and cause other bad
consequences.
Problem was that NULL-pointer was accessed inside a macro when
page read from tablespace is encrypted but decrypt fails because
of incorrect key file.
Removed unsafe macro using inlined function where used pointers
are checked.
Analysis: By design InnoDB was reading first page of every .ibd file
at startup to find out is tablespace encrypted or not. This is
because tablespace could have been encrypted always, not
encrypted newer or encrypted based on configuration and this
information can be find realible only from first page of .ibd file.
Fix: Do not read first page of every .ibd file at startup. Instead
whenever tablespace is first time accedded we will read the first
page to find necessary information about tablespace encryption
status.
TODO: Add support for SYS_TABLEOPTIONS where all table options
encryption information included will be stored.
Followup from 5.5 patch. Removing memory barriers on intel is wrong as
this doesn't prevent the compiler and/or processor from reorganizing reads
before the mutex release. Forcing a memory barrier before reading the waiters will
guarantee that no speculative reading takes place.
- fixes in innodb to skip wsrep processing (like kill victim) when running in native mysql mode
- similar fixes in mysql server side
- forcing tc_log_dummy in native mysql mode when no binlog used. wsrep hton messes up handler counter
and used to lead in using tc_log_mmap instead. Bad news is that tc_log_mmap does not seem to work at all
When checking is any of the renamed columns part of the
columns for new indexes we accessed NULL pointer if checked
column used on index was added on same statement. Additionally,
we tried to check too many indexes, added_index_count
is enough here.
Fix memory barrier issues on releasing mutexes. We must have a full
memory barrier between releasing a mutex lock and reading its waiters.
This prevents us from missing to release waiters due to reading the
number of waiters speculatively before releasing the lock. If threads
try and wait between us reading the waiters count and releasing the
lock, those threads might stall indefinitely.
Also, we must use proper ACQUIRE/RELEASE semantics for atomic
operations, not ACQUIRE/ACQUIRE.