buf_block_init(): Initialize buf_page_t::flush_type.
For some reason, Valgrind 3.12.0 would seem to flag some
bits in adjacent bitfields as uninitialized, even though only
the two bits of flush_type were left uninitialized. Initialize
the field to get rid of many warnings.
buf_page_init_low(): Initialize buf_page_t::old.
For some reason, Valgrind 3.12.0 would seem to flag all 32
bits uninitialized when buf_page_init_for_read() invokes
buf_LRU_add_block(bpage, TRUE). This would trigger bogus warnings
for buf_page_t::freed_page_clock being uninitialized.
(The V-bits would later claim that only "old" is initialized
in the 32-bit word.) Perhaps recent compilers
(GCC 6.2.1 and clang 4.0.0) generate more optimized x86_64 code
for bitfield operations, confusing Valgrind?
mach_write_to_1(), mach_write_to_2(), mach_write_to_3():
Rewrite the assertions that ensure that the most significant
bits are zero. Apparently, clang 4.0.0 would optimize expressions
of the form ((n | 0xFF) <= 0x100) to (n <= 0x100). The redundant
0xFF was added in the first place in order to suppress a
Valgrind warning. (Valgrind would warn about comparing uninitialized
values even in the case when the uninitialized bits do not affect
the result of the comparison.)
In InnoDB and XtraDB functions that declare pointer parameters as nonnull,
remove nullness checks, because GCC would optimize them away anyway.
Use #ifdef instead of #if when checking for a configuration flag.
Clang says that left shifts of negative values are undefined.
So, use ~0U instead of ~0 in a number of macros.
Some functions that were defined as UNIV_INLINE were declared as
UNIV_INTERN. Consistently use the same type of linkage.
ibuf_merge_or_delete_for_page() could pass bitmap_page=NULL to
buf_page_print(), conflicting with the __attribute__((nonnull)).
Analysis: Problem is that page is encrypted but encryption information
on page 0 has already being changed.
Fix: If page header contains key_version != 0 and even if based on
current encryption information tablespace is not encrypted we
need to check is page corrupted. If it is not, then we know that
page is not encrypted. If page is corrupted, we need to try to
decrypt it and then compare the stored and calculated checksums
to see is page corrupted or not.
Two problems:
(1) When pushing warning to sql-layer we need to check that thd != NULL
to avoid NULL-pointer reference.
(2) At tablespace key rotation if used key_id is not found from
encryption plugin tablespace should not be rotated.
MDEV-10394: Innodb system table space corrupted
Analysis: After we have read the page in buf_page_io_complete try to
find if the page is encrypted or corrupted. Encryption was determined
by reading FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION field from FIL-header
as a key_version. However, this field is not always zero even when
encryption is not used. Thus, incorrect key_version could lead situation where
decryption is tried to page that is not encrypted.
Fix: We still read key_version information from FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION
field but also check if tablespace has encryption information before trying
encrypt the page.
The changes are deliberately kept minimal
- some functions are made global instead of static (they will be used in
xtrabackup later on)
- functions got additional parameter, deliberately unused for now :
fil_load_single_tablespaces
srv_undo_tablespaces_init
- Global variables added, also unused for now :
srv_archive_recovery
srv_archive_recovery_limit_lsn
srv_apply_log_only
srv_backup_mode
srv_close_files
- To make xtrabackup link with sql.lib on Windows, added some missing
source files to sql.lib
- Fixed os_thread_ret_t to be DWORD on Windows
(Fixing both InnoDB and XtraDB)
Re-opening a TABLE object (after e.g. FLUSH TABLES or open table cache
eviction) causes ha_innobase to call
dict_stats_update(DICT_STATS_FETCH_ONLY_IF_NOT_IN_MEMORY).
Inside this call, the following is done:
dict_stats_empty_table(table);
dict_stats_copy(table, t);
On the other hand, commands like UPDATE make this call to get the "rows in
table" statistics in table->stats.records:
ha_innobase->info(HA_STATUS_VARIABLE|HA_STATUS_NO_LOCK)
note the HA_STATUS_NO_LOCK parameter. It means, no locks are taken by
::info() If the ::info() call happens between dict_stats_empty_table
and dict_stats_copy calls, the UPDATE's optimizer will get an estimate
of table->stats.records=1, which causes it to pick a full table scan,
which in turn will take a lot of row locks and cause other bad
consequences.
Problem was that NULL-pointer was accessed inside a macro when
page read from tablespace is encrypted but decrypt fails because
of incorrect key file.
Removed unsafe macro using inlined function where used pointers
are checked.
Analysis: By design InnoDB was reading first page of every .ibd file
at startup to find out is tablespace encrypted or not. This is
because tablespace could have been encrypted always, not
encrypted newer or encrypted based on configuration and this
information can be find realible only from first page of .ibd file.
Fix: Do not read first page of every .ibd file at startup. Instead
whenever tablespace is first time accedded we will read the first
page to find necessary information about tablespace encryption
status.
TODO: Add support for SYS_TABLEOPTIONS where all table options
encryption information included will be stored.
Followup from 5.5 patch. Removing memory barriers on intel is wrong as
this doesn't prevent the compiler and/or processor from reorganizing reads
before the mutex release. Forcing a memory barrier before reading the waiters will
guarantee that no speculative reading takes place.
- fixes in innodb to skip wsrep processing (like kill victim) when running in native mysql mode
- similar fixes in mysql server side
- forcing tc_log_dummy in native mysql mode when no binlog used. wsrep hton messes up handler counter
and used to lead in using tc_log_mmap instead. Bad news is that tc_log_mmap does not seem to work at all