MySQL 5.6 do not work with MariaDB 10.1
Analysis: Problem is that tablespace flags bit DATA_DIR
is on different position on MySQL 5.6 compared to
MariaDB 10.1.
Fix: If we detect that there is difference between dictionary
flags and tablespace flags we remove DATA_DIR flag and compare
again. Remote tablespace is tried to locate even in case
when DATA_DIR flag is not set.
Analysis: When pages in doublewrite buffer are analyzed compressed
pages do not have correct checksum.
Fix: Decompress page before checksum is compared. If decompression
fails we still check checksum and corrupted pages are found.
If decompression succeeds, page now contains the original
checksum.
Problem was that in-place online alter table was used on a table
that had mismatch between MySQL frm file and InnoDB data dictionary.
Fixed so that traditional "Copy" method is used if the MySQL frm
and InnoDB data dictionary is not consistent.
Using __ppc_get_timebase will translate to mfspr instruction
The mfspr instruction will block FXU1 until complete but the other
Pipelines are available for execution of instructions from other
SMT threads on the same core.
The latency time to read the timebase SPR is ~10 cycles.
So any impact on other threads is limited other FXU1 only instructions
(basically other mfspr/mtspr ops).
Suggested by Steven J. Munroe, Linux on Power Toolchain Architect,
Linux Technology Center
IBM Corporation
Bug#18842925 : SET THREAD PRIORITY IN INNODB MUTEX SPINLOOP
Like "pause" instruction for hyper-threading at Intel CPUs,
POWER has special instructions only for hinting priority of hardware-threads.
Approved by Sunny in rb#6256
Backport of the 5.7 fix - c92102a6ef
(excluded cache line size patch)
Suggestion by Stewart Smith
UT_RELAX_CPU(): Use a compiler barrier.
ut_delay(): Remove the dummy global variable ut_always_false.
RB: 11399
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
Backported from MySQL-5.7 - patch 5e3efb0396
Suggestion by Stewart Smith
Make sure that we read all possible encryption keys from checkpoint
and if log block checksum does not match, print all found
checkpoint encryption keys.
Problem was that link file (.isl) is also opened using O_DIRECT
mode and if this fails the whole create table fails on internal
error.
Fixed by not using O_DIRECT on link files as they are used only
on create table and startup and do not contain real data.
O_DIRECT failures are successfully ignored for data files
if O_DIRECT is not supported by file system on used
data directory.
Analysis:
-- InnoDB has n (>0) redo-log files.
-- In the first page of redo-log there is 2 checkpoint records on fixed location (checkpoint is not encrypted)
-- On every checkpoint record there is up to 5 crypt_keys containing the keys used for encryption/decryption
-- On crash recovery we read all checkpoints on every file
-- Recovery starts by reading from the latest checkpoint forward
-- Problem is that latest checkpoint might not always contain the key we need to decrypt all the
redo-log blocks (see MDEV-9422 for one example)
-- Furthermore, there is no way to identify is the log block corrupted or encrypted
For example checkpoint can contain following keys :
write chk: 4 [ chk key ]: [ 5 1 ] [ 4 1 ] [ 3 1 ] [ 2 1 ] [ 1 1 ]
so over time we could have a checkpoint
write chk: 13 [ chk key ]: [ 14 1 ] [ 13 1 ] [ 12 1 ] [ 11 1 ] [ 10 1 ]
killall -9 mysqld causes crash recovery and on crash recovery we read as
many checkpoints as there is log files, e.g.
read [ chk key ]: [ 13 1 ] [ 12 1 ] [ 11 1 ] [ 10 1 ] [ 9 1 ]
read [ chk key ]: [ 14 1 ] [ 13 1 ] [ 12 1 ] [ 11 1 ] [ 10 1 ] [ 9 1 ]
This is problematic, as we could still scan log blocks e.g. from checkpoint 4 and we do
not know anymore the correct key.
CRYPT INFO: for checkpoint 14 search 4
CRYPT INFO: for checkpoint 13 search 4
CRYPT INFO: for checkpoint 12 search 4
CRYPT INFO: for checkpoint 11 search 4
CRYPT INFO: for checkpoint 10 search 4
CRYPT INFO: for checkpoint 9 search 4 (NOTE: NOT FOUND)
For every checkpoint, code generated a new encrypted key based on key
from encryption plugin and random numbers. Only random numbers are
stored on checkpoint.
Fix: Generate only one key for every log file. If checkpoint contains only
one key, use that key to encrypt/decrypt all log blocks. If checkpoint
contains more than one key (this is case for databases created
using MariaDB server version 10.1.0 - 10.1.12 if log encryption was
used). If looked checkpoint_no is found from keys on checkpoint we use
that key to decrypt the log block. For encryption we use always the
first key. If the looked checkpoint_no is not found from keys on checkpoint
we use the first key.
Modified code also so that if log is not encrypted, we do not generate
any empty keys. If we have a log block and no keys is found from
checkpoint we assume that log block is unencrypted. Log corruption or
missing keys is found by comparing log block checksums. If we have
a keys but current log block checksum is correct we again assume
log block to be unencrypted. This is because current implementation
stores checksum only before encryption and new checksum after
encryption but before disk write is not stored anywhere.
There was two problems. Firstly, if page in ibuf is encrypted but
decrypt failed we should not allow InnoDB to start because
this means that system tablespace is encrypted and not usable.
Secondly, if page decrypt is detected we should return false
from buf_page_decrypt_after_read.
MDEV-9469: 'Incorrect key file' on ALTER TABLE
InnoDB needs to rebuild table if column name is changed and
added index (or foreign key) is created based on this new
name in same alter table.
Provided IBM System Z have outdated compiler version, which supports gcc sync
builtins but not gcc atomic builtins. It also has weak memory model.
InnoDB attempted to verify if __sync_lock_test_and_set() is available by
checking IB_STRONG_MEMORY_MODEL. This macro has nothing to do with availability
of __sync_lock_test_and_set(), the right one is HAVE_ATOMIC_BUILTINS.
In wsrep BF we have already took lock_sys and trx
mutex either on wsrep_abort_transaction() or
before wsrep_kill_victim(). In replication we
could own lock_sys mutex taken in
lock_deadlock_check_and_resolve().
Analysis: We have reserved ROW_MERGE_RESERVE_SIZE ( = 4) for
encryption key_version. When calculating is there more
space on sort buffer, this value needs to be substracted
from current available space.
Backport pull request #125 from grooverdan/MDEV-8923_innodb_buffer_pool_dump_pct to 10.0
WL#6504 InnoDB buffer pool dump/load enchantments
This patch consists of two parts:
1. Dump only the hottest N% of the buffer pool(s)
2. Prevent hogging the server duing BP load
From MySQL - commit b409342c43ce2edb68807100a77001367c7e6b8e
Add testcases for innodb_buffer_pool_dump_pct_basic.
Part of the code authored by Daniel Black
- Make accelerated checksum available to InnoDB and XtraDB.
- Fall back to slice-by-eight if not available. The mode used is printed on startup.
- Will only build on POWER systems at the moment until CMakeLists are modified
to only add the crc32_power8/ files when building on POWER.
running MySQL-5.7 unittest/gunit/innodb/ut0crc32-t
Before:
1..2
Using software crc32 implementation, CPU is little-endian
ok 1
Using software crc32 implementation, CPU is little-endian
normal CRC32: real 0.148006 sec
normal CRC32: user 0.148000 sec
normal CRC32: sys 0.000000 sec
big endian CRC32: real 0.144293 sec
big endian CRC32: user 0.144000 sec
big endian CRC32: sys 0.000000 sec
ok 2
After:
1..2
Using POWER8 crc32 implementation, CPU is little-endian
ok 1
Using POWER8 crc32 implementation, CPU is little-endian
normal CRC32: real 0.008097 sec
normal CRC32: user 0.008000 sec
normal CRC32: sys 0.000000 sec
big endian CRC32: real 0.147043 sec
big endian CRC32: user 0.144000 sec
big endian CRC32: sys 0.000000 sec
ok 2
Author CRC32 ASM code: Anton Blanchard <anton@au.ibm.com>
ref: https://github.com/antonblanchard/crc32-vpmsum
Signed-off-by: Daniel Black <daniel.black@au.ibm.com>