Backport pull request #125 from grooverdan/MDEV-8923_innodb_buffer_pool_dump_pct to 10.0
WL#6504 InnoDB buffer pool dump/load enchantments
This patch consists of two parts:
1. Dump only the hottest N% of the buffer pool(s)
2. Prevent hogging the server duing BP load
From MySQL - commit b409342c43ce2edb68807100a77001367c7e6b8e
Add testcases for innodb_buffer_pool_dump_pct_basic.
Part of the code authored by Daniel Black
As galera node (slave) received query log events from an async
replication master, it partially wrote the updates made to replication
state table (mysql.gtid_slave_pos) to galera transaction writeset post
TOI. As a result, the transaction handle, thus created within galera,
was never freed/purged as the corresponding trx did not commit.
Thus, it kept piling up for every query log event and was only reclaimed
upon server shutdown when the transaction map object got destructed.
Fixed by making sure that updates in replication slave state table
are not written to galera transaction writeset and thus, not replicated
to other nodes.
fix innodb auto-increment handling
three bugs:
1. innobase_next_autoinc treated the case of current<offset incorrectly
2. ha_innobase::get_auto_increment didn't recalculate current when increment changed
3. ha_innobase::get_auto_increment didn't pass offset down to innobase_next_autoinc
Analysis: There were two problems. (1) if partition table was
created using lower_case_tables = 1 on windows we did find the
correct table but we did not set share->ib_table correctly.
(2) we did open table on dictionary but did not increase
mysql_open_tables.
Fix: In xtradb allow access to tables with incorrect
lower case names (warning is printed to error log). If
table is opened increase mysql_open_tables count to avoid
crash on flush tables.
Analysis: debug only assertion I_S function (IS is XtraDB feature) is calling
buf_block_get_frame on any page it reads, which debug-asserts that the page is
buffer-fixed, which is not the case in I_S query.
Fixed by holding the buffer page mutex while the fields are read directly.
WL#6504 InnoDB buffer pool dump/load enchantments
This patch consists of two parts:
1. Dump only the hottest N% of the buffer pool(s)
2. Prevent hogging the server duing BP load
From MySQL - commit b409342c43ce2edb68807100a77001367c7e6b8e
Analysis: Lengths which are not UNIV_SQL_NULL, but bigger than the following
number indicate that a field contains a reference to an externally
stored part of the field in the tablespace. The length field then
contains the sum of the following flag and the locally stored len.
This was incorrectly set to
define UNIV_EXTERN_STORAGE_FIELD (UNIV_SQL_NULL - UNIV_PAGE_SIZE_MAX)
When it should be
define UNIV_EXTERN_STORAGE_FIELD (UNIV_SQL_NULL - UNIV_PAGE_SIZE_DEF)
Additionally, we need to disable support for > 16K page size for
row compressed tables because a compressed page directory entry
reserves 14 bits for the start offset and 2 bits for flags.
This limits the uncompressed page size to 16k. To support
larger pages page directory entry needs to be larger.
Analysis: Current implementation will write and read at least one block
(sort_buffer_size bytes) from disk / index even if that block does not
contain any records.
Fix: Avoid writing / reading empty blocks to temporary files (disk).
Analysis: We are alreading holing lock_sys mutex when we call thd::awake.
This could lead mutex deadlock if trx->current_lock_mutex_owner is not
correctly set.
Fix: Make sure that trx->current_lock_mutex_owner is correctly set.
Folloup: Made encryption rules too strict (and incorrect). Allow creating
table with ENCRYPTED=OFF with all values of ENCRYPTION_KEY_ID but create
warning that nondefault values are ignored. Allow creating table with
ENCRYPTED=DEFAULT if used key_id is found from key file (there was
bug on this) and give error if key_id is not found.
Analysis: Problem sees to be the fact that we allow creating or altering
table to use encryption_key_id that does not exists in case where
original table is not encrypted currently. Secondly we should not
do key rotation to tables that are not encrypted or tablespaces
that can't be found from tablespace cache.
Fix: Do not allow creating unencrypted table with nondefault encryption key
and do not rotate tablespaces that are not encrypted (FIL_SPACE_ENCRYPTION_OFF)
or can't be found from tablespace cache.
Added encryption support for online alter table where InnoDB temporary
files are used. Added similar support also for tables containing
full text-indexes.
Made sure that table remains encrypted during discard and import
tablespace.
Analysis: Server tried to continue reading tablespace using a cursor after
we had resolved that pages in the tablespace can't be decrypted.
Fixed by addind check is tablespace still encrypted.
Analysis: Problem was that in fil_read_first_page we do find that
table has encryption information and that encryption service
or used key_id is not available. But, then we just printed
fatal error message that causes above assertion.
Fix: When we open single table tablespace if it has encryption
information (crypt_data) store this crypt data to the table
structure. When we open a table and we find out that tablespace
is not available, check has table a encryption information
and from there is encryption service or used key_id is not available.
If it is, add additional warning for SQL-layer.
Analysis: Problem was that in fil_read_first_page we do find that
table has encryption information and that encryption service
or used key_id is not available. But, then we just printed
fatal error message that causes above assertion.
Fix: When we open single table tablespace if it has encryption
information (crypt_data) store this crypt data to the table
structure. When we open a table and we find out that tablespace
is not available, check has table a encryption information
and from there is encryption service or used key_id is not available.
If it is, add additional warning for SQL-layer.
When wsrep is enabled, for any update on innodb tables, the
corresponding keys are appended to galera's transaction writeset
(wsrep_append_keys()). However, for LOAD DATA, this got skipped
if binary logging was disabled or it was non-ROW based.
As a result, while the updates from LOAD DATA on non-partitioned
tables replicated fine as wsrep implicitly enables binary logging
(if not enabled, explicitly), the same did not work on partitioned
tables as for partitioned tables the binary logging gets disabled
temporarily (ha_partition::write_row()).
Fixed by removing the unwanted conditions from the check.
Also backported some changes from 10.0-galera to make sure
wsrep_load_data_splitting affects LOAD DATA commands only.
Analysis: Problem was that when a new tablespace is created a default
encryption info is also created and stored to the tablespace. Later a
new encryption information was created with correct key_id but that
does not affect on IV.
Fix: Push encryption mode and key_id to lower levels and create
correct encryption info when a new tablespace is created.
This fix does not contain test case because, currently incorrect
encryption key causes page corruption and a lot of error messages
to error log causing mtr to fail.
Analysis: Handler used table flag HA_REQUIRE_PRIMARY_KEY but a bug on
sql_table.cc function mysql_prepare_create_table internally marked
secondary key with NOT NULL colums as unique key and did not then
fail on requirement that table should have primary key or unique key.
Analysis: Handler table flag HA_REQUIRE_PRIMARY_KEY alone is not enough
to force primary or unique key, if table has at least one NOT NULL
column and secondary key for that column.
Fix: Add additional check that table really has primary key or
unique key for InnoDB terms.