mirror of
https://github.com/MariaDB/server.git
synced 2025-04-11 01:35:33 +02:00

User-visible changes: The INFORMATION_SCHEMA views INNODB_BUFFER_PAGE and INNODB_BUFFER_PAGE_LRU will report a dummy value FLUSH_TYPE=0 and will no longer report the PAGE_STATE value READY_FOR_USE. We will remove some fields from buf_page_t and move much code to member functions of buf_pool_t and buf_page_t, so that the access rules of data members can be enforced consistently. Evicting or adding pages in buf_pool.LRU will remain covered by buf_pool.mutex. Evicting or adding pages in buf_pool.page_hash will remain covered by both buf_pool.mutex and the buf_pool.page_hash X-latch. After this fix, buf_pool.page_hash lookups can entirely avoid acquiring buf_pool.mutex, only relying on buf_pool.hash_lock_get() S-latch. Similarly, buf_flush_check_neighbors() can will rely solely on buf_pool.mutex, no buf_pool.page_hash latch at all. The buf_pool.mutex is rather contended in I/O heavy benchmarks, especially when the workload does not fit in the buffer pool. The first attempt to alleviate the contention was the buf_pool_t::mutex split in commit4ed7082eef
which introduced buf_block_t::mutex, which we are now removing. Later, multiple instances of buf_pool_t were introduced in commitc18084f71b
and recently removed by us in commit1a6f708ec5
(MDEV-15058). UNIV_BUF_DEBUG: Remove. This option to enable some buffer pool related debugging in otherwise non-debug builds has not been used for years. Instead, we have been using UNIV_DEBUG, which is enabled in CMAKE_BUILD_TYPE=Debug. buf_block_t::mutex, buf_pool_t::zip_mutex: Remove. We can mainly rely on std::atomic and the buf_pool.page_hash latches, and in some cases depend on buf_pool.mutex or buf_pool.flush_list_mutex just like before. We must always release buf_block_t::lock before invoking unfix() or io_unfix(), to prevent a glitch where a block that was added to the buf_pool.free list would apper X-latched. See commitc5883debd6
how this glitch was finally caught in a debug environment. We move some buf_pool_t::page_hash specific code from the ha and hash modules to buf_pool, for improved readability. buf_pool_t::close(): Assert that all blocks are clean, except on aborted startup or crash-like shutdown. buf_pool_t::validate(): No longer attempt to validate n_flush[] against the number of BUF_IO_WRITE fixed blocks, because buf_page_t::flush_type no longer exists. buf_pool_t::watch_set(): Replaces buf_pool_watch_set(). Reduce mutex contention by separating the buf_pool.watch[] allocation and the insert into buf_pool.page_hash. buf_pool_t::page_hash_lock<bool exclusive>(): Acquire a buf_pool.page_hash latch. Replaces and extends buf_page_hash_lock_s_confirm() and buf_page_hash_lock_x_confirm(). buf_pool_t::READ_AHEAD_PAGES: Renamed from BUF_READ_AHEAD_PAGES. buf_pool_t::curr_size, old_size, read_ahead_area, n_pend_reads: Use Atomic_counter. buf_pool_t::running_out(): Replaces buf_LRU_buf_pool_running_out(). buf_pool_t::LRU_remove(): Remove a block from the LRU list and return its predecessor. Incorporates buf_LRU_adjust_hp(), which was removed. buf_page_get_gen(): Remove a redundant call of fsp_is_system_temporary(), for mode == BUF_GET_IF_IN_POOL_OR_WATCH, which is only used by BTR_DELETE_OP (purge), which is never invoked on temporary tables. buf_free_from_unzip_LRU_list_batch(): Avoid redundant assignments. buf_LRU_free_from_unzip_LRU_list(): Simplify the loop condition. buf_LRU_free_page(): Clarify the function comment. buf_flush_check_neighbor(), buf_flush_check_neighbors(): Rewrite the construction of the page hash range. We will hold the buf_pool.mutex for up to buf_pool.read_ahead_area (at most 64) consecutive lookups of buf_pool.page_hash. buf_flush_page_and_try_neighbors(): Remove. Merge to its only callers, and remove redundant operations in buf_flush_LRU_list_batch(). buf_read_ahead_random(), buf_read_ahead_linear(): Rewrite. Do not acquire buf_pool.mutex, and iterate directly with page_id_t. ut_2_power_up(): Remove. my_round_up_to_next_power() is inlined and avoids any loops. fil_page_get_prev(), fil_page_get_next(), fil_addr_is_null(): Remove. buf_flush_page(): Add a fil_space_t* parameter. Minimize the buf_pool.mutex hold time. buf_pool.n_flush[] is no longer updated atomically with the io_fix, and we will protect most buf_block_t fields with buf_block_t::lock. The function buf_flush_write_block_low() is removed and merged here. buf_page_init_for_read(): Use static linkage. Initialize the newly allocated block and acquire the exclusive buf_block_t::lock while not holding any mutex. IORequest::IORequest(): Remove the body. We only need to invoke set_punch_hole() in buf_flush_page() and nowhere else. buf_page_t::flush_type: Remove. Replaced by IORequest::flush_type. This field is only used during a fil_io() call. That function already takes IORequest as a parameter, so we had better introduce for the rarely changing field. buf_block_t::init(): Replaces buf_page_init(). buf_page_t::init(): Replaces buf_page_init_low(). buf_block_t::initialise(): Initialise many fields, but keep the buf_page_t::state(). Both buf_pool_t::validate() and buf_page_optimistic_get() requires that buf_page_t::in_file() be protected atomically with buf_page_t::in_page_hash and buf_page_t::in_LRU_list. buf_page_optimistic_get(): Now that buf_block_t::mutex no longer exists, we must check buf_page_t::io_fix() after acquiring the buf_pool.page_hash lock, to detect whether buf_page_init_for_read() has been initiated. We will also check the io_fix() before acquiring hash_lock in order to avoid unnecessary computation. The field buf_block_t::modify_clock (protected by buf_block_t::lock) allows buf_page_optimistic_get() to validate the block. buf_page_t::real_size: Remove. It was only used while flushing pages of page_compressed tables. buf_page_encrypt(): Add an output parameter that allows us ot eliminate buf_page_t::real_size. Replace a condition with debug assertion. buf_page_should_punch_hole(): Remove. buf_dblwr_t::add_to_batch(): Replaces buf_dblwr_add_to_batch(). Add the parameter size (to replace buf_page_t::real_size). buf_dblwr_t::write_single_page(): Replaces buf_dblwr_write_single_page(). Add the parameter size (to replace buf_page_t::real_size). fil_system_t::detach(): Replaces fil_space_detach(). Ensure that fil_validate() will not be violated even if fil_system.mutex is released and reacquired. fil_node_t::complete_io(): Renamed from fil_node_complete_io(). fil_node_t::close_to_free(): Replaces fil_node_close_to_free(). Avoid invoking fil_node_t::close() because fil_system.n_open has already been decremented in fil_space_t::detach(). BUF_BLOCK_READY_FOR_USE: Remove. Directly use BUF_BLOCK_MEMORY. BUF_BLOCK_ZIP_DIRTY: Remove. Directly use BUF_BLOCK_ZIP_PAGE, and distinguish dirty pages by buf_page_t::oldest_modification(). BUF_BLOCK_POOL_WATCH: Remove. Use BUF_BLOCK_NOT_USED instead. This state was only being used for buf_page_t that are in buf_pool.watch. buf_pool_t::watch[]: Remove pointer indirection. buf_page_t::in_flush_list: Remove. It was set if and only if buf_page_t::oldest_modification() is nonzero. buf_page_decrypt_after_read(), buf_corrupt_page_release(), buf_page_check_corrupt(): Change the const fil_space_t* parameter to const fil_node_t& so that we can report the correct file name. buf_page_monitor(): Declare as an ATTRIBUTE_COLD global function. buf_page_io_complete(): Split to buf_page_read_complete() and buf_page_write_complete(). buf_dblwr_t::in_use: Remove. buf_dblwr_t::buf_block_array: Add IORequest::flush_t. buf_dblwr_sync_datafiles(): Remove. It was a useless wrapper of os_aio_wait_until_no_pending_writes(). buf_flush_write_complete(): Declare static, not global. Add the parameter IORequest::flush_t. buf_flush_freed_page(): Simplify the code. recv_sys_t::flush_lru: Renamed from flush_type and changed to bool. fil_read(), fil_write(): Replaced with direct use of fil_io(). fil_buffering_disabled(): Remove. Check srv_file_flush_method directly. fil_mutex_enter_and_prepare_for_io(): Return the resolved fil_space_t* to avoid a duplicated lookup in the caller. fil_report_invalid_page_access(): Clean up the parameters. fil_io(): Return fil_io_t, which comprises fil_node_t and error code. Always invoke fil_space_t::acquire_for_io() and let either the sync=true caller or fil_aio_callback() invoke fil_space_t::release_for_io(). fil_aio_callback(): Rewrite to replace buf_page_io_complete(). fil_check_pending_operations(): Remove a parameter, and remove some redundant lookups. fil_node_close_to_free(): Wait for n_pending==0. Because we no longer do an extra lookup of the tablespace between fil_io() and the completion of the operation, we must give fil_node_t::complete_io() a chance to decrement the counter. fil_close_tablespace(): Remove unused parameter trx, and document that this is only invoked during the error handling of IMPORT TABLESPACE. row_import_discard_changes(): Merged with the only caller, row_import_cleanup(). Do not lock up the data dictionary while invoking fil_close_tablespace(). logs_empty_and_mark_files_at_shutdown(): Do not invoke fil_close_all_files(), to avoid a !needs_flush assertion failure on fil_node_t::close(). innodb_shutdown(): Invoke os_aio_free() before fil_close_all_files(). fil_close_all_files(): Invoke fil_flush_file_spaces() to ensure proper durability. thread_pool::unbind(): Fix a crash that would occur on Windows after srv_thread_pool->disable_aio() and os_file_close(). This fix was submitted by Vladislav Vaintroub. Thanks to Matthias Leich and Axel Schwenke for extensive testing, Vladislav Vaintroub for helpful comments, and Eugene Kosov for a review.
94 lines
4 KiB
Text
94 lines
4 KiB
Text
-- source include/have_innodb.inc
|
|
-- source include/have_file_key_management_plugin.inc
|
|
# embedded does not support restart
|
|
-- source include/not_embedded.inc
|
|
# MDEV-20839 innodb-redo-badkey sporadically fails on buildbot with page dump
|
|
# We only require a debug build to avoid getting page dumps, making use of
|
|
# MDEV-19766: Disable page dump output in debug builds
|
|
-- source include/have_debug.inc
|
|
|
|
call mtr.add_suppression("Plugin 'file_key_management'");
|
|
call mtr.add_suppression("Plugin 'InnoDB' init function returned error.");
|
|
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t[1-4]\\.ibd' cannot be decrypted");
|
|
call mtr.add_suppression("failed to read or decrypt \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\]");
|
|
call mtr.add_suppression("InnoDB: Unable to decompress .*.test.t[12]\\.ibd\\[page id: space=[1-9][0-9]*, page number=[0-9]+\\]");
|
|
call mtr.add_suppression("InnoDB: Database page corruption on disk or a failed read of file '.*test.t[12]\\.ibd'");
|
|
call mtr.add_suppression("InnoDB: Failed to read page .* from file '.*'");
|
|
call mtr.add_suppression("InnoDB: Plugin initialization aborted");
|
|
call mtr.add_suppression("Plugin 'InnoDB' registration as a STORAGE ENGINE failed");
|
|
# for innodb_checksum_algorithm=full_crc32 only
|
|
call mtr.add_suppression("\\[ERROR\\] InnoDB: Cannot decrypt \\[page id: space=");
|
|
|
|
-- let $restart_parameters=--file-key-management-filename=$MYSQL_TEST_DIR/std_data/keys2.txt
|
|
-- source include/restart_mysqld.inc
|
|
|
|
--echo # Wait max 10 min for key encryption threads to encrypt all spaces
|
|
--let $wait_timeout= 600
|
|
--let $wait_condition=SELECT COUNT(*) = 0 FROM INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION WHERE MIN_KEY_VERSION = 0
|
|
--source include/wait_condition.inc
|
|
|
|
SET GLOBAL innodb_file_per_table = ON;
|
|
|
|
create table t1(a int not null primary key auto_increment, c char(250), b blob, index(b(10))) engine=innodb row_format=compressed encrypted=yes encryption_key_id=4;
|
|
create table t2(a int not null primary key auto_increment, c char(250), b blob, index(b(10))) engine=innodb row_format=compressed;
|
|
create table t3(a int not null primary key auto_increment, c char(250), b blob, index(b(10))) engine=innodb encrypted=yes encryption_key_id=4;
|
|
create table t4(a int not null primary key auto_increment, c char(250), b blob, index(b(10))) engine=innodb;
|
|
|
|
begin;
|
|
--disable_query_log
|
|
--let $i = 20
|
|
begin;
|
|
while ($i)
|
|
{
|
|
insert into t1(c,b) values (repeat('secret1',20), repeat('secret2',6000));
|
|
dec $i;
|
|
}
|
|
--enable_query_log
|
|
|
|
insert into t2 select * from t1;
|
|
insert into t3 select * from t1;
|
|
insert into t4 select * from t1;
|
|
commit;
|
|
|
|
--source ../../suite/innodb/include/no_checkpoint_start.inc
|
|
|
|
#
|
|
# We test redo log page read at recv_read_page using
|
|
# incorrect keys from std_data/keys.txt. If checkpoint
|
|
# happens we will skip this test. If no checkpoint
|
|
# happens, InnoDB refuses to start as used
|
|
# encryption key is incorrect.
|
|
#
|
|
SET GLOBAL innodb_flush_log_at_trx_commit=1;
|
|
begin;
|
|
update t1 set c = repeat('secret3', 20);
|
|
update t2 set c = repeat('secret4', 20);
|
|
update t3 set c = repeat('secret4', 20);
|
|
update t4 set c = repeat('secret4', 20);
|
|
insert into t1 (c,b) values (repeat('secret5',20), repeat('secret6',6000));
|
|
insert into t2 (c,b) values (repeat('secret7',20), repeat('secret8',6000));
|
|
insert into t3 (c,b) values (repeat('secret9',20), repeat('secre10',6000));
|
|
insert into t4 (c,b) values (repeat('secre11',20), repeat('secre12',6000));
|
|
COMMIT;
|
|
let $cleanup= drop table t1,t2,t3,t4;
|
|
--let CLEANUP_IF_CHECKPOINT= $cleanup;
|
|
--source ../../suite/innodb/include/no_checkpoint_end.inc
|
|
|
|
--error 1
|
|
-- source include/start_mysqld.inc
|
|
--source include/kill_mysqld.inc
|
|
|
|
#
|
|
# Now test with innodb-force-recovery=1 i.e. ignore corrupt pages
|
|
#
|
|
|
|
-- let $restart_parameters=--innodb-force-recovery=1
|
|
--error 1
|
|
-- source include/start_mysqld.inc
|
|
|
|
--source include/kill_mysqld.inc
|
|
|
|
-- let $restart_parameters=--file-key-management-filename=$MYSQL_TEST_DIR/std_data/keys2.txt
|
|
-- source include/start_mysqld.inc
|
|
|
|
drop table t1, t2,t3,t4;
|