mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-30 18:41:56 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	3794673111	MDEV-28836: Memory alignment cleanup Table_cache_instance: Define the structure aligned at the CPU cache line, and remove a pad[] data member. Krunal Bauskar reported this to improve performance on ARMv8. aligned_malloc(): Wrapper for the Microsoft _aligned_malloc() and the ISO/IEC 9899:2011 <stdlib.h> aligned_alloc(). Note: The parameters are in the Microsoft order (size, alignment), opposite of aligned_alloc(alignment, size). Note: The standard defines that size must be an integer multiple of alignment. It is enforced by AddressSanitizer but not by GNU libc on Linux. aligned_free(): Wrapper for the Microsoft _aligned_free() and the standard free(). HAVE_ALIGNED_ALLOC: A new test. Unfortunately, support for aligned_alloc() may still be missing on some platforms. We will fall back to posix_memalign() for those cases. HAVE_MEMALIGN: Remove, along with any use of the nonstandard memalign(). PFS_ALIGNEMENT (sic): Removed; we will use CPU_LEVEL1_DCACHE_LINESIZE. PFS_ALIGNED: Defined using the C++11 keyword alignas. buf_pool_t::page_hash_table::create(), lock_sys_t::hash_table::create(): lock_sys_t::hash_table::resize(): Pad the allocation size to an integer multiple of the alignment. Reviewed by: Vladislav Vaintroub	2022-06-21 16:59:49 +03:00
Marko Mäkelä	55f02c24a6	MDEV-28845 fixup: Prevent an infinite loop buf_page_create_low(): Before retrying, release the exclusive page latch in order to prevent an infinite loop in buf_pool_t::corrupted_evict().	2022-06-21 14:40:40 +03:00
Marko Mäkelä	20b8e5a07e	Merge 10.8 into 10.9	2022-06-17 11:31:21 +03:00
Marko Mäkelä	cb19e211ec	Merge 10.7 into 10.8	2022-06-16 11:15:21 +03:00
Marko Mäkelä	a8c22dae8b	Merge 10.6 into 10.7	2022-06-16 10:50:58 +03:00
Marko Mäkelä	253806dffc	MDEV-28845 InnoDB: Failing assertion: bpage->can_relocate() in buf0lru.cc Since commit `0b47c126e3` (MDEV-13542) we treat all-zero pages as corrupted ones. During a stress test, a read-ahead of an all-zero page was triggered and the page read was completed concurrently with buf_page_create_low(). This caused the assertion to fail, because buf_page_create_low() was waiting for the page latch. buf_page_get_low(): Only invoke buf_pool_t::corrupted_evict() if the block was not already marked as corrupted. buf_page_create_low(): On page identifier mismatch, retry the buf_pool.page_hash lookup. buf_pool_t::corrupted_evict(): Set the state of the block to FREED so that a concurrent buf_page_get_low() will refuse to load the page. Wait for the page latch to be vacant before proceeding to remove the block from buf_pool.page_hash and buf_pool.LRU. page_id_t::set_corrupted(), page_id_t::is_corrupted(): Accessors for indicating a corrupted page identifier. Tested by Matthias Leich	2022-06-15 17:00:05 +03:00
Marko Mäkelä	9fe784ff7e	Merge 10.8 into 10.9	2022-06-15 10:01:51 +03:00
Marko Mäkelä	813986a647	Merge 10.7 into 10.8	2022-06-14 16:19:29 +03:00
Marko Mäkelä	ddf511c44d	Merge 10.6 into 10.7	2022-06-14 10:17:36 +03:00
Marko Mäkelä	1f1fa7e09c	Merge 10.5 into 10.6	2022-06-14 09:49:47 +03:00
Marko Mäkelä	4849d94fe6	MDEV-28828 SIGSEGV in buf_flush_LRU_list_batch In commit `73fee39ea6` (MDEV-27985) a regression was introduced that would cause bpage=nullptr to be referenced. buf_flush_LRU_list_batch(): Always terminate the loop upon encountering a null pointer.	2022-06-14 09:14:24 +03:00
Marko Mäkelä	6dea701e0f	Merge 10.8 into 10.9	2022-06-09 14:53:34 +03:00
Marko Mäkelä	0af9346079	Merge 10.7 into 10.8	2022-06-09 14:37:53 +03:00
Marko Mäkelä	d61839c71e	MDEV-28708 Increased congestion on buf_pool.flush_list_mutex In commit `f80deb9590` (MDEV-27868) a fix for a correctness regression caused a performance regression by increasing the amount of work that is executed while holding buf_pool.flush_list_mutex. buf_page_t::set_temp_modified(): Relax an assertion, to allow an already dirty block to be marked as dirty. buf_page_t::flush_list_requests: Note that the variable is not always protected by buf_pool.flush_list_mutex. Already dirty blocks that are being written to will increment the counter without holding buf_pool.flush_list_mutex. mtr_t::process_freed_pages(): Handle pages that were freed during the execution of the mini-transaction. ReleaseUnlogged, mtr_t::release_unlogged(): Release modified pages when no log was written. This is for pages of the temporary tablespace, or for IMPORT TABLESPACE. ReleaseModified: Renamed from ReleaseBlocks. Assume that buf_pool.flush_list_mutex was acquired by the caller. ReleaseSimple: A combination of ReleaseLatches and ReleaseModified, for the case that for any modified pages, some earlier modifications are already waiting to be written. mtr_t::commit(): Invoke one of release_unlogged(), ReleaseModified, ReleaseSimple, ReleaseAll. Acquire and release buf_pool.flush_list_mutex at most once. memo_slot_release(): Simplify the code. mtr_t::sx_latch_at_savepoint(), mtr_t::x_latch_at_savepoint(): Reduce the size of the critical section. fil_space_t::update_last_freed_lsn(), fil_space_t::clear_freed_ranges(), fil_space_t::add_free_range(): Assume that freed_range_mutex is held by the caller. buf_pool_t::prepare_insert_into_flush_list(): Determine the insert position for buf_pool_t::insert_into_flush_list(). Remove any clean blocks from buf_pool.flush_list that were encountered while searching. buf_pool_t::insert_into_flush_list(): Insert the block at the predetermined position.	2022-06-09 14:12:49 +03:00
Marko Mäkelä	fe75e5e5b1	Merge 10.6 into 10.7	2022-06-09 14:11:43 +03:00
Marko Mäkelä	77b3959b5c	MDEV-28457 Crash in page_dir_find_owner_slot() A prominent remaining source of crashes on corrupted index pages is page directory corruption. A frequent caller of page_dir_find_owner_slot() is page_rec_get_prev(). Some of those calls can be replaced with simpler logic that is less prone to fail. page_dir_find_owner_slot(), page_rec_get_prev(), page_rec_get_prev_const(), btr_pcur_move_to_prev(), btr_pcur_move_to_prev_on_page(), btr_cur_upd_rec_sys(), page_delete_rec_list_end(), rtr_page_copy_rec_list_end_no_locks(), rtr_page_copy_rec_list_start_no_locks(): Return an error code on failure. fil_space_t::io(), buf_page_get_low(): Use DB_CORRUPTION for out-of-bounds page reads. PageBulk::getSplitRec(), PageBulk::copyOut(): Simplify the code. btr_validate_level(): Prevent some more CHECK TABLE crashes on corrupted pages. btr_block_get(), btr_pcur_move_to_next_page(): Implement some checks that were previously only part of IndexPurge::next(). IndexPurge::next(): Use btr_pcur_move_to_next_page().	2022-06-08 14:53:24 +03:00
Marko Mäkelä	892c426371	MDEV-13542: Do not crash on decryption failure fil_page_type_validate(): Remove. This debug check was mostly redundant and added little value to the code paths that deal with page_compressed or encrypted pages. fil_get_page_type_name(): Remove; unused function. fil_space_decrypt(): Return an error if the page is not supposed to be encrypted. It is possible that an unencrypted page contains a nonzero key_version field even though it is not supposed to be encrypted. Previously we would crash in such a situation. buf_page_decrypt_after_read(): Simplify the code. Remove some unnecessary error message about temporary tablespace corruption. This is where we would usually invoke fil_space_decrypt().	2022-06-08 09:48:12 +03:00
Marko Mäkelä	5a33a37682	Merge 10.8 into 10.9	2022-06-07 09:20:07 +03:00
Marko Mäkelä	57d4a242da	Merge 10.7 into 10.8	2022-06-06 16:22:09 +03:00
Marko Mäkelä	7e39470e33	Merge 10.6 into 10.7	2022-06-06 14:56:20 +03:00
Marko Mäkelä	0b47c126e3	MDEV-13542: Crashing on corrupted page is unhelpful The approach to handling corruption that was chosen by Oracle in commit `177d8b0c12` is not really useful. Not only did it actually fail to prevent InnoDB from crashing, but it is making things worse by blocking attempts to rescue data from or rebuild a partially readable table. We will try to prevent crashes in a different way: by propagating errors up the call stack. We will never mark the clustered index persistently corrupted, so that data recovery may be attempted by reading from the table, or by rebuilding the table. This should also fix MDEV-13680 (crash on btr_page_alloc() failure); it was extensively tested with innodb_file_per_table=0 and a non-autoextend system tablespace. We should now avoid crashes in many cases, such as when a page cannot be read or allocated, or an inconsistency is detected when attempting to update multiple pages. We will not crash on double-free, such as on the recovery of DDL in system tablespace in case something was corrupted. Crashes on corrupted data are still possible. The fault injection mechanism that is introduced in the subsequent commit may help catch more of them. buf_page_import_corrupt_failure: Remove the fault injection, and instead corrupt some pages using Perl code in the tests. btr_cur_pessimistic_insert(): Always reserve extents (except for the change buffer), in order to prevent a subsequent allocation failure. btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages(). btr_assert_not_corrupted(), btr_corruption_report(): Remove. Similar checks are already part of btr_block_get(). FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE. dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(), trx_undo_page_get_s_latched(): Replaced with error-checking calls. trx_rseg_t::get(mtr_t): Replaces trx_rsegf_get(). trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed. trx_sys_create_sys_pages(): Merged with trx_sysf_create(). dict_check_tablespaces_and_store_max_id(): Do not access DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot(). Merge dict_check_sys_tables() with this function. dir_pathname(): Replaces os_file_make_new_pathname(). row_undo_ins_remove_sec(): Do not modify the undo page by adding a terminating NUL byte to the record. btr_decryption_failed(): Report decryption failures dict_set_corrupted_by_space(), dict_set_encrypted_by_space(), dict_set_corrupted_index_cache_only(): Remove. dict_set_corrupted(): Remove the constant parameter dict_locked=false. Never flag the clustered index corrupted in SYS_INDEXES, because that would deny further access to the table. It might be possible to repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case no B-tree leaf page is corrupted. dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(), row_purge_skip_uncommitted_virtual_index(): Remove, and refactor the callers to read dict_index_t::type only once. dict_table_is_corrupted(): Remove. dict_index_t::is_btree(): Determine if the index is a valid B-tree. BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove. UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger assertion failures, but error codes being returned. buf_corrupt_page_release(): Replaced with a direct call to buf_pool.corrupted_evict(). fil_invalid_page_access_msg(): Never crash on an invalid read; let the caller of buf_page_get_gen() decide. btr_pcur_t::restore_position(): Propagate failure status to the caller by returning CORRUPTED. opt_search_plan_for_table(): Simplify the code. row_purge_del_mark(), row_purge_upd_exist_or_extern_func(), row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(), row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free() when no secondary indexes exist. row_undo_mod_upd_exist_sec(): Simplify the code. row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT if the clustered index (and therefore the table) is corrupted, similar to what we do in row_insert_for_mysql(). fut_get_ptr(): Replace with buf_page_get_gen() calls. buf_page_get_gen(): Return nullptr and err=DB_CORRUPTION if the page is marked as freed. For other modes than BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED, we will return nullptr for freed pages, so that the callers can be simplified. The purge of transaction history will be a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on corrupted data. buf_page_get_low(): Never crash on a corrupted page, but simply return nullptr. fseg_page_is_allocated(): Replaces fseg_page_is_free(). fts_drop_common_tables(): Return an error if the transaction was rolled back. fil_space_t::set_corrupted(): Report a tablespace as corrupted if it was not reported already. fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report out-of-bounds page access or other errors. Clean up mtr_t::page_lock() buf_page_get_low(): Validate the page identifier (to check for recently read corrupted pages) after acquiring the page latch. buf_page_t::read_complete(): Flag uninitialized (all-zero) pages with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch. mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi(). recv_sys_t::free_corrupted_page(): Only set_corrupt_fs() if any log records exist for the page. We do not mind if read-ahead produces corrupted (or all-zero) pages that were not actually needed during recovery. recv_recover_page(): Return whether the operation succeeded. recv_sys_t::recover_low(): Simplify the logic. Check for recovery error. Thanks to Matthias Leich for testing this extensively and to the authors of https://rr-project.org for making it easy to diagnose and fix any failures that were found during the testing.	2022-06-06 14:03:22 +03:00
Marko Mäkelä	aa45850687	Cleanup: Make fil_space_t::freed_ranges private fil_space_t::is_freed(): Check if a page is in freed_ranges. fil_space_t::flush_freed(): Replaces buf_flush_freed_pages().	2022-06-06 11:55:29 +03:00
Haidong Ji	41068a890e	MDEV-27314 Condense innodb buffer pool resize message InnoDB buffer pool resize messages are more succinct from this change: Before: ``` 2022-05-07 17:10:33 0 [Note] InnoDB: Completed resizing buffer pool from 14745600 to 19660800 bytes. 2022-05-07 17:10:33 0 [Note] InnoDB: Completed resizing buffer pool. 2022-05-07 17:10:33 8 [Note] InnoDB: Completed resizing buffer pool. (New size: 19660800 bytes). ``` After: ``` 2022-05-07 17:10:33 0 [Note] InnoDB: Completed resizing buffer pool from 14745600 to 19660800 bytes. ``` Additionally, the INNODB_BUFFER_POOL_RESIZE_STATUS has more complete info: it contains both the old and new buffer pool size values.	2022-05-26 12:10:29 +10:00
Sergei Golubchik	bf2bdd1a1a	Merge branch '10.8' into 10.9	2022-05-19 14:07:55 +02:00
Sergei Golubchik	b7ffccf49b	Merge branch '10.7' into 10.8	2022-05-18 13:26:48 +02:00
Sergei Golubchik	99a433ed1c	Merge branch '10.6' into 10.7	2022-05-18 10:34:38 +02:00
Marko Mäkelä	daa2680c78	Merge 10.5 into 10.6	2022-05-12 08:11:57 +03:00
Vlad Lesin	3fabdc3ca8	MDEV-28473 field_ref_zero is not initialized in xtrabackup_prepare_func() The solution is to initialize field_ref_zero in main_low() before xtrabackup_backup_func() and xtrabackup_prepare_func() calls.	2022-05-11 17:20:31 +03:00
Sergei Golubchik	a70a1cf3f4	Merge branch '10.3' into 10.4	2022-05-08 23:03:08 +02:00
Oleksandr Byelkin	9614fde1aa	Merge branch '10.2' into 10.3	2022-05-03 10:59:54 +02:00
Marko Mäkelä	504a3b32f6	Merge 10.8 into 10.9	2022-04-28 15:54:03 +03:00
Marko Mäkelä	133c2129cd	Merge 10.7 into 10.8	2022-04-27 10:43:00 +03:00
Marko Mäkelä	f21a875600	MDEV-28415 ALTER TABLE on a large table hangs InnoDB buf_flush_page(): Never wait for a page latch, even in checkpoint flushing (flush_type == BUF_FLUSH_LIST), to prevent a hang of the page cleaner threads when a large number of pages is latched. In mysql/mysql-server@9542f3015b it was claimed that such a hang only affects CREATE FULLTEXT INDEX. Their fix was to retain buffer-fix but release exclusive latch on non-leaf pages, and subsequently write to those pages while they are not associated with the mini-transaction, which would trip a debug assertion in the MariaDB version of mtr_t::memo_modify_page() and cause potential corruption when using the default MariaDB setting innodb_log_optimize_ddl=OFF. This change essentially backports a small part of commit `7cffb5f6e8` (MDEV-23399) from MariaDB Server 10.5.7.	2022-04-27 07:57:04 +03:00
Marko Mäkelä	638afc4acf	Merge 10.6 into 10.7	2022-04-26 18:59:40 +03:00
Marko Mäkelä	e135edec3a	Merge 10.5 into 10.6	2022-04-26 15:21:20 +03:00
Marko Mäkelä	c009ce7dd0	MDEV-27094 Debug builds include useless InnoDB "disabled" options This is a backport of commit `4489a89c71` in order to remove the test innodb.redo_log_during_checkpoint that would cause trouble in the DBUG subsystem invoked by safe_mutex_lock() via log_checkpoint(). Before commit `7cffb5f6e8` these mutexes were of different type. The following options were introduced in commit `2e814d4702` (mariadb-10.2.2) and have little use: innodb_disable_resize_buffer_pool_debug had no effect even in MariaDB 10.2.2 or MySQL 5.7.9. It was introduced in mysql/mysql-server@5c4094cf49 to work around a problem that was fixed in mysql/mysql-server@2957ae4f99 (but the parameter was not removed). innodb_page_cleaner_disabled_debug and innodb_master_thread_disabled_debug are only used by the test innodb.redo_log_during_checkpoint that will be removed as part of this commit. innodb_dict_stats_disabled_debug is only used by that test, and it is redundant because one could simply use innodb_stats_persistent=OFF or the STATS_PERSISTENT=0 attribute of the table in the test to achieve the same effect.	2022-04-22 12:48:40 +03:00
Marko Mäkelä	f84b5d782a	Fix clang -Wunused-but-set-variable	2022-04-21 11:35:07 +03:00
Marko Mäkelä	d1edb011ee	Cleanup: Remove os0thread Let us use the common pthread_t wrapper for Microsoft Windows. This fixes up commit `dbe941e06f`	2022-04-19 13:49:52 +03:00
Marko Mäkelä	e98013cb5c	Merge 10.8 into 10.9	2022-04-13 13:39:00 +03:00
Nayuta Yanagisawa	cbf9d8a8d5	Merge 10.7 into 10.8	2022-04-13 17:52:27 +09:00
Marko Mäkelä	aa3a9d1ef5	Merge 10.6 into 10.7	2022-04-12 16:11:29 +03:00
Marko Mäkelä	7bccf3dd74	MDEV-28274 Assertion s <= READ_FIX failed in buf_page_t::set_state buf_page_t::set_state(): Relax a debug assertion. It is fine to update a read-fixed block descriptor to be both read-fixed and buffer-fixed. buf_pool_t::watch_unset(): Fix some incorrect logic that was implemented in commit `e9e6db9355`. Thanks to Elena Stepanova for the test case.	2022-04-11 10:22:40 +03:00
Marko Mäkelä	6cb6ba8b7b	Merge 10.8 into 10.9	2022-04-06 13:33:33 +03:00
Marko Mäkelä	b2baeba415	Merge 10.7 into 10.8	2022-04-06 13:28:25 +03:00
Marko Mäkelä	2d8e38bc94	Merge 10.6 into 10.7	2022-04-06 13:00:09 +03:00
Marko Mäkelä	ff99413804	MDEV-25975: Merge 10.5 into 10.6	2022-04-06 12:45:14 +03:00
Marko Mäkelä	5d8dcfd86c	MDEV-25975: Merge 10.4 into 10.5	2022-04-06 10:30:49 +03:00
Marko Mäkelä	8680eedb26	Merge 10.8 into 10.9	2022-03-30 09:41:14 +03:00
Marko Mäkelä	5c69e93630	Merge 10.7 into 10.8	2022-03-30 09:34:07 +03:00
Marko Mäkelä	a4d753758f	Merge 10.6 into 10.7	2022-03-30 08:52:05 +03:00

... 4 5 6 7 8 ...

1605 commits