mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-31 02:51:44 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	40eff3f868	MDEV-26827 fixup: hangs and !os_aio_pending_writes() assertion failures buf_LRU_get_free_block(): Always wake up the page cleaner if needed before exiting the inner loop. srv_prepare_to_delete_redo_log_file(): Replace a debug assertion with a wait in debug builds. Starting with commit `7e31a8e7fa` the debug assertion ut_ad(!os_aio_pending_writes()) could occasionally fail, while it would hold in core dumps of crashes. The failure can be reproduced more easily by adding a sleep to the write completion callback function, right before releasing to write_slots. srv_start(): Remove a bogus debug assertion ut_ad(!os_aio_pending_writes()) that could fail in mariadb-backup --prepare. In an rr replay trace, we had buf_pool.flush_list.count==0 but write_slots->m_cache.m_pos==1 and buf_page_t::write_complete() was executing u_unlock().	2023-04-21 17:52:47 +03:00
Marko Mäkelä	e55e761eae	MDEV-31084 assert(waiting) failed in TP_connection_generic::wait_end buf_flush_wait_flushed(): Correct the logic for registering a wait around buf_flush_wait() that commit `a091d6ac4e` recently broke. This should be easily repeatable when using a non-default startup parameter: thread-handling=pool-of-threads	2023-04-21 16:49:59 +03:00
Marko Mäkelä	27ff972be2	MDEV-26827 fixup: Do not hog buf_pool.mutex buf_flush_LRU_list_batch(): When evicting clean pages, release and reacquire the buf_pool.mutex after every 32 pages. Also, eliminate some conditional branches.	2023-04-19 18:57:18 +03:00
Marko Mäkelä	a091d6ac4e	MDEV-26827 fixup: Do not duplicate io_slots::pending_io_count() os_aio_pending_reads_approx(), os_aio_pending_reads(): Replaces buf_pool.n_pend_reads. os_aio_pending_writes(): Replaces buf_dblwr.pending_writes(). buf_dblwr_t::write_cond, buf_dblwr_t::writes_pending: Remove.	2023-04-12 13:49:57 +03:00
Marko Mäkelä	5bada1246d	Merge 10.5 into 10.6	2023-04-11 16:15:19 +03:00
Marko Mäkelä	375991a531	MDEV-26827 fixup for DDL race condition buf_flush_try_neighbors(): Tolerate count<2 in case the tablespace is being dropped.	2023-04-11 14:42:08 +03:00
Oleksandr Byelkin	ac5a534a4c	Merge remote-tracking branch '10.4' into 10.5	2023-03-31 21:32:41 +02:00
Marko Mäkelä	dfa90257f6	MDEV-30936 clang 15.0.7 -fsanitize=memory fails massively handle_slave_io(), handle_slave_sql(), os_thread_exit(): Remove a redundant pthread_exit(nullptr) call, because it would cause SIGSEGV. mysql_print_status(): Add MEM_MAKE_DEFINED() to work around some missing instrumentation around mallinfo2(). que_graph_free_stat_list(): Invoke que_node_get_next(node) before que_graph_free_recursive(node). That is the logical and MSAN_OPTIONS=poison_in_dtor=1 compatible way of freeing memory. ins_node_t::~ins_node_t(): Invoke mem_heap_free(entry_sys_heap). que_graph_free_recursive(): Rely on ins_node_t::~ins_node_t(). fts_t::~fts_t(): Invoke mem_heap_free(fts_heap). fts_free(): Replace with direct calls to fts_t::~fts_t(). The failures in free_root() due to MSAN_OPTIONS=poison_in_dtor=1 will be covered in MDEV-30942.	2023-03-28 11:44:24 +03:00
Marko Mäkelä	07460c31e3	MDEV-30900 Crash on macOS due to zero-initialized buf_dblwr.write_cond buf_dblwr_t::init(), buf_dblwr_t::close(): Cover also write_cond, which was added in commit `a55b951e60` without explicit initialization. On GNU/Linux, PTHREAD_COND_INITIALIZER is a zero-initializer. That is why the default zero initialization happened to work on that platform.	2023-03-24 07:59:27 +02:00
Marko Mäkelä	e0560fc4cf	Remove a bogus UNIV_ZIP_DEBUG check buf_LRU_block_remove_hashed(): Ever since commit `2e814d4702` we could get page_zip_validate() failures after an ALTER TABLE operation was aborted and BtrBulk::pageCommit() had never been executed on some blocks.	2023-03-21 14:36:38 +02:00
Marko Mäkelä	32a53a66df	MDEV-26827 fixup: Remove a bogus assertion We can have dirty_blocks=0 when buf_flush_page_cleaner() is being woken up to write out or evict pages from the buf_pool.LRU list.	2023-03-20 10:32:35 +02:00
Marko Mäkelä	a55b951e60	MDEV-26827 Make page flushing even faster For more convenient monitoring of something that could greatly affect the volume of page writes, we add the status variable Innodb_buffer_pool_pages_split that was previously only available via information_schema.innodb_metrics as "innodb_page_splits". This was suggested by Axel Schwenke. buf_flush_page_count: Replaced with buf_pool.stat.n_pages_written. We protect buf_pool.stat (except n_page_gets) with buf_pool.mutex and remove unnecessary export_vars indirection. buf_pool.flush_list_bytes: Moved from buf_pool.stat.flush_list_bytes. Protected by buf_pool.flush_list_mutex. buf_pool_t::page_cleaner_status: Replaces buf_pool_t::n_flush_LRU_, buf_pool_t::n_flush_list_, and buf_pool_t::page_cleaner_is_idle. Protected by buf_pool.flush_list_mutex. We will exclusively broadcast buf_pool.done_flush_list by the buf_flush_page_cleaner thread, and only wait for it when communicating with buf_flush_page_cleaner. There is no need to keep a count of pending writes by the buf_pool.flush_list processing. A single flag suffices for that. Waits for page write completion can be performed by simply waiting on block->page.lock, or by invoking buf_dblwr.wait_for_page_writes(). buf_LRU_block_free_non_file_page(): Broadcast buf_pool.done_free and set buf_pool.try_LRU_scan when freeing a page. This would be executed also as part of buf_page_write_complete(). buf_page_write_complete(): Do not broadcast buf_pool.done_flush_list, and do not acquire buf_pool.mutex unless buf_pool.LRU eviction is needed. Let buf_dblwr count all writes to persistent pages and broadcast a condition variable when no outstanding writes remain. buf_flush_page_cleaner(): Prioritize LRU flushing and eviction right after "furious flushing" (lsn_limit). Simplify the conditions and reduce the hold time of buf_pool.flush_list_mutex. Refuse to shut down or sleep if buf_pool.ran_out(), that is, LRU eviction is needed. buf_pool_t::page_cleaner_wakeup(): Add the optional parameter for_LRU. buf_LRU_get_free_block(): Protect buf_lru_free_blocks_error_printed with buf_pool.mutex. Invoke buf_pool.page_cleaner_wakeup(true) to to ensure that buf_flush_page_cleaner() will process the LRU flush request. buf_do_LRU_batch(), buf_flush_list(), buf_flush_list_space(): Update buf_pool.stat.n_pages_written when submitting writes (while holding buf_pool.mutex), not when completing them. buf_page_t::flush(), buf_flush_discard_page(): Require that the page U-latch be acquired upfront, and remove buf_page_t::ready_for_flush(). buf_pool_t::delete_from_flush_list(): Remove the parameter "bool clear". buf_flush_page(): Count pending page writes via buf_dblwr. buf_flush_try_neighbors(): Take the block of page_id as a parameter. If the tablespace is dropped before our page has been written out, release the page U-latch. buf_pool_invalidate(): Let the caller ensure that there are no outstanding writes. buf_flush_wait_batch_end(false), buf_flush_wait_batch_end_acquiring_mutex(false): Replaced with buf_dblwr.wait_for_page_writes(). buf_flush_wait_LRU_batch_end(): Replaces buf_flush_wait_batch_end(true). buf_flush_list(): Remove some broadcast of buf_pool.done_flush_list. buf_flush_buffer_pool(): Invoke also buf_dblwr.wait_for_page_writes(). buf_pool_t::io_pending(), buf_pool_t::n_flush_list(): Remove. Outstanding writes are reflected by buf_dblwr.pending_writes(). buf_dblwr_t::init(): New function, to initialize the mutex and the condition variables, but not the backing store. buf_dblwr_t::is_created(): Replaces buf_dblwr_t::is_initialised(). buf_dblwr_t::pending_writes(), buf_dblwr_t::writes_pending: Keeps track of writes of persistent data pages. buf_flush_LRU(): Allow calls while LRU flushing may be in progress in another thread. Tested by Matthias Leich (correctness) and Axel Schwenke (performance)	2023-03-16 17:19:58 +02:00
Marko Mäkelä	9593cccf28	MDEV-26055: Improve adaptive flushing Adaptive flushing is enabled by setting innodb_max_dirty_pages_pct_lwm>0 (not default) and innodb_adaptive_flushing=ON (default). There is also the parameter innodb_adaptive_flushing_lwm (default: 10 per cent of the log capacity). It should enable some adaptive flushing even when innodb_max_dirty_pages_pct_lwm=0. That is not being changed here. This idea was first presented by Inaam Rana several years ago, and I discussed it with Jean-François Gagné at FOSDEM 2023. buf_flush_page_cleaner(): When we are not near the log capacity limit (neither buf_flush_async_lsn nor buf_flush_sync_lsn are set), also try to move clean blocks from the buf_pool.LRU list to buf_pool.free or initiate writes (but not the eviction) of dirty blocks, until the remaining I/O capacity has been consumed. buf_flush_LRU_list_batch(): Add the parameter bool evict, to specify whether dirty least recently used pages (from buf_pool.LRU) should be evicted immediately after they have been written out. Callers outside buf_flush_page_cleaner() will pass evict=true, to retain the existing behaviour. buf_do_LRU_batch(): Add the parameter bool evict. Return counts of evicted and flushed pages. buf_flush_LRU(): Add the parameter bool evict. Assume that the caller holds buf_pool.mutex and will invoke buf_dblwr.flush_buffered_writes() afterwards. buf_flush_list_holding_mutex(): A low-level variant of buf_flush_list() whose caller must hold buf_pool.mutex and invoke buf_dblwr.flush_buffered_writes() afterwards. buf_flush_wait_batch_end_acquiring_mutex(): Remove. It is enough to have buf_flush_wait_batch_end(). page_cleaner_flush_pages_recommendation(): Avoid some floating-point arithmetics. buf_flush_page(), buf_flush_check_neighbor(), buf_flush_check_neighbors(), buf_flush_try_neighbors(): Rename the parameter "bool lru" to "bool evict". buf_free_from_unzip_LRU_list_batch(): Remove the parameter. Only actual page writes will contribute towards the limit. buf_LRU_free_page(): Evict freed pages of temporary tables. buf_pool.done_free: Broadcast whenever a block is freed (and buf_pool.try_LRU_scan is set). buf_pool_t::io_buf_t::reserve(): Retry indefinitely. During the test encryption.innochecksum we easily run out of these buffers for PAGE_COMPRESSED or ENCRYPTED pages. Tested by Matthias Leich and Axel Schwenke	2023-03-16 17:09:08 +02:00
Marko Mäkelä	54c0ac72e3	MDEV-30134 Assertion failed in buf_page_t::unfix() in buf_pool_t::watch_unset() buf_pool_t::watch_set(): Always buffer-fix a block if one was found, no matter if it is a watch sentinel or a buffer page. The type of the block descriptor will be rechecked in buf_page_t::watch_unset(). Do not expect the caller to acquire the page hash latch. Starting with commit `bd5a6403ca` it is safe to release buf_pool.mutex before acquiring a buf_pool.page_hash latch. buf_page_get_low(): Adjust to the changed buf_pool_t::watch_set(). This simplifies the logic and fixes a bug that was reproduced when using debug builds and the setting innodb_change_buffering_debug=1.	2023-02-16 08:29:44 +02:00
Marko Mäkelä	9c15799462	MDEV-30397: MariaDB crash due to DB_FAIL reported for a corrupted page buf_read_page_low(): Map the buf_page_t::read_complete() return value DB_FAIL to DB_PAGE_CORRUPTED. The purpose of the DB_FAIL return value is to avoid error log noise when read-ahead brings in an unused page that is typically filled with NUL bytes. If a synchronous read is bringing in a corrupted page where the page frame does not contain the expected tablespace identifier and page number, that must be treated as an attempt to read a corrupted page. The correct error code for this is DB_PAGE_CORRUPTED. The error code DB_FAIL is not handled by row_mysql_handle_errors(). This was missed in commit `0b47c126e3` (MDEV-13542).	2023-02-16 08:28:14 +02:00
Marko Mäkelä	96a3b11d13	Merge 10.5 into 10.6	2023-02-14 15:23:23 +02:00
Thirunarayanan Balathandayuthapani	3eea2e8e10	MDEV-30551 InnoDB recovery hangs when buffer pool ran out of memory - During non-last batch of multi-batch recovery, InnoDB holds log_sys.mutex and preallocates the block which may intiate page flush, which may initiate log flush, which requires log_sys.mutex to acquire again. This leads to assert failure. So InnoDB recovery should release log_sys.mutex before preallocating the block.	2023-02-14 14:35:35 +05:30
Marko Mäkelä	de4030e4d4	MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT This also fixes part of MDEV-29835 Partial server freeze which is caused by violations of the latching order that was defined in https://dev.mysql.com/worklog/task/?id=6326 (WL#6326: InnoDB: fix index->lock contention). Unless the current thread is holding an exclusive dict_index_t::lock, it must acquire page latches in a strict parent-to-child, left-to-right order. Not all cases of MDEV-29835 are fixed yet. Failure to follow the correct latching order will cause deadlocks of threads due to lock order inversion. As part of these changes, the BTR_MODIFY_TREE mode is modified so that an Update latch (U a.k.a. SX) will be acquired on the root page, and eXclusive latches (X) will be acquired on all pages leading to the leaf page, as well as any left and right siblings of the pages along the path. The DEBUG_SYNC test innodb.innodb_wl6326 will be removed, because at the time the DEBUG_SYNC point is hit, the thread is actually holding several page latches that will be blocking a concurrent SELECT statement. We also remove double bookkeeping that was caused due to excessive information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo store information of latched pages, and ensure that mtr_memo_slot_t::object is never a null pointer. The tree_blocks[] and tree_savepoints[] were redundant. buf_page_get_low(): If innodb_change_buffering_debug=1, to avoid a hang, do not try to evict blocks if we are holding a latch on a modified page. The test innodb.innodb-change-buffer-recovery will be removed, because change buffering may no longer be forced by debug injection when the change buffer comprises multiple pages. Remove a debug assertion that could fail when innodb_change_buffering_debug=1 fails to evict a page. For other cases, the assertion is redundant, because we already checked that right after the got_block: label. The test innodb.innodb-change-buffering-recovery will be removed, because due to this change, we will be unable to evict the desired page. mtr_t::lock_register(): Register a change of a page latch on an unmodified buffer-fixed block. mtr_t::x_latch_at_savepoint(), mtr_t::sx_latch_at_savepoint(): Replaced by the use of mtr_t::upgrade_buffer_fix(), which now also handles RW_S_LATCH. mtr_t::set_modified(): For temporary tables, invoke buf_page_t::set_modified() here and not in mtr_t::commit(). We will never set the MTR_MEMO_MODIFY flag on other than persistent data pages, nor set mtr_t::m_modifications when temporary data pages are modified. mtr_t::commit(): Only invoke the buf_flush_note_modification() loop if persistent data pages were modified. mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo. This avoids many redundant entries in mtr_t::m_memo, as well as redundant calls to buf_page_get_gen() for blocks that had already been looked up in a mini-transaction. btr_get_latched_root(): Return a pointer to an already latched root page. This replaces btr_root_block_get() in cases where the mini-transaction has already latched the root page. btr_page_get_parent(): Fetch a parent page that was already latched in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched(). If needed, upgrade the root page U latch to X. This avoids bloating mtr_t::m_memo as well as performing redundant buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for B-tree defragmentation, we will invoke btr_cur_search_to_nth_level(). btr_cur_search_to_nth_level(): This will only be used for non-leaf (level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be removed altogether, or retained for the case of CHECK TABLE without QUICK. btr_cur_t::left_block: Remove. btr_pcur_move_backward_from_page() can retrieve the left sibling from the end of mtr_t::m_memo. btr_cur_t::open_leaf(): Some clean-up. btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level() for searches to level=0 (the leaf level). We will never release parent page latches before acquiring leaf page latches. If we need to temporarily release the level=1 page latch in the BTR_SEARCH_PREV or BTR_MODIFY_PREV latch_mode, we will reposition the cursor on the child node pointer so that we will land on the correct leaf page. btr_cur_t::pessimistic_search_leaf(): Implement new BTR_MODIFY_TREE latching logic in the case that page splits or merges will be needed. The parent pages (and their siblings) should already be latched on the first dive to the leaf and be present in mtr_t::m_memo; there should be no need for BTR_CONT_MODIFY_TREE. This pre-latching almost suffices; it must be revised in MDEV-29835 and work-arounds removed for cases where mtr_t::get_already_latched() fails to find a block. rtr_search_to_nth_level(): A SPATIAL INDEX version of btr_search_to_nth_level() that can search to any level (including the leaf level). rtr_search_leaf(), rtr_insert_leaf(): Wrappers for rtr_search_to_nth_level(). rtr_search(): Replaces rtr_pcur_open(). rtr_latch_leaves(): Replaces btr_cur_latch_leaves(). Note that unlike in the B-tree code, there is no error handling in case the sibling pages are corrupted. rtr_cur_restore_position(): Remove an unused constant parameter. btr_pcur_open_on_user_rec(): Remove the constant parameter mode=PAGE_CUR_GE. row_ins_clust_index_entry_low(): Use a new mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC. BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove. BTR_CONT_MODIFY_TREE: Note that this is only used by rtr_search_to_nth_level(). btr_pcur_optimistic_latch_leaves(): Replaces btr_cur_optimistic_latch_leaves(). ibuf_delete_rec(): Acquire exclusive ibuf.index->lock in order to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV). btr_blob_log_check_t(): Acquire a U latch on the root page, so that btr_page_alloc() in btr_store_big_rec_extern_fields() will avoid a deadlock. btr_store_big_rec_extern_fields(): Assert that the root page latch is being held. Tested by: Matthias Leich Reviewed by: Vladislav Lesin	2023-01-24 14:09:21 +02:00
Marko Mäkelä	2b3423c462	Merge 10.3 into 10.4	2023-01-17 18:03:58 +02:00
Marko Mäkelä	489b556947	MDEV-30422 Merge new release of InnoDB 5.7.41 to 10.3 MySQL 5.7.41 includes one InnoDB change mysql/mysql-server@d2d6b2dd00 that seems to be applicable to MariaDB Server 10.3 and 10.4. Even though commit `5b9ee8d819` seems to have fixed sporadic failures on our CI systems, it is theoretically possible that another race condition remained. buf_flush_page_cleaner_coordinator(): In the final loop, wait also for buf_get_n_pending_read_ios() to reach 0. In this way, if a secondary index leaf page was read into the buffer pool and ibuf_merge_or_delete_for_page() modified that page or some change buffer pages, the flush loop would execute until the buffer pool really is in a clean state. This potential data corruption bug does not affect MariaDB Server 10.5 or later, thanks to commit `b42294bc64` which removed change buffer merges that are not explicitly requested.	2023-01-17 17:52:16 +02:00
Marko Mäkelä	a8a5c8a1b8	Merge 10.5 into 10.6	2022-12-13 16:58:58 +02:00
Marko Mäkelä	1dc2f35598	Merge 10.4 into 10.5	2022-12-13 14:39:18 +02:00
Marko Mäkelä	fdf43b5c78	Merge 10.3 into 10.4	2022-12-13 11:37:33 +02:00
Marko Mäkelä	15ab2e122d	MDEV-30132 Crash after recovery, with InnoDB: Tried to read ... os_file_read(): Merged with os_file_read_no_error_handling(). Crashing on a partial page read is as unhelpful as crashing on a corrupted page read (commit `0b47c126e3`). Report the file name if it is available via IORequest.	2022-11-30 10:54:03 +02:00
Marko Mäkelä	fdc582fd98	Merge 10.5 into 10.6	2022-11-28 12:20:17 +02:00
Marko Mäkelä	e0d672f30b	MDEV-30089 Metrics not incremented for 1st iteration in buf_LRU_free_from_common_LRU_list() In commit `a03dd94be8` as well as mysql/mysql-server@6ef8c34344 the iterations were changed so that the variable "scanned" would remain 0 when the first list item qualifies for eviction. buf_LRU_free_from_unzip_LRU_list(), buf_LRU_free_from_common_LRU_list(): Increment "scanned" when a block can be freed. buf_LRU_free_from_common_LRU_list(): Remove a redundant condition. Whenever this function is invoked, buf_pool.LRU should be nonempty, hence something should always be scanned. Thanks to Jean-François Gagné for reporting this.	2022-11-28 11:34:00 +02:00
Daniel Black	183ca823bb	MDEV-25417: Remove innodb buffer pool load throttling The very lightest of load would decimate any buffer pool loading to ~1 page per second. As seen in MDEV-29343 this resulting in a load taking over an hour on a high end system. Since MDEV-26547 the fetching is asynchronous, however the loading has equal access to the IO as the SQL queries.	2022-11-28 11:25:47 +02:00
Marko Mäkelä	6d40274f65	Merge 10.5 into 10.6	2022-11-23 18:13:28 +02:00
Thirunarayanan Balathandayuthapani	71c93fb8fd	MDEV-28462 Race condition between instant alter and AHI access - InnoDB AHI tries to access the concurrent instant alter column, leads to asan failure. Instant alter column should acquire the clustered index search latch in exclusive mode before changing the table cache definition. - Removed the default parameter for the function btr_search_drop_page_hash_index() - Addressed the DWITH_INNODB_AHI=0 compilation failure by passing two parameters from all callers of btr_search_drop_page_hash_index()	2022-11-22 15:24:44 +05:30
Marko Mäkelä	4e5e8166b4	MDEV-19514 fixup: Fix recovery with innodb_change_buffering_debug=1 During crash recovery, recv_sys.apply(true) invokes mlog_init.mark_ibuf_exist(), which in turn may invoke recv_sys.apply(true) via the buf_flush_sync() call in buf_page_get_low(). The simplest fix is to disable the innodb_change_buffering_debug=1 instrumentation during crash recovery.	2022-11-21 17:55:35 +02:00
Marko Mäkelä	ae6ebafd81	Merge 10.5 into 10.6	2022-11-14 15:44:55 +02:00
Marko Mäkelä	e0e096faaa	MDEV-29982 Improve the InnoDB log overwrite error message The InnoDB write-ahead log ib_logfile0 is of fixed size, specified by innodb_log_file_size. If the tail of the log manages to overwrite the head (latest checkpoint) of the log, crash recovery will be broken. Let us clarify the messages about this, including adding a message on the completion of a log checkpoint that notes that the dangerous situation is over. To reproduce the dangerous scenario, we will introduce the debug injection label ib_log_checkpoint_avoid_hard, which will avoid log checkpoints even harder than the previous ib_log_checkpoint_avoid. log_t::overwrite_warned: The first known dangerous log sequence number. Set in log_close() and cleared in log_write_checkpoint_info(), which will output a "Crash recovery was broken" message.	2022-11-14 12:18:03 +02:00
Marko Mäkelä	2ac1edb1c3	Merge 10.5 into 10.6	2022-11-08 17:37:22 +02:00
Marko Mäkelä	a732d5e2ba	Merge 10.4 into 10.5	2022-11-08 17:01:28 +02:00
Marko Mäkelä	8fb176c3c1	MDEV-27121 fixup: mariabackup.mdev-14447,full_crc32	2022-11-08 16:59:36 +02:00
Marko Mäkelä	93b4f84ab2	Merge 10.3 into 10.4	2022-11-08 16:04:01 +02:00
Marko Mäkelä	eabb3b35d5	MDEV-27121 fixup: mariabackup.mdev-14447 fault injection	2022-11-08 08:53:49 +02:00
Marko Mäkelä	65d0c57c1a	Merge 10.3 into 10.4	2022-10-05 20:30:57 +03:00
Vlad Lesin	c0eda62aec	MDEV-27927 row_sel_try_search_shortcut_for_mysql() does not latch a page, violating read view isolation btr_search_guess_on_hash() would only acquire an index page latch if it is invoked with ahi_latch=NULL. If it's invoked from row_sel_try_search_shortcut_for_mysql() with ahi_latch!=NULL, a page will not be latched, and row_search_mvcc() will get a pointer to the record, which can be changed by some other transaction before the record was stored in result buffer with row_sel_store_mysql_rec() call. ahi_latch argument of btr_cur_search_to_nth_level_func() and btr_pcur_open_with_no_init_func() is used only for row_sel_try_search_shortcut_for_mysql(). btr_cur_search_to_nth_level_func(..., ahi_latch !=0, ...) is invoked only from btr_pcur_open_with_no_init_func(..., ahi_latch !=0, ...), which, in turns, is invoked only from row_sel_try_search_shortcut_for_mysql(). I suppose that separate case with ahi_latch!=0 was intentionally implemented to protect row_sel_store_mysql_rec() call in row_search_mvcc() just after row_sel_try_search_shortcut_for_mysql() call. After the ahi_latch was moved from row_seach_mvcc() to row_sel_try_search_shortcut_for_mysql(), there is no need in it at all if btr_search_guess_on_hash() latches a page unconditionally. And if btr_search_guess_on_hash() latched the page, any access to the record in row_sel_try_search_shortcut_for_mysql() after btr_pcur_open_with_no_init() call will be protected with the page latch. The fix is to remove ahi_latch argument from btr_pcur_open_with_no_init_func(), btr_cur_search_to_nth_level_func() and btr_search_guess_on_hash(). There will not be test, as to test it we need to freeze some SELECT execution in the point between row_sel_try_search_shortcut_for_mysql() and row_sel_store_mysql_rec() calls in row_search_mvcc(), and to change the record in some other transaction to let row_sel_store_mysql_rec() to store changed record in result buffer. Buf we can't do this with the fix, as the page will be latched in btr_search_guess_on_hash() call.	2022-10-05 17:35:21 +03:00
Marko Mäkelä	0c0a569028	Merge 10.3 into 10.4	2022-09-20 12:38:25 +03:00
Marko Mäkelä	c22dff21a5	InnoDB cleanup: Replace UNIV_LINUX, UNIV_SOLARIS, UNIV_AIX Let us use the normal platform-specific preprocessor symbols __linux__, __sun__, _AIX instead of some homebrew ones. The preprocessor symbol UNIV_HPUX must have lost its meaning by `f6deb00a56` (note: the symbol UNIV_HPUX10 is being checked for, but only UNIV_HPUX is defined).	2022-09-19 12:20:53 +03:00
Marko Mäkelä	9203249987	MDEV-27983: InnoDB hangs after loading a ROW_FORMAT=COMPRESSED page If multiple threads invoke buf_page_get_low() on a ROW_FORMAT=COMPRESSED page that does not reside in the buffer pool, then one of the threads will end up acquiring an exclusive page latch (the "if" statement right before the new wait_for_unzip: label) and other threads will end up waiting for a shared latch while holding a buffer-fix. The exclusive latch holder would then wait for the buffer-fixes to be released while the buffer-fix holders are waiting for the shared latch. buf_page_get_low(): Prevent the hang that was introduced in commit `9436c778c3` (MDEV-27058), by releasing the buffer-fix, sleeping some time, and retrying the page lookup.	2022-08-31 17:52:23 +03:00
Marko Mäkelä	bdf62ece6c	MDEV-29374 InnoDB recovery fails with "Data structure corruption" recv_sys_t::free_corrupted_page(): Identify the corrupted page in an error or warning message. buf_page_free(): Just in case, register the page as modified. This should already have been done in mtr_t::free() as part of fseg_free_page_low(). mtr_t::memo_push(): Simplify a condition, so that when invoked with MTR_MEMO_PAGE_X_MODIFY, we will do the right thing. fseg_free_page_low(): Remove an accidentally added return statement that prevented mtr_t::free() from being called. This fixes a regression that was introduced in commit `0b47c126e3` (MDEV-13542).	2022-08-31 17:52:16 +03:00
Marko Mäkelä	76bb671e42	Merge 10.5 into 10.6	2022-08-25 16:02:44 +03:00
Marko Mäkelä	9929301ecd	Merge 10.4 into 10.5	2022-08-25 15:31:19 +03:00
Marko Mäkelä	851058a3e6	Merge 10.3 into 10.4	2022-08-25 15:17:20 +03:00
Marko Mäkelä	d1a80c42ee	MDEV-29384 Hangs caused by innodb_adaptive_hash_index=ON buf_defer_drop_ahi(): Remove. Ever since commit `c7f8cfc9e7` (MDEV-27700) it is safe to invoke btr_search_drop_page_hash_index(block, true) to remove an orphan adaptive hash index. Any attempt to upgrade page latches is prone to deadlocks. Recently, we observed a few hangs that involved nothing more than a small table consisting of one clustered index page, one secondary index page and some undo pages.	2022-08-25 15:14:38 +03:00
Marko Mäkelä	fbb2b1f55f	Merge 10.5 into 10.6	2022-08-23 08:47:21 +03:00
Marko Mäkelä	3b656ac8c1	Merge 10.4 into 10.5	2022-08-22 19:49:56 +03:00
Marko Mäkelä	b68ae6dc1d	Merge 10.3 into 10.4	2022-08-22 16:22:09 +03:00

1 2 3 4 5 ...

1345 commits