mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-28 01:34:17 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	0076eb3d4e	Merge 10.5 into 10.6	2024-06-24 13:09:47 +03:00
Marko Mäkelä	acc077ffa1	MDEV-34443 ha_innobase::info_low() does not distinguish HA_STATUS_VARIABLE_EXTRA ha_innobase::info_low(): For HA_STATUS_VARIABLE without HA_STATUS_VARIABLE_EXTRA, let us avoid unnecessary and costly updates of the data_free statistics, which are only needed for SHOW TABLE STATUS. This optimization had been enabled in commit `247ecb7597` but not utilized until now.	2024-06-24 10:39:13 +03:00
Vladislav Vaintroub	7c5fdc9b6a	MDEV-33748 get rid of pthread_(get_/set_)specific, use thread_local Apart from better performance when accessing thread local variables, we'll get rid of things that depend on initialization/cleanup of pthread_key_t variables. Where appropriate, use compiler-dependent pre-C++11 thread-local equivalents, where it makes sense, to avoid initialization check overhead that non-static thread_local can suffer from.	2024-06-21 13:46:41 +02:00
Dave Gosselin	db0c28eff8	MDEV-33746 Supply missing override markings Find and fix missing virtual override markings. Updates cmake maintainer flags to include -Wsuggest-override and -Winconsistent-missing-override.	2024-06-20 11:32:13 -04:00
Vlad Lesin	0a199cb810	MDEV-34108 Inappropriate semi-consistent read in RC if innodb_snapshot_isolation=ON The fixes in `b8a6719889` have not disabled semi-consistent read for innodb_snapshot_isolation=ON mode, they just allowed read uncommitted version of a record, that's why the test for MDEV-26643 worked well. The semi-consistent read should be disabled on upper level in row_search_mvcc() for READ COMMITTED isolation level. Reviewed by Marko Mäkelä.	2024-06-20 16:11:54 +03:00
Thirunarayanan Balathandayuthapani	ab448d4b34	MDEV-34389 Avoid log overwrite in early recovery - InnoDB tries to write FILE_CHECKPOINT marker during early recovery when log file size is insufficient. While updating the log checkpoint at the end of the recovery, InnoDB must already have written out all pending changes to the persistent files. To complete the checkpoint, InnoDB has to write some log records for the checkpoint and to update the checkpoint header. If the server gets killed before updating the checkpoint header then it would lead the logfile to be unrecoverable. - This patch avoids FILE_CHECKPOINT marker during early recovery and narrows down the window of opportunity to make the log file unrecoverable.	2024-06-20 17:54:57 +05:30
Iaroslav Babanin	5d49a2add7	MDEV-33935 fix deadlock counter - The deadlock counter was moved from Deadlock::find_cycle into Deadlock::report, because the find_cycle method is called multiple times during deadlock detection flow, which means it shouldn't have such side effects. But report() can, which called only once for a victim transaction. - Also the deadlock_detect.test and *.result test case has been extended to handle the fix.	2024-06-19 20:43:33 +03:00
Jan Lindström	ee974ca5e0	MDEV-31658 : Deadlock found when trying to get lock during applying Problem was that there was two non-conflicting local idle transactions in node_1 that both inserted a key to primary key. Then two transactions from other nodes inserted also a key to primary key so that insert from node_2 conflicted one of the local transactions in node_1 so that there would be duplicate key if both are committed. For this insert from other node tries to acquire S-lock for this record and because this insert is high priority brute force (BF) transaction it will kill idle local transaction. Concurrently, second insert from node_3 conflicts the second idle insert transaction in node_1. Again, it tries to acquire S-lock for this record and kills idle local transaction. At this point we have two non-conflicting high priority transactions holding S-lock on different records in node_1. For example like this: rec s-lock-node2-rec s-lock-node3-rec rec. Because these high priority BF-transactions do not wait each other insert from node3 that has later seqno compared to insert from node2 can continue. It will try to acquire insert intention for record it tries to insert (to avoid duplicate key to be inserted by local transaction). Hower, it will note that there is conflicting S-lock in same gap between records. This will lead deadlock error as we have defined that BF-transactions may not wait for record lock but we can't kill conflicting BF-transaction because it has lower seqno and it should commit first. BF-transactions are executed concurrently because their values to primary key are different i.e. they do not conflict. Galera certification will make sure that inserts from other nodes i.e these high priority BF-transactions can't insert duplicate keys. Local transactions naturally can but they will be killed when BF-transaction acquires required record locks. Therefore, we can allow situation where there is conflicting S-lock and insert intention lock regardless of their seqno order and let both continue with no wait. This will lead to situation where we need to allow BF-transaction to wait when lock_rec_has_to_wait_in_queue is called because this function is also called from lock_rec_queue_validate and because lock is waiting there would be assertion in ut_a(lock->is_gap() \|\| lock_rec_has_to_wait_in_queue(cell, lock)); lock_wait_wsrep_kill Add debug sync points for BF-transactions killing local transaction. wsrep_assert_no_bf_bf_wait Print also requested lock information lock_rec_has_to_wait Add function to handle wsrep transaction lock wait cases. lock_rec_has_to_wait_wsrep New function to handle wsrep transaction lock wait exceptions. lock_rec_has_to_wait_in_queue Remove wsrep exception, in this function all conflicting locks need to wait in queue. Conflicts between BF and local transactions are handled in lock_wait. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-06-19 14:09:11 +02:00
Marko Mäkelä	f9e717cb48	MDEV-34426: Assertion failure on bootstrap Let us relax an assertion that would fail in ./mtr main.1st --mysqld=--innodb-undo-tablespaces=127	2024-06-19 15:08:19 +03:00
Marko Mäkelä	34813c1aa0	Merge 10.6 into 10.11	2024-06-19 15:04:07 +03:00
Marko Mäkelä	5b26a07698	MDEV-34178: Enable spinloop for index_lock In an I/O bound concurrent INSERT test conducted by Mark Callaghan, spin loops on dict_index_t::lock turn out to be beneficial. This is a mixed bag; enabling the spin loops will improve throughput and latency on some workloads and degrade in others. Reviewed by: Debarun Banerjee Tested by: Matthias Leich Performance tested by: Axel Schwenke	2024-06-19 13:41:11 +03:00
Marko Mäkelä	f8d213bd0e	MDEV-34178: Improve the spin loops srw_mutex_impl<spinloop>::wait_and_lock(): Invoke srw_pause() and reload the lock word on each loop. Thanks to Mark Callaghan for suggesting this. ssux_lock_impl<spinloop>::rd_wait(): Actually implement a spin loop on the rw-lock component without blocking on the mutex component. If there is a conflict with wr_lock(), wait for writer.lock to be released without actually acquiring it. Reviewed by: Debarun Banerjee Tested by: Matthias Leich	2024-06-19 13:40:57 +03:00
Marko Mäkelä	6cde03aedc	MDEV-34178: Improve PERFORMANCE_SCHEMA instrumentation When MariaDB is built with PERFORMANCE_SCHEMA support enabled and with futex-based rw-locks (not srw_lock_), we were unnecessarily releasing and reacquiring lock.writer in srw_lock_impl::psi_wr_lock() and ssux_lock::psi_wr_lock(). If there is a conflict with rd_lock(), let us hold the lock.writer and execute u_wr_upgrade() to wait for rd_unlock(). Reviewed by: Debarun Banerjee Tested by: Matthias Leich	2024-06-19 13:30:23 +03:00
Marko Mäkelä	2bd661ca10	MDEV-34178: Simplify the U lock The U lock mode of the sux_lock that was introduced in commit `03ca6495df` (MDEV-24142) is unnecessarily complex. Internally, sux_lock comprises two parts, each with their own wait queue inside the operating system kernel: a mutex and a rw-lock. We can map the operations as follows: x_lock(): (X,X) u_lock(): (X,_) s_lock(): (_,S) The Update lock mode, which is mutually exclusive with itself and with X (exclusive) locks but not with shared (S) locks, was unnecessarily acquiring a shared lock on the second component. The mutual exclusion is guaranteed by the first component. We might simplify the #ifdef SUX_LOCK_GENERIC case further by omitting srw_mutex_impl::lock, because it is kind-of duplicating the mutex that we will use for having a wait queue. However, the predicate buf_page_t::can_relocate() would depend on the predicate is_locked_or_waiting(), which is not available for pthread_mutex_t. Reviewed by: Debarun Banerjee Tested by: Matthias Leich	2024-06-18 18:22:28 +03:00
Alexander Barkov	c4bf4ce948	Merge remote-tracking branch 'origin/11.2' into 11.4	2024-06-17 15:46:39 +04:00
Marko Mäkelä	a21e49cbcc	Merge 11.1 into 11.2	2024-06-17 12:02:03 +03:00
Marko Mäkelä	d34289a3e2	Merge 10.11 into 11.1	2024-06-17 09:21:50 +03:00
Marko Mäkelä	346a0c1402	Merge 10.6 into 10.11	2024-06-17 09:08:07 +03:00
Marko Mäkelä	e60acae655	Merge 10.5 into 10.6	2024-06-17 08:40:07 +03:00
Marko Mäkelä	4b4c371fe7	MDEV-34297 fixup: -Wconversion on 32-bit	2024-06-14 13:21:19 +03:00
Thirunarayanan Balathandayuthapani	3271588bb7	MDEV-34381 During innodb_undo_truncate=ON recovery, InnoDB may fail to shrink undo* files - During recovery, InnoDB may fail to shrink the undo tablespaces when there are no pages to recover while applying the redo log. This issue exists only when innodb_undo_truncate is enabled. trx_lists_init_at_db_start() could've applied the redo logs for undo tablespace page0.	2024-06-14 12:46:02 +05:30
Marko Mäkelä	5b89cab44f	Merge 10.6 into 10.11	2024-06-13 08:16:49 +03:00
Thirunarayanan Balathandayuthapani	b40f9d3d5c	MDEV-34374 Shrinking tablespace logic fails to handle error condition - InnoDB ignores the error while traversing the used extents during shrinking process. Made changes in fsp_traverse_extents() to handle error condition correctly	2024-06-12 17:36:12 +05:30
Brandon Nesterenko	d3a7e46bb4	MDEV-34365: UBSAN runtime error: call to function io_callback(tpool::aiocb) On an UBSAN clang-15 build, if running with UBSAN option halt_on_error=1 (the issue doesn't show up without it), MTR fails during mysqld --bootstrap with UBSAN error: call to function io_callback(tpool::aiocb) through pointer to incorrect function type 'void ()(void )' This patch corrects the parameter type of io_callback to match its expected type defined by callback_func, i.e. (void*). Reviewed By: ============ <TODO>	2024-06-12 08:39:41 +03:00
Marko Mäkelä	fc9005adc4	Merge 10.5 into 10.6	2024-06-12 07:51:28 +03:00
Thirunarayanan Balathandayuthapani	5b39ded713	MDEV-34156 InnoDB fails to apply the redo log for compressed tablespace Problem: ======= During recovery, InnoDB fails to apply the redo log for compressed tablespace. The reason is that InnoDB assumes that pages has been freed while applying the redo log for it. During multiple scan of redo logs, InnoDB stores the freed page information when it have sufficient buffer pool pages. Once it ran out of memory, InnoDB doesn't store freed page information. But InnoDB assigns the freed page ranges to tablespace in recv_init_crash_recovery_spaces() even though InnoDB doesn't have complete freed range information. While applying the redo log, InnoDB wrongly assumes that page has been freed and it could lead to corruption of tablespace. This issue is caused by commit `941af1fa58` (MDEV-31803) and commit `2f9e264781` (MDEV-29911). Solution: ======== During recovery, set recovery size and freed page information for all tablespace irrespective of memory.	2024-06-11 16:14:36 +05:30
Marko Mäkelä	b81d717387	Merge 10.6 into 10.11	2024-06-11 12:50:10 +03:00
Yuchen Pei	d524cb5b3d	MDEV-34002 Initialise fields in spider_db_handler Otherwise it may result in nonsensical values like 190 for a boolean.	2024-06-11 09:18:42 +10:00
Marko Mäkelä	27834ebc91	Merge 10.5 into 10.6	2024-06-10 15:22:15 +03:00
Marko Mäkelä	a2bd936c52	MDEV-33161 Function pointer signature mismatch in LF_HASH In cmake -DWITH_UBSAN=ON builds with clang but not with GCC, -fsanitize=undefined will flag several runtime errors on function pointer mismatch related to the lock-free hash table LF_HASH. Let us use matching function signatures and remove function pointer casts in order to avoid potential bugs due to undefined behaviour. These errors could be caught at compilation time by -Wcast-function-type-strict, which is available starting with clang-16, but not available in any version of GCC as of now. The old GCC flag -Wcast-function-type is enabled as part of -Wextra, but it specifically does not catch these errors. Reviewed by: Vladislav Vaintroub	2024-06-10 12:35:33 +03:00
Brandon Nesterenko	bf0aa99aeb	MDEV-34237: On Startup: UBSAN: runtime error: call to function MDL_lock::lf_hash_initializer lf_hash_insert through pointer to incorrect function type 'void ()(st_lf_hash , void , const void )' A few different incorrect function type UBSAN issues have been grouped into this patch. The only real potentially undefined behavior is an error about show_func_mutex_instances_lost, which when invoked in sql_show.cc::show_status_array(), puts 5 arguments onto the stack; however, the implementing function only actually has 3 parameters (so only 3 would be popped). This was fixed by adding in the remaining parameters to satisfy the type mysql_show_var_func. The rest of the findings are pointer type mismatches that wouldn't lead to actual undefined behavior. The lf_hash_initializer function type definition is typedef void (lf_hash_initializer)(LF_HASH hash, void dst, const void src); but the MDL_lock and table cache's implementations of this function do not have that signature. The MDL_lock has specific MDL object parameters: static void lf_hash_initializer(LF_HASH hash __attribute__((unused)), MDL_lock lock, MDL_key key_arg) and the table cache has specific TDC parameters: static void tdc_hash_initializer(LF_HASH , TDC_element element, LEX_STRING key) leading to UBSAN runtime errors when invoking these functions. This patch fixes these type mis-matches by changing the implementing functions to use void * and const void * for their respective parameters, and later casting them to their expected type in the function body. Note too the functions tdc_hash_key and tc_purge_callback had a similar problem to tdc_hash_initializer and was fixed similarly. Reviewed By: ============ Sergei Golubchik <serg@mariadb.com>	2024-06-08 19:59:59 -06:00
Thirunarayanan Balathandayuthapani	4b4dbb23ea	MDEV-34169 Don't allow innodb_open_files to be lesser than number of non-user tablespace. fil_space_t::try_to_close(): Don't try to close the tablespace which is acquired by the caller of the function Added the suppression message in open_files_limit test case	2024-06-07 20:50:39 +05:30
Thirunarayanan Balathandayuthapani	b7a75fbb8a	MDEV-34169 Don't allow innodb_open_files to be lesser than number of non-user tablespace. - InnoDB only closes the user tablespace when the number of open files exceeds innodb_open_files limit. In that case, InnoDB should make sure that innodb_open_files value should be greater than number of undo tablespace, system and temporary tablespace files.	2024-06-07 15:37:11 +05:30
Marko Mäkelä	a687cf8661	Merge 10.5 into 10.6	2024-06-07 10:03:51 +03:00
Thirunarayanan Balathandayuthapani	a02773f7c0	MDEV-34057 Inconsistent FTS state in concurrent scenarios Problem: ======= - This commit is a merge of mysql commit 129ee47ef994652081a11ee9040c0488e5275b14. InnoDB FTS can be in inconsistent state when sync operation terminates the server before committing the operation. This could lead to incorrect synced doc id and incorrect query results. Solution: ======== - During sync commit operation, InnoDB should pass the sync transaction to update the max doc id in the config table. fts_read_synced_doc_id() : This function is used to read only synced doc id from the config table.	2024-06-06 19:09:13 +05:30
Marko Mäkelä	699d38d951	MDEV-34296 extern thread_local is a CPU waste In commit 99bd22605938c42d876194f2ec75b32e658f00f5 (MDEV-31558) we wrongly thought that there would be minimal overhead for accessing a thread-local variable mariadb_stats. It turns out that in C++11, each access to an extern thread_local variable requires conditionally invoking an initialization function. In fact, the initializer expression of mariadb_stats is dynamic, and those calls were actually unavoidable. In C++20, one could declare constinit thread_local variables, but the address of a thread_local variable (&mariadb_dummy_stats) is not a compile-time constant. We did not want to declare mariadb_dummy_stats without thread_local, because then the dummy accesses could lead to cache line contention between threads. mariadb_stats: Declare as __thread or __declspec(thread) so that there will be no dynamic initialization, but zero-initialization. mariadb_dummy_stats: Remove. It is a lesser evil to let the environment perform zero-initialization and check if !mariadb_stats. Reviewed by: Sergei Petrunia	2024-06-06 14:38:42 +03:00
Marko Mäkelä	9fac857f26	MDEV-34283 A misplaced btr_cur_need_opposite_intention() check may fail to prevent hangs btr_cur_t::search_leaf(): Invoke btr_cur_need_opposite_intention() after positioning page_cur.rec so that the record will be in the intended page. This is something that was broken in commit `f2096478d5` or commit `de4030e4d4` or related changes. btr_cur_need_opposite_intention(): Add a debug assertion that would catch the misuse. The "next line of defence" that should have caught this bug in debug builds are assertions that mtr_t::m_memo contains MTR_MEMO_X_LOCK for the dict_index_t::lock. When btr_cur_need_opposite_intention() holds, we should escalate to acquiring an exclusive index->lock in btr_cur_t::pessimistic_search_leaf(). Reviewed by: Debarun Banerjee	2024-06-06 13:03:34 +03:00
Marko Mäkelä	bc3660925d	MDEV-34307 On startup, [FATAL] InnoDB: Page ... still fixed or dirty buf_pool_invalidate(): Properly wait for os_aio_wait_until_no_pending_writes() to ensure so that there are no pending buf_page_t::write_complete() or buf_page_write_complete() operations. This will avoid a failure of buf_pool.assert_all_freed(). This bug should affect debug builds only. At this point, the buf_pool.flush_list should be clear and all changes should have been written out. The loop around buf_LRU_scan_and_free_block() should have eventually completed and freed all pages as soon as buf_page_t::write_complete() had a chance to release the page latches. It is worth noting that buf_flush_wait() is working as intended. As soon as buf_flush_page_cleaner() invokes buf_pool.get_oldest_modification() it will observe that buf_page_t::write_complete() had assigned oldest_modification_ to 1, and remove such blocks from buf_pool.flush_list. Upon reaching buf_pool.flush_list.count=0 the buf_flush_page_cleaner() will mark itself idle and wake buf_flush_wait() by broadcasting buf_pool.done_flush_list. This regression was introduced in commit `a55b951e60` (MDEV-26827). Reviewed by: Debarun Banerjee	2024-06-06 10:18:42 +03:00
Vladislav Vaintroub	ce9efb4e02	MDEV-34296 tpool - declare thread_local_waiter "static thread_local"	2024-06-05 16:55:11 +02:00
mariadb-DebarunBanerjee	b12c14e3b4	MDEV-34265 Possible hang during IO burst with innodb_flush_sync enabled When checkpoint age goes beyond the sync flush threshold and buf_flush_sync_lsn is set, page cleaner enters into "furious flush" stage to aggressively flush dirty pages from flush list and pull checkpoint LSN above safe margin. In this stage, page cleaner skips doing LRU flush and eviction. In 10.6, all other threads entirely rely on page cleaner to generate free pages. If free pages get over while page cleaner is busy in "furious flush" stage, a session thread could wait for free page in the middle of a min-transaction(mtr) while holding latches on other pages. It, in turn, can prevent page cleaner to flush such pages preventing checkpoint LSN to move forward creating a deadlock situation. Even otherwise, it could create a stall and hang like situation for large BP with plenty of dirty pages to flush before the stage could finish. Fix: During furious flush, check and evict LRU pages after each flush iteration.	2024-06-05 18:11:29 +05:30
Vladislav Vaintroub	40abd973ab	MDEV-34236 Mroonga build with ASAN/UBSAN with GCC 12+ extremely slow. Workaround by disabling sanitizer for single source file.	2024-06-05 11:58:53 +02:00
Monty	38cbef8b3f	MDEV-22935 Erroneous Aria Index / Optimizer behaviour The problem was in the Aria part of the range optimizer, maria_records_in_range(), which wrong concluded that there was no rows in the range. This error would happen in the unlikely case when searching for a range on a partial key and there was a match for the first key part in the upper part of the b-tree (node) and also a match in the underlying node page. In other words, for this bug to happen one have to use Aria, have a multi part key with a lot of identical values for the first key part and do a range search on the second part of the key. Fixed by ensuring that we do not stop searching for partial keys found on node. Other things: - Added some comments - Changed a variable name to more clearly explain it's purpose. - Fixed wrong cast in _ma_record_pos() that could cause problems on 32 bit systems.	2024-06-05 10:29:49 +03:00
Marko Mäkelä	c6d36c3e7c	MDEV-34297 get_rnd_value() of ib_counter_t is unnecessarily complex The shared counter template ib_counter_t uses the function my_timer_cycles() as a source of pseudo-random numbers to pick a shard. On some platforms, my_timer_cycles() could return the constant value 0. get_rnd_value(): Remove. my_pseudo_random(): Implement as an alias of my_timer_cycles() or a wrapper for pthread_self(). Reviewed by: Vladislav Vaintroub	2024-06-05 09:54:14 +03:00
Yuchen Pei	042a0d85ad	MDEV-27186 spider/partition: Report error on info() failure Like MDEV-28105, spider may attempt to connect to remote server in info(), and it may emit an error upon failure to connect. In this case, the downstream caller ha_partition::open() should return the error to avoid inconsistency. This fixes MDEV-27186, MDEV-27237, MDEV-27334, MDEV-28241, MDEV-34101.	2024-06-05 10:13:30 +10:00
Yuchen Pei	581712b989	MDEV-33490 MENT-1504 Fix some english strings in spider.	2024-06-04 12:25:08 +10:00
Thirunarayanan Balathandayuthapani	58a0e1e3dd	MDEV-34223 Innodb - add status variable for number of bulk inserts - Added a counter innodb_num_bulk_insert_operation in INFORMATION_SCHEMA.GLOBAL_STATUS. This counter is incremented whenever a InnoDB undergoes bulk insert operation. - Change the innodb_instant_alter_column to atomic variable.	2024-06-03 16:27:22 +05:30
Yuchen Pei	2d3e2c58b6	Merge branch '10.11' into 11.1	2024-05-31 10:54:31 +10:00
Yuchen Pei	22774ddb53	Merge branch '10.6' into 10.11	2024-05-31 09:13:40 +10:00
Yuchen Pei	f2302a62e3	Merge branch '10.5' into 10.6	2024-05-31 09:10:17 +10:00
Yuchen Pei	25476ba1ae	MDEV-29027 ASAN errors in spider_db_free_result after partition DDL Spider calls ha_spider::close() at least twice on ALTER TABLE ... ADD PARTITION. The first call frees wide_handler and the second call accesses wide_handler->trx->thd (heap-use-after-free). In general, there seems to be no problem with using THD obtained by the macro current_thd() except in background threads. Thus, we simply replace wide_handler->trx->thd with current_thd(). Original author: Nayuta Yanagasawa	2024-05-31 09:06:55 +10:00

1 2 3 4 5 ...

28041 commits