mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 12:02:42 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	3a3a4f044f	Merge 10.4 into 10.5	2024-01-03 12:07:51 +02:00
Marko Mäkelä	96130b1898	MDEV-33157 WSREP: Fix function pointer mismatch wsrep_plugin_init(), wsrep_plugin_deinit(): Remove these dummy functions in order to fix an error that would be flagged by cmake -DWITH_UBSAN=ON when using clang. wsrep_show_ready(), wsrep_show_bf_aborts(): Correct the signature.	2024-01-03 08:52:50 +02:00
Teemu Ollakka	3f59bbeeae	MDEV-29293 MariaDB stuck on starting commit state The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:39:43 +02:00
Teemu Ollakka	6966d7fe4b	MDEV-29293 MariaDB stuck on starting commit state This is a backport from 10.5. The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:33:37 +02:00
Oleksandr Byelkin	10e135b679	Merge branch 'bb-10.4-release' into bb-10.5-release	2023-05-02 15:47:10 +02:00
Daniele Sciascia	ef227762b1	MDEV-30838 Assertion `m_thd == _current_thd()' - Update wsrep-lib which contains fix for the assertion - Fix error handling for appending fragment to streaming log, make sure tables are closed after rollback. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-02 03:42:39 +02:00
Jan Lindström	179c283372	Merge branch 10.4 into 10.5	2023-01-14 08:25:57 +02:00
sjaakola	66c05326d2	MDEV-29684 Fixes for cluster wide write conflict resolving Cluster conflict victim's THD is marked with wsrep_aborter. THD::wsrep_aorter holds the thread ID of the hight priority tread, which is currently carrying out BF aborting for this victim. However, the BF abort operation is not always successful, and in such case the wsrep_aborter mark should be removed. In the old code, this wsrep_aborter resetting did not happen, and this could lead to a situation where the sticky wsrep_aborter mark prevents any further attempt to BF abort this transaction. This commit fixes this issue, and resets wsrep_aborter after unsuccesful BF abort attempt. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-13 13:11:03 +02:00
Jan Lindström	8c5d323326	Additional fixes * galera_many_rows : reduce the time used * wsrep_thd.cc : remove incorrect assertion * disabled.def : disable failing test cases	2022-10-25 12:58:10 +03:00
Jan Lindström	5fffdbc8d5	Fixes after 10.4 --> 10.5 merge * MDEV-29142 : Ignore inconsistency warning as we kill cluster * galera_parallel_apply_3nodes : Disabled because it is unstable * MDEV-26597 : Add missing code * galera_sr.galera_sr_ws_size2 : Remove incorrect assertion	2022-10-12 12:11:28 +03:00
Marko Mäkelä	977c385df3	Merge 10.4 into 10.5	2022-10-12 11:29:32 +03:00
Sergei Golubchik	2aab7f2d0a	MDEV-26597 post-fix: cannot add new error messages in 10.4 followup for `e8acec8974`	2022-10-11 16:20:22 +02:00
Jan Lindström	e8acec8974	MDEV-26597 : Assertion `!wsrep_has_changes(thd) \|\| (thd->lex->sql_command == SQLCOM_CREATE_TABLE && !thd->is_current_stmt_binlog_format_row())' failed. If repl.max_ws_size is set too low following CREATE TABLE could fail during commit. In this case wsrep_commit_empty should allow rolling it back if provider state is s_aborted. Furhermore, original ER_ERROR_DURING_COMMIT does not really tell anything clear for user. Therefore, this commit adds a new error ER_TOO_BIG_WRITESET. This will change some test cases output.	2022-10-09 10:09:47 +03:00
Julius Goryavsky	55bb933a88	Merge branch 10.4 into 10.5	2021-12-26 12:51:04 +01:00
sjaakola	c1846c4fcf	MDEV-26803 PA unsafety with FK cascade delete operation This commit has a mtr test where two two transactions delete a row from two separate tables, which will cascade a FK delete for the same row in a third table. Second replica node is configured with 2 applier threads, and the test will fail if these two transactions are applied in parallel. The actual fix, in this commit, is to mark a transaction as unsafe for parallel applying when it traverses into cascade delete operation. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-12-17 09:38:23 +02:00
Marko Mäkelä	87ff4ba7c8	Merge 10.4 into 10.5	2021-08-26 08:46:57 +03:00
Marko Mäkelä	15b691b7bd	After-merge fix `f84e28c119` In a rebase of the merge, two preceding commits were accidentally reverted: commit `112b23969a` (MDEV-26308) commit `ac2857a5fb` (MDEV-25717) Thanks to Daniele Sciascia for noticing this.	2021-08-25 17:35:44 +03:00
Marko Mäkelä	f84e28c119	Merge 10.3 into 10.4	2021-08-18 16:51:52 +03:00
Daniele Sciascia	ac2857a5fb	MDEV-25717 Assertion `owning_thread_id_ == wsrep::this_thread::get_id()' A test case to reproduce the issue. The actual fix is in galera library. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-08-18 12:28:11 +03:00
Leandro Pacheco	112b23969a	MDEV-26308 : Galera test failure on galera.galera_split_brain Contains following fixes: * allow TOI commands to timeout while trying to acquire TOI with override lock_wait_timeout with a LONG_TIMEOUT only after succesfully entering TOI * only ignore lock_wait_timeout on TOI * fix galera_split_brain test as TOI operation now returns ER_LOCK_WAIT_TIMEOUT after lock_wait_timeout * explicitly test for TOI Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-08-18 08:57:33 +03:00
Jan Lindström	224c950462	MDEV-23101 : SIGSEGV in lock_rec_unlock() when Galera is enabled Remove incorrect BF (brute force) handling from lock_rec_has_to_wait_in_queue and move condition to correct callers. Add a function to report BF lock waits and assert if incorrect BF-BF lock wait happens. wsrep_report_bf_lock_wait Add a new function to report BF lock wait. wsrep_assert_no_bf_bf_wait Add a new function to check do we have a BF-BF wait and if we have report this case and assert as it is a bug. lock_rec_has_to_wait Use new wsrep_assert_bf_wait to check BF-BF wait. lock_rec_create_low lock_table_create Use new function to report BF lock waits. lock_rec_insert_by_trx_age lock_grant_and_move_on_page lock_grant_and_move_on_rec Assert that trx is not Galera as VATS is not compatible with Galera. lock_rec_add_to_queue If there is conflicting lock in a queue make sure that transaction is BF. lock_rec_has_to_wait_in_queue Remove incorrect BF handling. If there is conflicting locks in a queue all transactions must wait. lock_rec_dequeue_from_page lock_rec_unlock If there is conflicting lock make sure it is not BF-BF case. lock_rec_queue_validate Add Galera record locking rules comment and use new function to report BF lock waits. All attempts to reproduce the original assertion have been failed. Therefore, there is no test case on this commit.	2020-09-10 13:18:12 +03:00
Julius Goryavsky	5f55f69e4a	Merge 10.1 into 10.2	2020-06-05 18:32:37 +02:00
sjaakola	8ec0e9111a	MDEV-22763 backporting MDEV-20225 fix into 10.1 Backported the support for aborting and replaying stored procedure and fix for trigger key assigments from 10.4 version. Backported also two mtr tests: wsrep_sp_bf_abort and MDEV-20225	2020-06-03 15:34:44 +02:00
Eugene Kosov	89ff4176c1	MDEV-22437 make THR_THD* variable thread_local Now all access goes through _current_thd() and set_current_thd() functions. Some functions like THD::store_globals() can not fail now.	2020-05-05 18:13:31 +03:00
Jan Lindström	57ec527841	MDEV-17062 : Test failure on galera.MW-336 Add mutex protection while we calculate required slave thread change and create them. Add error handling.	2020-01-20 15:54:30 +02:00
Jan Lindström	c4195305b2	MDEV-17062 : Test failure on galera.MW-336 Add mutex protection while we calculate required slave thread change and create them. Add error handling.	2020-01-17 12:51:18 +02:00
Daniele Sciascia	aab6cefe8d	MDEV-20848 Fixes for MTR test galera_sr.GCF-1060 (#1421 ) This patch contains two fixes: * wsrep_handle_mdl_conflict(): handle the case where SR transaction is in aborting state. Previously, a BF-BF conflict was reported, and the process would abort. * wsrep_thd_bf_abort(): do not restore thread vars after calling wsrep_bf_abort(). Thread vars are already restored in wsrep-lib if necessary. This also removes the assumption that the caller of wsrep_thd_bf_abort() is the given bf_thd, which is not the case. Also in this patch: * Remove unnecessary check for active victim transaction in wsrep_thd_bf_abort(): the exact same check is performed later in wsrep_bf_abort(). * Make wsrep_thd_bf_abort() and wsrep_log_thd() const-correct. * Change signature of wsrep_abort_thd() to take THD pointers instead of void pointers.	2019-12-04 09:21:14 +02:00
Teemu Ollakka	9487e0b259	MDEV-19826 10.4 seems to crash with "pool-of-threads" (#1370 ) MariaDB 10.4 was crashing when thread-handling was set to pool-of-threads and wsrep was enabled. There were two apparent reasons for the crash: - Connection handling in threadpool_common.cc was missing calls to control wsrep client state. - Thread specific storage which contains thread variables (THR_KEY_mysys) was not handled appropriately by wsrep patch when pool-of-threads was configured. This patch addresses the above issues in the following way: - Wsrep client state open/close was moved in thd_prepare_connection() and end_connection() to have common handling for one-thread-per-connection and pool-of-threads. - Thread local storage handling in wsrep patch was reworked by introducing set of wsrep_xxx_threadvars() calls which replace calls to THD store_globals()/reset_globals() and deal with thread handling specifics internally. Wsrep-lib was updated to version which relaxes internal concurrency related sanity checks. Rollback code from wsrep_rollback_process() was extracted to separate calls for better readability. Post rollback thread was removed as it was completely unused.	2019-08-30 08:42:24 +03:00
Oleksandr Byelkin	c07325f932	Merge branch '10.3' into 10.4	2019-05-19 20:55:37 +02:00
Vicențiu Ciorbaru	cb248f8806	Merge branch '5.5' into 10.1	2019-05-11 22:19:05 +03:00
Marko Mäkelä	d0116e10a5	Revert MDEV-18464 and MDEV-12009 This reverts commit `21b2fada7a` and commit `81d71ee6b2`. The MDEV-18464 change introduces a few data race issues. Contrary to the documentation, the field trx_t::victim is not always being protected by lock_sys_t::mutex and trx_t::mutex. Most importantly, it seems that KILL QUERY could wrongly avoid acquiring both mutexes when invoking lock_trx_handle_wait_low(), in case another thread had already set trx->victim=true. We also revert MDEV-12009, because it should depend on the MDEV-18464 fix being present.	2019-03-28 12:39:50 +02:00
Jan Lindström	81d71ee6b2	MDEV-12009: Allow to force kill user threads/query which are flagged as high priority by Galera As noted on kill_one_thread SUPER should be able to kill even system threads i.e. threads/query flagged as high priority or wsrep applier thread. Normal user, should not able to kill threads/query flagged as high priority (BF) or wsrep applier thread.	2019-03-28 08:43:44 +02:00
Brave Galera Crew	36a2a185fe	Galera4	2019-01-23 15:30:00 +04:00
Monty	0ee879ff8a	Improve performance for calculating memory allocation Extend interface for 'show variables' with current scope	2015-02-01 15:24:22 +02:00
Sergei Golubchik	03ec3511a8	cleanup: galera misc cleanups also disable galera-specific output in mysql_tzinfo_to_sql, it'll be enabled later.	2014-10-10 22:27:36 +02:00
Sergei Golubchik	7aabc2ded2	fixing embedded: WaaS. Wsrep as a Service.	2014-10-01 23:48:34 +02:00
Sergei Golubchik	3620910eea	cleanup: galera merge, simple changes	2014-10-01 23:38:27 +02:00
Jan Lindström	df4dd593f2	MDEV-6247: Merge 10.0-galera to 10.1. Merged lp:maria/maria-10.0-galera up to revision 3879. Added a new functions to handler API to forcefully abort_transaction, producing fake_trx_id, get_checkpoint and set_checkpoint for XA. These were added for future possiblity to add more storage engines that could use galera replication.	2014-08-26 15:43:46 +03:00

38 commits