mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-27 17:33:44 +01:00

Author	SHA1	Message	Date
Julius Goryavsky	cefdc3e67d	Merge branch '10.5' into '10.6'	2024-12-03 13:08:12 +01:00
Jan Lindström	f219fb8489	MDEV-35355 : Galera test failure on galera_sr.mysql-wsrep-features#165 Problem was that in DeadlockChecker::trx_rollback() we hold lock_sys before we call wsrep_handle_SR_rollback() where THD::LOCK_thd_data (and some cases THD::LOCK_thd_kill) are acquired. This is against current mutex ordering rules. However, acquiring THD::LOCK_thd_data is not necessary because we always are in victim_thd context, either client session is rolling back or rollbacker thread should be in control. Therefore, we should always use wsrep_thd_self_abort() and then no additional mutexes are required. Fixed by removing locking of THD::LOCK_thd_data and using only wsrep_thd_self_abort(). In debug builds added assertions to verify that we are always in victim_thd context. This fix is for MariaDB 10.5 and we already have a test case that sporadically fail in Jenkins before this fix. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-12-03 03:19:35 +01:00
Denis Protivensky	231900e5bb	MDEV-34836: TOI on parent table must BF abort SR in progress on a child Applied SR transaction on the child table was not BF aborted by TOI running on the parent table for several reasons: Although SR correctly collected FK-referenced keys to parent, TOI in Galera disregards common certification index and simply sets itself to depend on the latest certified write set seqno. Since this write set was the fragment of SR transaction, TOI was allowed to run in parallel with SR presuming it would BF abort the latter. At the same time, DML transactions in the server don't grab MDL locks on FK-referenced tables, thus parent table wasn't protected by an MDL lock from SR and it couldn't provoke MDL lock conflict for TOI to BF abort SR transaction. In InnoDB, DDL transactions grab shared MDL locks on child tables, which is not enough to trigger MDL conflict in Galera. InnoDB-level Wsrep patch didn't contain correct conflict resolution logic due to the fact that it was believed MDL locking should always produce conflicts correctly. The fix brings conflict resolution rules similar to MDL-level checks to InnoDB, thus accounting for the problematic case. Apart from that, wsrep_thd_is_SR() is patched to return true only for executing SR transactions. It should be safe as any other SR state is either the same as for any single write set (thus making the two logically equivalent), or it reflects an SR transaction as being aborting or prepared, which is handled separately in BF-aborting logic, and for regular execution path it should not matter at all. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-09-24 11:14:01 +02:00
Jan Lindström	ee974ca5e0	MDEV-31658 : Deadlock found when trying to get lock during applying Problem was that there was two non-conflicting local idle transactions in node_1 that both inserted a key to primary key. Then two transactions from other nodes inserted also a key to primary key so that insert from node_2 conflicted one of the local transactions in node_1 so that there would be duplicate key if both are committed. For this insert from other node tries to acquire S-lock for this record and because this insert is high priority brute force (BF) transaction it will kill idle local transaction. Concurrently, second insert from node_3 conflicts the second idle insert transaction in node_1. Again, it tries to acquire S-lock for this record and kills idle local transaction. At this point we have two non-conflicting high priority transactions holding S-lock on different records in node_1. For example like this: rec s-lock-node2-rec s-lock-node3-rec rec. Because these high priority BF-transactions do not wait each other insert from node3 that has later seqno compared to insert from node2 can continue. It will try to acquire insert intention for record it tries to insert (to avoid duplicate key to be inserted by local transaction). Hower, it will note that there is conflicting S-lock in same gap between records. This will lead deadlock error as we have defined that BF-transactions may not wait for record lock but we can't kill conflicting BF-transaction because it has lower seqno and it should commit first. BF-transactions are executed concurrently because their values to primary key are different i.e. they do not conflict. Galera certification will make sure that inserts from other nodes i.e these high priority BF-transactions can't insert duplicate keys. Local transactions naturally can but they will be killed when BF-transaction acquires required record locks. Therefore, we can allow situation where there is conflicting S-lock and insert intention lock regardless of their seqno order and let both continue with no wait. This will lead to situation where we need to allow BF-transaction to wait when lock_rec_has_to_wait_in_queue is called because this function is also called from lock_rec_queue_validate and because lock is waiting there would be assertion in ut_a(lock->is_gap() \|\| lock_rec_has_to_wait_in_queue(cell, lock)); lock_wait_wsrep_kill Add debug sync points for BF-transactions killing local transaction. wsrep_assert_no_bf_bf_wait Print also requested lock information lock_rec_has_to_wait Add function to handle wsrep transaction lock wait cases. lock_rec_has_to_wait_wsrep New function to handle wsrep transaction lock wait exceptions. lock_rec_has_to_wait_in_queue Remove wsrep exception, in this function all conflicting locks need to wait in queue. Conflicts between BF and local transactions are handled in lock_wait. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-06-19 14:09:11 +02:00
Julius Goryavsky	b88c20ce1b	Merge branch 10.4 into 10.5	2024-05-06 13:55:42 +02:00
Jan Lindström	fbfb5a6f59	MDEV-33928 : Assertion failure on wsrep_thd_is_aborting Problem was assertion assuming we always hold THD::LOCK_thd_data mutex that is not true. In most cases this is true but function is also used from InnoDB lock manager and there we can't take THD::LOCK_thd_data to obey mutex ordering. Removed assertion as wsrep transaction state can't change even that case. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-04-23 00:40:36 +02:00
Jan Lindström	a2fee2da0b	MDEV-33928 : Assertion failure on wsrep_thd_is_aborting Problem was assertion assuming we always hold THD::LOCK_thd_data mutex that is not true. In most cases this is true but function is also used from InnoDB lock manager and there we can't take THD::LOCK_thd_data to obey mutex ordering. Removed assertion as wsrep transaction state can't change even that case. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-04-22 16:57:48 +02:00
Jan Lindström	c5ac9836b3	MDEV-33039 Galera test failure on mysql-wsrep-features#165 We should not set debug sync point when holding a mutex to avoid mutex ordering failure. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-03-27 01:25:22 +01:00
Marko Mäkelä	e23c695250	Merge 10.5 into 10.6	2024-01-02 17:37:58 +02:00
sjaakola	c89f769f24	MDEV-31905 GTID inconsistency This commit fixes GTID inconsistency which was injected by mariabackup SST. Donor node now writes new info file: donor_galera_info, which is streamed along the mariabackup donation to the joiner node. The donor_galera_info file contains both GTID and gtid domain_id, and joiner will use these to initialize the GTID state. Commit has new mtr test case: galera_3nodes.galera_gtid_consistency, which exercises potentially harmful mariabackup SST scenarios. The test has also scenario with IST joining. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-12-22 00:10:23 +01:00
Sergei Golubchik	e95bba9c58	Merge branch '10.5' into 10.6	2023-12-17 11:20:43 +01:00
Teemu Ollakka	f307160218	MDEV-29293 MariaDB stuck on starting commit state This commit contains a merge from 10.5-MDEV-29293-squash into 10.6. Although the bug MDEV-29293 was not reproducible with 10.6, the fix contains several improvements for wsrep KILL query and BF abort handling, and addresses the following issues: * MDEV-30307 KILL command issued inside a transaction is problematic for galera replication: This commit will remove KILL TOI replication, so Galera side transaction context is not lost during KILL. * MDEV-21075 KILL QUERY maintains nodes data consistency but breaks GTID sequence: This is fixed as well as KILL does not use TOI, and thus does not change GTID state. * MDEV-30372 Assertion in wsrep-lib state: This was caused by BF abort or KILL when local transaction was in the middle of group commit. This commit disables THD::killed handling during commit, so the problem is avoided. * MDEV-30963 Assertion failure !lock.was_chosen_as_deadlock_victim in trx0trx.h:1065: The assertion happened when the victim was BF aborted via MDL while it was committing. This commit changes MDL BF aborts so that transactions which are committing cannot be BF aborted via MDL. The RQG grammar attached in the issue could not reproduce the crash anymore. Original commit message from 10.5 fix: MDEV-29293 MariaDB stuck on starting commit state The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Make galera_var_retry_autocommit result more readable by echoing cases and expectations into result. Only one expected result for reap to verify that server returns expected status for query. * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_bf_abort_registering to check that registering trx gets BF aborted through MDL. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:42:05 +02:00
Teemu Ollakka	3f59bbeeae	MDEV-29293 MariaDB stuck on starting commit state The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:39:43 +02:00
Teemu Ollakka	6966d7fe4b	MDEV-29293 MariaDB stuck on starting commit state This is a backport from 10.5. The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:33:37 +02:00
Denis Protivensky	39f4674599	MDEV-24623 Replicate bulk insert as table-level exclusive key - introduce table key construction function in wsrep service interface - don't add row keys when replicating bulk insert - don't start bulk insert on applier or when transaction is not active - don't start bulk insert on system versioned tables - implement actual bulk insert table-level key replication Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-24 11:54:25 +02:00
Marko Mäkelä	851c56771e	Merge 10.5 into 10.6	2023-01-23 13:15:41 +02:00
Daniele Sciascia	eeb8ebb152	MDEV-29774 BF abort no longer wakes up debug_sync waiters Since commit `d7d3ad698a`, "hard" kill is required to interrupt debug sync waits. Affected the following tests: - galera_var_retry_autocommit, - galera_bf_abort_at_after_statement - galera_parallel_apply_3nodes Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-19 08:24:41 +02:00
Marko Mäkelä	a8c5635cf1	Merge 10.5 into 10.6	2023-01-17 20:02:29 +02:00
sjaakola	95de5248c7	MDEV-26391 BF abortable mariabackup execution This commit changes backup execution (namely the block ddl phase), so that node is not paused from cluster. Instead, the following backup execution is declared as vulnerable for possible cluster level conflicts, especially with DDL statement applying. With this, the mariabackup execution may be aborted, if DDL statements happen during backup execution. This abortable backup execution is optional feature and may be enabled/disabled by wsrep_mode: BF_ABORT_MARIABACKUP. Note that old style node desync and pause, despite of WSREP_MODE_BF_MARIABACKUP is needed if node is operating as SST donor. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-17 10:03:05 +02:00
Jan Lindström	179c283372	Merge branch 10.4 into 10.5	2023-01-14 08:25:57 +02:00
sjaakola	a44d896f98	10.4-MDEV-29684 Fixes for cluster wide write conflict resolving If two high priority threads have lock conflict, we look at the order of these transactions and honor the earlier transaction. for_locking parameter in lock_rec_has_to_wait() has become obsolete and it is now removed from the code . Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-14 07:50:04 +02:00
sjaakola	66c05326d2	MDEV-29684 Fixes for cluster wide write conflict resolving Cluster conflict victim's THD is marked with wsrep_aborter. THD::wsrep_aorter holds the thread ID of the hight priority tread, which is currently carrying out BF aborting for this victim. However, the BF abort operation is not always successful, and in such case the wsrep_aborter mark should be removed. In the old code, this wsrep_aborter resetting did not happen, and this could lead to a situation where the sticky wsrep_aborter mark prevents any further attempt to BF abort this transaction. This commit fixes this issue, and resets wsrep_aborter after unsuccesful BF abort attempt. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-13 13:11:03 +02:00
Marko Mäkelä	6286a05d80	Merge 10.4 into 10.5	2022-09-26 13:34:38 +03:00
Marko Mäkelä	3c92050d1c	Fix build without either ENABLED_DEBUG_SYNC or DBUG_OFF There are separate flags DBUG_OFF for disabling the DBUG facility and ENABLED_DEBUG_SYNC for enabling the DEBUG_SYNC facility. Let us allow debug builds without DEBUG_SYNC. Note: For CMAKE_BUILD_TYPE=Debug, CMakeLists.txt will continue to define ENABLED_DEBUG_SYNC.	2022-09-23 17:37:52 +03:00
Marko Mäkelä	3f5726768f	Merge 10.5 into 10.6	2022-01-04 09:26:38 +02:00
Julius Goryavsky	55bb933a88	Merge branch 10.4 into 10.5	2021-12-26 12:51:04 +01:00
sjaakola	c1846c4fcf	MDEV-26803 PA unsafety with FK cascade delete operation This commit has a mtr test where two two transactions delete a row from two separate tables, which will cascade a FK delete for the same row in a third table. Second replica node is configured with 2 applier threads, and the test will fail if these two transactions are applied in parallel. The actual fix, in this commit, is to mark a transaction as unsafe for parallel applying when it traverses into cascade delete operation. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-12-17 09:38:23 +02:00
sjaakola	ef2dbb8dbc	MDEV-23328 Server hang due to Galera lock conflict resolution Mutex order violation when wsrep bf thread kills a conflicting trx, the stack is wsrep_thd_LOCK() wsrep_kill_victim() lock_rec_other_has_conflicting() lock_clust_rec_read_check_and_lock() row_search_mvcc() ha_innobase::index_read() ha_innobase::rnd_pos() handler::ha_rnd_pos() handler::rnd_pos_by_record() handler::ha_rnd_pos_by_record() Rows_log_event::find_row() Update_rows_log_event::do_exec_row() Rows_log_event::do_apply_event() Log_event::apply_event() wsrep_apply_events() and mutexes are taken in the order lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data When a normal KILL statement is executed, the stack is innobase_kill_query() kill_handlerton() plugin_foreach_with_mask() ha_kill_query() THD::awake() kill_one_thread() and mutexes are victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex This patch is the plan D variant for fixing potetial mutex locking order exercised by BF aborting and KILL command execution. In this approach, KILL command is replicated as TOI operation. This guarantees total isolation for the KILL command execution in the first node: there is no concurrent replication applying and no concurrent DDL executing. Therefore there is no risk of BF aborting to happen in parallel with KILL command execution either. Potential mutex deadlocks between the different mutex access paths with KILL command execution and BF aborting cannot therefore happen. TOI replication is used, in this approach, purely as means to provide isolated KILL command execution in the first node. KILL command should not (and must not) be applied in secondary nodes. In this patch, we make this sure by skipping KILL execution in secondary nodes, in applying phase, where we bail out if applier thread is trying to execute KILL command. This is effective, but skipping the applying of KILL command could happen much earlier as well. This also fixed unprotected calls to wsrep_thd_abort that will use wsrep_abort_transaction. This is fixed by holding THD::LOCK_thd_data while we abort transaction. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-10-29 20:40:35 +02:00
Jan Lindström	d5bc05798f	MDEV-25114: Crash: WSREP: invalid state ROLLED_BACK (FATAL) Revert "MDEV-23328 Server hang due to Galera lock conflict resolution" This reverts commit `eac8341df4`.	2021-10-29 20:38:11 +02:00
sjaakola	5c230b21bf	MDEV-23328 Server hang due to Galera lock conflict resolution Mutex order violation when wsrep bf thread kills a conflicting trx, the stack is wsrep_thd_LOCK() wsrep_kill_victim() lock_rec_other_has_conflicting() lock_clust_rec_read_check_and_lock() row_search_mvcc() ha_innobase::index_read() ha_innobase::rnd_pos() handler::ha_rnd_pos() handler::rnd_pos_by_record() handler::ha_rnd_pos_by_record() Rows_log_event::find_row() Update_rows_log_event::do_exec_row() Rows_log_event::do_apply_event() Log_event::apply_event() wsrep_apply_events() and mutexes are taken in the order lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data When a normal KILL statement is executed, the stack is innobase_kill_query() kill_handlerton() plugin_foreach_with_mask() ha_kill_query() THD::awake() kill_one_thread() and mutexes are victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex This patch is the plan D variant for fixing potetial mutex locking order exercised by BF aborting and KILL command execution. In this approach, KILL command is replicated as TOI operation. This guarantees total isolation for the KILL command execution in the first node: there is no concurrent replication applying and no concurrent DDL executing. Therefore there is no risk of BF aborting to happen in parallel with KILL command execution either. Potential mutex deadlocks between the different mutex access paths with KILL command execution and BF aborting cannot therefore happen. TOI replication is used, in this approach, purely as means to provide isolated KILL command execution in the first node. KILL command should not (and must not) be applied in secondary nodes. In this patch, we make this sure by skipping KILL execution in secondary nodes, in applying phase, where we bail out if applier thread is trying to execute KILL command. This is effective, but skipping the applying of KILL command could happen much earlier as well. This also fixed unprotected calls to wsrep_thd_abort that will use wsrep_abort_transaction. This is fixed by holding THD::LOCK_thd_data while we abort transaction. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-10-29 09:52:52 +03:00
Jan Lindström	aa7ca987db	MDEV-25114: Crash: WSREP: invalid state ROLLED_BACK (FATAL) Revert "MDEV-23328 Server hang due to Galera lock conflict resolution" This reverts commit `eac8341df4`.	2021-10-29 09:52:40 +03:00
Marko Mäkelä	ead38354e6	Merge 10.4 into 10.5	2021-10-04 19:32:13 +03:00
mkaruza	2f5ae0da71	MDEV-25883 Galera Cluster hangs while "DELETE FROM mysql.wsrep_cluster" Using `innodb_thread_concurrency` will call `wsrep_thd_is_aborting` to check WSREP thread state. This call should be protected by taking `LOCK_thd_data` before entering function. Applier and TOI threads should no be affected with usage of `innodb_thread_concurrency` variable so returning before any checks. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-09-30 12:25:26 +03:00
Marko Mäkelä	80ed136e6d	Merge 10.4 into 10.5	2021-04-21 09:01:01 +03:00
Daniele Sciascia	eb4123eefc	More fixes to variable wsrep_on * Disallow setting wsrep_on = 1 if wsrep_provider is unset. Also, move wsrep_on_basic from sys_vars to wsrep suite: this test now requires to run with wsrep_provider set * Disallow setting @@session.wsrep_on = 1 when @@global.wsrep_on = 0 * Handle the case where a new connection turns @@global.wsrep_on from off to on. In this case we would miss a call to wsrep_open, causing unexpected states in wsrep::client_state (causing assertions). * Disable wsrep.MDEV-22443 because it is no longer possible to enable wsrep_on, if server is started with wsrep_provider='none' Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-04-20 08:24:14 +03:00
Sergei Golubchik	25d9d2e37f	Merge branch 'bb-10.4-release' into bb-10.5-release	2021-02-15 16:43:15 +01:00
Sergei Golubchik	eac8341df4	MDEV-23328 Server hang due to Galera lock conflict resolution adaptation of `29bbcac0ee` for 10.4	2021-02-12 18:17:06 +01:00
Sergei Golubchik	00a313ecf3	Merge branch 'bb-10.3-release' into bb-10.4-release Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution" was null-merged. 10.4 version of the fix is coming up separately	2021-02-12 17:44:22 +01:00
Marko Mäkelä	882ce206db	Merge 10.4 into 10.5	2020-09-23 11:32:43 +03:00
Marko Mäkelä	3a423088ac	Merge 10.3 into 10.4	2020-09-21 12:29:00 +03:00
sjaakola	7bffe468b2	MDEV-21910 Deadlock between BF abort and manual KILL command When high priority replication slave applier encounters lock conflict in innodb, it will force the conflicting lock holder transaction (victim) to rollback. This is a must in multi-master sychronous replication model to avoid cluster lock-up. This high priority victim abort (aka "brute force" (BF) abort), is started from innodb lock manager while holding the victim's transaction's (trx) mutex. Depending on the execution state of the victim transaction, it may happen that the BF abort will call for THD::awake() to wake up the victim transaction for the rollback. Now, if BF abort requires THD::awake() to be called, then the applier thread executed locking protocol of: victim trx mutex -> victim THD::LOCK_thd_data If, at the same time another DBMS super user issues KILL command to abort the same victim, it will execute locking protocol of: victim THD::LOCK_thd_data -> victim trx mutex. These two locking protocol acquire mutexes in opposite order, hence unresolvable mutex locking deadlock may occur. The fix in this commit adds THD::wsrep_aborter flag to synchronize who can kill the victim This flag is set both when BF is called for from innodb and by KILL command. Either path of victim killing will bail out if victim's wsrep_killed is already set to avoid mutex conflicts with the other aborter execution. THD::wsrep_aborter records the aborter THD's ID. This is needed to preserve the right to kill the victim from different locations for the same aborter thread. It is also good error logging, to see who is reponsible for the abort. A new test case was added in galera.galera_bf_kill_debug.test for scenario where wsrep applier thread and manual KILL command try to kill same idle victim	2020-07-22 08:20:10 +03:00
sjaakola	5a7794d3a8	MDEV-21910 Deadlock between BF abort and manual KILL command When high priority replication slave applier encounters lock conflict in innodb, it will force the conflicting lock holder transaction (victim) to rollback. This is a must in multi-master sychronous replication model to avoid cluster lock-up. This high priority victim abort (aka "brute force" (BF) abort), is started from innodb lock manager while holding the victim's transaction's (trx) mutex. Depending on the execution state of the victim transaction, it may happen that the BF abort will call for THD::awake() to wake up the victim transaction for the rollback. Now, if BF abort requires THD::awake() to be called, then the applier thread executed locking protocol of: victim trx mutex -> victim THD::LOCK_thd_data If, at the same time another DBMS super user issues KILL command to abort the same victim, it will execute locking protocol of: victim THD::LOCK_thd_data -> victim trx mutex. These two locking protocol acquire mutexes in opposite order, hence unresolvable mutex locking deadlock may occur. The fix in this commit adds THD::wsrep_aborter flag to synchronize who can kill the victim This flag is set both when BF is called for from innodb and by KILL command. Either path of victim killing will bail out if victim's wsrep_killed is already set to avoid mutex conflicts with the other aborter execution. THD::wsrep_aborter records the aborter THD's ID. This is needed to preserve the right to kill the victim from different locations for the same aborter thread. It is also good error logging, to see who is reponsible for the abort. A new test case was added in galera.galera_bf_kill_debug.test for scenario where wsrep applier thread and manual KILL command try to kill same idle victim	2020-06-26 09:56:23 +03:00
Julius Goryavsky	198a4fee3c	MDEV-22729: Additional fix for branch 10.5	2020-06-24 13:02:37 +02:00
Marko Mäkelä	4a0b56f604	Merge 10.4 into 10.5	2020-05-31 10:28:59 +03:00
sjaakola	1af6e92f0b	MDEV-22666 galera.MW-328A hang The hang can happen between a lock connection issuing KILL CONNECTION for a victim, which is in committing phase. There happens two resource deadlockwhere killer is holding victim's LOCK_thd_data and requires trx mutex for the victim. The victim, otoh, holds his own trx mutex, but requires LOCK_thd_data in wsrep_commit_ordered(). Hence a classic two thread deadlock happens. The fix in this commit changes innodb commit so that wsrep_commit_ordered() is not called while holding trx mutex. With this, wsrep patch commit time mutex locking does not violate the locking protocol of KILL command (i.e. LOCK_thd_data -> trx mutex) Also, a new test case has been added in galera.galera_bf_kill.test for scenario where a client connection is killed in committting phase.	2020-05-25 19:30:23 +03:00
Aleksey Midenkov	f2a944516e	Merge 10.4 into 10.5	2020-05-15 17:13:35 +03:00
Jan Lindström	523d67a272	MDEV-22494 : Galera assertion lock_sys.mutex.is_owned() at lock_trx_handle_wait_low Problem was that trx->lock.was_chosen_as_wsrep_victim variable was not set back to false after it was set true. wsrep_thd_bf_abort Add assertions for correct mutex status and take necessary mutexes before calling thd->awake_no_mutex(). innobase_rollback_trx() Reset trx->lock.was_chosen_as_wsrep_victim wsrep_abort_slave_trx() Removed unused function. wsrep_innobase_kill_one_trx() Added function comment, removed unnecessary parameters and added debug assertions to enforce correct usage. Added more debug output to help out on error analysis. wsrep_abort_transaction() Added debug assertions and removed unused variables. trx0trx.h Removed assert_trx_is_free macro and replaced it with assert_freed() member function. trx_create() Use above assert_free() and initialize wsrep variables. trx_free() Use assert_free() trx_t::commit_in_memory() Reset lock.was_chosen_as_wsrep_victim trx_rollback_for_mysql() Reset trx->lock.was_chosen_as_wsrep_victim Add test case galera_bf_kill	2020-05-15 09:04:02 +03:00
Marko Mäkelä	5203bc10f1	Merge 10.4 into 10.5	2020-03-21 11:37:10 +02:00
mkaruza	d87c16be79	MDEV-20616: MariaDB-Galera 10.4.8 \| Transaction aborted \| Sig 6 Shutdown When connections go to same node and deadlock happens, BF abort should not happen for victim thread. Fixed by guarding `wsrep_handle_SR_rollback()` so that is called only for SR transactions. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Daniele Sciascia <daniele.sciascia@galeracluster.com>	2020-03-03 10:29:45 +01:00
Jan Lindström	e6a50e41da	MDEV-20051: Add new mode to wsrep_OSU_method in which Galera checks storage engine of the effected table Introduced a new wsrep_strict_ddl configuration variable in which Galera checks storage engine of the effected table. If table is not InnoDB (only storage engine currently fully supporting Galera replication) DDL-statement will return error code: ER_GALERA_REPLICATION_NOT_SUPPORTED eng "DDL-statement is forbidden as table storage engine does not support Galera replication" However, when wsrep_replicate_myisam=ON we allow DDL-statements to MyISAM tables. If effected table is allowed storage engine Galera will run normal TOI. This new setting should be for now set globally on all nodes in a cluster. When this setting is set following DDL-clauses accessing tables not supporting Galera replication are refused: * CREATE TABLE (e.g. CREATE TABLE t1(a int) engine=Aria * ALTER TABLE * TRUNCATE TABLE * CREATE VIEW * CREATE TRIGGER * CREATE INDEX * DROP INDEX * RENAME TABLE * DROP TABLE Statements on PROCEDURE, EVENT, FUNCTION are allowed as effected tables are known only at execution. Furthermore, USER, ROLE, SERVER, DATABASE statements are also allowed as they do not really have effected table.	2020-02-11 15:17:50 +02:00

1 2

64 commits