mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 12:02:42 +01:00

Author	SHA1	Message	Date
Teemu Ollakka	6966d7fe4b	MDEV-29293 MariaDB stuck on starting commit state This is a backport from 10.5. The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:33:37 +02:00
sjaakola	c1846c4fcf	MDEV-26803 PA unsafety with FK cascade delete operation This commit has a mtr test where two two transactions delete a row from two separate tables, which will cascade a FK delete for the same row in a third table. Second replica node is configured with 2 applier threads, and the test will fail if these two transactions are applied in parallel. The actual fix, in this commit, is to mark a transaction as unsafe for parallel applying when it traverses into cascade delete operation. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-12-17 09:38:23 +02:00
Marko Mäkelä	44d70c01f0	Merge 10.3 into 10.4	2021-03-19 11:42:44 +02:00
Marko Mäkelä	19052b6deb	Merge 10.2 into 10.3	2021-03-18 12:34:48 +02:00
Julius Goryavsky	7345d37141	MDEV-24853: Duplicate key generated during cluster configuration change Incorrect processing of an auto-incrementing field in the WSREP-related code during applying transactions results in a duplicate key being created. This is due to the fact that at the beginning of the write_row() and update_row() functions, the values of the auto-increment parameters are used, which are read from the parameters of the current thread, but further along the code other values are used, which are read from global variables (when applying a transaction). This can happen when the cluster configuration has changed while applying a transaction (for example in the high_priority_service mode for Galera 4). Further during IST processing duplicating key is detected, and processing of the DB_DUPLICATE_KEY return code (inside innodb, in the write_row() handler) results in a call to the wsrep_thd_self_abort() function.	2021-03-08 11:15:08 +01:00
Sergei Golubchik	00a313ecf3	Merge branch 'bb-10.3-release' into bb-10.4-release Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution" was null-merged. 10.4 version of the fix is coming up separately	2021-02-12 17:44:22 +01:00
Sergei Golubchik	6212cf86a2	galera fixes related to THD::LOCK_thd_kill win	2021-02-02 14:08:07 +01:00
Marko Mäkelä	3a423088ac	Merge 10.3 into 10.4	2020-09-21 12:29:00 +03:00
Marko Mäkelä	cbcb4ecabb	Merge 10.2 into 10.3	2020-09-21 11:04:04 +03:00
Jan Lindström	224c950462	MDEV-23101 : SIGSEGV in lock_rec_unlock() when Galera is enabled Remove incorrect BF (brute force) handling from lock_rec_has_to_wait_in_queue and move condition to correct callers. Add a function to report BF lock waits and assert if incorrect BF-BF lock wait happens. wsrep_report_bf_lock_wait Add a new function to report BF lock wait. wsrep_assert_no_bf_bf_wait Add a new function to check do we have a BF-BF wait and if we have report this case and assert as it is a bug. lock_rec_has_to_wait Use new wsrep_assert_bf_wait to check BF-BF wait. lock_rec_create_low lock_table_create Use new function to report BF lock waits. lock_rec_insert_by_trx_age lock_grant_and_move_on_page lock_grant_and_move_on_rec Assert that trx is not Galera as VATS is not compatible with Galera. lock_rec_add_to_queue If there is conflicting lock in a queue make sure that transaction is BF. lock_rec_has_to_wait_in_queue Remove incorrect BF handling. If there is conflicting locks in a queue all transactions must wait. lock_rec_dequeue_from_page lock_rec_unlock If there is conflicting lock make sure it is not BF-BF case. lock_rec_queue_validate Add Galera record locking rules comment and use new function to report BF lock waits. All attempts to reproduce the original assertion have been failed. Therefore, there is no test case on this commit.	2020-09-10 13:18:12 +03:00
sjaakola	5a7794d3a8	MDEV-21910 Deadlock between BF abort and manual KILL command When high priority replication slave applier encounters lock conflict in innodb, it will force the conflicting lock holder transaction (victim) to rollback. This is a must in multi-master sychronous replication model to avoid cluster lock-up. This high priority victim abort (aka "brute force" (BF) abort), is started from innodb lock manager while holding the victim's transaction's (trx) mutex. Depending on the execution state of the victim transaction, it may happen that the BF abort will call for THD::awake() to wake up the victim transaction for the rollback. Now, if BF abort requires THD::awake() to be called, then the applier thread executed locking protocol of: victim trx mutex -> victim THD::LOCK_thd_data If, at the same time another DBMS super user issues KILL command to abort the same victim, it will execute locking protocol of: victim THD::LOCK_thd_data -> victim trx mutex. These two locking protocol acquire mutexes in opposite order, hence unresolvable mutex locking deadlock may occur. The fix in this commit adds THD::wsrep_aborter flag to synchronize who can kill the victim This flag is set both when BF is called for from innodb and by KILL command. Either path of victim killing will bail out if victim's wsrep_killed is already set to avoid mutex conflicts with the other aborter execution. THD::wsrep_aborter records the aborter THD's ID. This is needed to preserve the right to kill the victim from different locations for the same aborter thread. It is also good error logging, to see who is reponsible for the abort. A new test case was added in galera.galera_bf_kill_debug.test for scenario where wsrep applier thread and manual KILL command try to kill same idle victim	2020-06-26 09:56:23 +03:00
Marko Mäkelä	b63446984c	Merge 10.3 into 10.4	2020-04-27 17:38:17 +03:00
Marko Mäkelä	2e12d471ea	Merge 10.2 into 10.3	2020-04-27 14:24:41 +03:00
Marko Mäkelä	c06845d6f0	Merge 10.1 into 10.2	2020-04-27 13:28:13 +03:00
Sergei Golubchik	dd4124c224	MDEV-22271 Excessive stack memory usage due to WSREP_LOG fix embedded innodb_plugin tests followup for `7198c6ab2d`	2020-04-27 09:13:02 +02:00
Daniele Sciascia	aab6cefe8d	MDEV-20848 Fixes for MTR test galera_sr.GCF-1060 (#1421 ) This patch contains two fixes: * wsrep_handle_mdl_conflict(): handle the case where SR transaction is in aborting state. Previously, a BF-BF conflict was reported, and the process would abort. * wsrep_thd_bf_abort(): do not restore thread vars after calling wsrep_bf_abort(). Thread vars are already restored in wsrep-lib if necessary. This also removes the assumption that the caller of wsrep_thd_bf_abort() is the given bf_thd, which is not the case. Also in this patch: * Remove unnecessary check for active victim transaction in wsrep_thd_bf_abort(): the exact same check is performed later in wsrep_bf_abort(). * Make wsrep_thd_bf_abort() and wsrep_log_thd() const-correct. * Change signature of wsrep_abort_thd() to take THD pointers instead of void pointers.	2019-12-04 09:21:14 +02:00
Marko Mäkelä	ec40980ddd	Merge 10.3 into 10.4	2019-11-01 15:23:18 +02:00
Oleksandr Byelkin	55b2281a5d	Merge branch '10.2' into 10.3	2019-10-31 10:58:06 +01:00
Jan Lindström	36a9694378	MDEV-18562 [ERROR] InnoDB: WSREP: referenced FK check fail: Lock wait index Lock wait can happen on secondary index when doing FK checks for wsrep. We should just return error to upper layer and applier will retry operation when needed.	2019-10-30 10:14:56 +02:00
Oleksandr Byelkin	c07325f932	Merge branch '10.3' into 10.4	2019-05-19 20:55:37 +02:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	cb248f8806	Merge branch '5.5' into 10.1	2019-05-11 22:19:05 +03:00
Jan Lindström	71848585f8	Fix InnoDB dynamic plugin compile errors on wsrep patch.	2019-04-10 11:19:38 +03:00
Marko Mäkelä	d0116e10a5	Revert MDEV-18464 and MDEV-12009 This reverts commit `21b2fada7a` and commit `81d71ee6b2`. The MDEV-18464 change introduces a few data race issues. Contrary to the documentation, the field trx_t::victim is not always being protected by lock_sys_t::mutex and trx_t::mutex. Most importantly, it seems that KILL QUERY could wrongly avoid acquiring both mutexes when invoking lock_trx_handle_wait_low(), in case another thread had already set trx->victim=true. We also revert MDEV-12009, because it should depend on the MDEV-18464 fix being present.	2019-03-28 12:39:50 +02:00
Jan Lindström	81d71ee6b2	MDEV-12009: Allow to force kill user threads/query which are flagged as high priority by Galera As noted on kill_one_thread SUPER should be able to kill even system threads i.e. threads/query flagged as high priority or wsrep applier thread. Normal user, should not able to kill threads/query flagged as high priority (BF) or wsrep applier thread.	2019-03-28 08:43:44 +02:00
Marko Mäkelä	117291db8b	Merge 10.2 into 10.3	2019-03-19 16:04:59 +02:00
sysprg	26432e49d3	MDEV-17262: mysql crashed on galera while node rejoined cluster (#895 ) This patch contains a fix for the MDEV-17262/17243 issues and new mtr test. These issues (MDEV-17262/17243) have two reasons: 1) After an intermediate commit, a transaction loses its status of "transaction that registered in the MySQL for 2pc coordinator" (in the InnoDB) due to the fact that since version 10.2 the write_row() function (which located in the ha_innodb.cc) does not call trx_register_for_2pc(m_prebuilt->trx) during the processing of split transactions. It is necessary to restore this call inside the write_row() when an intermediate commit was made (for a split transaction). Similarly, we need to set the flag of the started transaction (m_prebuilt->sql_stat_start) after intermediate commit. The table->file->extra(HA_EXTRA_FAKE_START_STMT) called from the wsrep_load_data_split() function (which located in sql_load.cc) will also do this, but it will be too late. As a result, the call to the wsrep_append_keys() function from the InnoDB engine may be lost or function may be called with invalid transaction identifier. 2) If a transaction with the LOAD DATA statement is divided into logical mini-transactions (of the 10K rows) and binlog is rotated, then in rare cases due to the wsrep handler re-registration at the boundary of the split, the last portion of data may be lost. Since splitting of the LOAD DATA into mini-transactions is technical, I believe that we should not allow these mini-transactions to fall into separate binlogs. Therefore, it is necessary to prohibit the rotation of binlog in the middle of processing LOAD DATA statement. https://jira.mariadb.org/browse/MDEV-17262 and https://jira.mariadb.org/browse/MDEV-17243	2019-03-18 07:39:51 +02:00
Sergei Golubchik	b64fde8f38	Merge branch '10.2' into 10.3	2019-03-17 13:06:41 +01:00
Teemu Ollakka	1ef50a34ec	10.4 wsrep group commit fixes (#1224 ) * MDEV-16509 Improve wsrep commit performance with binlog disabled Release commit order critical section early after trx_commit_low() if binlog is not transaction coordinator. In order to avoid two phase commit, binlog_hton is not registered for THD during IO_CACHE population. Implemented a test which verifies that the transactions release commit order early. This optimization will change behavior during recovery as the commit is not two phase when binlog is off. Fixed and recorded wsrep-recover-v25 and wsrep-recover to match the behavior. * MDEV-18730 Ordering for wsrep binlog group commit Previously out of order execution was allowed for wsrep commits. Established proper ordering by populating wait_for_commit for every wsrep THD and making group commit leader to wait for prior commits before proceeding to trx_group_commit_leader(). * MDEV-18730 Added a test case to verify correct commit ordering * MDEV-16509, MDEV-18730 Review fixes Use WSREP_EMULATE_BINLOG() macro to decide if the binlog_hton should be registered. Whitespace/syntax fixes and cleanups. * MDEV-16509 Require binlog for galera_var_innodb_disallow_writes test If the commit to InnoDB is done in one phase, the native InnoDB behavior is that the transaction is committed in memory before it is persisted to disk. This means that the innodb_disallow_writes=ON may not prevent transaction to become visible to other readers before commit is completely over. On the other hand, if the commit is two phase (as it is with binlog), the transaction will be blocked in prepare phase. Fixed the test to use binlog, which enforces two phase commit, which in turn makes commit to block before the changes become visible to other connections. This guarantees that the test produces expected result.	2019-03-15 07:09:13 +02:00
Jan Lindström	d0ebb155fe	MDEV-18577: Indexes problem on import dump SQL Problem was that we skipped background persistent statistics calculation on applier nodes if thread is marked as high priority (a.k.a BF). However, on applier nodes all DDL which is replicate will be executed as high priority i.e BF. Fixed by allowing background persistent statistics calculation on applier nodes even when thread is marked as BF. This could lead BF lock waits but for queries on that node needs that statistics.	2019-03-13 10:18:12 +02:00
Marko Mäkelä	2a791c53ad	Merge 10.3 into 10.4	2019-03-06 09:00:52 +02:00
Julius Goryavsky	50b3632fa4	MDEV-9519: Data corruption will happen on the Galera cluster size change If we have a 2+ node cluster which is replicating from an async master and the binlog_format is set to STATEMENT and multi-row inserts are executed on a table with an auto_increment column such that values are automatically generated by MySQL, then the server node generates wrong auto_increment values, which are different from what was generated on the async master. In the title of the MDEV-9519 it was proposed to ban start slave on a Galera if master binlog_format = statement and wsrep_auto_increment_control = 1, but the problem can be solved without such a restriction. The causes and fixes: 1. We need to improve processing of changing the auto-increment values after changing the cluster size. 2. If wsrep auto_increment_control switched on during operation of the node, then we should immediately update the auto_increment_increment and auto_increment_offset global variables, without waiting of the next invocation of the wsrep_view_handler_cb() callback. In the current version these variables retain its initial values if wsrep_auto_increment_control is switched on during operation of the node, which leads to inconsistent results on the different nodes in some scenarios. 3. If wsrep auto_increment_control switched off during operation of the node, then we must return the original values of the auto_increment_increment and auto_increment_offset global variables, as the user has set. To make this possible, we need to add a "shadow copies" of these variables (which stores the latest values set by the user). https://jira.mariadb.org/browse/MDEV-9519	2019-02-26 08:09:04 +02:00
Julius Goryavsky	2c734c980e	MDEV-9519: Data corruption will happen on the Galera cluster size change If we have a 2+ node cluster which is replicating from an async master and the binlog_format is set to STATEMENT and multi-row inserts are executed on a table with an auto_increment column such that values are automatically generated by MySQL, then the server node generates wrong auto_increment values, which are different from what was generated on the async master. In the title of the MDEV-9519 it was proposed to ban start slave on a Galera if master binlog_format = statement and wsrep_auto_increment_control = 1, but the problem can be solved without such a restriction. The causes and fixes: 1. We need to improve processing of changing the auto-increment values after changing the cluster size. 2. If wsrep auto_increment_control switched on during operation of the node, then we should immediately update the auto_increment_increment and auto_increment_offset global variables, without waiting of the next invocation of the wsrep_view_handler_cb() callback. In the current version these variables retain its initial values if wsrep_auto_increment_control is switched on during operation of the node, which leads to inconsistent results on the different nodes in some scenarios. 3. If wsrep auto_increment_control switched off during operation of the node, then we must return the original values of the auto_increment_increment and auto_increment_offset global variables, as the user has set. To make this possible, we need to add a "shadow copies" of these variables (which stores the latest values set by the user). https://jira.mariadb.org/browse/MDEV-9519	2019-02-26 07:45:11 +02:00
Julius Goryavsky	243f829c1c	MDEV-9519: Data corruption will happen on the Galera cluster size change If we have a 2+ node cluster which is replicating from an async master and the binlog_format is set to STATEMENT and multi-row inserts are executed on a table with an auto_increment column such that values are automatically generated by MySQL, then the server node generates wrong auto_increment values, which are different from what was generated on the async master. In the title of the MDEV-9519 it was proposed to ban start slave on a Galera if master binlog_format = statement and wsrep_auto_increment_control = 1, but the problem can be solved without such a restriction. The causes and fixes: 1. We need to improve processing of changing the auto-increment values after changing the cluster size. 2. If wsrep auto_increment_control switched on during operation of the node, then we should immediately update the auto_increment_increment and auto_increment_offset global variables, without waiting of the next invocation of the wsrep_view_handler_cb() callback. In the current version these variables retain its initial values if wsrep_auto_increment_control is switched on during operation of the node, which leads to inconsistent results on the different nodes in some scenarios. 3. If wsrep auto_increment_control switched off during operation of the node, then we must return the original values of the auto_increment_increment and auto_increment_offset global variables, as the user has set. To make this possible, we need to add a "shadow copies" of these variables (which stores the latest values set by the user). https://jira.mariadb.org/browse/MDEV-9519	2019-02-25 11:19:07 +02:00
Brave Galera Crew	36a2a185fe	Galera4	2019-01-23 15:30:00 +04:00
Sergei Golubchik	57e0da50bb	Merge branch '10.2' into 10.3	2018-09-28 16:37:06 +02:00
Sergei Golubchik	a6246cab16	fix failures of innodb_plugin tests in --embedded Post-fix for `7e8ed15b95` Also, apply the same innodb fix to xtradb.	2018-09-04 09:19:50 +02:00
Teemu Ollakka	33aad1d273	MDEV-15505 Fixes to compilation without -DWITH_WSREP:BOOL=ON Removed including wsrep_api.h from service_wsrep.h. This caused various kinds of collisions with definitions when wsrep is not supposed to be built in. Defined functions wsrep_xid_seqno() and wsrep_xid_uuid() in wsrep_dummy.cc. Replaced wsrep_seqno_t with long long where wsrep_api.h is not included. Removed wsrep_xid_seqno() macro from wsrep_mysqld.h and made wsrep code using wsrep_xid_seqno() in handler.cc to be compiled in only if WITH_WSREP is ON. Included wsrep_api.h for mariabackup if WITH_WSREP is ON.	2018-03-21 12:02:09 +02:00
Michael Widenius	4aaa38d26e	Enusure that my_global.h is included first - Added sql/mariadb.h file that should be included first by files in sql directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables that must be done before my_global.h is included) - Removed a lot of include my_global.h from include files - Removed include's of some files that my_global.h automatically includes - Removed duplicated include's of my_sys.h - Replaced include my_config.h with my_global.h	2017-08-24 01:05:44 +02:00
Nirbhay Choubey	0251232f8c	Fix to ensure updates in gtid_slave_state table do not get binlogged. Also, renamed wsrep_skip_append_keys to wsrep_ignore_table. Test case : galera.galera_as_slave_gtid.test	2016-02-24 23:32:37 -05:00
Sergei Golubchik	7697bf0bd7	Merge branch 'github/10.0-galera' into 10.1 Note: some tests fail, just as they failed before the merge!	2015-12-22 10:32:33 +01:00
Nirbhay Choubey	dced5146bd	Merge branch '10.0-galera' into 10.1	2015-07-14 16:05:29 -04:00
Sergei Golubchik	8655136222	remove wsrep_hton dependency from innodb/xtradb	2015-01-08 21:27:30 +01:00
Sergei Golubchik	7aabc2ded2	fixing embedded: WaaS. Wsrep as a Service.	2014-10-01 23:48:34 +02:00

45 commits