mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-28 01:34:17 +01:00

Author	SHA1	Message	Date
Jan Lindström	93475aff8d	MDEV-22203: WSREP_ON is unnecessarily expensive to evaluate Replaced WSREP_ON macro by single global variable WSREP_ON that is then updated at server statup and on wsrep_on and wsrep_provider update functions.	2020-04-24 13:12:46 +03:00
Daniele Sciascia	bdcecfa22c	MDEV-22021: Galera database could get inconsistent with rollback to savepoint When binlog is disabled, WSREP will not behave correctly when SAVEPOINT ROLLBACK is executed and we will not rollback transaction.	2020-03-31 14:18:21 +03:00
Oleksandr Byelkin	b8c0e49670	Merge commit '10.3' into 10.4	2020-03-11 13:27:10 +01:00
Oleksandr Byelkin	440452628d	Merge branch '10.2' into 10.3	2020-03-06 23:28:26 +01:00
seppo	4618c974e4	MDEV-21723 Async slave thread BF abort and replaying fixes (#1448 ) If async replication slave thread conflicts with cluster replication, then the async slave transaction should be BF aborted, and depending on the state of async slave transaction execution, potentially also replayed. There were problems in such BF abort implementation and the replaying was not started. This pull request contains fixes which make sure that if async slave thread is marked to abort and replay, it will complete carry out the rollback and release all locks and resources before starting the replaying. After replaying, async slave transactions is treated as successful, so the slave thread will continue as usual, handling next replication event. There is also new mtr test: galera.galera_slave_replay, which stresses both a certification failure for async slave thread and a successful BF abort followed by replaying.	2020-02-23 10:29:42 +02:00
Marko Mäkelä	259185764b	MDEV-17062: Fix a typo in an error message	2020-01-20 16:08:25 +02:00
Jan Lindström	57ec527841	MDEV-17062 : Test failure on galera.MW-336 Add mutex protection while we calculate required slave thread change and create them. Add error handling.	2020-01-20 15:54:30 +02:00
Marko Mäkelä	6373ec3ec7	Merge 10.2 into 10.3	2020-01-18 16:56:16 +02:00
Jan Lindström	c4195305b2	MDEV-17062 : Test failure on galera.MW-336 Add mutex protection while we calculate required slave thread change and create them. Add error handling.	2020-01-17 12:51:18 +02:00
Daniele Sciascia	72a5a4f1d5	MDEV-20780 Fixes for failures on galera_sr_ddl_master (#1425 ) Test galera_sr_ddl_master would sometimes fail due to leftover streaming replication fragments. Rollbacker thread would attempt to open streaming_log table to remove the fragments, but would fail in check_stack_overrun(). Ultimately the check_stack_overrun() failure was caused by rollbacker missing to switch the victim's THD thread stack to rollbacker's thread stack. Also in this patch: - Remove duplicate functionality in rollbacker helper functions, and extract rollbacker fragment removal into function wsrep_remove_streaming_fragments() - Reuse open_for_write() in wsrep_schema::remove_fragments - Partially revert changes to galera_sr_ddl_master test from commit `44a11a7c08`. Removed unnecessary wait condition and isolation level setting	2019-12-11 14:08:06 +02:00
Daniele Sciascia	aab6cefe8d	MDEV-20848 Fixes for MTR test galera_sr.GCF-1060 (#1421 ) This patch contains two fixes: * wsrep_handle_mdl_conflict(): handle the case where SR transaction is in aborting state. Previously, a BF-BF conflict was reported, and the process would abort. * wsrep_thd_bf_abort(): do not restore thread vars after calling wsrep_bf_abort(). Thread vars are already restored in wsrep-lib if necessary. This also removes the assumption that the caller of wsrep_thd_bf_abort() is the given bf_thd, which is not the case. Also in this patch: * Remove unnecessary check for active victim transaction in wsrep_thd_bf_abort(): the exact same check is performed later in wsrep_bf_abort(). * Make wsrep_thd_bf_abort() and wsrep_log_thd() const-correct. * Change signature of wsrep_abort_thd() to take THD pointers instead of void pointers.	2019-12-04 09:21:14 +02:00
Alexander Barkov	dc588e3d3f	Merge remote-tracking branch 'origin/10.3' into 10.4	2019-10-01 10:45:52 +04:00
Alexander Barkov	7e44c455f4	Merge remote-tracking branch 'origin/10.2' into 10.3	2019-10-01 09:37:40 +04:00
Marko Mäkelä	46b785262b	Fix -Wunused for CMAKE_BUILD_TYPE=RelWithDebInfo For release builds, do not declare unused variables. unpack_row(): Omit a debug-only variable from WSREP diagnostic message. create_wsrep_THD(): Fix -Wmaybe-uninitialized for the PSI_thread_key.	2019-09-30 12:49:53 +03:00
Marko Mäkelä	368e64aaed	MDEV-19826: Avoid unused variable in cmake -DPLUGIN_PERFSCHEMA=NO	2019-09-13 10:42:10 +03:00
Teemu Ollakka	9487e0b259	MDEV-19826 10.4 seems to crash with "pool-of-threads" (#1370 ) MariaDB 10.4 was crashing when thread-handling was set to pool-of-threads and wsrep was enabled. There were two apparent reasons for the crash: - Connection handling in threadpool_common.cc was missing calls to control wsrep client state. - Thread specific storage which contains thread variables (THR_KEY_mysys) was not handled appropriately by wsrep patch when pool-of-threads was configured. This patch addresses the above issues in the following way: - Wsrep client state open/close was moved in thd_prepare_connection() and end_connection() to have common handling for one-thread-per-connection and pool-of-threads. - Thread local storage handling in wsrep patch was reworked by introducing set of wsrep_xxx_threadvars() calls which replace calls to THD store_globals()/reset_globals() and deal with thread handling specifics internally. Wsrep-lib was updated to version which relaxes internal concurrency related sanity checks. Rollback code from wsrep_rollback_process() was extracted to separate calls for better readability. Post rollback thread was removed as it was completely unused.	2019-08-30 08:42:24 +03:00
Marko Mäkelä	efb8485d85	Merge 10.3 into 10.4, except for MDEV-20265 The MDEV-20265 commit `e746f451d5` introduces DBUG_ASSERT(right_op == r_tbl) in st_select_lex::add_cross_joined_table(), and that assertion would fail in several tests that exercise joins. That commit was skipped in this merge, and a separate fix of MDEV-20265 will be necessary in 10.4.	2019-08-23 08:06:17 +03:00
Jan Lindström	7b4de10477	MDEV-20378: Galera uses uninitialized memory Problem was that wsrep thread argument was deleted on wrong place. Furthermore, scan method incorrectly used unsafe c_ptr(). Finally, fixed wsrep thread initialization to correctly set up thread_id and pass correct argument to functions and fix signess problem causing compiler errors.	2019-08-20 10:32:04 +03:00
Aleksey Midenkov	6dd3f24090	MDEV-19740 Debug build of 10.3.15 FTBFS * Replace LINT_INIT for non-struct types with ctor initializers; * Check BUILD_DEPS list is not empty so REMOVE_DUPLICATES won't throw error.	2019-08-19 10:38:24 +03:00
Marko Mäkelä	1d15a28e52	Merge 10.3 into 10.4	2019-08-14 18:06:51 +03:00
Marko Mäkelä	65d48b4a7b	Merge 10.2 to 10.3	2019-08-13 19:28:51 +03:00
Jan Lindström	5edc4ea4d9	MDEV-20324: Galera threads are not registered to performance schema Galera threads were not registered to performance schema and used pthread_create when mysql_thread_create should have been used. Added test case to verify current galera performance schema instrumentation does work.	2019-08-13 12:52:01 +03:00
Marko Mäkelä	e9c1701e11	Merge 10.3 into 10.4	2019-07-25 18:42:06 +03:00
Eugene Kosov	0f83c8878d	Merge 10.2 into 10.3	2019-07-16 18:39:21 +03:00
Jan Lindström	ec49976e38	MDEV-19746: Galera test failures because of wsrep_slave_threads identification Problem was that tests select INFORMATION_SCHEMA.PROCESSLIST processes from user system user and empty state. Thus, there is not clear state for slave threads. Changes: - Added new status variables that store current amount of applier threads (wsrep_applier_thread_count) and rollbacker threads (wsrep_rollbacker_thread_count). This will make clear how many slave threads of certain type there is. - Added THD state "wsrep applier idle" when applier slave thread is waiting for work. This makes finding slave/applier threads easier. - Added force-restart option for mtr to always restart servers between tests to avoid race on start of the test - Added wait_condition_with_debug to wait until the passed statement returns true, or the operation times out. If operation times out, the additional error statement will be executed Changes to be committed: new file: mysql-test/include/force_restart.inc new file: mysql-test/include/wait_condition_with_debug.inc modified: mysql-test/mysql-test-run.pl modified: mysql-test/suite/galera/disabled.def modified: mysql-test/suite/galera/r/MW-336.result modified: mysql-test/suite/galera/r/galera_kill_applier.result modified: mysql-test/suite/galera/r/galera_var_slave_threads.result new file: mysql-test/suite/galera/t/MW-336.cnf modified: mysql-test/suite/galera/t/MW-336.test modified: mysql-test/suite/galera/t/galera_kill_applier.test modified: mysql-test/suite/galera/t/galera_parallel_autoinc_largetrx.test modified: mysql-test/suite/galera/t/galera_parallel_autoinc_manytrx.test modified: mysql-test/suite/galera/t/galera_var_slave_threads.test modified: mysql-test/suite/wsrep/disabled.def modified: mysql-test/suite/wsrep/r/variables.result modified: mysql-test/suite/wsrep/t/variables.test modified: sql/mysqld.cc modified: sql/wsrep_mysqld.cc modified: sql/wsrep_mysqld.h modified: sql/wsrep_thd.cc modified: sql/wsrep_var.cc	2019-07-15 10:17:07 +03:00
Oleksandr Byelkin	c07325f932	Merge branch '10.3' into 10.4	2019-05-19 20:55:37 +02:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	cb248f8806	Merge branch '5.5' into 10.1	2019-05-11 22:19:05 +03:00
Marko Mäkelä	d0116e10a5	Revert MDEV-18464 and MDEV-12009 This reverts commit `21b2fada7a` and commit `81d71ee6b2`. The MDEV-18464 change introduces a few data race issues. Contrary to the documentation, the field trx_t::victim is not always being protected by lock_sys_t::mutex and trx_t::mutex. Most importantly, it seems that KILL QUERY could wrongly avoid acquiring both mutexes when invoking lock_trx_handle_wait_low(), in case another thread had already set trx->victim=true. We also revert MDEV-12009, because it should depend on the MDEV-18464 fix being present.	2019-03-28 12:39:50 +02:00
Jan Lindström	81d71ee6b2	MDEV-12009: Allow to force kill user threads/query which are flagged as high priority by Galera As noted on kill_one_thread SUPER should be able to kill even system threads i.e. threads/query flagged as high priority or wsrep applier thread. Normal user, should not able to kill threads/query flagged as high priority (BF) or wsrep applier thread.	2019-03-28 08:43:44 +02:00
Marko Mäkelä	117291db8b	Merge 10.2 into 10.3	2019-03-19 16:04:59 +02:00
sysprg	26432e49d3	MDEV-17262: mysql crashed on galera while node rejoined cluster (#895 ) This patch contains a fix for the MDEV-17262/17243 issues and new mtr test. These issues (MDEV-17262/17243) have two reasons: 1) After an intermediate commit, a transaction loses its status of "transaction that registered in the MySQL for 2pc coordinator" (in the InnoDB) due to the fact that since version 10.2 the write_row() function (which located in the ha_innodb.cc) does not call trx_register_for_2pc(m_prebuilt->trx) during the processing of split transactions. It is necessary to restore this call inside the write_row() when an intermediate commit was made (for a split transaction). Similarly, we need to set the flag of the started transaction (m_prebuilt->sql_stat_start) after intermediate commit. The table->file->extra(HA_EXTRA_FAKE_START_STMT) called from the wsrep_load_data_split() function (which located in sql_load.cc) will also do this, but it will be too late. As a result, the call to the wsrep_append_keys() function from the InnoDB engine may be lost or function may be called with invalid transaction identifier. 2) If a transaction with the LOAD DATA statement is divided into logical mini-transactions (of the 10K rows) and binlog is rotated, then in rare cases due to the wsrep handler re-registration at the boundary of the split, the last portion of data may be lost. Since splitting of the LOAD DATA into mini-transactions is technical, I believe that we should not allow these mini-transactions to fall into separate binlogs. Therefore, it is necessary to prohibit the rotation of binlog in the middle of processing LOAD DATA statement. https://jira.mariadb.org/browse/MDEV-17262 and https://jira.mariadb.org/browse/MDEV-17243	2019-03-18 07:39:51 +02:00
Sergei Golubchik	b64fde8f38	Merge branch '10.2' into 10.3	2019-03-17 13:06:41 +01:00
Jan Lindström	d0ebb155fe	MDEV-18577: Indexes problem on import dump SQL Problem was that we skipped background persistent statistics calculation on applier nodes if thread is marked as high priority (a.k.a BF). However, on applier nodes all DDL which is replicate will be executed as high priority i.e BF. Fixed by allowing background persistent statistics calculation on applier nodes even when thread is marked as BF. This could lead BF lock waits but for queries on that node needs that statistics.	2019-03-13 10:18:12 +02:00
Marko Mäkelä	2a791c53ad	Merge 10.3 into 10.4	2019-03-06 09:00:52 +02:00
seppo	785092ee23	LOCK_thread_count and COND_thread_count removed from wsrep modules (#1197 ) Refactored wsrep patch to not use LOCK_thread_count and COND_thread_count anymore. This has partially been replaced by using old LOCK_wsrep_slave_threads mutex. For slave thread count change waiting, new COND_wsrep_slave_threads signal has been added Added LOCK_wsrep_cluster_config mutex to control that cluster address change cannot happen in parallel Protected wsrep_slave_threads variable changes with LOCK_cluster_config mutex This is for avoiding concurrent slave thread count and cluster joining operations to happen Fixes according to Teemu's review	2019-02-26 13:39:05 -05:00
Julius Goryavsky	50b3632fa4	MDEV-9519: Data corruption will happen on the Galera cluster size change If we have a 2+ node cluster which is replicating from an async master and the binlog_format is set to STATEMENT and multi-row inserts are executed on a table with an auto_increment column such that values are automatically generated by MySQL, then the server node generates wrong auto_increment values, which are different from what was generated on the async master. In the title of the MDEV-9519 it was proposed to ban start slave on a Galera if master binlog_format = statement and wsrep_auto_increment_control = 1, but the problem can be solved without such a restriction. The causes and fixes: 1. We need to improve processing of changing the auto-increment values after changing the cluster size. 2. If wsrep auto_increment_control switched on during operation of the node, then we should immediately update the auto_increment_increment and auto_increment_offset global variables, without waiting of the next invocation of the wsrep_view_handler_cb() callback. In the current version these variables retain its initial values if wsrep_auto_increment_control is switched on during operation of the node, which leads to inconsistent results on the different nodes in some scenarios. 3. If wsrep auto_increment_control switched off during operation of the node, then we must return the original values of the auto_increment_increment and auto_increment_offset global variables, as the user has set. To make this possible, we need to add a "shadow copies" of these variables (which stores the latest values set by the user). https://jira.mariadb.org/browse/MDEV-9519	2019-02-26 08:09:04 +02:00
Julius Goryavsky	2c734c980e	MDEV-9519: Data corruption will happen on the Galera cluster size change If we have a 2+ node cluster which is replicating from an async master and the binlog_format is set to STATEMENT and multi-row inserts are executed on a table with an auto_increment column such that values are automatically generated by MySQL, then the server node generates wrong auto_increment values, which are different from what was generated on the async master. In the title of the MDEV-9519 it was proposed to ban start slave on a Galera if master binlog_format = statement and wsrep_auto_increment_control = 1, but the problem can be solved without such a restriction. The causes and fixes: 1. We need to improve processing of changing the auto-increment values after changing the cluster size. 2. If wsrep auto_increment_control switched on during operation of the node, then we should immediately update the auto_increment_increment and auto_increment_offset global variables, without waiting of the next invocation of the wsrep_view_handler_cb() callback. In the current version these variables retain its initial values if wsrep_auto_increment_control is switched on during operation of the node, which leads to inconsistent results on the different nodes in some scenarios. 3. If wsrep auto_increment_control switched off during operation of the node, then we must return the original values of the auto_increment_increment and auto_increment_offset global variables, as the user has set. To make this possible, we need to add a "shadow copies" of these variables (which stores the latest values set by the user). https://jira.mariadb.org/browse/MDEV-9519	2019-02-26 07:45:11 +02:00
Julius Goryavsky	243f829c1c	MDEV-9519: Data corruption will happen on the Galera cluster size change If we have a 2+ node cluster which is replicating from an async master and the binlog_format is set to STATEMENT and multi-row inserts are executed on a table with an auto_increment column such that values are automatically generated by MySQL, then the server node generates wrong auto_increment values, which are different from what was generated on the async master. In the title of the MDEV-9519 it was proposed to ban start slave on a Galera if master binlog_format = statement and wsrep_auto_increment_control = 1, but the problem can be solved without such a restriction. The causes and fixes: 1. We need to improve processing of changing the auto-increment values after changing the cluster size. 2. If wsrep auto_increment_control switched on during operation of the node, then we should immediately update the auto_increment_increment and auto_increment_offset global variables, without waiting of the next invocation of the wsrep_view_handler_cb() callback. In the current version these variables retain its initial values if wsrep_auto_increment_control is switched on during operation of the node, which leads to inconsistent results on the different nodes in some scenarios. 3. If wsrep auto_increment_control switched off during operation of the node, then we must return the original values of the auto_increment_increment and auto_increment_offset global variables, as the user has set. To make this possible, we need to add a "shadow copies" of these variables (which stores the latest values set by the user). https://jira.mariadb.org/browse/MDEV-9519	2019-02-25 11:19:07 +02:00
Brave Galera Crew	36a2a185fe	Galera4	2019-01-23 15:30:00 +04:00
Marko Mäkelä	2f4c391958	Merge 10.2 into 10.3	2018-09-06 22:35:45 +03:00
Marko Mäkelä	206528f722	Merge 10.1 into 10.2	2018-08-31 15:10:02 +03:00
Marko Mäkelä	3b5d3cd68e	Revert MDEV-9519 due to regressions This reverts commit `75dfd4acb9`.	2018-08-31 12:36:31 +03:00
Marko Mäkelä	7830fb7f45	Merge 10.2 into 10.3	2018-08-28 12:22:56 +03:00
Marko Mäkelä	9258097fa3	Merge 10.1 into 10.2	2018-08-21 15:20:34 +03:00
Julius Goryavsky	75dfd4acb9	This is patch for the https://jira.mariadb.org/browse/MDEV-9519 issue: If we have a 2+ node cluster which is replicating from an async master and the binlog_format is set to STATEMENT and multi-row inserts are executed on a table with an auto_increment column such that values are automatically generated by MySQL, then the server node generates wrong auto_increment values, which are different from what was generated on the async master. The causes and fixes: 1. We need to improve processing of changing the auto-increment values after changing the cluster size. 2. If wsrep auto_increment_control switched on during operation of the node, then we should immediately update the auto_increment_increment and auto_increment_offset global variables, without waiting of the next invocation of the wsrep_view_handler_cb() callback. In the current version these variables retain its initial values if wsrep_auto_increment_control is switched on during operation of the node, which leads to inconsistent results on the different nodes in some scenarios. 3. If wsrep auto_increment_control switched off during operation of the node, then we must return the original values of the auto_increment_increment and auto_increment_offset global variables, as the user has set. To make this possible, we need to add a "shadow copies" of these variables (which stores the latest values set by the user).	2018-08-15 14:17:28 +03:00
Sergei Golubchik	c9717dc019	Merge branch '10.2' into 10.3	2018-05-11 13:15:10 +02:00
Sergei Golubchik	9b1824dcd2	Merge branch '10.1' into 10.2	2018-05-10 13:01:42 +02:00
sjaakola	2f0b8f3e02	MDEV-16005 sporadic failures with galera tests MW-328B and MW-328C These test can sporadically show mutex deadlock warnings between LOCK_wsrep_thd and LOCK_thd_data mutexes. This means that these mutexes can be locked in opposite order by different threads, and thus result in deadlock situation. To fix such issue, the locking policy of these mutexes should be revised and enforced to be uniform. However, a quick code review shows that the number of lock/unlock operations for these mutexes combined is between 100-200, and all these mutex invocations should be checked/fixed. On the other hand, it turns out that LOCK_wsrep_thd is used for protecting access to wsrep variables of THD (wsrep_conflict_state, wsrep_query_state), whereas LOCK_thd_data protects query, db and mysys_var variables in THD. Extending LOCK_thd_data to protect also wsrep variables looks like a viable solution, as there should not be a use case where separate threads need simultaneous access to wsrep variables and THD data variables. In this commit LOCK_wsrep_thd mutex is refactored to be replaced by LOCK_thd_data. By bluntly replacing LOCK_wsrep_thd by LOCK_thd_data, will result in double locking of LOCK_thd_data, and some adjustements have been performed to fix such situations.	2018-04-24 16:57:39 +03:00

1 2 3

112 commits