mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 12:02:42 +01:00

Author	SHA1	Message	Date
Oleksandr Byelkin	fecd78b837	Merge branch '10.10' into 10.11	2023-11-08 16:46:47 +01:00
Oleksandr Byelkin	04d9a46c41	Merge branch '10.6' into 10.10	2023-11-08 16:23:30 +01:00
Oleksandr Byelkin	b83c379420	Merge branch '10.5' into 10.6	2023-11-08 15:57:05 +01:00
Kristian Nielsen	2a4c573339	MDEV-32728: Wrong mutex usage 'LOCK_thd_data' and 'wait_mutex' Checking for kill with thd_kill_level() or check_killed() runs apc requests, which takes the LOCK_thd_kill mutex. But this is dangerous, as checking for kill needs to be called while holding many different mutexes, and can lead to cyclic mutex dependency and deadlock. But running apc is only "best effort", so skip running the apc if the LOCK_thd_kill is not available. The apc will then be run on next check of kill signal. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-11-08 14:50:43 +01:00
Oleksandr Byelkin	6cfd2ba397	Merge branch '10.4' into 10.5	2023-11-08 12:59:00 +01:00
Oleksandr Byelkin	fefd6d5559	MDEV-32656: ASAN errors in base_list_iterator::next / setup_table_map upon 2nd execution of PS Correctly supress error issuing when saving value in field for comporison	2023-11-08 12:08:23 +01:00
Alexander Barkov	d2d657e722	MDEV-31187 Add class Sql_mode_save_for_frm_handling	2023-10-23 13:44:31 +04:00
Marko Mäkelä	2ecc0443ec	Merge 10.10 into 10.11	2023-10-17 16:04:21 +03:00
Sergei Golubchik	f293b2b211	cleanup	2023-10-17 14:32:05 +02:00
Marko Mäkelä	d5e15424d8	Merge 10.6 into 10.10 The MDEV-29693 conflict resolution is from Monty, as well as is a bug fix where ANALYZE TABLE wrongly built histograms for single-column PRIMARY KEY. Also includes a fix for safe_malloc error reporting. Other things: - Copied main.log_slow from 10.4 to avoid mtr issue Disabled test: - spider/bugfix.mdev_27239 because we started to get +Error 1429 Unable to connect to foreign data source: localhost -Error 1158 Got an error reading communication packets - main.delayed - Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED This part is disabled for now as it fails randomly with different warnings/errors (no corruption).	2023-10-14 13:36:11 +03:00
Monty	fdcb443e62	Remember first error in Dummy_error_handler Use Dummy_error_handler in open_stat_tables() to ignore all errors when opening statistics tables.	2023-10-10 11:12:26 +03:00
Monty	d4347177c7	Change SEL_ARG::MAX_SEL_ARGS to a user defined variable optimizer_max_sel_args This allows a user to to change the default value of MAX_SEL_ARGS (16000) in the rare case where they neeed more generated SEL_ARGS (as part of the range optimizer)	2023-10-03 08:25:31 +03:00
Monty	4e9322e2ff	MDEV-32203 Raise notes when an index cannot be used on data type mismatch Raise notes if indexes cannot be used: - in case of data type or collation mismatch (diferent error messages). - in case if a table field was replaced to something else (e.g. Item_func_conv_charset) during a condition rewrite. Added option to write warnings and notes to the slow query log for slow queries. New variables added/changed: - note_verbosity, with is a set of the following options: basic - All old notes unusable_keys - Print warnings about keys that cannot be used for select, delete or update. explain - Print unusable_keys warnings for EXPLAIN querys. The default is 'basic,explain'. This means that for old installations the only notable new behavior is that one will get notes about unusable keys when one does an EXPLAIN for a query. One can turn all of all notes by either setting note_verbosity to "" or setting sql_notes=0. - log_slow_verbosity has a new option 'warnings'. If this is set then warnings and notes generated are printed in the slow query log (up to log_slow_max_warnings times per statement). - log_slow_max_warnings - Max number of warnings written to slow query log. Other things: - One can now use =ALL for any 'set' variable to set all options at once. For example using "note_verbosity=ALL" in a config file or "SET @@note_verbosity=ALL' in SQL. - mysqldump will in the future use @@note_verbosity=""' instead of @sql_notes=0 to disable notes. - Added "enum class Data_type_compatibility" and changing the return type of all Field::can_optimize*() methods from "bool" to this new data type. Reviewer & Co-author: Alexander Barkov <bar@mariadb.com> - The code that prints out the notes comes mainly from Alexander	2023-10-03 08:25:31 +03:00
Monty	e3b36b8f1b	MDEV-31957 Concurrent ALTER and ANALYZE collecting statistics can result in stale statistical data Example of what causes the problem: T1: ANALYZE TABLE starts to collect statistics T2: ALTER TABLE starts by deleting statistics for all changed fields, then creates a temp table and copies data to it. T1: ANALYZE ends and writes to the statistics tables. T2: ALTER TABLE renames temp table in place of the old table. Now the statistics from analyze matches the old deleted tables. Fixed by waiting to delete old statistics until ALTER TABLE is the only one using the old table and ensure that rename of columns can handle swapping of column names. rename_columns_in_stat_table() (former rename_column_in_stat_tables()) now takes a list of columns to rename. It uses the following algorithm to update column_stats to be able to handle circular renames - While there are columns to be renamed and it is the first loop or last rename loop did change something. - Loop over all columns to be renamed - Change column name in column_stat - If fail because of duplicate key - If this is first change attempt for this column - Change column name to a temporary column name - If there was a conflicting row, replace it with the current row. else - Remove entry from column list - Loop over all remaining columns in the list - Remove the conflicting row - Change column from temporary name to final name in column_stat Other things: - Don't flush tables for every operation. Only flush when all updates are done. - Rename of columns was not handled in case of ALGORITHM=copy (old bug). - Fixed that we do not collect statistics for hidden hash columns used by UNIQUE constraint on long values. - Fixed that we do not collect statistics for blob columns referred by generated virtual columns. This was achieved by storing the fields for which we want to have statistics in table->has_value_set instead of in table->read_set. - Rename of indexes was not handled for persistent statistics. - This is now handled similar as rename of columns. Renamed columns are now stored in 'rename_stat_indexes' and handled in Alter_info::delete_statistics() together with drooped indexes. - ALTER TABLE .. ADD INDEX may instead of creating a new index rename an existing generated foreign key index. This was not reflected in the index_stats table because this was handled in mysql_prepare_create_table instead instead of in the mysql_alter() code. Fixed by adding a call in mysql_prepare_create_table() to drop the changed index. I also had to change the code that 'marked the index' to be ignored with code that would not destroy the original index name. Reviewer: Sergei Petrunia <sergey@mariadb.com>	2023-10-03 08:25:30 +03:00
Jan Lindström	f57deb314f	MDEV-31660 : Assertion `client_state.transaction().active() in wsrep_append_key At the moment we cannot support wsrep_forced_binlog_format=[MIXED\|STATEMENT] during CREATE TABLE AS SELECT. Statement will use ROW instead and give a warning. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-09-29 12:54:04 +02:00
Marko Mäkelä	8096139b3a	Merge 10.5 into 10.6	2023-09-19 10:47:26 +03:00
Marko Mäkelä	6c05edfdcd	Merge 10.4 into 10.5	2023-09-19 10:20:09 +03:00
Dmitry Shulga	68353dc92a	MDEV-23902: MariaDB crash on calling function On creation of a VIEW that depends on a stored routine an instance of the class Item_func_sp is allocated on a memory root of SP statement. It happens since mysql_make_view() calls the method THD::activate_stmt_arena_if_needed() before parsing definition of the view. On the other hand, when sp_head's rcontext is created an instance of the class Field referenced by the data member Item_func_sp::result_field is allocated on the Item_func_sp's Query_arena (call arena) that set up inside the method Item_sp::execute_impl just before calling the method sp_head::execute_function() On return from the method sp_head::execute_function() all items allocated on the Item_func_sp's Query_arena are released and its memory root is freed (see implementation of the method Item_sp::execute_impl). As a consequence, the pointer Item_func_sp::result_field references to the deallocated memory. Later, when the method sp_head::execute cleans up items allocated for just executed SP instruction the method Item_func_sp::cleanup is invoked and tries to delete an object referenced by data member Item_func_sp::result_field that points to already deallocated memory, that results in a server abnormal termination. To fix the issue the current active arena shouldn't be switched to a statement arena inside the function mysql_make_view() that invoked indirectly by the method sp_head::rcontext_create. It is implemented by introducing the new Query_arena's state STMT_SP_QUERY_ARGUMENTS that is set when explicit Query_arena is created for placing SP arguments and other caller's side items used during SP execution. Then the method THD::activate_stmt_arena_if_needed() checks Query_arena's state and returns immediately without switching to statement's arena.	2023-09-19 08:57:36 +07:00
Marko Mäkelä	0dd25f28f7	Merge 10.5 into 10.6	2023-09-11 14:46:39 +03:00
Marko Mäkelä	f8f7d9de2c	Merge 10.4 into 10.5	2023-09-11 11:29:31 +03:00
Kristian Nielsen	e937a64d46	MDEV-10356: rpl.rpl_parallel_temptable failure due to incorrect commit optimization of temptables The problem was that parallel replication of temporary tables using statement-based binlogging could overlap the COMMIT in one thread with a DML or DROP TEMPORARY TABLE in another thread using the same temporary table. Temporary tables are not safe for concurrent access, so this caused reference to freed memory and possibly other nastiness. The fix is to disable the optimisation with overlapping commits of one transaction with the start of a later transaction, when temporary tables are in use. Then the following event groups will be blocked from starting until the one using temporary tables is completed. This also fixes occasional test failures of rpl.rpl_parallel_temptable seen in Buildbot. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-09-07 14:40:05 +02:00
Oleksandr Byelkin	036df5f970	Merge branch '10.10' into 10.11	2023-08-08 14:57:31 +02:00
Oleksandr Byelkin	ced243a099	Merge branch '10.9' into 10.10	2023-08-05 20:34:09 +02:00
Oleksandr Byelkin	34a8e78581	Merge branch '10.6' into 10.9	2023-08-04 08:01:06 +02:00
Oleksandr Byelkin	5ea5291d97	Merge branch '10.5' into 10.6	2023-08-04 07:52:54 +02:00
Sergei Golubchik	ab1191c039	cleanup: key->key_create_info.check_for_duplicate_indexes -> key->old mark old keys in the ALTER TABLE with the `old` flag, not with the `key_create_info.check_for_duplicate_indexes`. This allows to mark old foreign keys too.	2023-08-01 22:43:16 +02:00
Sergei Golubchik	b8233b38da	cleanup: put db/table_name into Alter_info also, prefer Lex_table_name and Lex_ident over LEX_CSTRING	2023-08-01 22:43:16 +02:00
Sergei Golubchik	383baa812e	cleanup: invert return code	2023-08-01 22:42:24 +02:00
Oleksandr Byelkin	6bf8483cac	Merge branch '10.5' into 10.6	2023-08-01 15:08:52 +02:00
Oleksandr Byelkin	7564be1352	Merge branch '10.4' into 10.5	2023-07-26 16:02:57 +02:00
Marko Mäkelä	bce3ee704f	Merge 10.10 into 10.11	2023-07-26 14:44:43 +03:00
Marko Mäkelä	b1b47264d2	Merge 10.9 into 10.10	2023-07-26 14:17:36 +03:00
Marko Mäkelä	864bbd4d09	Merge 10.6 into 10.9	2023-07-26 13:42:23 +03:00
Sergei Petrunia	6e484c3bd9	MDEV-31577: Make ANALYZE FORMAT=JSON print innodb stats ANALYZE FORMAT=JSON output now includes table.r_engine_stats which has the engine statistics. Only non-zero members are printed. Internally: EXPLAIN data structures Explain_table_acccess and Explain_update now have handler* handler_for_stats pointer. It is used to read statistics from handler_for_stats->handler_stats. The following applies only to 10.9+, backport doesn't use it: Explain data structures exist after the tables are closed. We avoid walking invalid pointers using this: - SQL layer calls Explain_query::notify_tables_are_closed() before closing tables. - After that call, printing of JSON output is disabled. Non-JSON output can be printed but we don't access handler_for_stats when doing that.	2023-07-21 16:50:11 +03:00
Alexander Barkov	400c101332	MDEV-30662 SQL/PL package body does not appear in I_S.ROUTINES.ROUTINE_DEFINITION - Moving the code from a public function trim_whitespaces() to the class Lex_cstring as methods. This code may be useful in other contexts, and also this code becomes visible inside sql_class.h - Adding a helper method THD::strmake_lex_cstring_trim_whitespaces() - Unifying the way how CREATE PROCEDURE/CREATE FUNCTION and CREATE PACKAGE/CREATE PACKAGE BODY work: a) Now CREATE PACKAGE/CREATE PACKAGE BODY also calls Lex->sphead->set_body_start() to remember the cpp body start inside an sp_head member. b) adding a "const char *cpp_body_end" parameter to sp_head::set_stmt_end(). These changes made it possible to reuse sp_head::set_stmt_end() inside LEX::create_package_finalize() and remove the duplucate code. - Renaming sp_head::m_body_begin to m_cpp_body_begin and adding a comment to make it clear that this member is used only during parsing, and points to a fragment inside the cpp buffer. - Changed sp_head::set_body_start() and sp_head::set_stmt_end() to skip the calls related to "body_utf8" in cases when m_parent is not NULL. A non-NULL m_parent means that we're inside a package routine. "body_utf8" in such case belongs not to the current sphead itself, but to parent (the package) sphead. So an sphead instance of a package routine should neither initialize, nor finalize, nor change in any other ways the "body_utf8" related members of Lex_input_stream, and should not take over or copy "body_utf8" data from Lex_input_stream to "this".	2023-07-14 13:26:26 +04:00
Kristian Nielsen	5d61442c85	MDEV-31448: Killing a replica thread awaiting its GCO can hang/crash a parallel replica The problem is that when a worker thread is (user) killed in wait_for_prior_commit, the event group may complete out-of-order since the wait for prior commit was aborted by the kill. This fix ensures that event groups will always complete in-order, even in the error case. This is done in finish_event_group() by doing an extra wait_for_prior_commit(), if necessary, that ignores kills. This fix supersedes the fix for MDEV-30780, so the earlier fix for that is reverted in this patch. Also fix that an error from wait_for_prior_commit() inside finish_event_group() would not signal the error to wakeup_subsequent_commits(). Based on earlier work by Brandon Nesterenko and Andrei Elkin, with some changes to simplify the semantics of wait_for_prior_commit() and make the code more robust to future changes. Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-07-12 09:41:32 +02:00
Brandon Nesterenko	9808ebe195	MDEV-30978: On slave XA COMMIT/XA ROLLBACK fail to return an error in read-only mode Where a read-only server permits writes through replication, it should not permit user connections to commit/rollback XA transactions prepared via replication. The bug reported in MDEV-30978 shows that this can happen. This is because there is no read only check in the XA transaction logic, the most relevant one occurs in ha_commit_trans() for normal statements/transactions. This patch extends the XA transaction logic to check the read only status of the server before performing an XA COMMIT or ROLLBACK. Reviewed By: Andrei Elkin <andrei.elkin@mariadb.com>	2023-07-11 07:49:44 -06:00
Marko Mäkelä	7cde5c539b	Merge 10.6 into 10.9	2023-07-10 11:22:21 +03:00
Monty	99bd226059	MDEV-31558 Add InnoDB engine information to the slow query log The new statistics is enabled by adding the "engine", "innodb" or "full" option to --log-slow-verbosity Example output: # Pages_accessed: 184 Pages_read: 95 Pages_updated: 0 Old_rows_read: 1 # Pages_read_time: 17.0204 Engine_time: 248.1297 Page_read_time is time doing physical reads inside a storage engine. (Writes cannot be tracked as these are usually done in the background). Engine_time is the time spent inside the storage engine for the full duration of the read/write/update calls. It uses the same code as 'analyze statement' for calculating the time spent. The engine statistics is done with a generic interface that should be easy for any engine to use. It can also easily be extended to provide even more statistics. Currently only InnoDB has counters for Pages_% and Undo_% status. Engine_time works for all engines. Implementation details: class ha_handler_stats holds all engine stats. This class is included in handler and THD classes. While a query is running, all statistics is updated in the handler. In close_thread_tables() the statistics is added to the THD. handler::handler_stats is a pointer to where statistics should be collected. This is set to point to handler::active_handler_stats if stats are requested. If not, it is set to 0. handler_stats has also an element, 'active' that is 1 if stats are requested. This is to allow engines to avoid doing any 'if's while updating the statistics. Cloned or partition tables have the pointer set to the base table if status are requested. There is a small performance impact when using --log-slow-verbosity=engine: - All engine calls in 'select' will be timed. - IO calls for InnoDB reads will be timed. - Incrementation of counters are done on local variables and accesses are inline, so these should have very little impact. - Statistics has to be reset for each statement for the THD and each used handler. This is only 40 bytes, which should be neglectable. - For partition tables we have to loop over all partitions to update the handler_status as part of table_init(). Can be optimized in the future to only do this is log-slow-verbosity changes. For this to work we have to update handler_status for all opened partitions and also for all partitions opened in the future. Other things: - Added options 'engine' and 'full' to log-slow-verbosity. - Some of the new files in the test suite comes from Percona server, which has similar status information. - buf_page_optimistic_get(): Do not increment any counter, since we are only validating a pointer, not performing any buf_pool.page_hash lookup. - Added THD argument to save_explain_data_intern(). - Switched arguments for save_explain_.*_data() to have always THD first (generates better code as other functions also have THD first).	2023-07-07 12:53:18 +03:00
Marko Mäkelä	c04284e747	Merge 10.10 into 10.11	2023-06-07 15:01:43 +03:00
Marko Mäkelä	82230aa423	Merge 10.9 into 10.10	2023-06-07 14:48:37 +03:00
Brandon Nesterenko	8ed88e3455	Revert "MDEV-13915: STOP SLAVE takes very long time on a busy system" This reverts commit `0a99d457b3` because it should go into only 10.5+	2023-06-06 08:11:38 -06:00
Brandon Nesterenko	0a99d457b3	MDEV-13915: STOP SLAVE takes very long time on a busy system The problem is that a parallel replica would not immediately stop running/queued transactions when issued STOP SLAVE. That is, it allowed the current group of transactions to run, and sometimes the transactions which belong to the next group could be started and run through commit after STOP SLAVE was issued too, if the last group had started committing. This would lead to long periods to wait for all waiting transactions to finish. This patch updates a parallel replica to try and abort immediately and roll-back any ongoing transactions. The exception to this is any transactions which are non-transactional (e.g. those modifying sequences or non-transactional tables), and any prior transactions, will be run to completion. The specifics are as follows: 1. A new stage was added to SHOW PROCESSLIST output for the SQL Thread when it is waiting for a replica thread to either rollback or finish its transaction before stopping. This stage presents as “Waiting for worker thread to stop” 2. Worker threads which error or are killed no longer perform GCO cleanup if there is a concurrently running prior transaction. This is because a worker thread scheduled to run in a future GCO could be killed and incorrectly perform cleanup of the active GCO. 3. Refined cases when the FL_TRANSACTIONAL flag is added to GTID binlog events to disallow adding it to transactions which modify both transactional and non-transactional engines when the binlogging configuration allow the modifications to exist in the same event, i.e. when using binlog_direct_non_trans_update == 0 and binlog_format == statement. 4. A few existing MTR tests relied on the completion of certain transactions after issuing STOP SLAVE, and were re-recorded (potentially with added synchronizations) under the new rollback behavior. Reviewed By =========== Andrei Elkin <andrei.elkin@mariadb.com>	2023-06-05 10:03:06 -06:00
Marko Mäkelä	0796b7ad5e	Merge 10.6 into 10.9	2023-05-22 09:13:51 +03:00
Teemu Ollakka	f307160218	MDEV-29293 MariaDB stuck on starting commit state This commit contains a merge from 10.5-MDEV-29293-squash into 10.6. Although the bug MDEV-29293 was not reproducible with 10.6, the fix contains several improvements for wsrep KILL query and BF abort handling, and addresses the following issues: * MDEV-30307 KILL command issued inside a transaction is problematic for galera replication: This commit will remove KILL TOI replication, so Galera side transaction context is not lost during KILL. * MDEV-21075 KILL QUERY maintains nodes data consistency but breaks GTID sequence: This is fixed as well as KILL does not use TOI, and thus does not change GTID state. * MDEV-30372 Assertion in wsrep-lib state: This was caused by BF abort or KILL when local transaction was in the middle of group commit. This commit disables THD::killed handling during commit, so the problem is avoided. * MDEV-30963 Assertion failure !lock.was_chosen_as_deadlock_victim in trx0trx.h:1065: The assertion happened when the victim was BF aborted via MDL while it was committing. This commit changes MDL BF aborts so that transactions which are committing cannot be BF aborted via MDL. The RQG grammar attached in the issue could not reproduce the crash anymore. Original commit message from 10.5 fix: MDEV-29293 MariaDB stuck on starting commit state The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Make galera_var_retry_autocommit result more readable by echoing cases and expectations into result. Only one expected result for reap to verify that server returns expected status for query. * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_bf_abort_registering to check that registering trx gets BF aborted through MDL. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:42:05 +02:00
Teemu Ollakka	3f59bbeeae	MDEV-29293 MariaDB stuck on starting commit state The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:39:43 +02:00
Teemu Ollakka	6966d7fe4b	MDEV-29293 MariaDB stuck on starting commit state This is a backport from 10.5. The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:33:37 +02:00
Marko Mäkelä	656c2e18b1	Merge 10.10 into 10.11	2023-04-14 13:08:28 +03:00
Marko Mäkelä	a009280e60	Merge 10.9 into 10.10	2023-04-14 12:24:14 +03:00
Marko Mäkelä	44281b88f3	Merge 10.8 into 10.9	2023-04-14 11:32:36 +03:00

1 2 3 4 5 ...

4645 commits