mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-02-13 17:05:35 +01:00

Author	SHA1	Message	Date
Alexander Barkov	e56040fee8	Merge remote-tracking branch 'origin/10.5' into 10.6	2024-07-08 18:59:04 +04:00
Anson Chung	215fab68db	Perform simple fixes for cppcheck findings Rectify cases of mismatched brackets and address possible cases of division by zero by checking if the denominator is zero before dividing. No functional changes were made. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2024-07-08 10:51:48 +01:00
Yuchen Pei	d7042ec4da	Merge branch '10.5' into 10.6	2024-06-26 09:16:54 +08:00
Rex	d513a4ce74	MDEV-19520 Extend condition normalization to include 'NOT a' Having Item_func_not items in item trees breaks assumptions during the optimization phase about transformation possibilities in fix_fields(). Remove Item_func_not by extending normalization during parsing. Reviewed by Oleksandr Byelkin (sanja@mariadb.com)	2024-06-25 04:51:29 +11:00
Marko Mäkelä	0076eb3d4e	Merge 10.5 into 10.6	2024-06-24 13:09:47 +03:00
Marko Mäkelä	d9dd673fee	MDEV-12008 fixup: Do not add a new error code New error codes can only be added in the latest major version. Adding ER_KILL_DENIED_HIGH_PRIORITY would shift by one all error codes that were added in MariaDB Server 10.6 or later. This amends commit `1001dae186` Suggested by: Sergei Golubchik	2024-06-24 12:08:13 +03:00
Dave Gosselin	db0c28eff8	MDEV-33746 Supply missing override markings Find and fix missing virtual override markings. Updates cmake maintainer flags to include -Wsuggest-override and -Winconsistent-missing-override.	2024-06-20 11:32:13 -04:00
Jan Lindström	1001dae186	MDEV-12008 : Change error code for Galera unkillable threads Changed error code for Galera unkillable threads to be ER_KILL_DENIED_HIGH_PRIORITY giving message This is a high priority thread/query and cannot be killed without the compromising consistency of the cluster also a warning is produced Thread %lld is [wsrep applier\|high priority] and cannot be killed Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-06-19 14:07:34 +02:00
Marko Mäkelä	27834ebc91	Merge 10.5 into 10.6	2024-06-10 15:22:15 +03:00
Alexander Barkov	21f56583bf	MDEV-32376 SHOW CREATE DATABASE statement crashes the server when db name contains some unicode characters, ASAN stack-buffer-overflow Adding the test for the length of lex->name into show_create_db(). Without this test writes beyond the end of db_name_buff were possible upon a too long database name.	2024-06-10 09:31:14 +04:00
Julius Goryavsky	0d85c905c4	MDEV-34269: post-fix code simplification The code is slightly simplified taking into account the fact that partition_ht() always returns a normal hton when there is no partitioning.	2024-06-07 18:26:08 +02:00
Jan Lindström	0172887980	MDEV-34269 : 10.11.8 cluster becomes inconsistent when using composite primary key and partitioning This is regression from commit `3228c08fa8`. Problem is that when table storage engine is determined there should be check is table partitioned and if it is then determine partition implementing storage engine. Reported bug is reproducible only with --log-bin so make sure tests changed by `3228c08fa8` and new test are run with --log-bin and binlog disabled. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-06-07 18:26:08 +02:00
Marko Mäkelä	5ba542e9ee	Merge 10.5 into 10.6	2024-05-30 14:27:07 +03:00
Vladislav Vaintroub	736449d30f	MDEV-34205: ASAN stack buffer overflow in strxnmov() in frm_file_exists Correct the second parameter for strxnmov to prevent potential buffer overflows. The second parameter must be one less than the size of the input buffer to avoid writing past the end of the buffer. While the second parameter is usually correct, there are exceptions that need fixing. This commit addresses the issue within frm_file_exists() and other affected places.	2024-05-23 22:08:27 +02:00
Alexander Barkov	310fd6ff69	Backporting bugs fixes fixed by MDEV-31340 from 11.5 The patch for MDEV-31340 fixed the following bugs: MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0 MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0 MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0 MDEV-33088 Cannot create triggers in the database `MYSQL` MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0 MDEV-33108 TABLE_STATISTICS and INDEX_STATISTICS are case insensitive with lower-case-table-names=0 MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0 MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0 MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0 Backporting the fixes from 11.5 to 10.5	2024-05-21 14:58:01 +04:00
Marko Mäkelä	829cb1a49c	Merge 10.5 into 10.6	2024-04-17 14:14:58 +03:00
Dave Gosselin	a8a75ba2d0	Factor TABLE_LIST creation from add_table_to_list Ideally our methods and functions should do one thing, do that well, and do only that. add_table_to_list does far more than adding a table to a list, so this commit factors the TABLE_LIST creation out to a new TABLE_LIST constructor. It then uses placement new() to create it in the correct memory area (result of thd->calloc). Benefits of this approach: 1. add_table_to_list now returns as early as possible on an error 2. fewer side-effects incurred on creating the TABLE_LIST object 3. TABLE_LIST won't be calloc'd if copy_to_db fails 4. local declarations moved closer to their respective first uses 5. improved code readability and logical flow Also factored a couple of other functions to keep the happy path more to the left, which makes them easier to follow at a glance.	2024-04-16 10:09:43 -04:00
Oleksandr Byelkin	9b18275623	Merge branch '10.4' into 10.5	2024-04-16 11:04:14 +02:00
Kristian Nielsen	16aa4b5f59	Merge from 10.4 to 10.5 Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-04-15 17:46:49 +02:00
Daniele Sciascia	c71dc39529	MDEV-26499 Fix error "mysql_shutdown failed" during MTR tests - Fix to avoid mysqltest client getting killed abruptly during mysql_shutdown(). When Galera replication is shutdown, wait for THDs with `thd->stmt_da()->is_eof()` to disconnect (these are about to disconnect anyway). - Extract duplicate code from `wsrep_stop_replication()` and `wsrep_shutdown_replication()` in a new function. - No need to use a custom `shutdown_mysqld.inc` in galera suite. Delete it, so that the one in `mysql-test/include/` is used. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-03-27 04:31:45 +01:00
Kristian Nielsen	ef7abc881c	MDEV-10793: MDEV-33292: main.kill_processlist-6619 fails sporadically in buildbot There were several races in the main.kill_processlist-6619 testcase: - Lingering connections from a previous test case could be visible in SHOW PROCESSLIST and cause .result diff. - A sync point "dispatch_command_end" was ineffective, as it was consumed at the end of the SET DEBUG command itself. - The signal from sync point "before_execute_sql_command" could override an earlier signal, causing DEBUG_SYNC timeout and test failure. - The final SHOW PROCESSLIST could occasionally see a connection in state "Busy" instead of the expected "Sleep". Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-03-14 22:48:12 +01:00
Dmitry Shulga	d7758debae	MDEV-33218: Assertion `active_arena->is_stmt_prepare_or_first_stmt_execute() \|\| active_arena->state == Query_arena::STMT_SP_QUERY_ARGUMENTS' failed in st_select_lex::fix_prepare_information In case there is a view that queried from a stored routine or a prepared statement and this temporary table is dropped between executions of SP/PS, then it leads to hitting an assertion at the SELECT_LEX::fix_prepare_information. The fired assertion was added by the commit `85f2e4f8e8` (MDEV-32466: Potential memory leak on executing of create view statement). Firing of this assertion means memory leaking on execution of SP/PS. Moreover, if the added assert be commented out, different result sets can be produced by the statement SELECT * FROM the hidden table. Both hitting the assertion and different result sets have the same root cause. This cause is usage of temporary table's metadata after the table itself has been dropped. To fix the issue, reload the cache of stored routines. To do it cache of stored routines is reset at the end of execution of the function dispatch_command(). Next time any stored routine be called it will be loaded from the table mysql.proc. This happens inside the method Sp_handler::sp_cache_routine where loading of a stored routine is performed in case it missed in cache. Loading is performed unconditionally while previously it was controlled by the parameter lookup_only. By that reason the signature of the method Sroutine_hash_entry::sp_cache_routine was changed by removing unused parameter lookup_only. Clearing of sp caches affects the test main.lock_sync since it forces opening and locking the table mysql.proc but the test assumes that each statement locks its tables once during its execution. To keep this invariant the debug sync points with names "before_lock_tables_takes_lock" and "after_lock_tables_takes_lock" are not activated on handling the table mysql.proc	2024-03-14 15:43:03 +07:00
Monty	9a132d423a	MDEV-33620 Improve times and states in show processlist for replication This will makes it easier to find out what replication workers are doing and what they are waiting for. Things changed in processlist: - Slave_SQL time was not consistent. Now time for state "Slave has read all relay log; waiting for more updates" shows how long it has waited for getting the next event. - Slave_worker threads did often show "Closing tables" for a long time. Now the state is reverted to the previous state after "Closing tables" is done. - Commit and Rollback states where not shown for replication (and some other threads). Now Commit and Rollback states are always shown and the state is reverted to previous state when the Commit/Rollback have finished. Code changes: - Added thd->set_time_for_next_stage() for parallel replication when when starting to wait for prior transactions to commit, group commit, and FTWRL and for free space in thread pool. Before we reset the time only after the above events. - Moved THD_STAGE_INFO(stage_rollback) and THD_STAGE_INFO(stage_commit) from sql_parse.cc to transaction.cc to ensure this is done for all commits and not only 'normal connection queries'. Test case changes: - close_thread_tables() reverting stage to previous stage caused the counter in performance_schema to be increased. In many case it is the 'sql/starting' stage that was effected. - We only change to "Commit" stage if there is a need for a commit. This caused some "Commit" stages to disapper from perfschema reports. TODO in 11.#: - Slave_IO always showes "Waiting for master to send event" and the time is from SLAVE START. We should in 11.# change this to be the time since reading the last event.	2024-03-08 15:23:17 +02:00
Marko Mäkelä	691f923906	Merge 10.5 into 10.6	2024-02-13 20:42:59 +02:00
Marko Mäkelä	8ec12e0d6d	Merge 10.4 into 10.5	2024-02-12 11:38:13 +02:00
Jan Lindström	3228c08fa8	MDEV-22063 : Assertion `0' failed in wsrep::transaction::before_rollback Problem was that REPLACE was using consistency check that started TOI and we tried to rollback it. Do not use wsrep_before_rollback and wsrep_after_rollback if we are runing consistency check because no writeset keys are in that case added. Do not allow consistency check usage if table storage for target table is not InnoDB, instead give warning. REPLACE\|SELECT INTO ... SELECT will use now TOI if table storage for target table is not InnoDB to maintain consistency between galera nodes. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-01-29 06:34:46 +01:00
Monty	2fcb5d651b	Fixed possible mutex-wrong-order with KILL USER The old code collected a list of THD's, locked the THD's from getting deleted by locking two mutex and then later in a separate loop sent a kill signal to each THD. The problem with this approach is that, as THD's can be reused, the second time the THD is killed, the mutex can be taken in different order, which signals failures in safe_mutex. Fixed by sending the kill signal directly and not collect the THD's in a list to be signaled later. This is the same approach we are using in kill_zombie_dump_threads(). Other things: - Reset safe_mutex_t->locked_mutex when freed (Safety fix)	2024-01-23 13:03:11 +02:00
Michael Widenius	7af50e4df4	MDEV-32551: "Read semi-sync reply magic number error" warnings on master rpl_semi_sync_slave_enabled_consistent.test and the first part of the commit message comes from Brandon Nesterenko. A test to show how to induce the "Read semi-sync reply magic number error" message on a primary. In short, if semi-sync is turned on during the hand-shake process between a primary and replica, but later a user negates the rpl_semi_sync_slave_enabled variable while the replica's IO thread is running; if the io thread exits, the replica can skip a necessary call to kill_connection() in repl_semisync_slave.slave_stop() due to its reliance on a global variable. Then, the replica will send a COM_QUIT packet to the primary on an active semi-sync connection, causing the magic number error. The test in this patch exits the IO thread by forcing an error; though note a call to STOP SLAVE could also do this, but it ends up needing more synchronization. That is, the STOP SLAVE command also tries to kill the VIO of the replica, which makes a race with the IO thread to try and send the COM_QUIT before this happens (which would need more debug_sync to get around). See THD::awake_no_mutex for details as to the killing of the replica’s vio. Notes: - The MariaDB documentation does not make it clear that when one enables semi-sync replication it does not matter if one enables it first in the master or slave. Any order works. Changes done: - The rpl_semi_sync_slave_enabled variable is now a default value for when semisync is started. The variable does not anymore affect semisync if it is already running. This fixes the original reported bug. Internally we now use repl_semisync_slave.get_slave_enabled() instead of rpl_semi_sync_slave_enabled. To check if semisync is active on should check the @@rpl_semi_sync_slave_status variable (as before). - The semisync protocol conflicts in the way that the original MySQL/MariaDB client-server protocol was designed (client-server send and reply packets are strictly ordered and includes a packet number to allow one to check if a packet is lost). When using semi-sync the master and slave can send packets at 'any time', so packet numbering does not work. The 'solution' has been that each communication starts with packet number 1, but in some cases there is still a chance that the packet number check can fail. Fixed by adding a flag (pkt_nr_can_be_reset) in the NET struct that one can use to signal that packet number checking should not be done. This is flag is set when semi-sync is used. - Added Master_info::semi_sync_reply_enabled to allow one to configure some slaves with semisync and other other slaves without semisync. Removed global variable semi_sync_need_reply that would not work with multi-master. - Repl_semi_sync_master::report_reply_packet() can now recognize the COM_QUIT packet from semisync slave and not give a "Read semi-sync reply magic number error" error for this case. The slave will be removed from the Ack listener. - On Windows, don't stop semisync Ack listener just because one slave connection is using socket_id > FD_SETSIZE. - Removed busy loop in Ack_receiver::run() by using "Self-pipe trick" to signal new slave and stop Ack_receiver. - Changed some Repl_semi_sync_slave functions that always returns 0 from int to void. - Added Repl_semi_sync_slave::slave_reconnect(). - Removed dummy_function Repl_semi_sync_slave::reset_slave(). - Removed some duplicate semisync notes from the error log. - Add test of "if (get_slave_enabled() && semi_sync_need_reply)" before calling Repl_semi_sync_slave::slave_reply(). (Speeds up the code as we can skip all initializations). - If epl_semisync_slave.slave_reply() fails, we disable semisync for that connection. - We do not call semisync.switch_off() if there are no active slaves. Instead we check in Repl_semi_sync_master::commit_trx() if there are no active threads. This simplices the code. - Changed assert() to DBUG_ASSERT() to ensure that the DBUG log is flushed in case of asserts. - Removed the internal rpl_semi_sync_slave_status as it is not needed anymore. The @@rpl_semi_sync_slave_status status variable is now mapped to rpl_semi_sync_enabled. - Removed rpl_semi_sync_slave_enabled as it is not needed anymore. Repl_semi_sync_slave::get_slave_enabled() contains the active status. - Added checking that we do not add a slave twice with Ack_receiver::add_slave(). This could happen with old code. - Removed Repl_semi_sync_master::check_and_switch() as it is not needed anymore. - Ensure that when we call Ack_receiver::remove_slave() that the slave is removed from the listener before function returns. - Call listener.listen_on_sockets() outside of mutex for better performance and less contested mutex. - Ensure that listening is ignoring newly added slaves when checking for responses. - Fixed the master ack_receiver listener is not killed if there are no connected slaves (and thus stop semisync handling of future connections). This could happen if all slaves sockets where would be marked as unreliable. - Added unlink() to base_ilist_iterator and remove() to I_List_iterator. This enables us to remove 'dead' slaves in Ack_recever::run(). - kill_zombie_dump_threads() now does killing of dump threads properly. - It can now kill several threads (should be impossible but could happen if IO slaves reconnects very fast). - We now wait until the dump thread is done before starting the dump. - Added an error if kill_zombie_dump_threads() fails. - Set thd->variables.server_id before calling kill_zombie_dump_threads(). This simplies the code. - Added a lot of comments both in code and tests. - Removed DBUG_EVALUATE_IF "failed_slave_start" as it is not used. Test changes: - rpl.rpl_session_var2 added which runs rpl.rpl_session_var test with semisync enabled. - Some timings changed slight with startup of slave which caused rpl_binlog_dump_slave_gtid_state_info.text to fail as it checked the error log file before the slave had started properly. Fixed by adding wait_for_pattern_in_file.inc that allows waiting for the pattern to appear in the log file. - Tests have been updated so that we first set rpl_semi_sync_master_enabled on the master and then set rpl_semi_sync_slave_enabled on the slaves (this is according to how the MariaDB documentation document how to setup semi-sync). - Error text "Master server does not have semi-sync enabled" has been replaced with "Master server does not support semi-sync" for the case when the master supports semi-sync but semi-sync is not enabled. Other things: - Some trivial cleanups in Repl_semi_sync_master::update_sync_header(). - We should in 11.3 changed the default value for rpl-semi-sync-master-wait-no-slave from TRUE to FALSE as the TRUE does not make much sense as default. The main difference with using FALSE is that we do not wait for semisync Ack if there are no slave threads. In the case of TRUE we wait once, which did not bring any notable benefits except slower startup of master configured for using semisync. Co-author: Brandon Nesterenko <brandon.nesterenko@mariadb.com> This solves the problem reported in MDEV-32960 where a new slave may not be registered in time and the master disables semi sync because of that.	2024-01-23 13:03:11 +02:00
Sergei Golubchik	e95bba9c58	Merge branch '10.5' into 10.6	2023-12-17 11:20:43 +01:00
Sergei Golubchik	98a39b0c91	Merge branch '10.4' into 10.5	2023-12-02 01:02:50 +01:00
Monty	83214c3406	Improve reporting from sf_report_leaked_memory() Other things: - Added DBUG_EXECUTE_IF("print_allocated_thread_memory") at end of query to easier find not freed memory allocated by THD - Removed free_root() from plugin_init() that did nothing.	2023-11-27 19:08:14 +02:00
Anel Husakovic	ff0bade2f8	MDEV-28367: BACKUP LOCKS on table to be accessible to those with database LOCK TABLES privileges - Allow database level access via `LOCK TABLES` to execute statement `BACKUP [un]LOCK <object>` - `BACKUP UNLOCK` works only with `RELOAD` privilege. In case there is `LOCK TABLES` privilege without `RELOAD` privilege, we check if backup lock is taken before. If it is not we raise an error of missing `RELOAD` privilege. - We had to remove any error/warnings from calling functions because `thd->get_stmt_da()->m_status` will be set to error and will break `my_ok()`. - Added missing test coverage of `RELOAD` privilege to `main.grant.test` Reviewer: <daniel@mariadb.org>	2023-11-23 11:52:12 +11:00
Oleksandr Byelkin	b83c379420	Merge branch '10.5' into 10.6	2023-11-08 15:57:05 +01:00
Oleksandr Byelkin	6cfd2ba397	Merge branch '10.4' into 10.5	2023-11-08 12:59:00 +01:00
Monty	2447172afb	Ensure that process "State" is properly cleaned after query execution In some cases "SHOW PROCESSLIST" could show "Reset for next command" as State, even if the previous query had finished properly. Fixed by clearing State after end of command and also setting the State for the "Connect" command. Other things: - Changed usage of 'thd->set_command(COM_SLEEP)' to 'thd->mark_connection_idle()'. - Changed thread_state_info() to return "" instead of NULL. This is just a safety measurement and in line with the logic of the rest of the function.	2023-11-07 10:07:30 +02:00
Sergei Golubchik	547dfc0e01	MDEV-32500 Information schema leaks table names and structure to unauthorized users standard table KEY_COLUMN_USAGE should only show keys where a user has some privileges on every column of the key standard table TABLE_CONSTRAINTS should show tables where a user has any non-SELECT privilege on the table or on any column of the table standard table REFERENTIAL_CONSTRAINTS is defined in terms of TABLE_CONSTRAINTS, so the same rule applies. If the user has no rights to see the REFERENCED_TABLE_NAME value, it should be NULL SHOW INDEX (and STATISTICS table) is non-standard, but it seems reasonable to use the same logic as for KEY_COLUMN_USAGE.	2023-10-23 17:40:03 +02:00
Sergei Golubchik	81c88ab7cd	MDEV-28820 MyISAM wrong server status flags MyISAM tables no longer take transactional metadata locks unless there already is an active transaction.	2023-10-17 14:32:05 +02:00
Marko Mäkelä	0dd25f28f7	Merge 10.5 into 10.6	2023-09-11 14:46:39 +03:00
Marko Mäkelä	f8f7d9de2c	Merge 10.4 into 10.5	2023-09-11 11:29:31 +03:00
Sergei Golubchik	e78ce63291	MDEV-17711 Assertion `arena_for_set_stmt== 0' failed in LEX::set_arena_for_set_stmt upon SET STATEMENT restore SET STATEMENT variables between statements in a multi-statement	2023-09-06 22:38:41 +02:00
Daniel Black	9b1b4a6f69	MDEV-31545 Revert "Fix gcc warning for wsrep_plug" This reverts commit `38fe266ea9`. The correct fix was pushed to the 10.4 branch (`fbc157ab33`)	2023-08-30 16:24:38 +10:00
Oleksandr Byelkin	6bf8483cac	Merge branch '10.5' into 10.6	2023-08-01 15:08:52 +02:00
Oleksandr Byelkin	7564be1352	Merge branch '10.4' into 10.5	2023-07-26 16:02:57 +02:00
Oleksandr Byelkin	f52954ef42	Merge commit '10.4' into 10.5	2023-07-20 11:54:52 +02:00
Daniel Black	fbc157ab33	MDEV-31545 GCC 13 -Wdangling-pointer in execute_show_status() The pointer was used deep in the call path. Resolve this by setting the pointer to NULL at the end of the function. Tested with gcc-13.3.1 (fc38) The warning disable `38fe266ea9` can be reverted in 10.6+ on merge.	2023-07-19 11:20:29 +10:00
Marko Mäkelä	2855bc53bc	Merge 10.5 into 10.6	2023-07-05 16:40:22 +03:00
Alexander Barkov	0d3720c12a	MDEV-30680 Warning: Memory not freed: 280 on mangled query, LeakSanitizer: detected memory leaks The parser works as follows: The rule expr_lex returns a pointer to a newly created sp_expr_lex instance which is not linked to any MariaDB structures yet - it is pointed only from a Bison stack variable. The sp_expr_lex instance gets linked to other structures (such as sp_instr_jump_if_not) later, after scanning some following grammar. Problem before the fix: If a parse error happened immediately after expr_lex (before it got linked), the created sp_expr_lex value got lost causing a memory leak. Fix: - Using Bison's "destructor" directive to free the results of expr_lex on parse/oom errors. - Moving the call for LEX::cleanup_lex_after_parse_error() from MYSQL_YYABORT and yyerror inside parse_sql(). This is needed because Bison calls destructors after yyerror(), while it's important to delete the sp_expr_lex instance before LEX::cleanup_lex_after_parse_error(). The latter frees the memory root containing the sp_expr_lex instance. After this change the code block are executed in the following order: - yyerror() -- now only raises the error to DA (no cleanup done any more) - %destructor { delete $$; } <expr_lex> -- destructs the sp_expr_lex instance - LEX::cleanup_lex_after_parse_error() -- frees the memory root containing the sp_expr_lex instance - Removing the "delete sublex" related code from restore_lex(): - restore_lex() is called in most cases on success, when delete is not needed. - There is one place when restore_lex() is called on error: In sp_create_assignment_instr(). But in this case LEX::sp_lex_in_use is true anyway. The patch adds a new DBUG_ASSERT(lex->sp_lex_in_use) to guard this.	2023-06-29 13:34:22 +04:00
Vicentiu Ciorbaru	38fe266ea9	Fix gcc warning for wsrep_plug	2023-06-25 16:15:08 +03:00
Teemu Ollakka	f307160218	MDEV-29293 MariaDB stuck on starting commit state This commit contains a merge from 10.5-MDEV-29293-squash into 10.6. Although the bug MDEV-29293 was not reproducible with 10.6, the fix contains several improvements for wsrep KILL query and BF abort handling, and addresses the following issues: * MDEV-30307 KILL command issued inside a transaction is problematic for galera replication: This commit will remove KILL TOI replication, so Galera side transaction context is not lost during KILL. * MDEV-21075 KILL QUERY maintains nodes data consistency but breaks GTID sequence: This is fixed as well as KILL does not use TOI, and thus does not change GTID state. * MDEV-30372 Assertion in wsrep-lib state: This was caused by BF abort or KILL when local transaction was in the middle of group commit. This commit disables THD::killed handling during commit, so the problem is avoided. * MDEV-30963 Assertion failure !lock.was_chosen_as_deadlock_victim in trx0trx.h:1065: The assertion happened when the victim was BF aborted via MDL while it was committing. This commit changes MDL BF aborts so that transactions which are committing cannot be BF aborted via MDL. The RQG grammar attached in the issue could not reproduce the crash anymore. Original commit message from 10.5 fix: MDEV-29293 MariaDB stuck on starting commit state The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Make galera_var_retry_autocommit result more readable by echoing cases and expectations into result. Only one expected result for reap to verify that server returns expected status for query. * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_bf_abort_registering to check that registering trx gets BF aborted through MDL. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:42:05 +02:00
Teemu Ollakka	3f59bbeeae	MDEV-29293 MariaDB stuck on starting commit state The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:39:43 +02:00

1 2 3 4 5 ...

7794 commits