mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 12:02:42 +01:00

Author	SHA1	Message	Date
Sergei Golubchik	cb7c99674e	sporadic failure of perfschema.func_file_io --- func_file_io.result +++ func_file_io.reject @@ -134,7 +134,7 @@ Variable_name Value Performance_schema_accounts_lost 0 Performance_schema_cond_classes_lost 0 -Performance_schema_cond_instances_lost 0 +Performance_schema_cond_instances_lost 5 Performance_schema_digest_lost 0 Performance_schema_file_classes_lost 0 Performance_schema_file_handles_lost 0	2024-05-05 21:37:07 +02:00
Kristian Nielsen	4b4db4a8e5	MDEV-34042: Deadlock kill of XA PREPARE can break replication / rpl.rpl_parallel_multi_domain_xa sporadic failure Refinement of the original patch. Move the code to reset the kill up into the parent class Xid_apply_log_event, to also fix the similar issue for XA COMMIT. Increase the number of slave retries in the test case rpl.rpl_parallel_multi_domain_xa to fix some sporadic failures. The test generates massive amounts of conflicting transactions in multiple independent domains, which can cause multiple rollback+retry for a transaction as it conflicts with transactions in other domains one-by-one. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-05-05 19:01:56 +02:00
Sergei Golubchik	3ee6f69d49	sporadic failures of binlog_encryption.rpl_parallel_gco_wait_kill CURRENT_TEST: binlog_encryption.rpl_parallel_gco_wait_kill mysqltest: In included file "./suite/rpl/t/rpl_parallel_gco_wait_kill.test": included from /home/buildbot/amd64-ubuntu-2004-debug/build/mysql-test/suite/binlog_encryption/rpl_parallel_gco_wait_kill.test at line 2: At line 334: Can't initialize replace from 'replace_result $thd_id THD_ID' An sql thread can reach the "Slave has read all relay log" state and then start reading relay log again. Let's use a more generic pattern to retrieve the sql thread ID even if it's not in the "read all relay log" state.	2024-05-02 22:14:19 +02:00
Kristian Nielsen	e365877bae	MDEV-33798: ROW base optimistic deadlock with concurrent writes on same table One case is conflicting transactions T1 and T2 with different domain id, in optimistic parallel replication in non-GTID mode. Then T2 will wait_for_prior_commit on T1; and if T1 got a row lock wait on T2 it would hang, as different domains caused the deadlock kill to be skipped in thd_rpl_deadlock_check(). More generally, if we have transactions T1 and T2 in one domain/master connection, and independent transactions U in another, then we can still deadlock like this: T1 row low wait on U U row lock wait on T2 T2 wait_for_prior_commit on T1 This commit enforces the deadlock kill in these cases. If the waited-for transaction is speculatively applied, then it will be deadlock killed in case of a conflict, even if the two transactions are in different domains or master connections. Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-05-02 21:07:45 +02:00
Sergei Golubchik	dba9d19249	atomic.alter_table test is too slow for MSAN	2024-04-30 21:59:38 +02:00
Thirunarayanan Balathandayuthapani	156761db3b	MDEV-31161 Assertion failures upon adding a too long key to table with COMPRESSED row Problem: ======= During InnoDB non-rebuild online alter operation, InnoDB set the dummy log to clustered index online log. This can be used by concurrent DML to identify whether the table undergoes online DDL. InnoDB fails to reset the dummy log of clustered index in case of error happened during prepare phase. Solution: ======== Reset the InnoDB clustered index online log in case of error during prepare phase.	2024-04-30 20:40:29 +05:30
Sergei Golubchik	b663c935a4	don't use normal diffs in *.rdiff files they aren't robust enough and can easily apply incorrectly (this fixes the failure of innodb.insert_into_empty,4k after the merge)	2024-04-30 16:57:07 +02:00
Sergei Golubchik	0aae11ac28	Merge branch '10.6' into 10.11	2024-04-30 16:56:49 +02:00
Thirunarayanan Balathandayuthapani	f378e76434	MDEV-33980 mariadb-backup --backup is missing retry logic for undo tablespaces Problem: ======== - Currently mariabackup have to reread the pages in case they are modified by server concurrently. But while reading the undo tablespace, mariabackup failed to do reread the page in case of error. Fix: === Mariabackup --backup functionality should have retry logic while reading the undo tablespaces.	2024-04-30 16:15:26 +05:30
Andrei	ae03374f29	MDEV-34030 rpl.rpl_using_gtid_default can fail in (BB) mtr The test's header is not written to follow strictly a correct order of checks by mtr at test start which may lead to an error. E.g ./mtr --mysqld=--binlog-format=row rpl.rpl_using_gtid_default to At line 175: query 'SET GLOBAL gtid_slave_pos= ""' failed: ER_SLAVE_MUST_STOP (1198): This operation cannot be performed as you have a running slave ''; run STOP SLAVE '' first Fixed to require the binlog format first in the test header.	2024-04-30 12:40:50 +03:00
Andrei	6a63204c36	MDEV-34029 rpl.rpl_heartbeat can fail when (BB) mtr reorders tests rpl.rpl_heartbeat turns out to miss a standard include/master-slave header which made it potentially in BB and actually with manual mtr failing as it may have used a previous slave GTID state. Fixed with installing the standard rpl suite header/footer in the test file.	2024-04-30 12:40:50 +03:00
Rucha Deodhar	d7df63e1c9	MDEV-19487: JSON_TYPE doesnt detect the type of String Values (returns NULL) and for Date/DateTime returns "INTEGER" Analysis: When the first character of json is scanned it is number. Based on that integer is returned. Fix: Scan rest of the json before returning the final result to ensure json is valid in the first place in order to have a valid type.	2024-04-29 22:32:17 +05:30
Thirunarayanan Balathandayuthapani	a586b6dbc8	MDEV-22855 Assertion `!field->prefix_len \|\| field->fixed_len == field->prefix_len' failed in btr_node_ptr_max_size Problem: ======== - InnoDB wrongly calulates the record size in btr_node_ptr_max_size() when prefix index of the column has to be stored externally. Fix: ==== - InnoDB should add the maximum field size to record size when the field is a fixed length one.	2024-04-29 16:42:26 +05:30
Alexander Barkov	c6e3fe29d4	MDEV-30646 View created via JSON_ARRAYAGG returns incorrect json object Backporting `add782a13e` from 10.6, this fixes the problem.	2024-04-29 13:47:45 +04:00
Sergei Golubchik	c1f3eff53f	Merge branch '10.5' into 10.6	2024-04-29 10:08:58 +02:00
Alexander Barkov	dc25d600ee	MDEV-21058 CREATE TABLE with generated column and RLIKE results in sigabrt Regexp_processor_pcre::fix_owner() called Regexp_processor_pcre::compile(), which could fail on the regex syntax error in the pattern and put an error into the diagnostics area. However, the callers: - Item_func_regex::fix_length_and_dec() - Item_func_regexp_instr::fix_length_and_dec() still returned "false" in such cases, which made the code crash later inside Diagnostics_area::set_ok_status(). Fix: - Change the return type of fix_onwer() from "void" to "bool" and return "true" whenever an error is put to the DA (e.g. on the syntax error in the pattern). - Fixing fix_length_and_dec() of the mentioned Item_func_xxx classes to return "true" if fix_onwer() returned "true".	2024-04-29 11:08:07 +04:00
mkaruza	136358036d	MDEV-18590: galera.versioning_trx_id: Test failure: mysqltest: Result content mismatch Replicated events have time associated with them from originating node which will be used for commit timestamp. Associated time can be set in past before event is even applied. For WSREP replication we don't need to use time information from event. Addressed review comments: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-04-27 18:40:58 +02:00
Jan Lindström	1532f12058	MDEV-33898 : Galera test failure on galera.MW-369 Tests using MW-369.inc sometimes hanged after signaling two debug sync points inside a Galera library. Replaced Galera library sync point with server code sync point when possible and added more wait_conditions to make sure we are in correct state. Tests effected: MW-369, MW-402, MDEV-27276, and mysql-wsrep#332. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-04-26 20:21:44 +02:00
Julius Goryavsky	288ea9e146	galera SST scripts: parsing CN in certificates This commit contains a fix for the code that extracts and parses the CN (common name, domain name) record from certificates using the openssl utility. This code is also made common to the rsync and mariabackup scripts. There is also some systematization of the use of 'printf' and 'echo' builtins/utilities.	2024-04-26 20:21:44 +02:00
Sergei Golubchik	7ff649315e	sporadic failures of rpl.rpl_parallel_multi_domain_xa it's a slow test, the slave needs to catch up, reading >1500 transactions. A default MASTER_GTID_WAIT() timeout in sync_with_master_gtid.inc is 120 seconds, which might be not enough for a slow/overloaded slave. Let's wait forever or until ./mtr --testcase-timeout, whatever comes first.	2024-04-26 14:24:32 +02:00
Jan Lindström	b3e531a3cc	MDEV-33896 : Galera test failure on galera_3nodes.MDEV-29171 Based on logs we might start SST before donor has reached Primary state. Because this test shutdowns all nodes we need to make sure when we start nodes that previous nodes have reached Primary state and joined the cluster. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-04-25 16:32:06 +02:00
Sergei Golubchik	9e92582024	sporadic failures of rpl.rpl_parallel_sbm the test waits for the event to get stuck on MASTER_DELAY, but on a slow/overloaded slave the event might pass MASTER_DELAY before the test starts waiting. Wait for the event to get stuck on the LOCK TABLES (after MASTER_DELAY), the event cannot avoid that,	2024-04-25 12:47:23 +02:00
Marko Mäkelä	0936c13809	MDEV-33993 Possible server hang on DROP INDEX or RENAME INDEX commit_try_norebuild(): Add the parameter statistics_exist, similar to commit_try_rebuild(). If the InnoDB statistics tables did not exist, we will not attempt to update statistics later on during the transaction. Thanks to Matthias Leich for originally reproducing this scenario.	2024-04-25 13:44:10 +03:00
Kristian Nielsen	553a4d6271	MDEV-33602: Sporadic test failure in rpl.rpl_gtid_stop_start The test could fail with a duplicate key error because switching to non-GTID mode could start at the wrong old-style position. The position could be wrong when the previous GTID connect was stopped before receiving the fake GTID list event which gives the old-style position corresponding to the GTID connected position. Work-around by injecting an extra event and syncing the slave before switching to non-GTID mode. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-04-25 11:00:45 +02:00
Thirunarayanan Balathandayuthapani	8c8b7da017	MDEV-33979 Disallow bulk insert operation during partition update statement Problem: ======== - Partition update operation enables the bulk insert for the transaction while moving the row between partitions. This leads to debug assert failure while removing the row from one of the partition. Solution: ======== - Disallow the bulk insert operation for non-insert operation of partition table.	2024-04-25 10:50:34 +05:30
Sergei Golubchik	9cf718859f	cleanup: use THD_STAGE_INFO, not thd_proc_info and put master-slave.inc last in the series of includes	2024-04-24 22:08:52 +02:00
Sergei Golubchik	7d5e08de6b	MDEV-20157 perfschema.stage_mdl_function failed in buildbot with wrong result MDL wait consists of short 1 second waits (this is not configurable) repeated until lock_wait_timeout is reached. The stage is changed to Waiting and back every second. To have predictable result in the test the query should filter all sequences of X, "Waiting for MDL", X, leaving just X.	2024-04-24 18:09:58 +02:00
Sergei Golubchik	259394aed7	disable mariabackup.incremental_encrypted,64k on 32bit it allocates 1GB of memory, it causes failures in CI	2024-04-24 18:09:20 +02:00
Sergei Golubchik	e2f95ebbcb	fix galera_3nodes.galera_gtid_consistency to work with nc like other galera tests do	2024-04-24 18:09:20 +02:00
Brandon Nesterenko	8c7992165b	MDEV-33672: 10.11 Fix for Two Phase Alter Flags Extends `89c907bd4f` to account for binlog_two_phase_alter flags in a Gtid log event. I.e., if the FL_COMMIT_ALTER_E1 or FL_ROLLBACK_ALTER_E2 flags are set in the event flags, yet the length of the event is too short to hold the value, then set the event as invalid	2024-04-24 13:19:36 +02:00
Thirunarayanan Balathandayuthapani	0c55d854fe	MDEV-33334 mariadb-backup fails to preserve innodb_encrypt_tables Problem: ======== mariabackup --prepare fails to write the pages in encrypted format. This issue happens only for default encrypted table when innodb_encrypt_tables variable is enabled. Fix: ==== backup process should write the value of innodb_encrypt_tables variable in configuration file. prepare should enable the variable based on configuration file.	2024-04-24 16:27:31 +05:30
Thirunarayanan Balathandayuthapani	c3460e6904	MDEV-33970 Assertion `!m.first->second.is_bulk_insert()' failed in trx_undo_report_row_operation() In case of partition insert, InnoDB fails to end the bulk insert for one of the partition. It leads to bulk insert operation for the consecutive delete statement. trx_t::bulk_insert_apply_for_table(): Irrespective of bulk insert value, InnoDB should end the bulk insert for the table.	2024-04-23 16:26:02 +05:30
Sergei Golubchik	e73181112f	MDEV-16944 fix galera tests followup for `061adae9a2`	2024-04-23 10:55:35 +02:00
Sergei Golubchik	f243c73788	sporadic failures of rpl.rpl_rewrite_db_sys_vars first stop the slave, then run commands on the master that are supposed to fail on the slave, then start the slave. if you swap first two steps, the slave might get and execute those commands before it's stopped, which will fail the test. also, improve debugability	2024-04-22 21:02:11 +02:00
Sergei Golubchik	a74846354e	fix failing large_tests.maria_recover_encrypted update results	2024-04-22 18:38:39 +02:00
Sergei Golubchik	466bc8f7e0	fix failing large_tests.maria_recover_encrypted update results	2024-04-22 17:22:11 +02:00
Sergei Golubchik	018d537ec1	Merge branch '10.6' into 10.11	2024-04-22 15:23:10 +02:00
Sergei Golubchik	75488a57f2	archive.archive and main.mysqlbinlog_row_compressed fixes for zlib-ng	2024-04-22 00:14:03 +02:00
Sergei Golubchik	e83d92ee5e	sporadic failures of rpl.rpl_semi_sync_fail_over in the $case=2 - it's wrong to kill after the first binlog EOF, because that might happen between INSERT(4) and INSERT(5). So, wait for the slave to acknowledge INSERT(5) before killing the master, that is, both connection threads must pass repl_semisync_master.wait_after_sync()	2024-04-21 22:54:52 +02:00
Sergei Golubchik	6242783f24	rpl.rpl_semi_sync_fail_over improve debugability	2024-04-21 14:03:26 +02:00
Sergei Golubchik	a4b6409ff6	sporadic failures of binlog_encryption.rpl_parallel_slave_bgc_kill do CHANGE MASTER before sync_with_master to have the slave in a predictable fully synced state before the next test	2024-04-21 01:17:31 +02:00
Sergei Golubchik	d8368ae289	Merge '10.5' into 10.6	2024-04-20 14:47:26 +02:00
Kristian Nielsen	0c249ad718	MDEV-30232: rpl.rpl_gtid_crash fails sporadically in BB The root cause of the failure is a bug in the Linux network stack: https://lore.kernel.org/netdev/87sf0ldk41.fsf@urd.knielsen-hq.org/T/#u If the slave does a connect(2) at the exact same time that kill -9 of the master process closes the listening socket, the FIN or RST packet is lost in the kernel, and the slave ends up timing out waiting for the initial communication from the server. This timeout defaults to --slave-net-timeout=120, which causes include/master_gtid_wait.inc to time out first and fail the test. Work-around this problem by reducing the --slave-net-timeout for this test case. If this problem turns up in other tests, we can consider reducing the default value for all tests. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-04-20 13:41:08 +02:00
Sergei Golubchik	4a2e03453a	MDEV-33952 galera_create_table_as_select fails sporadically disable until fixed	2024-04-19 22:09:41 +02:00
Thirunarayanan Balathandayuthapani	8a3755cc29	MDEV-33934 Assertion `!check_foreigns' failed in bulk_insert_apply_for_table(dict_table_t*) This issue is caused by commit `188c5da72a` (MDEV-32453). trx_t::bulk_insert_apply_for_table(): Remove the assert check_unique_secondary and check_foreigns. InnoDB can apply the bulk insert operation even after disabling the check_foreigns and check_unique_secondary variable.	2024-04-19 11:05:44 +05:30
Marko Mäkelä	bb2e125d07	Merge 10.5 into 10.6 This excludes commit `040069f4ba` because it is specific to innodb_sync_debug, which had been removed in commit `ff5d306e29`.	2024-04-18 07:14:56 +03:00
Brandon Nesterenko	0ad52e4d6a	MDEV-27512: Assertion !thd->transaction_rollback_request failed in rows_event_stmt_cleanup If replicating an event in ROW format, and InnoDB detects a deadlock while searching for a row, the row event will error and rollback in InnoDB and indicate that the binlog cache also needs to be cleared, i.e. by marking thd->transaction_rollback_request. In the normal case, this will trigger an error in Rows_log_event::do_apply_event() and cause a rollback. During the Rows_log_event::do_apply_event() cleanup of a successful event application, there is a DBUG_ASSERT in log_event_server.cc::rows_event_stmt_cleanup(), which sets the expectation that thd->transaction_rollback_request cannot be set because the general rollback (i.e. not the InnoDB rollback) should have happened already. However, if the replica is configured to skip deadlock errors, the rows event logic will clear the error and continue on, as if no error happened. This results in thd->transaction_rollback_request being set while in rows_event_stmt_cleanup(), thereby triggering the assertion. This patch fixes this in the following ways: 1) The assertion is invalid, and thereby removed. 2) The rollback case is forced in rows_event_stmt_cleanup() if transaction_rollback_request is set. Note the differing behavior between transactions which are skipped due to deadlock errors and other errors. When a transaction is skipped due to an ignored deadlock error, the entire transaction is rolled back and skipped (though note MDEV-33930 which allows statements in the same transaction after the deadlock-inducing one to commit). When a transaction is skipped due to ignoring a different error, only the erroring statements are rolled-back and skipped - the rest of the transaction will execute as normal. The effect of this can be seen in the test results. The added test case to rpl_skip_error.test shows that only statements which are ignored due to non-deadlock errors are ignored in larger transactions. A diff between rpl_temporary_error2_skip_all.result and rpl_temporary_error2.result shows that all statements in the errored transaction are rolled back (diff pasted below): : diff rpl_temporary_error2.result rpl_temporary_error2_skip_all.result 49c49 < 2 1 --- > 2 NULL 51c51 < 4 1 --- > 4 NULL 53c53 < * There will be two rows in t2 due to the retry. --- > * There will be one row in t2 because the ignored deadlock does not retry. 57d56 < 1 59c58 < 1 --- > 0 Reviewed By: ============ Andrei Elkin <andrei.elkin@mariadb.com>	2024-04-17 11:14:21 -06:00
Vladislav Vaintroub	061adae9a2	MDEV-16944 Fix file sharing issues on Windows in mysqltest On Windows systems, occurrences of ERROR_SHARING_VIOLATION due to conflicting share modes between processes accessing the same file can result in CreateFile failures. mysys' my_open() already incorporates a workaround by implementing wait/retry logic on Windows. But this does not help if files are opened using shell redirection like mysqltest traditionally did it, i.e via --echo exec "some text" > output_file In such cases, it is cmd.exe, that opens the output_file, and it won't do any sharing-violation retries. This commit addresses the issue by introducing a new built-in command, 'write_line', in mysqltest. This new command serves as a brief alternative to 'write_file', with a single line output, that also resolves variables like "exec" would. Internally, this command will use my_open(), and therefore retry-on-error logic. Hopefully this will eliminate the very sporadic "can't open file because it is used by another process" error on CI.	2024-04-17 16:52:37 +02:00
Vladislav Vaintroub	173847b76a	Do not run maria_recover_encrypted with embedded. It uses shutdown/restart etc, features not compatible the embedded. also add have_debug.inc , since it uses debug_dbug variable	2024-04-17 16:52:17 +02:00
Marko Mäkelä	829cb1a49c	Merge 10.5 into 10.6	2024-04-17 14:14:58 +03:00
mariadb-DebarunBanerjee	040069f4ba	MDEV-33431 Latching order violation reported fil_system.sys_space.latch and ibuf_pessimistic_insert_mutex Issue: ------ The actual order of acquisition of the IBUF pessimistic insert mutex (SYNC_IBUF_PESS_INSERT_MUTEX) and IBUF header page latch (SYNC_IBUF_HEADER) w.r.t space latch (SYNC_FSP) differs from the order defined in sync0types.h. It was not discovered earlier as the path to ibuf_remove_free_page was not covered by the mtr test. Ideal order and one defined in sync0types.h is as follows. SYNC_IBUF_HEADER -> SYNC_IBUF_PESS_INSERT_MUTEX -> SYNC_FSP In ibuf_remove_free_page, we acquire space latch earlier and we have the order as follows resulting in the assert with innodb_sync_debug=on. SYNC_FSP -> SYNC_IBUF_HEADER -> SYNC_IBUF_PESS_INSERT_MUTEX Fix: --- We do maintain this order in other places and there doesn't seem to be any real issue here. To reduce impact in GA versions, we avoid doing extensive changes in mutex ordering to match the current SYNC_IBUF_PESS_INSERT_MUTEX order. Instead we relax the ordering check for IBUF pessimistic insert mutex using SYNC_NO_ORDER_CHECK.	2024-04-17 15:16:50 +05:30
Marko Mäkelä	3a3fe3005d	Merge 10.4 into 10.5	2024-04-17 10:10:26 +03:00
Jan Lindström	4aeba2590b	MDEV-33895 : Galera test failure on galera_sr.MDEV-25718 Test was waiting INSERT-clause to make rollback but wait_condition was too tight. State could be Freeing items or Rollback. Fixed wait_condition to expect one of them.	2024-04-17 09:41:15 +03:00
Sergei Golubchik	41e7ceb0ac	MDEV-33889 Read only server throws error when running a create temporary table as select statement create_partitioning_metadata() should only mark transaction r/w if it actually did anything (that is, the table is partitioned). otherwise it's a no-op, called even for temporary tables and it shouldn't do anything at all	2024-04-16 20:43:31 +02:00
Oleksandr Byelkin	9b18275623	Merge branch '10.4' into 10.5	2024-04-16 11:04:14 +02:00
Kristian Nielsen	16aa4b5f59	Merge from 10.4 to 10.5 Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-04-15 17:46:49 +02:00
Daniel Black	ea810b04cb	MDEV-30676 rpl.parallel_backup* tests sometimes fail Raise innodb_lock_wait_timeout from 1 to 5	2024-04-15 15:45:03 +10:00
Sergei Golubchik	69b5fdf32a	galera/suite.pm: perl warning Unescaped left brace in regex is passed through in regex	2024-04-13 16:28:13 +02:00
Vlad Lesin	d7fc975cfe	MDEV-33802 Weird read view after ROLLBACK of other transactions. In the case if some unique key fields are nullable, there can be several records with the same key fields in unique index with at least one key field equal to NULL, as NULL != NULL. When transaction is resumed after waiting on the record with at least one key field equal to NULL, and stored in persistent cursor record is deleted, persistent cursor can be restored to the record with all key fields equal to the stored ones, but with at least one field equal to NULL. And such record is wrongly treated as a record with the same unique key as stored in persistent cursor record one, what is wrong as NULL != NULL. The fix is to check if at least one unique field is NULL in restored persistent cursor position, and, if so, then don't treat the record as one with the same unique key as in the stored record key. dict_index_t::nulls_equal was removed, as it was initially developed for never existed in MariaDB "intrinsic tables", and there is no code, which would set it to "true". Reviewed by Marko Mäkelä.	2024-04-12 18:13:51 +03:00
Brandon Nesterenko	a6aecbb036	MDEV-10684: rpl.rpl_domain_id_filter_restart fails in buildbot The test failure in rpl.rpl_domain_id_filter_restart is caused by MDEV-33887. That is, the test uses master_pos_wait() (called indirectly by sync_slave_with_master) to try and wait for the replica to catch up to the master. However, the waited on transaction is ignored by the configured CHANGE MASTER TO IGNORE_DOMAIN_IDS=() As MDEV-33887 reports, due to the IO thread updating the binlog coordinates and the SQL thread updating the GTID state, if the replica is stopped in-between these updates, the replica state will be inconsistent. That is, the test expects that the GTID state will be updated, so upon restart, the replica will be up-to-date. However, if the replica is stopped before the SQL thread updates its GTID state, then upon restart, the replica will fetch the previously ignored event, which is no longer ignored upon restart, and execute it. This leads to the sporadic extra row in t2. This patch changes master_pos_wait() to use master_gtid_wait() to ensure the replica state is consistent with the master state.	2024-04-11 09:49:20 -06:00
Sergei Golubchik	340d93a8cc	cleanup: rpl.rpl_semi_sync_shutdown_await_ack avoid using multiple files with the same functionality.	2024-04-11 15:28:55 +02:00
Sergei Golubchik	e5c9904eba	make innodb.monitor test idempotent	2024-04-11 14:53:12 +02:00
Sergei Golubchik	41296a07c8	Merge branch '10.5' into 10.6	2024-04-11 13:58:22 +02:00
Thirunarayanan Balathandayuthapani	863f5996f2	MDEV-33868 Assertion `trx->bulk_insert' failed in innodb_prepare_commit_versioned - This issue is caused by commit `188c5da72a` (MDEV-32453). InnoDB fails to end the bulk insert for the table after applying the bulk insert operation. This leads to assertion during commit process.	2024-04-11 15:57:54 +05:30
Jan Lindström	cac0fc97cc	MDEV-32974 : Member fails to join due to old seqno in GTID Before MDEV-15158, wsrep xid information was stored in only one place: in the TRX_SYS page. Starting with 10.3, it is not stored there but in the rollback segment header pages, and the latest one is what matters. MDEV-19229 allows the undo tablespaces to be rebuilt when innodb_undo_tablespaces is changed on startup. Previously it was not possible to change that parameter. These changes caused the fact that rollback segment header pages could contain several wsrep xid's stored and when undo tablespaces were rebuilt there was a effort to restore wsrep xid back to rollback segment header page but because there was several of them the latest wsrep xid was overwritten with older one. trx_rseg_read_wsrep_checkpoint trx_rseg_init_wsrep_xid Return true if read xid is wsrep xid, false if not trx_rseg_mem_restore Try to read wsrep xid and if it is found copy it to trx_sys.recovered_wsrep_xid if read xid has larger seqno.	2024-04-11 10:18:20 +03:00
Sergei Golubchik	2d2172a5cf	sporadic failures of rpl.rpl_semi_sync_master_shutdown increase the MASTER_CONNECT_RETRY time under valgrind, otherwise the slave gives up retrying before the master is ready also, cosmetic cleanup of rpl_semi_sync_master_shutdown.test	2024-04-10 19:38:39 +02:00
Andrei	0da1653f1b	MDEV-31779 Server crash in Rows_log_event::update_sequence upon replaying binary log The crash at running mysqlbinlog on a SEQUENCE containing binlog file was caused MDEV-29621 fixes that did not check which of the slave or binlog applier executes a block introduced there. The block is meaningful only for the parallel slave applier, so it's safe to fix this bug with identified the actual applier and skipping the block when it's the mysqlbinlog one.	2024-04-10 19:31:39 +03:00
Marko Mäkelä	d824977598	MDEV-33512 Corrupted table after IMPORT TABLESPACE and restart In commit `d74d95961a` (MDEV-18543) there was an error that would cause the hidden metadata record to be deleted, and therefore cause the table to appear corrupted when it is reloaded into the data dictionary cache. PageConverter::update_records(): Do not delete the metadata record, but do validate it. RecIterator::open(): Make the API more similar to 10.6, to simplify merges.	2024-04-10 09:47:44 +03:00
Jan Lindström	0304dbc327	MDEV-25089 : Assertion `error.len > 0' failed in galera::ReplicatorSMM::handle_apply_error() Additional corrections after merge from 10.4 branch Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-04-10 05:17:54 +02:00
Alexander Barkov	9fb8881ef8	MDEV-28366 GLOBAL debug_dbug setting affected by collation_connection=utf16... When the system variables @@debug_dbug was assigned to some expression, Sys_debug_dbug::do_check() did not properly convert the value from the expression character set to utf8. So the value was erroneously re-interpretted as utf8 without conversion. In case of a tricky expression character set (e.g. utf16le), this led to unexpected results. Fix: Re-using Sys_var_charptr::do_string_check() in Sys_debug_dbug::do_check().	2024-04-10 06:09:45 +04:00
Brandon Nesterenko	952ab9a596	MDEV-30260: Slave crashed:reload_acl_and_cache during shutdown The signal handler thread can use various different runtime resources when processing a SIGHUP (e.g. master-info information) due to calling into reload_acl_and_cache(). Currently, the shutdown process waits for the termination of the signal thread after performing cleanup. However, this could cause resources actively used by the signal handler to be freed while reload_acl_and_cache() is processing. The specific resource that caused MDEV-30260 is a race condition for the hostname_cache, such that mysqld would delete it in clean_up()::hostname_cache_free(), before the signal handler would use it in reload_acl_and_cache()::hostname_cache_refresh(). Another similar resource is the active_mi/master_info_index. There was a race between its deletion by the main thread in end_slave(), and their usage by the Signal Handler as a part of Master_info_index::flush_all_relay_logs.read(active_mi) in reload_acl_and_cache(). This patch fixes these race conditions by relocating where server shutdown waits for the signal handler to die until after server-level threads have been killed (i.e., as a last step of close_connections()). With respect to the hostname_cache, active_mi and master_info_cache, this ensures that they cannot be destroyed while the signal handler is still active, and potentially using them. Additionally: 1) This requires that Events memory is still in place for SIGHUP handling's mysql_print_status(). So event deinitialization is moved into clean_up(), but the event scheduler still needs to be stopped in close_connections() at the same spot. 2) The function kill_server_thread is no longer used, so it is deleted 3) The timeout to wait for the death of the signal thread was not consistent with the comment. The comment mentioned up to 10 seconds, whereas it was actually 0.01s. The code has been fixed to wait up to 10 seconds. 4) A warning has been added if the signal handler thread fails to exit in time. 5) Added pthread_join() to end of wait_for_signal_thread_to_end() if it hadn't ended in 10s with a warning. Note this also removes the pthread_detached attribute from the signal_thread to allow for the pthread_join(). Reviewed By: =========== Vladislav Vaintroub <wlad@mariadb.com> Andrei Elkin <andrei.elkin@mariadb.com>	2024-04-09 14:25:13 -06:00
Jan Lindström	33af5575a9	MDEV-25731 : Assertion `mode_ == m_local' failed in void wsrep::client_state::streaming_params(wsrep::streaming_context::fragment_unit, size_t) Problem was that if wsrep_load_data_splitting was used streaming replication (SR) parameters were set for MyISAM table. Galera does not currently support SR for MyISAM. Fix is to ignore wsrep_load_data_splitting setting (with warning) if table is not InnoDB table. This is 10.6+ case of fix. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-04-09 15:55:35 +02:00
Jan Lindström	7aa86eb1e1	MDEV-33828 : Transactional commit not supported by involved engine(s) Problem was too tight condition on ha_commit_trans to not allow non transactional storage engines participate 2pc in Galera case. This is required because transaction using e.g. procedures might read mysql.proc table inside a trasaction and these tables use at the moment Aria storage engine that does not support 2pc. Fixed by allowing read only transactions to storage engines that do not support two phase commit to participate 2pc transaction. These will be committed later separately. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-04-09 12:21:53 +02:00
Marko Mäkelä	4aa92911c7	MDEV-33802 Weird read view after ROLLBACK of another transaction Even after commit `b8a6719889` there is an anomaly where a locking read could return inconsistent results. If a locking read would have to wait for a record lock, then by the definition of a read view, the modifications made by the current lock holder cannot be visible in the read view. This is because the read view must exclude any transactions that had not been committed at the time when the read view was created. lock_rec_convert_impl_to_expl_for_trx(), lock_rec_convert_impl_to_expl(): Return an unsafe-to-dereference pointer to a transaction that holds or held the lock, or nullptr if the lock was available. lock_clust_rec_modify_check_and_lock(), lock_sec_rec_read_check_and_lock(), lock_clust_rec_read_check_and_lock(): Return DB_RECORD_CHANGED if innodb_strict_isolation=ON and the lock was being held by another transaction. The test case, which is based on a bug report by Zhuang Liu, covers the function lock_sec_rec_read_check_and_lock(). Reviewed by: Vladislav Lesin	2024-04-09 12:50:24 +03:00
Kristian Nielsen	d90a2b44ad	MDEV-33668: More precise dependency tracking of XA XID in parallel replication Keep track of each recently active XID, recording which worker it was queued on. If an XID might still be active, choose the same worker to queue event groups that refer to the same XID to avoid conflicts. Otherwise, schedule the XID freely in the next round-robin slot. This way, XA PREPARE can normally be scheduled without restrictions (unless duplicate XID transactions come close together). This improves scheduling and parallelism over the old method, where the worker thread to schedule XA PREPARE on was fixed based on a hash value of the XID. XA COMMIT will normally be scheduled on the same worker as XA PREPARE, but can be a different one if the XA PREPARE is far back in the event history. Testcase and code for trimming dynamic array due to Andrei. Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-04-09 11:42:34 +03:00
Marko Mäkelä	0892e6d028	MDEV-33585 The maximum innodb_log_buffer_size is too large On Microsoft Windows, ReadFile() as well as WriteFile() limit the size of the request to DWORD, which is 32 bits (at most 4 GiB - 1) also on 64-bit systems. On FreeBSD, sysctl debug.iosize_max_clamp could limit the size of a write request to INT_MAX. The size of a read request is always limited to INT_MAX. This would allow the request size to be 4095 bytes more than the Linux limit (0x7ffff000 according to "man 2 read" and "man 2 write"). On OpenBSD, Solaris and possibly NetBSD, the read request size is limited to SSIZE_T_MAX, which would be half the current maximum innodb_log_buffer_size. This should be not much of an issue anyway, because on contemporary 64-bit platforms, the virtual addresses are limited to 48 bits. IBM AIX documentation mentions OFF_MAX which would apply when a 64-bit application is running on a 32-bit kernel. Let us declare innodb_log_buffer_size as 32-bit unsigned and make the maximum 0x7ffff000, to be compatible with the least common denominator (Linux). The maximum innodb_sort_buffer_size already was 64 MiB, which is not a problem. SyncFileIO::execute(): Assert that the size of a synchronous read or write request is limited to the maximum. Reviewed by: Vladislav Vaintroub	2024-04-09 09:32:47 +03:00
Sergei Golubchik	7e3090a8a0	fix perfschema.misc when previous tests used lots of threads	2024-04-08 20:52:14 +02:00
Sergei Golubchik	50803bc456	MDEV-25614 disable failing galera test	2024-04-08 19:13:14 +02:00
Brandon Nesterenko	89c907bd4f	MDEV-33672: Gtid_log_event Construction from File Should Ensure Event Length When Using Extra Flags A GTID event can have variable length, with contributing factors such as the variable length from the flags2 and optional extra flags fields. These fields are bitmaps, where each set bit indicates an additional value that should be appended to the event, e.g. multi-engine transactions append a number to indicate the number of additional engines a transaction uses. However, if a flags bit is set, and no additional fields are appended to the event, MDEV-33672 reports that the server can still try to read from memory as if it did exist. Note, however, in debug builds, this condition is asserted for FL_EXTRA_MULTI_ENGINE. This patch fixes this to check that the length of the event is aligned with the expectation set by the flags for FL_PREPARED_XA, FL_COMPLETED_XA, and FL_EXTRA_MULTI_ENGINE. Reviewed By ============ Kristian Nielsen <knielsen@knielsen-hq.org>	2024-04-08 07:57:14 -06:00
Thirunarayanan Balathandayuthapani	188c5da72a	MDEV-32453 Bulk insert fails to apply when trigger does insert operation Reason: ======= - InnoDB fails to apply the buffered insert operation if the after insert trigger does change the same table. This behaviour leads to empty table for the subsequent insert operation and server abort. Solution: ======== - InnoDB should apply buffered insert operation if "after insert" trigger changes the same table.	2024-04-08 14:24:20 +05:30
Yuchen Pei	a73c3f1077	MDEV-21007 Do not assert auto_increment_value unless all parts open Commit `6dce6aeceb` breaks out of a loop in ha_partition::info when some partitions aren't opened, in which case auto_increment_value assertion will fail. This commit patches that hole.	2024-04-08 16:35:21 +10:00
Sergei Golubchik	e1825e39ca	increase performance-schema-max-thread-instances the value of 200 isn't enough for some tests anymore, this causes some random threads to become not instrumented and any table operations there are not reflected in the perfschema. If, say, a DROP TABLE doesn't change perfschema state, perfschema tables might show ghost tables that no longer exist in the server	2024-04-07 23:55:38 +02:00
Sergei Golubchik	54ad3b0e9e	MDEV-22949 perfschema.memory_aggregate_no_a_no_u fails sporadically in buildbot with wrong result 32-bit followup for `8bb8820df2`	2024-04-07 12:01:47 +02:00
Thirunarayanan Balathandayuthapani	9b5d711ac3	MDEV-20094 InnoDB blob allocation allocates extra extents - InnoDB reserves the free extents unnecessarily during blob page allocation even though btr_page_alloc() can handle reserving the extent when the existing ran out of pages to be used.	2024-04-05 19:55:57 +05:30
Sergei Golubchik	429fdb5bd6	MDEV-29171 disable failing galera test	2024-04-05 15:47:52 +02:00
Sergei Golubchik	96533bae54	suppress a transient galera warning these warnings are expected and are auto-resolved by galera	2024-04-05 12:40:49 +02:00
Sergei Golubchik	cb41757f02	cleanup: perfschema.threads_history improve debuggability	2024-04-05 12:40:49 +02:00
Sergei Golubchik	b067df3213	innodb.innodb_defrag_stats wait for the correct value failed on amd64-centos-stream8	2024-04-05 12:40:49 +02:00
Sergei Golubchik	a58a570c07	innodb.monitor test: wait for the correct value on a busy system it might take time for buffer_page_written_index_leaf to reach the correct value. Wait for it. also, tag identical statements to be different in the result file.	2024-04-05 12:40:49 +02:00
sjaakola	2fcf2ec229	MDEV-33749 hyphen in table name can cause galera certification failures Fix in this commit handles foreign key value appending into write set so that db and table names are converted from the filepath format to tablename format. This is compatible with key values appended from elsewhere in the code base There is a mtr test galera.galera_table_with_hyphen for regression testing Reviewer: monty@mariadb.com	2024-04-04 17:12:09 +03:00
Brandon Nesterenko	9a4991a089	MDEV-33799: mysql_manager_submit Segfault at Startup Still Possible During Recovery MDEV-26473 fixed a segmentation fault at startup between the handle manager thread and the binlog background thread, such that the binlog background thread could be started and submit a job to the handle manager, before it had initialized. Where MDEV-26473 made it so the handle manager would initialize before the main thread started the normal binary logs, it did not account for the recovery case. That is, there is still a possibility of a segmentation fault when a server is recovering using the binary logs such that it can open the binary logs, start the binlog background thread, and submit a job to the handle manager before it is initialized. This patch fixes this by moving the initialization of the mysql handler manager to happen prior to recovery. Reviewed By: ============ Andrei Elkin <andrei.elkin@mariadb.com>	2024-04-03 11:55:18 -06:00
Jan Lindström	baec63e304	MDEV-33787 : Fix Galera test failures on 10.11	2024-04-03 10:04:40 +03:00
joshhn	4987b5e3b1	MDEV-33803 Error 4162 "Operator does not exists" is incorrectly-worded "Operator does not exists" should rather read "Operator does not exist".	2024-04-03 10:03:02 +11:00
Aleksey Midenkov	c477697422	MDEV-29872 MSAN/Valgrind uninitialised value errors in TABLE::vers_switch_partition Delayed_insert has its own THD (initialized at mysql_insert()) and hence its own LEX. Delayed_insert initalizes a very few parameters for LEX and 'duplicates' is not in this list. Now we copy this missing parameter from parser LEX (as well as sql_command).	2024-04-02 00:11:35 +03:00
Aleksey Midenkov	d966e55c0a	MDEV-31903 Server crashes in _ma_reset_history upon UNLOCK table with auto-create history partitions When INSERT does auto-create for t1 all its handler instances are closed by alter_close_table(). At this time down the stack maria_close() clears share->state_history. Later when we unlock the tables Aria transaction manager accesses old share instance (the one before t1 was closed) and tries to reset its state_history. The problem is maria_close() didn't remove table from transaction's list (used_tables). The fix does _ma_remove_table_from_trnman() which is triggered by HA_EXTRA_PREPARE_FOR_RENAME.	2024-04-02 00:11:34 +03:00
Marko Mäkelä	788953463d	Merge 10.6 into 10.11 Some fixes related to commit `f838b2d799` and Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row() for system-versioned tables were provided by Nikita Malyavin. This was required by test versioning.rpl,trx_id,row.	2024-03-28 09:16:57 +02:00
Sergei Golubchik	f50694c52b	remove pointless test	2024-03-27 16:14:55 +01:00
Sergei Golubchik	dc681953cf	events in perfschema tests: use ON COMPLETION NOT PRESERVE when the execution is very slow, under valgrind, the event might manage to fire more than once, making the test to fail	2024-03-27 16:14:55 +01:00
Sergei Golubchik	8bb8820df2	MDEV-22949 perfschema.memory_aggregate_no_a_no_u fails sporadically in buildbot with wrong result perfschema aggregation, like SHOW STATUS, is only statistically correct. It doesn't use atomics for performance reasons and might miss individual increments, particularly when two connections are disconnecting at the same time. To have stable results tests should avoid doing it.	2024-03-27 16:14:55 +01:00
Marko Mäkelä	ccb7a1e9a1	Merge 10.5 into 10.6	2024-03-27 15:00:56 +02:00
Dave Gosselin	58df20974b	MDEV-33460 select '123' 'x'; unexpected result Queries that select concatenated constant strings now have colname and value that match. For example, SELECT '123' 'x'; will return a result where the column name and value both are '123x'. Review: Daniel Black	2024-03-27 15:51:26 +11:00
Daniele Sciascia	c71dc39529	MDEV-26499 Fix error "mysql_shutdown failed" during MTR tests - Fix to avoid mysqltest client getting killed abruptly during mysql_shutdown(). When Galera replication is shutdown, wait for THDs with `thd->stmt_da()->is_eof()` to disconnect (these are about to disconnect anyway). - Extract duplicate code from `wsrep_stop_replication()` and `wsrep_shutdown_replication()` in a new function. - No need to use a custom `shutdown_mysqld.inc` in galera suite. Delete it, so that the one in `mysql-test/include/` is used. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-03-27 04:31:45 +01:00
Denis Protivensky	7bf3c3124a	MDEV-33136: Properly BF-abort user transactions with explicit locks User transactions may acquire explicit MDL locks from InnoDB level when persistent statistics is re-read for a table. If such a transaction would be subject to BF-abort, it was improperly detected as a system transaction and wouldn't get aborted. The fix: Check if a transaction holding explicit MDL locks is a user transaction in the MDL conflict handling code. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-03-27 01:25:22 +01:00
Daniele Sciascia	e0c8165487	MDEV-33509 Failed to apply write set with flags=(rollback\|pa_unsafe) Fix function `remove_fragment()` in wsrep_schema so that no error is raised if the fragment to be removed is not found in the wsrep_streaming_log table. This is necessary to handle the case where streaming transaction in idle state is BF aborted. This may result in the case where the rollbacker thread successfully removes the transaction's fragments, followed by the applier's attempt to remove the same fragments. Causing the node to leave the cluster after reporting a "Failed to apply write set" error. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-03-26 05:56:37 +01:00
Marko Mäkelä	fa8a46eb68	MDEV-33613 InnoDB may still hang when temporarily running out of buffer pool By design, InnoDB has always hung when permanently running out of buffer pool, for example when several threads are waiting to allocate a block, and all of the buffer pool is buffer-fixed by the active threads. The hang that we are fixing here occurs when the buffer pool is only temporarily running out and the situation could be rescued by writing out some dirty pages or evicting some clean pages. buf_LRU_get_free_block(): Simplify the way how we wait for the buf_flush_page_cleaner thread. This fixes occasional hangs of the test encryption.innochecksum that were introduced by commit `a55b951e60` (MDEV-26827). To play it safe, we use a timed wait when waiting for the buf_flush_page_cleaner() thread to perform its job. Should that thread get stuck, we will invoke buf_pool.LRU_warn() in order to display a message that pages could not be freed, and keep trying to wake up the buf_flush_page_cleaner() thread. The INFORMATION_SCHEMA.INNODB_METRICS counters buffer_LRU_single_flush_failure_count and buffer_LRU_get_free_waits will be removed. The latter is represented by buffer_pool_wait_free. Also removed will be the message "InnoDB: Difficult to find free blocks in the buffer pool" because in `d34479dc66` we introduced a more precise message "InnoDB: Could not free any blocks in the buffer pool" in the buf_flush_page_cleaner thread. buf_pool_t::LRU_warn(): Issue the warning message that we could not free any blocks in the buffer pool. This may also be invoked by buf_LRU_get_free_block() if buf_flush_page_cleaner() appears to be stuck. buf_pool_t::n_flush_dec(): Remove. buf_pool_t::n_flush_dec_holding_mutex(): Rename to n_flush_dec(). buf_flush_LRU_list_batch(): Increment the eviction counter for blocks of temporary, discarded or dropped tablespaces. buf_flush_LRU(): Make static, and remove the constant parameter evict=false. The only caller will be the buf_flush_page_cleaner() thread. IORequest::is_LRU(): Remove. The only case of evicting pages on write completion will be when we are writing out pages of the temporary tablespace. Those pages are not in buf_pool.flush_list, only in buf_pool.LRU. buf_page_t::flush(): Remove the parameter evict. buf_page_t::write_complete(): Change the parameter "bool temporary" to "bool persistent" and add a parameter for an already read state(). Reviewed by: Debarun Banerjee	2024-03-22 14:17:39 +02:00
Marko Mäkelä	bf0b82d24b	MDEV-33515 log_sys.lsn_lock causes excessive context switching The log_sys.lsn_lock is a very contended resource with a small critical section in log_sys.append_prepare(). On many processor microarchitectures, replacing the system call based log_sys.lsn_lock with a pure spin lock would fare worse during high concurrency workloads, wasting a significant amount of CPU cycles in the spin loop. On other microarchitectures, we would see a significant amount of time being spent in native_queued_spin_lock_slowpath() in the Linux kernel, plus context switching between user and kernel address space. This was pointed out by Steve Shaw from Intel Corporation. Depending on the workload and the hardware implementation, it may be useful to use a pure spin lock in log_sys.append_prepare(). We will introduce a parameter. The statement SET GLOBAL INNODB_LOG_SPIN_WAIT_DELAY=50; would enable a spin lock that will execute that many MY_RELAX_CPU() operations (such as the x86 PAUSE instruction) between successive attempts of acquiring the spin lock. The use of a system call based log_sys.lsn_lock (which is the default setting) can be enabled by SET GLOBAL INNODB_LOG_SPIN_WAIT_DELAY=0; This patch will also introduce #ifdef LOG_LATCH_DEBUG (part of cmake -DWITH_INNODB_EXTRA_DEBUG=ON) for more accurate tracking of log_sys.latch ownership and reorganize the fields of log_sys to improve the locality of reference and to reduce the chances of false sharing. When a spin lock is being used, it will be maintained in the most significant bit of log_sys.buf_free. This is useful, because that is one of the fields that is covered by the lock. For IA-32 or AMD64, we implement the spin lock specially via log_t::lsn_lock_bts(), employing the i386 LOCK BTS instruction. A straightforward std::atomic::fetch_or() would translate into an inefficient loop around LOCK CMPXCHG. mtr_t::spin_wait_delay: The value of innodb_log_spin_wait_delay. mtr_t::finisher: Pointer to the currently used mtr_t::finish_write() implementation. This allows to avoid introducing conditional branches. We no longer invoke log_sys.is_pmem() at the mini-transaction level, but we would do that in log_write_up_to(). mtr_t::finisher_update(): Update finisher when spin_wait_delay is changed from or to 0 (the spin lock is changed to log_sys.lsn_lock or vice versa).	2024-03-22 12:29:01 +02:00
Brandon Nesterenko	75c7c6dc39	MDEV-33551: Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency When using semi-sync replication with rpl_semi_sync_master_wait_point=AFTER_COMMIT, the performance of the primary can significantly reduce compared to AFTER_SYNC's performance for workloads with many concurrent users executing transactions. This is because all connections on the primary share the same cond_wait variable/mutex pair, so any time an ACK is received from a replica, all waiting connections are awoken to check if the ACK was for itself, which is done in mutual exclusion. This patch changes this such that the waiting THD will use its own local condition variable, and the ACK receiver thread only signals connections which have been ACKed for wakeup. That is, the THD::LOCK_wakeup_ready condition variable is re-used for this purpose, and the Active_tranx queue nodes are extended to hold the waiting thread, so it can be signalled once ACKed. Additionally: 1) Removed part of MDEV-11853 additions, which allowed suspended connection threads awaiting their semi-sync ACKs to live until their ACKs had been received. This part, however, wasn't needed. That is, all that was needed was for the Ack_thread to survive. So now the connection threads are killed during phase 1. Thereby THD::is_awaiting_semisync_ack, and all its related code was removed. 2) COND_binlog_send is repurposed to signal on the condition when Active_tranx is emptied during clear_active_tranx_nodes. 3) At master shutdown (when waiting for slaves), instead of the main loop individually waiting for each ACK, await_slave_reply() (renamed await_all_slave_replies()) just waits once for the repurposed COND_binlog_send to signal it is empty. 4) Test rpl_semi_sync_shutdown_await_ack is updates as following: 4.1) Added test case (adapted from Kristian Nielsen) to ensure that if a thread awaiting its ACK is killed while SHUTDOWN WAIT FOR ALL SLAVES is issued, the primary will still wait for the ACK from the killed thread. 4.2) As connections which by-passed phase 1 of thread killing no longer are delayed for kill until phase 2, we can no longer query yes/no tx after receiving an ACK/timeout. The check for these variables is removed. 4.3) Comment descriptions are updated which mention that the connection is alive; and adjusted to be the Ack_thread. Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org>	2024-03-21 08:42:18 -06:00
Vladislav Vaintroub	a2dd4c14a3	post-fix `1c55b845e0` remove bunch of .rej and .orig files, checked in by mistake.	2024-03-20 16:03:15 +01:00
Marko Mäkelä	b8a6719889	MDEV-26642/MDEV-26643/MDEV-32898 Implement innodb_snapshot_isolation https://jepsen.io/analyses/mysql-8.0.34 highlights that the transaction isolation levels in the InnoDB storage engine do not correspond to any widely accepted definitions, such as "Generalized Isolation Level Definitions" https://pmg.csail.mit.edu/papers/icde00.pdf (PL-1 = READ UNCOMMITTED, PL-2 = READ COMMITTED, PL-2.99 = REPEATABLE READ, PL-3 = SERIALIZABLE). Only READ UNCOMMITTED in InnoDB seems to match the above definition. The issue is that InnoDB does not detect write/write conflicts (Section 4.4.3, Definition 6) in the above. It appears that as soon as we implement write/write conflict detection (SET SESSION innodb_snapshot_isolation=ON), the default isolation level (SET TRANSACTION ISOLATION LEVEL REPEATABLE READ) will become Snapshot Isolation (similar to Postgres), as defined in Section 4.2 of "A Critique of ANSI SQL Isolation Levels", MSR-TR-95-51, June 1995 https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-95-51.pdf Locking reads inside InnoDB used to read the latest committed version, ignoring what should actually be visible to the transaction. The added test innodb.lock_isolation illustrates this. The statement UPDATE t SET a=3 WHERE b=2; is executed in a transaction that was started before a read view or a snapshot of the current transaction was created, and committed before the current transaction attempts to execute UPDATE t SET b=3; If SET innodb_snapshot_isolation=ON is in effect when the second transaction was started, the second transaction will be aborted with the error ER_CHECKREAD. By default (innodb_snapshot_isolation=OFF), the second transaction would execute inconsistently, displaying an incorrect SELECT COUNT(*) FROM t in its read view. If innodb_snapshot_isolation=ON, if an attempt to acquire a lock on a record that does not exist in the current read view is made, an error DB_RECORD_CHANGED (HA_ERR_RECORD_CHANGED, ER_CHECKREAD) will be raised. This error will be treated in the same way as a deadlock: the transaction will be rolled back. lock_clust_rec_read_check_and_lock(): If the current transaction has a read view where the record is not visible and innodb_snapshot_isolation=ON, fail before trying to acquire the lock. row_sel_build_committed_vers_for_mysql(): If innodb_snapshot_isolation=ON, disable the "semi-consistent read" logic that had been implemented by myself on the directions of Heikki Tuuri in order to address https://bugs.mysql.com/bug.php?id=3300 that was motivated by a customer wanting UPDATE to skip locked rows that do not match the WHERE condition. It looks like my changes were included in the MySQL 5.1.5 commit ad126d90e019f223470e73e1b2b528f9007c4532; at that time, employees of Innobase Oy (a recent acquisition of Oracle) had lost write access to the repository. The only reason why we set innodb_snapshot_isolation=OFF by default is backward compatibility with applications, such as the one that motivated the implementation of "semi-consistent read" back in 2005. In a later major release, we can default to innodb_snapshot_isolation=ON. Thanks to Peter Alvaro, Kyle Kingsbury and Alexey Gotsman for their work on https://github.com/jepsen-io/ and to Kyle and Alexey for explanations and some testing of this fix. Thanks to Vladislav Lesin for the initial test for MDEV-26643, as well as reviewing these changes.	2024-03-20 09:48:03 +02:00
Brandon Nesterenko	ca07f62992	MDEV-33716: rpl.rpl_semi_sync_slave_enabled_consistent Fails with Error Condition Reached Though the test itself doesn't create any transactions directly, the added test suppressions are replicated, and when the SQL thread is stopped mid-execution, it is set into an error state because these are non-transactional events being aborted. This patch fixes the test by ensuring that the test suppressions are fully replicated before continuing	2024-03-19 14:16:07 -06:00
Thirunarayanan Balathandayuthapani	c3a6248bba	MDEV-33542 Inplace algorithm occupies more disk space compared to copy algorithm Problem: ======= - In case of large file size, InnoDB eagerly adds the new extent even though there are many existing unused pages of the segment. Reason is that in case of larger file size, threshold (1/8 of reserved pages) for adding new extent has been reached frequently. Solution: ========= - Try to utilise the unused pages in the segment before adding the new extent in the file segment. need_for_new_extent(): In case of larger file size, try to use the 4 * FSP_EXTENT_SIZE as threshold to allocate the new extent. fseg_alloc_free_page_low(): Rewrote the function to allocate the page in the following order. 1) Try to get the page from existing segment extent. 2) Check whether the segment needs new extent (need_for_new_extent()) and allocate the new extent, find the page. 3) Take individual page from the unused page from segment or tablespace. 4) Allocate a new extent and take first page from it. Removed FSEG_FILLFACTOR, FSEG_FRAG_LIMIT variable.	2024-03-19 18:42:45 +05:30
Marko Mäkelä	50715bd2ed	Merge 10.5 into 10.6	2024-03-18 17:07:32 +02:00
Kristian Nielsen	86a0b57689	MDEV-32976: Un-deprecate MASTER_USE_GTID=Current_Pos Remove incorrect deprecation. Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-03-15 18:18:42 +01:00
Kristian Nielsen	51abae5e46	MDEV-25923: Aria parallel repair MY_THREAD_SPECIFIC mismatch in realloc maria_repair_parallel() clears the MY_THREAD_SPECIFIC flag for allocations since it uses different threads. But it still did one _ma_alloc_buffer() call as thread-specific which would later assert if another thread needed to extend the buffer with realloc. This patch, due to Monty, removes the MY_THREAD_SPECIFIC flag for allocations that need to realloc in different threads, and preserves it for those that are allocated/freed in the user's thread. Also fixes MDEV-33562. Reviewed-by: Monty <monty@mariadb.org> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-03-15 15:55:07 +01:00
Kristian Nielsen	1fb00f37ae	MDEV-33303: slave_parallel_mode=optimistic should not report the mode's specific temporary errors An earlier patch for MDEV-13577 fixed the most common instances of this, but missed one case for tables without primary key when the scan reaches the end of the table. This patch adds similar code to handle this case, converting the error to HA_ERR_RECORD_CHANGED when doing optimistic parallel apply. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-03-15 15:55:06 +01:00
Kristian Nielsen	fb774eb1eb	Fix occasional test failure of rpl.rpl_parallel_stop_slave Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-03-15 15:04:09 +01:00
Thirunarayanan Balathandayuthapani	f5df4482e0	MDEV-33214 Table is getting rebuild with ALTER TABLE ADD COLUMN Problem: ====== - InnoDB fail to do instant operation while adding the variable length column. Problem is that InnoDB wrongly assumes that variable character length can never part of externally stored page. Solution: ======== instant_alter_column_possible(): Variable length character field can be stored as externally stored page.	2024-03-15 14:04:59 +05:30
Sergei Golubchik	55cea0c2a6	Merge branch '10.5' into 10.6	2024-03-14 19:52:08 +01:00
Sergei Golubchik	bb451d2cad	fix galera tests after `9a132d423a`	2024-03-14 19:51:09 +01:00
Sergei Golubchik	7eb6d5aa21	update s3.partition result after `57ffcd686f`	2024-03-14 11:43:13 +01:00
Thirunarayanan Balathandayuthapani	967a148966	MDEV-33635 innodb.innodb-64k-crash - Found warnings/errors in server log file - Suppress the "Difficult to find free blocks" warning globally to avoid many different test case failing. - Demote the error information in validate_first_page() to note. So first page can recovered from doublewrite buffer and can throw error in case the page wasn't found in doublewrite buffer.	2024-03-14 08:34:56 +05:30
Sergei Golubchik	f71d7f2f0f	Merge branch '10.5' into 10.6	2024-03-13 21:02:34 +01:00
Sergei Golubchik	bc46f1a7d9	cleanup: remove SEARCH_TYPE from search_pattern_in_file.inc	2024-03-13 18:27:18 +01:00
Kristian Nielsen	0a6f46965a	MDEV-33475: --gtid-ignore-duplicate can double-apply event in case of parallel replication retry When rolling back and retrying a transaction in parallel replication, don't release the domain ownership (for --gtid-ignore-duplicates) as part of the rollback. Otherwise another master connection could grab the ownership and double-apply the transaction in parallel with the retry. Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-03-13 16:59:10 +01:00
Marko Mäkelä	c3a00dfa53	Merge 10.5 into 10.6	2024-03-12 09:19:57 +02:00
mariadb-DebarunBanerjee	67abdb9f33	MDEV-33593 Auto increment deadlock error causes ASSERT in subsequent save point innodb.autoinc_debug: Correct the test case for predictable deadlock.	2024-03-11 15:59:56 +05:30
Marko Mäkelä	f703e72bd8	Merge 10.4 into 10.5	2024-03-11 10:08:20 +02:00
Monty	b3d507ff13	Suppressed new warning for rpl_get_lock on amd-freebsd and aarch64-macos	2024-03-09 12:20:58 +02:00
Kristian Nielsen	23c48474f7	MDEV-33212: mysqldump uses MASTER_LOG_POS with dump-slave The patch for MDEV-15530 incorrectly added a column in the middle of SHOW SLAVE STATUS output. This is wrong, as it breaks backwards compatibility with existing applications and scripts. In this case, it even broke mariadb-dump, which is included in the server source tree! Revert the incorrect change, putting the new Replicate_Rewrite_DB at the end of SHOW SLAVE STATUS output. Add a testcase for the mariadb-dump --dump-slave wrong output problem. Also add a testcase rpl.rpl_show_slave_status to hopefully prevent any future incorrect additions to SHOW SLAVE STATUS. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-03-08 15:23:42 +01:00
Monty	9a132d423a	MDEV-33620 Improve times and states in show processlist for replication This will makes it easier to find out what replication workers are doing and what they are waiting for. Things changed in processlist: - Slave_SQL time was not consistent. Now time for state "Slave has read all relay log; waiting for more updates" shows how long it has waited for getting the next event. - Slave_worker threads did often show "Closing tables" for a long time. Now the state is reverted to the previous state after "Closing tables" is done. - Commit and Rollback states where not shown for replication (and some other threads). Now Commit and Rollback states are always shown and the state is reverted to previous state when the Commit/Rollback have finished. Code changes: - Added thd->set_time_for_next_stage() for parallel replication when when starting to wait for prior transactions to commit, group commit, and FTWRL and for free space in thread pool. Before we reset the time only after the above events. - Moved THD_STAGE_INFO(stage_rollback) and THD_STAGE_INFO(stage_commit) from sql_parse.cc to transaction.cc to ensure this is done for all commits and not only 'normal connection queries'. Test case changes: - close_thread_tables() reverting stage to previous stage caused the counter in performance_schema to be increased. In many case it is the 'sql/starting' stage that was effected. - We only change to "Commit" stage if there is a need for a commit. This caused some "Commit" stages to disapper from perfschema reports. TODO in 11.#: - Slave_IO always showes "Waiting for master to send event" and the time is from SLAVE START. We should in 11.# change this to be the time since reading the last event.	2024-03-08 15:23:17 +02:00
mariadb-DebarunBanerjee	afe9632913	MDEV-33593 Auto increment deadlock error causes ASSERT in subsequent save point The issue here is ha_innobase::get_auto_increment() could cause a deadlock involving auto-increment lock and rollback the transaction implicitly. For such cases, storage engines usually call thd_mark_transaction_to_rollback() to inform SQL engine about it which in turn takes appropriate actions and close the transaction. In innodb, we call it while converting Innodb error code to MySQL. However, since ::innobase_get_autoinc() returns void, we skip the call for error code conversion and also miss marking the transaction for rollback for deadlock error. We assert eventually while releasing a savepoint as the transaction state is not active. Since convert_error_code_to_mysql() is handling some generic error handling part, like invoking the callback when needed, we should call that function in ha_innobase::get_auto_increment() even if we don't return the resulting mysql error code back.	2024-03-07 21:54:06 +05:30
Monty	0df4651c13	Fixed some mtr results found in Jenins after MDEV-333582 push MDEV-33582 Add more warnings to be able to better diagnose network issues - Disabled "Semisync ack receiver got hangup" warning - One could get this warning from semisync if running mtr --mysqld=log-warnings=3 rpl.rpl_semi_sync_shutdown_await_ack - Fixed result file for engines/funcs/rpl_get_lock.test	2024-03-06 15:16:03 +02:00
Thirunarayanan Balathandayuthapani	6e5333fc8c	MDEV-32445 InnoDB may corrupt its log before upgrading it on startup Problem: ======== During upgrade, InnoDB does write the redo log for adjusting the tablespace size or tablespace flags even before the log has upgraded to configured format. This could lead to data inconsistent if any crash happened during upgrade process. Fix: === srv_start(): Write the tablespace flags adjustment, increased tablespace size redo log only after redo log upgradation. log_write_low(), log_reserve_and_write_fast(): Check whether the redo log is in physical format.	2024-03-06 15:01:26 +05:30
Thirunarayanan Balathandayuthapani	738da4918d	MDEV-32346 Assertion failure sym_node->table != NULL in pars_retrieve_table_def on UPDATE - During update operation, InnoDB should avoid the initializing the FTS_DOC_ID of foreign table if the foreign table is discarded	2024-03-06 14:04:49 +05:30
Thirunarayanan Balathandayuthapani	8532dd82f1	MDEV-13765 encryption.encrypt_and_grep failed in buildbot with wrong result - Adjust the test case to check whether all tablespaces are encrypted by comparing it with existing table count.	2024-03-06 11:57:09 +05:30
Monty	567c097359	MDEV-33582 Add more warnings to be able to better diagnose network issues Warnings are added to net_server.cc when global_system_variables.log_warnings >= 4. When the above condition holds then: - All communication errors from net_serv.cc is also written to the error log. - In case of a of not being able to read or write a packet, a more detailed error is given. Other things: - Added detection of slaves that has hangup to Ack_receiver::run() - vio_close() is now first marking the socket closed before closing it. The reason for this is to ensure that the connection that gets a read error can check if the reason was that the socket was closed. - Add a new state to vio to be able to detect if vio is acive, shutdown or closed. This is used to detect if socket is closed by another thread. - Testing of the new warnings is done in rpl_get_lock.test - Suppress some of the new warnings in mtr to allow one to run some of the tests with -mysqld=--log-warnings=4. All test in the 'rpl' suite can now be run with this option. - Ensure that global.log_warnings are restored at test end in a way that allows one to use mtr --mysqld=--log-warnings=4. Reviewed-by: <serg@mariadb.org>,<brandon.nesterenko@mariadb.com>	2024-03-05 20:19:49 +02:00
Alexey Botchkov	b93252a303	MDEV-32454 JSON test has problem in view protocol. Few Item_func_json_xxx::fix_length_and_dec() functions fixed.	2024-03-02 14:58:57 +04:00
Monty	ee27bf749b	Disable mariabackup.aria_backup with msan because of timeouts	2024-03-01 12:44:32 +02:00
Brandon Nesterenko	bd604add76	MDEV-33546: Rpl_semi_sync_slave_status is ON When Replication Is Not Configured If a server has a default configuration (e.g. in a my.cnf file) with rpl_semi_sync_slave_enabled set, on server start, the corresponding rpl_semi_sync_slave_status variable will also be ON initially, even if the slave was never configured/started. This is because the Repl_semi_sync_slave initialization logic (function init_object()) sets the running status to the enabled value during init_server_components(). This patch fixes this by removing the statement which sets the semi-sync slave running status from the initialization logic. An additional change needed from this is to semi-sync recovery: this status variable was used as a condition to determine binlog truncation during server recovery. This patch also switches this condition to reference the global rpl_semi_sync_slave_enabled variable. Though note, the semi-sync recovery condition is to be changed entirely with the MDEV-33424 agenda. Reviewed By: ============ Andrei Elkin <andrei.elkin@mariadb.com>	2024-02-29 07:38:55 -07:00
Jan Lindström	41b435fea9	MDEV-33211 : Galera SST on maria-backup causes donor node to be unresponsive If mariabackup with backup locks is used on SST we do not pause and desync galera provider at all. If WSREP_MODE_BF_MARIABACKUP case provider is paused and desync at BLOCK_COMMIT phase. In other cases provider is paused and desync at BLOCK_DDL phase.	2024-02-27 20:55:54 +02:00
Monty	1c55b845e0	MDEV-32932 Port backup features from ES Added support to BACKUP STAGE to maria-backup This is a port of the code from ES 10.6 See MDEV-5336 for backup stages description. The following old options are not supported by the new code: --rsync ; This is because rsync will not work on tables that are in used. --no-backup-locks ; This is disabled as mariadb-backup will always use backup locks for better performance.	2024-02-27 20:55:54 +02:00
Monty	0c079f4f76	Updated test cases result for s3.parition MDEV-21472 ALTER TABLE ... ANALYZE PARTITION ... with EITS reads and locks all rows	2024-02-27 14:55:47 +02:00
mariadb-DebarunBanerjee	969669767b	MDEV-33011 mariabackup --backup: FATAL ERROR: ... Can't open datafile cool_down/t3 The root cause is the WAL logging of file operation when the actual operation fails afterwards. It creates a situation with a log entry for a operation that would always fail. I could simulate both the backup scenario error and Innodb recovery failure exploiting the weakness. We are following WAL for file rename operation and once logged the operation must eventually complete successfully, or it is a major catastrophe. Right now, we fail for rename and handle it as normal error and it is the problem. I created a patch to address RENAME operation to a non existing schema where the destination schema directory is missing. The patch checks for the missing schema before logging in an attempt to avoid the failure after WAL log is written/flushed. I also checked that the schema cannot be dropped or there cannot be any race with other rename to the same file. This is protected by the MDL lock in SQL today. The patch should this be a good improvement over the current situation and solves the issue at hand.	2024-02-27 17:59:20 +05:30
Thirunarayanan Balathandayuthapani	57cc8605eb	MDEV-19044 Alter table corrupts while applying the modification log Problem: ======== - InnoDB reads the length of the variable length field wrongly while applying the modification log of instant table. Solution: ======== rec_init_offsets_comp_ordinary(): For the temporary instant file record, InnoDB should read the length of the variable length field from the record itself.	2024-02-27 12:59:46 +05:30
Thirunarayanan Balathandayuthapani	2c5f3bbe78	MDEV-14193 innodb.log_file_name failed in buildbot with exception Problem: ======= - innodb.log_file_name fails if it executes after innodb.lock_insert_into_empty in few cases. innodb.lock_insert_into_empty test case failed to cleanup the table t2. Rollback of create..select fails to remove the table when it fails to acquire the innodb statistics table. This leads to rename table in log_file_name test case fails. Solution: ======== - Cleanup the table t2 explictly after resetting innodb_lock_wait_timeout variable in innodb.lock_insert_into_empty test case.	2024-02-26 19:04:14 +05:30
Alexander Barkov	e63311c2cf	MDEV-33496 Out of range error in AVG(YEAR(datetime)) due to a wrong data type Functions extracting non-negative datetime components: - YEAR(dt), EXTRACT(YEAR FROM dt) - QUARTER(td), EXTRACT(QUARTER FROM dt) - MONTH(dt), EXTRACT(MONTH FROM dt) - WEEK(dt), EXTRACT(WEEK FROM dt) - HOUR(dt), - MINUTE(dt), - SECOND(dt), - MICROSECOND(dt), - DAYOFYEAR(dt) - EXTRACT(YEAR_MONTH FROM dt) did not set their max_length properly, so in the DECIMAL context they created a too small DECIMAL column, which led to the 'Out of range value' error. The problem is that most of these functions historically returned the signed INT data type. There were two simple ways to fix these functions: 1. Add +1 to max_length. But this would also change their size in the string context and create too long VARCHAR columns, with +1 excessive size. 2. Preserve max_length, but change the data type from INT to INT UNSIGNED. But this would break backward compatibility. Also, using UNSIGNED is generally not desirable, it's better to stay with signed when possible. This fix implements another solution, which it makes all these functions work well in all contexts: int, decimal, string. Fix details: - Adding a new special class Type_handler_long_ge0 - the data type handler for expressions which: * should look like normal signed INT * but which known not to return negative values Expressions handled by Type_handler_long_ge0 store in Item::max_length only the number of digits, without adding +1 for the sign. - Fixing Item_extract to use Type_handler_long_ge0 for non-negative datetime components: YEAR, YEAR_MONTH, QUARTER, MONTH, WEEK - Adding a new abstract class Item_long_ge0_func, for functions returning non-negative datetime components. Item_long_ge0_func uses Type_handler_long_ge0 as the type handler. The class hierarchy now looks as follows: Item_long_ge0_func Item_long_func_date_field Item_func_to_days Item_func_dayofmonth Item_func_dayofyear Item_func_quarter Item_func_year Item_long_func_time_field Item_func_hour Item_func_minute Item_func_second Item_func_microsecond - Cleanup: EXTRACT(QUARTER FROM dt) created an excessive VARCHAR column in string context. Changing its length from 2 to 1.	2024-02-23 18:30:06 +04:00
Thirunarayanan Balathandayuthapani	e66928ab28	MDEV-33462 Server aborts while altering an InnoDB statistics table Problem: ======= - When online alter of InnoDB statistics table happens, any transaction which updates the statistics table has to read the undo log and log the DML changes during transaction commit. Applying undo log (UndorecApplier::apply_undo_rec) requires a shared lock on dictionary cache but dict_stats_save() already holds write lock on dictionary cache. This leads to abort of server during commit of statistics table changes. Solution: ======== - Disallow LOCK=NONE operation for the InnoDB statistics table. The reasoning is that statistics tables are typically rather small, so any blocking would be rather short. Writes to the statistics tables should be a rare operation.	2024-02-22 16:57:04 +05:30
Marko Mäkelä	5b1406ff30	Merge 10.6 into 10.11	2024-02-21 13:08:23 +02:00
Brandon Nesterenko	b04c857596	MDEV-33500: rpl.rpl_parallel_sbm can fail on slow machines, e.g. MSAN/Valgrind builders In an addition to test rpl.rpl_parallel_sbm added by MDEV-32265, the test uses sleep statements alone to test Seconds_Behind_Master with delayed replication. On slow running machines, the test can pass the intended MASTER_DELAY duration and Seconds_Behind_Master can become 0, when the test expects the transaction to still be actively in a delaying state. This can be consistently reproduced by adding a sleep statement before the call to --let = query_get_value(SHOW SLAVE STATUS, Seconds_Behind_Master, 1) to sleep past the delay end point. This patch fixes this by locking the table which the delayed transaction targets so Second_Behind_Master cannot be updated before the test reads it for validation.	2024-02-20 08:19:18 -07:00
Thirunarayanan Balathandayuthapani	903ae30069	MDEV-30655 IMPORT TABLESPACE fails with column count or index count mismatch Problem: ======== Currently import operation fails with schema mismatch when cfg file has hidden fts document id and hidden fts document index. Fix: ==== To fix this issue, simply add the fts doc id column, indexes in table definition and try to import the table. In case of success: 1) update the fts document id in sys columns. 2) update the number of columns in sys tables. 3) insert the new fts index entry in sys indexes table and sys fields. 4) Reload the table with new table definition	2024-02-20 19:48:25 +05:30
Kristian Nielsen	c73c6aea63	MDEV-33426: Aria temptables wrong thread-specific memory accounting in slave thread Aria temporary tables account allocated memory as specific to the current THD. But this fails for slave threads, where the temporary tables need to be detached from any specific THD. Introduce a new flag to mark temporary tables in replication as "global", and use that inside Aria to not account memory allocations as thread specific for such tables. Based on original suggestion by Monty. Reviewed-by: Monty <monty@mariadb.org> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-02-16 12:48:30 +01:00
Marko Mäkelä	53c6c823dc	MDEV-33464 Crash when innodb_max_undo_log_size is set to innodb_page_size*4294967296 purge_sys_t::truncating_tablespace(): Clamp the innodb_max_undo_log_size to the maximum number of pages before converting the result into a 32-bit unsigned integer. This fixes up commit `f8c88d905b` (MDEV-33213). In later major versions, we would use 32-bit unsigned integer here due to commit `ca501ffb04` and the code would crash also on 64-bit processors. Reviewed by: Debarun Banerjee	2024-02-15 12:34:04 +02:00
Marko Mäkelä	64cce8d5bf	Merge 10.6 into 10.11	2024-02-14 16:12:53 +02:00
Alexey Botchkov	85517f609a	MDEV-33393 audit plugin do not report user did the action.. The '<replication_slave>' user is assigned to the slave replication thread so this name appears in the auditing logs.	2024-02-14 00:02:29 +04:00
Marko Mäkelä	691f923906	Merge 10.5 into 10.6	2024-02-13 20:42:59 +02:00
Marko Mäkelä	b770633e07	Merge 10.4 into 10.5	2024-02-13 14:25:21 +02:00
Brandon Nesterenko	4fbd2e8573	MDEV-31768: Alias MASTER_DEMOTE_TO_REPLICA for MASTER_DEMOTE_TO_SLAVE Per MDEV-20601, REPLICA should be an alias for SLAVE in SQL statements. Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org> Andrei Elkin <andrei.elkin@mariadb.com>	2024-02-13 14:00:42 +11:00
Marko Mäkelä	ca88eac835	MDEV-30528 CREATE FULLTEXT INDEX assertion failure WITH SYSTEM VERSIONING ha_innobase::check_if_supported_inplace_alter(): Require ALGORITHM=COPY when creating a FULLTEXT INDEX on a versioned table. row_merge_buf_add(), row_merge_read_clustered_index(): Remove the parameter or local variable history_fts that had been added in the attempt to fix MDEV-25004. Reviewed by: Thirunarayanan Balathandayuthapani Tested by: Matthias Leich	2024-02-12 16:52:55 +01:00
Marko Mäkelä	92f87f2cf0	Cleanup: Remove changed_pages_bitmap The innodb_changed_pages plugin only was part of XtraDB, never InnoDB. It would be useful for incremental backups. We will remove the code from mariadb-backup for now, because it cannot serve any useful purpose until the server part has been implemented.	2024-02-12 17:01:35 +02:00
Monty	3907345e22	MDEV-33306 Optimizer choosing incorrect index in 10.6, 10.5 but not in 10.4 In MariaDB up to 10.11, the test_if_cheaper_ordering() code (that tries to optimizer how GROUP BY is executed) assumes that if a table scan is used then if there is any index usable by GROUP BY it will be used. The reason MySQL 10.4 provides a better plan is because of two differences: - Plans using 'ref' has a cost of 1/10 of what it should be (as a protection against table scans). This is why 'ref' is used in 10.4 and not in 10.5. - When 'ref' is used, then GROUP BY will not use an index for GROUP BY. In MariaDB 10.5 the chosen plan is a table scan (as it calculated to be faster) but as 'ref' is not used, the test_if_cheaper_ordering() optimizer phase decides (as ref is not usd) to use an index for GROUP BY, which has bad performance. Description of fix: - All new code is protected by the "optimizer_adjust_secondary_key_costs" variable, which is now a bit map, and is only executed if the option "disable_forced_index_in_group_by" set. - Corrects GROUP BY handling in test_if_cheaper_ordering() by making the choise of using and index with GROUP BY cost based instead of rule based. - Adds TIME_FOR_COMPARE to all costs, when using group by, to make read_time, index_scan_time and range_cost comparable. Other things: - Made optimizer_adjust_secondary_key_costs a bit map (compatible with old code). Notes: Current code ignores costs for the algorithm used when doing GROUP BY on the first table: - Create an in-memory temporary table for handling group by and doing a filesort of the result file We can probably in 10.6 continue to ignore this cost. This patch should NOT be merged to 11.0 series (not needed in 11.0).	2024-02-12 16:43:00 +02:00
Brandon Nesterenko	03d1346e7f	MDEV-29369: rpl.rpl_semi_sync_shutdown_await_ack fails regularly with Result content mismatch This test was prone to failures for a few reasons, summarized below: 1) MDEV-32168 introduced “only_running_threads=1” to slave_stop.inc, which allowed the stop logic to bypass an attempting-to-reconnect IO thread. That is, the IO thread could realize the master shutdown in `read_event()`, and thereby call into `try_to_reconnect()`. This would leave the IO thread up when the test expected it to be stopped. Fixed by explicitly stopping the IO thread and allowing an error state, as the above case would lead to errno 2003. 2) On slow systems (or those running profiling tools, e.g. MSAN), the waiting-for-ack transaction can complete before the system processes the `SHUTDOWN WAIT FOR ALL SLAVES`. There was shutdown preparation logic in-between the transaction and shutdown itself, which contributes to this problem. This patch also moves this preparation logic before the transaction, so there is less to do in-between the calls. 3) Changed work-around for MDEV-28141 to use debug_sync instead of sleep delay, as it was still possible to hit the bug on very slow systems. 4) Masked MTR variable reset with disable/enable query log Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org>	2024-02-12 05:48:18 -07:00
Brandon Nesterenko	ee89558312	MDEV-14357: rpl.rpl_domain_id_filter_io_crash failed in buildbot with wrong result A race condition with the SQL thread, where depending on if it was killed before or after it had executed the fake/generated IGN_GTIDS Gtid_list_log_event, may or may not update gtid_slave_pos with the position of the ignored events. Then, the slave would be restarted while resetting IGNORE_DOMAIN_IDS to be empty, which would result in the slave requesting different starting locations, depending on whether or not gtid_slave_pos was updated. And, because previously ignored events could now be requested and executed (no longer ignored), their presence would fail the test. This patch fixes this in two ways. First, to use GTID positions for synchronization rather than binlog file positions. Then second, to synchronize the SQL thread’s gtid_slave_pos with the ignored events before killing the SQL thread. To consistently reproduce the test failure, the following patch can be applied: diff --git a/sql/log_event_server.cc b/sql/log_event_server.cc index f51f5b7deec..de62233acff 100644 --- a/sql/log_event_server.cc +++ b/sql/log_event_server.cc @@ -3686,6 +3686,12 @@ Gtid_list_log_event::do_apply_event(rpl_group_info rgi) void hton= NULL; uint32 i; + sleep(1); + if (rli->sql_driver_thd->killed \|\| rli->abort_slave) + { + return 0; + } + Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org>	2024-02-12 05:48:18 -07:00
Marko Mäkelä	8ec12e0d6d	Merge 10.4 into 10.5	2024-02-12 11:38:13 +02:00
Oleksandr Byelkin	d816a5ca32	fix test	2024-02-09 10:26:46 +01:00
Marko Mäkelä	86c2c89743	Merge 10.6 into 10.11	2024-02-08 15:04:46 +02:00
Marko Mäkelä	77b4399545	MDEV-33421 innodb.corrupted_during_recovery fails due to error that the table is corrupted This fixes up the merge commit `7e39470e33` dict_table_open_on_name(): Report ER_TABLE_CORRUPT in a consistent fashion, with a pretty-printed table name.	2024-02-08 14:20:42 +02:00
Marko Mäkelä	466069b184	Merge 10.5 into 10.6	2024-02-08 10:38:53 +02:00
Marko Mäkelä	0381921e26	MDEV-33277 In-place upgrade causes invalid AUTO_INCREMENT values MDEV-33308 CHECK TABLE is modifying .frm file even if --read-only As noted in commit `d0ef1aaf61`, MySQL as well as older versions of MariaDB server would during ALTER TABLE ... IMPORT TABLESPACE write bogus values to the PAGE_MAX_TRX_ID field to pages of the clustered index, instead of letting that field remain 0. In commit `8777458a6e` this field was repurposed for PAGE_ROOT_AUTO_INC in the clustered index root page. To avoid trouble when upgrading from MySQL or older versions of MariaDB, we will try to detect and correct bogus values of PAGE_ROOT_AUTO_INC when opening a table for the first time from the SQL layer. btr_read_autoinc_with_fallback(): Add the parameters to mysql_version,max to indicate the TABLE_SHARE::mysql_version of the .frm file and the maximum value allowed for the type of the AUTO_INCREMENT column. In case the table was originally created in MySQL or an older version of MariaDB, read also the maximum value of the AUTO_INCREMENT column from the table and reset the PAGE_ROOT_AUTO_INC if it is above the limit. dict_table_t::get_index(const dict_col_t &) const: Find an index that starts with the specified column. ha_innobase::check_for_upgrade(): Return HA_ADMIN_FAILED if InnoDB needs upgrading but is in read-only mode. In this way, the call to update_frm_version() will be skipped. row_import_autoinc(): Adjust the AUTO_INCREMENT column at the end of ALTER TABLE...IMPORT TABLESPACE. This refinement was suggested by Debarun Banerjee. The changes outside InnoDB were developed by Michael 'Monty' Widenius: Added print_check_msg() service for easy reporting of check/repair messages in ENGINE=Aria and ENGINE=InnoDB. Fixed that CHECK TABLE do not update the .frm file under --read-only. Added 'handler_flags' to HA_CHECK_OPT as a way for storage engines to store state from handler::check_for_upgrade(). Reviewed by: Debarun Banerjee	2024-02-08 10:35:45 +02:00
Marko Mäkelä	85db534731	MDEV-33400 Adaptive hash index corruption after DISCARD TABLESPACE row_discard_tablespace(): Do not invoke dict_index_t::clear_instant_alter() because that would corrupt any adaptive hash index entries in the table. row_import_for_mysql(): Invoke dict_index_t::clear_instant_alter() after detaching any adaptive hash index entries.	2024-02-08 09:17:47 +01:00
Marko Mäkelä	b2654ba826	MDEV-32899 InnoDB is holding shared dict_sys.latch while waiting for FOREIGN KEY child table lock on DDL lock_table_children(): A new function to lock all child tables of a table. We will only hold dict_sys.latch while traversing dict_table_t::referenced_set. To prevent a race condition with std::set::erase() we will copy the pointers to the child tables to a local vector. Once we have acquired MDL and references to all child tables, we can safely release dict_sys.latch, wait for the locks, and finally release the references. dict_acquire_mdl_shared(): A new variant that takes mdl_context as a parameter. lock_table_for_trx(): Assert that we are not holding dict_sys.latch. ha_innobase::truncate(): When foreign_key_checks=ON, assert that no child tables exist (other than the current table). In any case, we will invoke lock_table_children() so that the child table metadata can be safely updated. (It is possible that a child table is being created concurrently with TRUNCATE TABLE.) ha_innobase::delete_table(): Before and after acquiring exclusive locks on the current table as well as all child tables, check that FOREIGN KEY constraints will not be violated. In this way, we can reject impossible DROP TABLE without having to wait for locks first. This fixes up commit `2ca1123464` (MDEV-26217) and commit `c3c53926c4` (MDEV-26554).	2024-02-08 14:22:35 +11:00
mariadb-DebarunBanerjee	5e7047067e	MDEV-33274 The test encryption.innodb-redo-nokeys often fails If we fail to open a tablespace while looking for FILE_CHECKPOINT, we set the corruption flag. Specifically, if encryption key is missing, we would not be able to open an encrypted tablespace and the flag could be set. We miss checking for this flag and report "Missing FILE_CHECKPOINT" Address review comment to improve the test. Flush pages before starting no-checkpoint block. It should improve the number of cases where the test is skipped because some intermediate checkpoint is triggered.	2024-02-08 08:13:16 +05:30
mariadb-DebarunBanerjee	fb9da7f751	MDEV-33023 Crash in mariadb-backup --prepare --export after --prepare mariadb-backup with --prepare option could result in empty redo log file. When --prepare is followed by --prepare --export, we exit early in srv_start function without opening the ibdata1 tablespace. Later while trying to read rollback segment header page, we hit the debug assert which claims that the system space should already have been opened. There are two assert cases here. Issue-1: System tablespace object is not there in fil space hash i.e. srv_sys_space.open_or_create() is not called. Issue-2: The system tablespace data file ibdata1 is not opened i.e. fil_system.sys_space->open() is not called. Fix: For empty redo log and restore operation, open system tablespace before returning.	2024-02-07 23:12:15 +05:30
Marko Mäkelä	8d54d173d7	Cleanup: Remove ut_format_name() This follows up commit `383f77cd84` which simplified dict_table_schema_check(). Note: We can display quoted names like this: my_snprintf(buf, sizeof buf, "%`.*s.%`s", int(t->name.dblen()), t->name.m_name, t->name.basename());	2024-02-07 13:56:31 +02:00
Marko Mäkelä	91a2192bf2	Merge 10.5 into 10.6	2024-02-07 13:51:03 +02:00
Vlad Lesin	f5373db898	MDEV-33004 innodb.cursor-restore-locking test fails THE FIX MUST NOT BE MERGED TO 10.6+, BECAUSE 10.6+ IS NOT AFFECTED! The test is waiting for delete-marked record purging. But this does not happen under the following conditions: 1. "START TRANSACTION WITH CONSISTENT SNAPSHOT" - is active, has not been rolled back yet 2. "DELETE FROM t WHERE b = 20 # trx_1" - is committed 3. "INSERT INTO t VALUES(10, 20) # trx_2" - hanging on "ib_after_row_insert" sync point, waiting for "first_ins_cont" signal 4. "DELETE FROM t WHERE b = 20 # trx_3" - blocked on delete-marked by trx_1 record, waiting for trx_2 5. connection "default" is waiting on 'now WAIT_FOR row_purge_del_mark_finished' purge_coordinator_callback_low() sets purge_state.m_history_length= srv_do_purge(&n_total_purged); even if nothing was purged, like in our case. Nothing was purged because transaction with consistent snapshot was still alive during purging procedure. Then purge_coordinator_timer_callback() does not wake purge thread if the following condition is true: purge_state.m_history_length == trx_sys.rseg_history_len The above condition is true for our case, because we are waiting for delete-marked record purging, and trx_sys.rseg_history_len does not grow. Only 10.5 is affected, because there is no such condition in 10.6, i.e. purge thread is woken up even if history size was not changed during purge coordinator thread suspending. The easiest way to fix it is just to remove the test from 10.5.	2024-02-07 12:35:18 +02:00
Thirunarayanan Balathandayuthapani	c31b1ee26a	MDEV-33341 innodb.undo_space_dblwr test case fails with Unknown Storage Engine InnoDB - Failed to reset the innodb_fil_make_page_dirty_debug variable in innodb_saved_page_number_debug_basic test case.	2024-02-07 12:35:18 +02:00
Oleksandr Byelkin	8e7314992f	Merge branch '10.5' into mariadb-10.5.24	2024-02-06 18:29:14 +01:00
Oleksandr Byelkin	8adc759988	Merge branch '10.4' into mariadb-10.4.33	2024-02-06 15:58:12 +01:00
mariadb-DebarunBanerjee	66bb229e91	MDEV-18288 Transportable Tablespaces leave AUTO_INCREMENT in mismatched state, causing INSERT errors in newly imported tables when .cfg is not used. During import, if cfg file is not specified, we don't update the autoinc field in innodb dictionary object dict_table_t. The next insert tries to insert from the starting position of auto increment and fails. It can be observed that the issue is resolved once server is restarted as the persistent value is read correctly from PAGE_ROOT_AUTO_INC from index root page. The patch fixes the issue by reading the the auto increment value directly from PAGE_ROOT_AUTO_INC during import if cfg file is not specified. Test Fix: 1. import_bugs.test: Embedded mode warning has absolute path. Regular expression replacement in test. 2. full_crc32_import.test: Table level auto increment mismatch after import. It was using the auto increment data from the table prior to discard and import which is not right. This value has cached auto increment value higher than the actual inserted value and value stored in PAGE_ROOT_AUTO_INC. Updated the result file and added validation for checking the maximum value of auto increment column.	2024-02-06 13:45:30 +05:30
Igor Babaev	6fadbf8ebf	MDEV-31361 Wrong result on 2nd execution of PS for query with derived table This bug led to wrong result sets returned by the second execution of prepared statements from selects using mergeable derived tables pushed into external engine. Such derived tables are always materialized. The decision that they have to be materialized is taken late in the function mysql_derived_optimized(). For regular derived tables this decision is usually taken at the prepare phase. However in some cases for some derived tables this decision is made in mysql_derived_optimized() too. It can be seen in the code of mysql_derived_fill() that for such a derived table it's critical to change its translation table to tune it to the fields of the temporary table used for materialization of the derived table and this must be done after each refill of the derived table. The same actions are needed for derived tables pushed into external engines. Approved by Oleksandr Byelkin <sanja@mariadb.com>	2024-02-04 12:05:36 -08:00
Sergei Golubchik	87e13722a9	Merge branch '10.6' into 10.11	2024-02-01 18:36:14 +01:00
Alexander Barkov	83aff675ce	MDEV-33355 Add a Galera-2-node-to-MariaDB replication MTR test cloning the slave with mariadb-backup Replication from a 2-node Galera cluster to a regular MariaDB server. Cloning the slave using mariadb-backup.	2024-02-01 18:28:32 +04:00
Alexander Barkov	8fbad58731	MDEV-33342 Add a replication MTR test cloning the slave with mariadb-backup	2024-02-01 12:28:00 +04:00
Brandon Nesterenko	dd95c58b58	MDEV-33331: IO Thread Relay Log Inconsistent Statistics After MDEV-32551 After MDEV-32551, in a master/slave setup, if the replica's IO thread quickly and successively reconnects (i.e quickly running STOP SLAVE IO_THREAD followed by START SLAVE IO_THREAD), the relay log rotation behavior changes. That is, MDEV-32551 changed the logic of the binlog_dump_thread on the primary, such that it can stop itself before sending any events if it sees a new connection has been created to a replica with the same server_id. Pre MDEV-32551, the connection would establish and it would send a "fake" rotate event to populate the log name. Post MDEV-32551, the connection stops itself, and a rotate event is not sent. This made the test rpl.rpl_mariadb_slave_capability unstable because it is reliant on the name of the relay logs (which is dependent on the number of rotates); and the pre-amble of the test would quickly start/stop the IO thread. There a binlog dump thread could end itself before sending a rotate event to the replica, thereby changing the name of the relay log. This patch fixes this by adding in a synchronization in-between IO thread restarts, such that it waits for the primary's binlog dump threads to sync up with the state of the replica.	2024-01-31 22:18:31 +01:00
Sergei Golubchik	3f6038bc51	Merge branch '10.5' into 10.6	2024-01-31 18:04:03 +01:00
Sergei Golubchik	01f6abd1d4	Merge branch '10.4' into 10.5	2024-01-31 17:32:53 +01:00
Sergei Golubchik	46e3a7658b	funcs_1.innodb_views times out in --ps	2024-01-31 17:07:46 +01:00
Nikita Malyavin	68c1fbfc17	MDEV-25370 Update for portion changes autoincrement key in bi-temp table According to the standard, the autoincrement column (i.e. identity column) should be advanced each insert implicitly made by UPDATE/DELETE ... FOR PORTION. This is very unconvenient use in several notable cases. Concider a WITHOUT OVERLAPS key with an autoinc column: id int auto_increment, unique(id, p without overlaps) An update or delete with FOR PORTION creates a sense that id will remain unchanged in such case. The standard's IDENTITY reminds MariaDB's AUTO_INCREMENT, however the generation rules differ in many ways. For example, there's also a notion autoincrement index, which is bound to the autoincrement field. We will define our own generation rule for the PORTION OF operations involving AUTO_INCREMENT: * If an autoincrement index contains WITHOUT OVERLAPS specification, then a new value should not be generated, otherwise it should. Apart from WITHOUT OVERLAPS there is also another notable case, referred by the reporter - a unique key that has an autoincrement column and a field from the period specification: id int auto_increment, unique(id, s), period for p(s, e) for this case, no exception is made, and the autoincrementing rules will be proceeded accordung to the standard (i.e. the value will be advanced on implicit inserts).	2024-01-31 16:03:38 +01:00
Thirunarayanan Balathandayuthapani	21f18bd9d7	MDEV-33341 innodb.undo_space_dblwr test case fails with Unknown Storage Engine InnoDB Reason: ====== undo_space_dblwr test case fails if the first page of undo tablespace is not flushed before restart the server. While restarting the server, InnoDB fails to detect the first page of undo tablespace from doublewrite buffer. Fix: === Use "ib_log_checkpoint_avoid_hard" debug sync point to avoid checkpoint and make sure to flush the dirtied page before killing the server. innodb_make_page_dirty(): Fails to set srv_fil_make_page_dirty_debug variable.	2024-01-31 15:55:09 +05:30
Denis Protivensky	f4ee7c110c	MDEV-22232 Fix test after changing behavior of ALTER DROP FOREIGN KEY Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-01-30 15:47:47 +01:00
Monty	57ffcd686f	MDEV-21472: ALTER TABLE ... ANALYZE PARTITION ... with EITS reads and locks all rows This was fixed in 10.2 in 2020 but merging the code to 10.3 caused the bug to come back.	2024-01-30 09:19:01 +02:00
Brandon Nesterenko	c75905cacb	MDEV-33327: rpl_seconds_behind_master_spike Sensitive to IO Thread Stop Position rpl.rpl_seconds_behind_master_spike uses the DEBUG_SYNC mechanism to count how many format descriptor events (FDEs) have been executed, to attempt to pause on a specific relay log FDE after executing transactions. However, depending on when the IO thread is stopped, it can send an extra FDE before sending the transactions, forcing the test to pause before executing any transactions, resulting in a table not existing, that is attempted to be read for COUNT. This patch fixes this by no longer counting FDEs, but rather by programmatically waiting until the SQL thread has executed the transaction and then automatically activating the DEBUG_SYNC point to trigger at the next relay log FDE.	2024-01-30 06:58:44 +01:00
Jan Lindström	f8fa3c55c6	MDEV-33173 : Galera test case galera_sr_kill_slave_before_apply unstable Add wait_condition to make sure tables are created before next operations. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-01-30 00:28:33 +01:00
Jan Lindström	ddb27a29b1	MDEV-33172 : Galera test case galera_mdl_race unstable Add wait_condition between debug sync SIGNAL points and other expected state conditions and refactor actual sync point for easier to use in test case. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-01-30 00:27:37 +01:00
Jan Lindström	5b4456b38a	MDEV-33036 : Galera test case galera_3nodes.galera_ist_gcache_rollover has warning Correct used configuration and force server restarts before test case. Add wait condition instead of sleep to verify that all expected nodes are back to cluster. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-01-30 00:26:13 +01:00
Jan Lindström	49fa5f6b5f	MDEV-33138 : Galera test case MW-336 unstable Add more inserts before wsrep_slave_threads is set to 1 and add wait_condition to wait all of them are replicated before wait_condition about number of wsrep_slave_threads. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-01-30 00:22:42 +01:00
Jan Lindström	736e429320	MDEV-32635: galera_shutdown_nonprim: mysql_shutdown failed Add wait_condition after cluster membership change Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-01-30 00:22:23 +01:00
Brandon Nesterenko	e4f221a5f2	MDEV-33327: rpl_seconds_behind_master_spike Sensitive to IO Thread Stop Position rpl.rpl_seconds_behind_master_spike uses the DEBUG_SYNC mechanism to count how many format descriptor events (FDEs) have been executed, to attempt to pause on a specific relay log FDE after executing transactions. However, depending on when the IO thread is stopped, it can send an extra FDE before sending the transactions, forcing the test to pause before executing any transactions, resulting in a table not existing, that is attempted to be read for COUNT. This patch fixes this by no longer counting FDEs, but rather by programmatically waiting until the SQL thread has executed the transaction and then automatically activating the DEBUG_SYNC point to trigger at the next relay log FDE.	2024-01-29 15:17:57 -07:00
Jan Lindström	c768ac6208	MDEV-25731 : Assertion `mode_ == m_local' failed in wsrep::client_state::streaming_params() Problem was that if wsrep_load_data_splitting was used streaming replication (SR) parameters were set for MyISAM table. Galera does not currently support SR for MyISAM. Fix is to ignore wsrep_load_data_splitting setting (with warning) if table is not InnoDB table. This is 10.4-10.5 case of fix. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-01-29 06:34:46 +01:00
Jan Lindström	daaa16a47f	MDEV-25089 : Assertion `error.len > 0' failed in galera::ReplicatorSMM::handle_apply_error() Problem is that Galera starts TOI (total order isolation) i.e. it sends query to all nodes. Later it is discovered that used engine or other feature is not supported by Galera. Because TOI is executed parallelly in all nodes appliers could execute given TOI and ignore the error and start inconsistency voting causing node to leave from cluster or we might have a crash as reported. For example SEQUENCE engine does not support GEOMETRY data type causing either inconsistency between nodes (because some errors are ignored on applier) or crash. Fixed my adding new function wsrep_check_support to check can Galera support provided CREATE TABLE/SEQUENCE before TOI is started and if not clear error message is provided to the user. Currently, not supported cases: * CREATE TABLE ... AS SELECT when streaming replication is used * CREATE TABLE ... WITH SYSTEM VERSIONING AS SELECT * CREATE TABLE ... ENGINE=SEQUENCE * CREATE SEQUENCE ... ENGINE!=InnoDB * ALTER TABLE t ... ENGINE!=InnoDB where table t is SEQUENCE Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-01-29 06:34:46 +01:00

... 2 3 4 5 6 ...

17892 commits