mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-28 17:54:16 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	98dbe3bfaf	Merge 10.5 into 10.6	2025-01-20 09:57:37 +02:00
Brandon Nesterenko	d8c841d0d4	MDEV-35096: History is stored in different partitions on different nodes when using SYSTEM VERSION Row-injection updates don’t correctly set the historical partition for tables with system versioning and system_time partitions. This results in inconsistencies between the master and slave when replicating transactions that target such tables (i.e. the primary server would correctly distribute archived rows amongst its partitions, whereas the replica would have all archived rows in a single partition). The function partition_info::vers_set_hist_part(THD) is used to set the partition; however, its initial check for vers_require_hist_part(THD) returns false, bypassing the rest of the function (which sets up the partition to use). This is because the actual check uses the LEX sql_command (via LEX::vers_history_generating()) to determine if the command is valid to generate history. Row injections don’t have sql_commands though. This patch provides a fix which extends the check in vers_history_generating() to additionally allow row injections to be history generating (via the function LEX::is_stmt_row_injection()). Special thanks to Jan Lindstrom <jan.lindstrom@galeracluster.com> for his work in reproducing the bug, and providing an initial test case. Reviewed By ============ Kristian Nielsen <knielsen@knielsen-hq.org> Aleksey Midenkov <midenok@mariadb.com>	2025-01-13 15:59:07 -07:00
Marko Mäkelä	b251cb6a4f	Merge 10.5 into 10.6	2025-01-08 08:48:21 +02:00
Sergei Golubchik	9508a44c37	enforce no trailing \n in Diagnostic_area messages that is in my_error(), push_warning(), etc	2025-01-07 16:31:39 +01:00
Sergei Golubchik	6abbfdef7a	sporadic failures of binlog_encryption.rpl_parallel_gco_wait_kill CURRENT_TEST: binlog_encryption.rpl_parallel_gco_wait_kill mysqltest: In included file "./suite/rpl/t/rpl_parallel_gco_wait_kill.test": included from /home/buildbot/amd64-ubuntu-2004-debug/build/mysql-test/suite/binlog_encryption/rpl_parallel_gco_wait_kill.test at line 2: At line 334: Can't initialize replace from 'replace_result $thd_id THD_ID' An sql thread can reach the "Slave has read all relay log" state and then start reading relay log again. Let's use a more generic pattern to retrieve the sql thread ID even if it's not in the "read all relay log" state.	2025-01-05 16:40:12 +02:00
Monty	88d9348dfc	Remove dates from all rdiff files	2025-01-05 16:40:11 +02:00
Monty	87ee1e75bc	MDEV-35643 Add support for MySQL 8.0 binlog events MDEV-29533 Crash when MariaDB is replica of MySQL 8.0 MySQL 8.0 has added the following new events in the MySQL binary log PARTIAL_UPDATE_ROWS_EVENT TRANSACTION_PAYLOAD_EVENT HEARTBEAT_LOG_EVENT_V2 - PARTIAL_UPDATE_ROWS_EVENT is used by MySQL to generate update statements using JSON_SET, JSON_REPLACE and JSON_REMOVE to make update of JSON columns more efficient. These events can be disabled by setting 'binlog-row-value-options=""' - TRANSACTION_PAYLOAD_EVENT is used by MySQL to signal that a row event is compressed. It an be disably by setting 'binlog_transaction_compression=0'. - HEARTBEAT_LOG_EVENT_V2 is written to the binary log many times per seconds. It can be ignored by the server. What this patch does: - If PARTIAL_UPDATE_ROWS_EVENT or TRANSACTION_PAYLOAD_EVENT is found, the server will stop with an error message of how to disable the MySQL server to generate such events. - HEARTBEAT_LOG_EVENT_V2 events are ignored. - mariadb-binlog will write the name of the new events. - mariadb-binlog will stop if PARTIAL_UPDATE_ROWS_EVENT or TRANSACTION_PAYLOAD_EVENT is found, unless --force is given. - Fixes a crash in mariadb-binlog if a character set unknown to MariaDB is found. (MDEV-29533) From Kristian Nielsen: - Add test case for MySQL 8.0 to MariaDB replication and fixed a a small typo in post_header_len initialization. Reviewer: knielsen@mariadb.org	2025-01-05 16:40:11 +02:00
Yuchen Pei	671f80c738	Merge branch '10.5' into 10.6	2024-12-17 11:06:09 +11:00
Andrei Elkin	bc6121819c	MDEV-35098 rpl.rpl_mysqldump_gtid_slave_pos fails in buildbot The test turns out to be senstive to @@global.gtid_cleanup_batch_size. With a rather small default value of the latter SELECTing from mysql.gtid_slave_pos may not be deterministic: tests that run before may increase a pending for automitic deletion batch. The test is refined to set its own value for the batch size which is virtually unreachable. Thanks to Kristian Nielsen for the analysis.	2024-12-16 19:43:41 +02:00
Marko Mäkelä	ddd7d5d8e3	MDEV-24035 Failing assertion: UT_LIST_GET_LEN(lock.trx_locks) == 0 causing disruption and replication failure Under unknown circumstances, the SQL layer may wrongly disregard an invocation of thd_mark_transaction_to_rollback() when an InnoDB transaction had been aborted (rolled back) due to one of the following errors: * HA_ERR_LOCK_DEADLOCK * HA_ERR_RECORD_CHANGED (if innodb_snapshot_isolation=ON) * HA_ERR_LOCK_WAIT_TIMEOUT (if innodb_rollback_on_timeout=ON) Such an error used to cause a crash of InnoDB during transaction commit. These changes aim to catch and report the error earlier, so that not only this crash can be avoided but also the original root cause be found and fixed more easily later. The idea of this fix is from Michael 'Monty' Widenius. HA_ERR_ROLLBACK: A new error code that will be translated into ER_ROLLBACK_ONLY, signalling that the current transaction has been aborted and the only allowed action is ROLLBACK. trx_t::state: Add TRX_STATE_ABORTED that is like TRX_STATE_NOT_STARTED, but noting that the transaction had been rolled back and aborted. trx_t::is_started(): Replaces trx_is_started(). ha_innobase: Check the transaction state in various places. Simplify the logic around SAVEPOINT. ha_innobase::is_valid_trx(): Replaces ha_innobase::is_read_only(). The InnoDB logic around transaction savepoints, commit, and rollback was unnecessarily complex and might have contributed to this inconsistency. So, we are simplifying that logic as well. trx_savept_t: Replace with const undo_no_t*. When we rollback to a savepoint, all we need to know is the number of undo log records that must survive. trx_named_savept_t, DB_NO_SAVEPOINT: Remove. We can store undo_no_t directly in the space allocated at innobase_hton->savepoint_offset. fts_trx_create(): Do not copy previous savepoints. fts_savepoint_rollback(): If a savepoint was not found, roll back everything after the default savepoint of fts_trx_create(). The test innodb_fts.savepoint is extended to cover this code. Reviewed by: Vladislav Lesin Tested by: Matthias Leich	2024-12-12 18:02:00 +02:00
Kristian Nielsen	d959acbbf8	MDEV-34049: Parallel access to temptable in different domain_id in parallel replication Disallow changing @@gtid_domain_id while a temporary table is open in STATEMENT or MIXED binlog mode. Otherwise, a slave may try to replicate events refering to the same temporary table in parallel, using domain-based out-of-order parallel replication. This is not valid, temporary tables are only available for use within a single thread at a time. One concrete consequence seen from this bug was a ROLLBACK on an InnoDB temporary table running in one domain in parallel with DROP TEMPORARY TABLE in another domain, causing an assertion inside InnoDB: InnoDB: Failing assertion: table->get_ref_count() == 0 in dict_sys_t::remove. Use an existing error code that's somewhat close to the real issue (ER_INSIDE_TRANSACTION_PREVENTS_SWITCH_GTID_DOMAIN_ID_SEQ_NO), to not add a new error code in a GA release. When this is merged to the next GA release, we could optionally introduce a new and more precise error code for an attempt to change the domain_id while temporary tables are open. Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-12-05 09:22:00 +01:00
Kristian Nielsen	0166c89e02	Merge 10.5 -> 10.6 Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-12-05 09:20:36 +01:00
Kristian Nielsen	b4fde50b1f	MDEV-5798: Wrong errorcode for missing partition after TRUNCATE PARTITION The partitioning error handling code was looking at thd->lex->alter_info.partition_flags in non-alter-table cases, in which cases the value is stale and contains whatever was set by any earlier ALTER TABLE. This could cause the wrong error code to be generated, which then in some cases can cause replication to break with "different errorcode" error. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-12-05 08:17:35 +01:00
Brandon Nesterenko	a06d81ff3f	MDEV-35477: rpl_semi_sync_no_missed_ack_after_add_slave fails after MDEV-35109 MTR test rpl_semi_sync_no_missed_ack_after_add_slave fails on buildbot after the preparatory commit for MDEV-35109 (`5290fa043b`) which changed a sleep to a debug_sync point. The problem is that the debug_sync point would time-out on a slave while waiting to enter the logic to send an ACK reply. More specifically, where the test config is a primary with two replicas, and the test waits on one of the replicas to start sending an ACK, if the other replica was able to receive the event and respond with an ACK before the binlog dump thread of the timing-out server would prepare to send event, it wouldn't set the SEMI_SYNC_NEED_ACK flag, and the replica wouldn't even try to respond with an ACK. Fix is to use debug_sync for both replicas such that both replicas are held before sending their ack, so one can’t temporarily disable semi-sync for the other before it receives the transaction.	2024-11-21 11:30:25 -07:00
Brandon Nesterenko	716ed2ce22	MDEV-35350: Consolidate MTR wait_for_pattern_in_file.inc and SEARCH_WAIT in search_pattern_in_file.inc Replace wait_for_pattern_in_file.inc and all of its uses to use search_pattern_in_file.inc with SEARCH_WAIT. Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org> Sergei Golubchik <serg@mariadb.org>	2024-11-07 13:25:58 -07:00
Vladislav Vaintroub	faf9e755ba	MDEV-35109 fix test case rpl_semi_sync_after_sync_coord_consistency fails on release compilation	2024-11-05 22:38:55 +01:00
Brandon Nesterenko	b07258a0d5	MDEV-35109: Semi-sync Replication stalling Primary using wait point=AFTER_SYNC For a primary configured with wait_point=AFTER_SYNC, if two threads T1 (binlogging through MYSQL_BIN_LOG::write()) and T2 were binlogging at the same time, T1 could accidentally wait for its semi-sync ACK using the binlog coordinates of T2. Prior to MDEV-33551, this only resulted in delayed transactions, because all transactions shared the same condition variable for ACK signaling. However, with the MDEV-33551 changes, each thread has its own condition variable to signal. So T1 could wait indefinitely when either: 1) T1's ACK is received but not T2's when T1 goes into wait_after_sync(), because the ACK receiver thread has already notified about the T1 ACK, but T1 was _actually_ waiting on T2's ACK, and therefore tries to wait (in vain). 2) T1 goes to wait_after_sync() before any ACKs have arrived. When T1's ACK comes in, T1 is woken up; however, sees it needs to wait more (because it was actually waiting on T2's ACK), and goes to wait again (this time, in vain). Note that the actual cause of T1 waiting on T2's binlog coordinates is when MYSQL_BIN_LOG::write() would call Repl_semisync_master::wait_after_sync(), the binlog offset parameter was read as the end of MYSQL_BIN_LOG::log_file, which is shared among transactions. So if T2 had updated the binary log _after_ T1 had released LOCK_log, but not yet invoked wait_after_sync(), it would use the end of the binary log file as the binlog offset, which was that of T2 (or any future transaction). The fix in this patch ensures consistency between the binary log coordinates a transaction uses between report_binlog_update() and wait_after_sync(). Reviewed By ============ Kristian Nielsen <knielsen@knielsen-hq.org> Andrei Elkin <andrei.elkin@mariadb.com>	2024-11-04 10:45:58 -07:00
Brandon Nesterenko	5290fa043b	MDEV-35109 PREP: simulate_delay_semisync_slave_reply use debug_sync This is a preparatory commit for MDEV-35109 to make its testing code cleaner (and harden other tests too). The DEBUG_DBUG point simulate_delay_semisync_slave_reply up to this patch used my_sleep() to delay an ACK response, but sleeps are prone to test failures on machines that run tests when already having a heavy load (e.g. on buildbot). This patch changes this DEBUG_DBUG sleep to use DEBUG_SYNC to coordinate exactly when a slave should send its reply, which is safer and faster. As DEBUG_SYNC can't be used while a server is shutting down, to synchronize threads with SHUTDOWN WAIT FOR SLAVES logic, we use and extend wait_for_pattern_in_file.inc to wait for an informational error message in the logic to indicate that the shutdown process has reached the intended state (i.e. indicating that the shutdown has been delayed to await semi-sync ACKs). Specifically, the extensions are as follows: 1. wait_for_pattern_in_file.inc is extended with parameter wait_for_pattern_count as a number that indicates the number of times a pattern should occur in the file before return control back to the calling script. 2. search_for_pattern_in_file.inc is extended with parameter SEARCH_ABORT_IS_SUCCESS to inverse the error/success logic, so the SEARCH_ABORT condition can be used to indicate success, rather than error.	2024-11-04 10:45:58 -07:00
Monty	066f920484	MDEV-35110 Deadlock on Replica during BACKUP STAGE BLOCK_COMMIT on XA transactions This is an extension of MDEV-30423 "Deadlock on Replica during BACKUP STAGE BLOCK_COMMIT on XA transactions" The original commit in MDEV-30423 was not complete as some usage in XA of MDL_BACKUP_COMMIT locks did not set thd->backup_commit_lock. This is required to be set when using parallel replication. Fixed by ensuring that all usage of BACKUP_COMMIT lock i XA is uniform and all sets thd->backup_commit_lock. I also changed all locks to be MDL_EXPLICIT to keep also that part uniform. A regression test is added.	2024-10-28 13:29:21 +02:00
Brandon Nesterenko	1ed30e08af	MDEV-34122: Assertion `entry' failed in Active_tranx::assert_thd_is_waiter If semi-sync is switched off then on while a transaction is in-between binlogging and waiting for an ACK, the semi-sync state of the transaction is removed, leading to a debug assertion that indicates the transaction tried to wait, but cannot receive an ACK signal. More specifically, when semi-sync is switched off, the Active_tranx list is cleared (where a transaction adds an entry to this list during binlogging), and each entry in this list saves the thread which will wait for an ACK, and the thread has the COND variable to signal to wake itself. So if the entry is lost, the Ack_receiver thread won’t be able to find the thread to wake up when an ACK comes in The fix is to ensure that the entry exists before awaiting the ACK, and if there is no entry, skip the wait. In debug builds, an informative message is written explaining that the transaction is skipping its wait. Additional debug-build only logic is added to ensure that the cause of the missing entry is due to semi-sync being turned off and on Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org>	2024-10-21 15:35:54 -06:00
Sergei Golubchik	3a1cf2c85b	MDEV-34679 ER_BAD_FIELD uses non-localizable substrings	2024-10-17 21:37:37 +02:00
Sergei Golubchik	5ebda30ccc	Revert "MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2" This reverts commit `8ae462a220`.	2024-10-16 13:23:47 +02:00
Kristian Nielsen	8ae462a220	MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2 Implement variable legacy_xa_rollback_at_disconnect to support backwards compatibility for applications that rely on the pre-10.5 behavior for connection disconnect, which is to rollback the transaction (in violation of the XA specification). Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-10-16 10:18:36 +02:00
Marko Mäkelä	7e0afb1c73	Merge 10.5 into 10.6	2024-10-03 09:31:39 +03:00
Lena Startseva	0a5e4a0191	MDEV-31005: Make working cursor-protocol Updated tests: cases with bugs or which cannot be run with the cursor-protocol were excluded with "--disable_cursor_protocol"/"--enable_cursor_protocol" Fix for v.10.5	2024-09-18 18:39:26 +07:00
Brandon Nesterenko	68938d2b42	MDEV-33500 (part 2): rpl.rpl_parallel_sbm can still fail The failing test case validates Seconds_Behind_Master for a delayed slave, while STOP SLAVE is executed during a delay. The test fixes initially added to the test (commit `b04c857596`) added a table lock to ensure a transaction could not finish before validating the Seconds_Behind_Master field after SLAVE START, but did not address a possibility that the transaction could finish before running the STOP SLAVE command, which invalidates the validations for the rest of the test case. Specifically, this would result in 1) a timeout in “Waiting for table metadata lock” on the replica, which expects the transaction to retry after slave restart and hit a lock conflict on the locked tables (added in `b04c857596`), and 2) that Seconds_Behind_Master should have increased, but did not. The failure can be reproduced by synchronizing the slave to the master before the MDEV-32265 echo statement (i.e. before the SLAVE STOP). This patch fixes the test by adding a mechanism to use DEBUG_SYNC to synchronize a MASTER_DELAY, rather than continually increase the duration of the delay each time the test fails on buildbot. This is to ensure that on slow machines, a delay does not pass before the test gets a chance to validate results. Additionally, it decreases overall test time because the test can continue immediately after validation, thereby bypassing the remainder of a full delay for each transaction.	2024-09-17 06:29:20 -06:00
Marko Mäkelä	48becffd07	Merge 10.5 into 10.6	2024-08-27 08:52:10 +03:00
Kristian Nielsen	8642453ce6	Fix sporadic failure of test case rpl.rpl_start_stop_slave The test was expecting the I/O thread to be in a specific state, but thread scheduling may cause it to not yet have reached that state. So just have a loop that waits for the expected state to occur. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-08-26 14:39:24 +02:00
Kristian Nielsen	214e6c5b3d	Fix sporadic failure of test case rpl.rpl_old_master Remove the test for MDEV-14528. This is supposed to test that parallel replication from pre-10.0 master will update Seconds_Behind_Master. But after MDEV-12179 the SQL thread is blocked from even beginning to fetch events from the relay log due to FLUSH TABLES WITH READ LOCK, so the test case is no longer testing what is was intended to. And pre-10.0 versions are long since out of support, so does not seem worthwhile to try to rewrite the test to work another way. The root cause of the test failure is MDEV-34778. Briefly, depending on exact timing during slave stop, the rli->sql_thread_caught_up flag may end up with different value. If it ends up as "true", this causes Seconds_Behind_Master to be 0 during next slave start; and this caused test case timeout as the test was waiting for Seconds_Behind_Master to become non-zero. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-08-26 14:39:24 +02:00
Kristian Nielsen	7dc4ea5649	Fix sporadic test failure in rpl.rpl_create_drop_event Depending on timing, an extra event run could start just when the event scheduler is shut down and delay running until after the table has been dropped; this would cause the test to fail with a "table does not exist" error in the log. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-08-26 14:39:24 +02:00
Kristian Nielsen	33854d7324	Restore skiping rpl.rpl_mdev6020 under Valgrind (Revert a change done by mistake when XtraDB was removed.) Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-08-26 14:39:24 +02:00
Oleksandr Byelkin	8f020508c8	Merge branch '10.5' into 10.6	2024-08-03 09:04:24 +02:00
Brandon Nesterenko	001608de7e	MDEV-15393: Fix rpl_mysqldump_gtid_slave_pos The slave would try to sync_with_master_gtid.inc, but the master never actually saved its gtid position so the test would move on too quickly.	2024-07-31 14:17:46 -06:00
Oleksandr Byelkin	a938503cfb	Merge branch '10.5' into 10.6	2024-07-20 08:12:42 +02:00
Andrei	b8f92ade57	MDEV-15393 gtid_slave_pos duplicate key errors after mysqldump restore When mysqldump is run to dump the `mysql` system database, it generates INSERT statements into the table `mysql.gtid_slave_pos`. After running the backup script those inserts did not produce the expected gtid state on slave. In particular the maximum of mysql.gtid_slave_pos.sub_id did not make into rpl_global_gtid_slave_state.last_sub_id an in-memory object that is supposed to match the current state of the table. And that was regardless of whether --gtid option was specified or not. Later when the backup recipient server starts as slave in non-gtid mode this desychronization may lead to a duplicate key error. This effect is corrected for --gtid mode mysqldump/mariadb-dump only as the following. The fixes ensure the insert block of the dump script is followed with a "summing-up" SET @global.gtid_slave_pos assignment. For the implemenation part, note a deferred print-out of SET-gtid_slave_pos and associated comments is prefered over relocating of the entire blocks if (opt_master,slave_data && do_show_master,slave_status) ... because of compatiblity concern. Namely an error inside do_show_*() is handled in the new code the same way, as early as, as before. A regression test can be run in how-to-reproduce mode as well. One affected mtr test observed. rpl_mysqldump_slave.result "mismatch" shows now the new deferring print of SET-gtid_slave_pos policy in action.	2024-07-19 21:44:12 +03:00
Oleksandr Byelkin	9af2caca33	Merge branch '10.5' into 10.6	2024-07-18 16:25:33 +02:00
Brandon Nesterenko	a061ae1079	MDEV-33921: Fix rpl_xa_empty_transaction.test The test was missing a save_master_gtid.inc on the master, leading to the slave thinking it was in sync after executing sync_with_master_gtid.inc, despite not having executed the latest transaction. This skipped transaction, XA COMMIT, was supposed to error-to-be-ignored because its XID could not be found, but be thrown out because the replication filters would filter out the target database. However, if the slave was able to stop before executing the transaction, then the replication filer is reset (to empty), and when the slave is later restarted, that transactions error would no longer be ignored. Additionally, as the test cases added in MDEV-33921 rely on GTID synchronization, the test cases now force master_use_gtid=slave_pos for consistency	2024-07-17 16:38:26 -06:00
Sergei Golubchik	d60f5c11ea	MDEV-34318 mariadb-dump SQL syntax error with MAX_STATEMENT_TIME against Percona MySQL server protect MariaDB conditional comments from a bug in Percona MySQL comment parser	2024-07-17 21:25:40 +02:00
Yuchen Pei	f071b7620b	Merge branch '10.5' into 10.6	2024-07-16 15:54:22 +08:00
Daniel Black	e8bcc4e455	MDEV-34568 rpl.rpl_mdev12179 - correct for Windows Simplify in an attempt to avoid: mysqltest: At line 275: File already exist: on the write_file lines. Using write_line as that's what a lot of other tests do for writing small bits to a expect file. Review thanks Valdislav Vaintroub	2024-07-12 12:55:28 +02:00
Brandon Nesterenko	ea9869504d	MDEV-33921: Replication breaks when filtering two-phase XA transactions There are two problems. First, replication fails when XA transactions are used where the slave has replicate_do_db set and the client has touched a different database when running DML such as inserts. This is because XA commands are not treated as keywords, and are thereby not exempt from the replication filter. The effect of this is that during an XA transaction, if its logged “use db” from the master is filtered out by the replication filter, then XA END will be ignored, yet its corresponding XA PREPARE will be executed in an invalid state, thereby breaking replication. Second, if the slave replicates an XA transaction which results in an empty transaction, the XA START through XA PREPARE first phase of the transaction won’t be binlogged, yet the XA COMMIT will be binlogged. This will break replication in chain configurations. The first problem is fixed by treating XA commands in Query_log_event as keywords, thus allowing them to bypass the replication filter. Note that Query_log_event::is_trans_keyword() is changed to accept a new parameter to define its mode, to either check for XA commands or regular transaction commands, but not both. In addition, mysqlbinlog is adapted to use this mode so its --database filter does not remove XA commands from its output. The second problem fixed by overwriting the XA state in the XID cache to be XA_ROLLBACK_ONLY, so at commit time, the server knows to rollback the transaction and skip its binlogging. If the xid cache is cleared before an XA transaction receives its completion command (e.g. on server shutdown), then before reporting ER_XAER_NOTA when the completion command is executed, the filter is first checked if the database is ignored, and if so, the error is ignored. Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org> Andrei Elkin <andrei.elkin@mariadb.com>	2024-07-10 14:37:39 -06:00
Julius Goryavsky	4026f04425	Merge branch 10.5 into 10.6	2024-07-09 11:56:47 +02:00
Brandon Nesterenko	744580d5a7	MDEV-32892: IO Thread Reports False Error When Stopped During Connecting to Primary The IO thread can report error code 2013 into the error log when it is stopped during the initial connection process to the primary, as well as when trying to read an event. However, because the IO thread is being stopped, its connection to the primary is force-killed by the signaling thread (see THD::awake_no_mutex()), and thereby these connection errors should be ignored. Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org>	2024-07-08 10:39:17 -06:00
Alexander Barkov	e56040fee8	Merge remote-tracking branch 'origin/10.5' into 10.6	2024-07-08 18:59:04 +04:00
Brandon Nesterenko	eb4458e993	MDEV-33465: an option to enable semisync recovery The current semi-sync binlog fail-over recovery process uses rpl_semi_sync_slave_enabled==TRUE as its condition to truncate a primary server’s binlog, as it is anticipating the server to re-join a replication topology as a replica. However, for servers configured with both rpl_semi_sync_master_enabled=1 and rpl_semi_sync_slave_enabled=1, if a primary is just re-started (i.e. retaining its role as master), it can truncate its binlog to drop transactions which its replica(s) has already received and executed. If this happens, when the replica reconnects, its gtid_slave_pos can be ahead of the recovered primary’s gtid_binlog_pos, resulting in an error state where the replica’s state is ahead of the primary’s. This patch changes the condition for semi-sync recovery to truncate the binlog to instead use the configuration variable --init-rpl-role, when set to SLAVE. This allows for both rpl_semi_sync_master_enabled and rpl_semi_sync_slave_enabled to be set for a primary that is restarted, and no transactions will be lost, so long as --init-rpl-role is not set to SLAVE. Reviewed By: ============ Sergei Golubchik <serg@mariadb.com>	2024-07-05 19:53:57 -06:00
Brandon Nesterenko	cbc1898e82	MDEV-25607: Auto-generated DELETE from HEAP table can break replication The special logic used by the memory storage engine to keep slaves in sync with the master on a restart can break replication. In particular, after a restart, the master writes DELETE statements in the binlog for each MEMORY-based table so the slave can empty its data. If the DELETE is not executable, e.g. due to invalid triggers, the slave will error and fail, whereas the master will never see the problem. Instead of DELETE statements, use TRUNCATE to keep slaves in-sync with the master, thereby bypassing triggers. Reviewed By: =========== Kristian Nielsen <knielsen@knielsen-hq.org> Andrei Elkin <andrei.elkin@mariadb.com>	2024-07-05 12:00:09 -06:00
Marko Mäkelä	0076eb3d4e	Merge 10.5 into 10.6	2024-06-24 13:09:47 +03:00
Monty	3541bd63f0	MDEV-33582 Add more warnings to be able to better diagnose network issues Changed the logged messages from errors to warnings Also changed 'remain' to 'read_length' in the warning to make it more readable.	2024-06-20 09:53:01 +03:00
Brandon Nesterenko	6cab2f75fe	MDEV-23857: replication master password length After MDEV-4013, the maximum length of replication passwords was extended to 96 ASCII characters. After a restart, however, slaves only read the first 41 characters of MASTER_PASSWORD from the master.info file. This lead to slaves unable to reconnect to the master after a restart. After a slave restart, if a master.info file is detected, use the full allowable length of the password rather than 41 characters. Reviewed By: ============ Sergei Golubchik <serg@mariadb.com>	2024-06-18 07:21:18 -06:00
Brandon Nesterenko	fcd21d3e40	MDEV-34355: rpl.rpl_semi_sync_no_missed_ack_after_add_slave ‘server_3 should have sent…’ The problem is that the test could query the status variable Rpl_semi_sync_slave_send_ack before the slave actually updated it. This would result in an immediate --die assertion killing the rest of the test. The bottom of this commit message has a small patch that can be applied to reproduce the test failure. This patch fixes the test failure by waiting for the variable to be updated before querying its value. diff --git a/sql/semisync_slave.cc b/sql/semisync_slave.cc index 9ddd4c5c8d7..60538079fce 100644 --- a/sql/semisync_slave.cc +++ b/sql/semisync_slave.cc @@ -303,7 +303,10 @@ int Repl_semi_sync_slave::slave_reply(Master_info *mi) reply_res= DBUG_EVALUATE_IF("semislave_failed_net_flush", 1, net_flush(net)); if (!reply_res) + { + sleep(1); rpl_semi_sync_slave_send_ack++; + } } DBUG_RETURN(reply_res); }	2024-06-10 12:27:20 -06:00

1 2 3 4 5 ...

4220 commits