mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-03-31 12:25:38 +02:00

Author	SHA1	Message	Date
ParadoxV5	8d6ff0db25	MDEV-27669: Add `skip-slave-start` info message When a slave does not start up the slave threads on restart, but not reporting anything to the error log about startup failures either, this can be due to `skip-slave-start` being set in the config file(s) or on the command line (and most likely is). Reviewed-by: Sergei Golubchik <serg@mariadb.org>	2025-03-18 18:31:28 +01:00
Sergey Vojtovich	feb1cf9086	Corrections to parent "fix typos" commmit	2025-03-14 12:08:56 +04:00
Vasilii Lakhin	717c12de0e	Fix typos in C comments inside sql/	2025-03-14 12:08:56 +04:00
ParadoxV5	5091986cea	misc. `sql/slave.cc` & co. refactor * `get_master_version_and_clock()` de-duplicate label using fall-through * `io_slave_killed()` & `check_io_slave_killed()`: * reüse the result from the level lower * add distinguishing docs * `try_to_reconnect()`: extract `'` from `if`-`else` * `handle_slave_io()`: Both `while`s have the same condition; looks like the outer `while` can simply be an `if`. * `connect_to_master()`: * assume `mysql_errno()` is not 0 on connection error * utilize 0’s falsiness in the loop * extend docs * `sql/sql_show.cc`: refactor SHOW ALL REPLICAS filter’s condition * `sql/mysqld.cc`: init `master-retry-count` with `master_retry_count` Reviewed-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-02-26 20:37:53 -07:00
ParadoxV5	e2dbd9b6ac	MDEV-35304: Add `Connects_Tried` and `Master_Retry_Count` to SSS When the IO thread (re)connect to a primary, no updates are available besides unique errors that cause the failure. These new `Master_info` numbers supplement SHOW SLAVE STATUS’s (most- recent) ‘Connecting’ state with statistics on (re)connect attempts: * `Connects_Tried`: how many retries have been attempted so far This was previously a local variable that only counted re-attempts; it’s now meaningful even after the “Connecting” state concludes. * `Master_Retry_Count` (from MDEV-25674): out of how many configured Side-note: Some of the tests updated by this commit dump the entire SHOW SLAVE STATUS, which might include non-deterministic entries. Reviewed-by: Kristian Nielsen <knielsen@knielsen-hq.org> Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>	2025-02-26 20:37:53 -07:00
ParadoxV5	7094a75596	MDEV-25674: Add CHANGE MASTER TO master_retry_count This new CHANGE MASTER TO field specifies the `--master-retry-count` (global option: the number of Primary connection attempts) for each multi-source replica; i.e, per-channel `performance_schema.` `replication_connection_configuration.CONNECTION_RETRY_COUNT`. `--master-retry-count` remains the default for new `CHANGE MASTER TO`s. This new keyword and `master-info` entry matches those of pre-‘REPLICATION SOURCE’ MySQL.	2025-02-26 20:37:53 -07:00
ParadoxV5	66f52ba630	slave.cc `try_to_reconnect` remove `retry_counter` `try_to_reconnect()` wraps `safe_reconnect()` with logging, but the latter already loops reconnection attempts up to `master_retry_count` times with `mi->connect_retry`-msec sleeps inbetween. This means `try_to_reconnect()` has been counting disconnects of the replication session (since it doesn’t loop) while `safe_reconnect()` was counting actual tries (which may be multiple per disconnect). In practice, this outer counter’s only benefit was to override the edge case `--master-retry-count=0` that the inner loop doesn’t cover with 1.	2025-02-26 20:37:53 -07:00
Sergei Golubchik	ba01c2aaf0	Merge branch '11.4' into 11.7 * rpl.rpl_system_versioning_partitions updated for MDEV-32188 * innodb.row_size_error_log_warnings_3 changed error for MDEV-33658 (checks are done in a different order)	2025-02-06 16:46:36 +01:00
ParadoxV5	697b88bf75	SHOW REPLICA STATUS: mark columns as unsigned Update all integer columns of SHOW REPLICA STATUS (technically INFORMATION_SCHEMA.SLAVE_STATUS) to unsigned because, well, they are (:. Some `uint32` ones were accidentally using the `Field::store(double nr)` overload because they forgot the `, true` for `Field::store(longlong nr, bool unsigned_val)`. The mistake’s harmless, fortunately, as `double` supports over 15 significant decimal digits, well over `uint32`’s 9-and-a-half.	2025-01-31 20:56:41 -07:00
Sergei Golubchik	7d657fda64	Merge branch '10.11 into 11.4	2025-01-30 12:01:11 +01:00
Sergei Golubchik	e69f8cae1a	Merge branch '10.6' into 10.11	2025-01-30 11:55:13 +01:00
Kristian Nielsen	72e1cc8f52	MDEV-35806: Error in read_log_event() corrupts relay log writer, crashes server In Log_event::read_log_event(), don't use IO_CACHE::error of the relay log's IO_CACHE to signal an error back to the caller. When reading the active relay log, this flag is also being used by the IO thread, and setting it can randomly cause the IO thread to wrongly detect IO error on writing and permanently disable the relay log. This was seen sporadically in test case rpl.rpl_from_mysql80. The read error set by the SQL thread in the IO_CACHE would be interpreted as a write error by the IO thread, which would cause it to throw a fatal error and close the relay log. And this would later cause CHANGE MASTER to try to purge a closed relay log, resulting in nullptr crash. SQL thread is not able to parse an event read from the relay log. This can happen like here when replicating unknown events from a MySQL master, potentially also for other reasons. Also fix a mistake in my_b_flush_io_cache() introduced back in 2001 (`fa09f2cd7e`) where my_b_flush_io_cache() could wrongly return an error set in IO_CACHE::error, even if the flush operation itself succeeded. Also fix another sporadic failure in rpl.rpl_from_mysql80 where the outout of MASTER_POS_WAIT() depended on timing of SQL and IO thread. Reviewed-by: Monty <monty@mariadb.org> Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-01-24 09:15:20 +00:00
Marko Mäkelä	33907f9ec6	Merge 11.4 into 11.7	2024-12-02 17:51:17 +02:00
Marko Mäkelä	2719cc4925	Merge 10.11 into 11.4	2024-12-02 11:35:34 +02:00
Marko Mäkelä	3d23adb766	Merge 10.6 into 10.11	2024-11-29 13:43:17 +02:00
Marko Mäkelä	7d4077cc11	Merge 10.5 into 10.6	2024-11-29 12:37:46 +02:00
Brandon Nesterenko	dbfee9fc2b	MDEV-34348: Consolidate cmp function declarations Partial commit of the greater MDEV-34348 scope. MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict The functions queue_compare, qsort2_cmp, and qsort_cmp2 all had similar interfaces, and were used interchangable and unsafely cast to one another. This patch consolidates the functions all into the qsort_cmp2 interface. Reviewed By: ============ Marko Mäkelä <marko.makela@mariadb.com>	2024-11-23 08:14:22 -07:00
ParadoxV5	cf2d49ddcf	Extract some of #3360 fixes to 10.5.x That PR uncovered countless issues on `my_snprintf` uses. This commit backports a squashed subset of their fixes.	2024-11-21 22:43:56 +11:00
Thirunarayanan Balathandayuthapani	074831ec61	Merge branch 10.5 into 10.6	2024-11-08 18:17:15 +05:30
Oleksandr Byelkin	9e1fb104a3	MariaDB 11.4.4 release -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmck77AACgkQ8WVvJMdM 0dgccQ/+Lls8fWt4D+gMPP7x+drJSO/IE/gZFt3ugbWF+/p3B2xXAs5AAE83wxEh QSbp4DCkb/9PnuakhLmzg0lFbxMUlh4rsJ1YyiuLB2J+YgKbAc36eQQf+rtYSipd DT5uRk36c9wOcOXo/mMv4APEvpPXBIBdIL4VvpKFbIOE7xT24Sp767zWXdXqrB1f JgOQdM2ct+bvSPC55oZ5p1kqyxwvd6K6+3RB3CIpwW9zrVSLg7enT3maLjj/761s jvlRae+Cv+r+Hit9XpmEH6n2FYVgIJ3o3WhdAHwN0kxKabXYTg7OCB7QxDZiUHI9 C/5goKmKaPB1PCQyuTQyLSyyK9a8nPfgn6tqw/p/ZKDQhKT9sWJv/5bSWecrVndx LLYifSTrFC/eXLzgPvCnNv/U8SjsZaAdMIKS681+qDJ0P5abghUIlGnMYTjYXuX1 1B6Vrr0bdrQ3V1CLB3tpkRjpUvicrsabtuAUAP65QnEG2G9UJXklOer+DE291Gsl f1I0o6C1zVGAOkUUD3QEYaHD8w7hlvyfKme5oXKUm3DOjaAar5UUKLdr6prxRZL4 ebhmGEy42Mf8fBYoeohIxmxgvv6h2Xd9xCukgPp8hFpqJGw8abg7JNZTTKH4h2IY J51RpD10h4eoi6WRn3opEcjexTGvZ+xNR7yYO5WxWw6VIre9IUA= =s+WW -----END PGP SIGNATURE----- Merge tag '11.4' into 11.6 MariaDB 11.4.4 release	2024-11-08 07:17:00 +01:00
Denis Protivensky	6d5fe9ed0d	MDEV-28378: Don't hang trying to peek log event past the end of log While applying CTAS log event, we peek the relay log to see if CTAS contains inserted rows or if it's empty. The peek function didn't check for end-of-file condition when tried to get the next event from the log, and thus it hanged. The fix includes checking for end-of-file while peeking for log events and considering returned XID_EVENT value as a sign of an empty CTAS. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-11-06 04:59:09 +01:00
Brandon Nesterenko	5290fa043b	MDEV-35109 PREP: simulate_delay_semisync_slave_reply use debug_sync This is a preparatory commit for MDEV-35109 to make its testing code cleaner (and harden other tests too). The DEBUG_DBUG point simulate_delay_semisync_slave_reply up to this patch used my_sleep() to delay an ACK response, but sleeps are prone to test failures on machines that run tests when already having a heavy load (e.g. on buildbot). This patch changes this DEBUG_DBUG sleep to use DEBUG_SYNC to coordinate exactly when a slave should send its reply, which is safer and faster. As DEBUG_SYNC can't be used while a server is shutting down, to synchronize threads with SHUTDOWN WAIT FOR SLAVES logic, we use and extend wait_for_pattern_in_file.inc to wait for an informational error message in the logic to indicate that the shutdown process has reached the intended state (i.e. indicating that the shutdown has been delayed to await semi-sync ACKs). Specifically, the extensions are as follows: 1. wait_for_pattern_in_file.inc is extended with parameter wait_for_pattern_count as a number that indicates the number of times a pattern should occur in the file before return control back to the calling script. 2. search_for_pattern_in_file.inc is extended with parameter SEARCH_ABORT_IS_SUCCESS to inverse the error/success logic, so the SEARCH_ABORT condition can be used to indicate success, rather than error.	2024-11-04 10:45:58 -07:00
Oleksandr Byelkin	c770bce898	Merge branch '11.2' into 11.4	2024-10-30 15:11:17 +01:00
Oleksandr Byelkin	69d033d165	Merge branch '10.11' into 11.2	2024-10-29 16:42:46 +01:00
Oleksandr Byelkin	3d0fb15028	Merge branch '10.6' into 10.11	2024-10-29 15:24:38 +01:00
Oleksandr Byelkin	f00711bba2	Merge branch '10.5' into 10.6	2024-10-29 14:20:03 +01:00
Monty	bddbef3573	MDEV-34533 asan error about stack overflow when writing record in Aria The problem was that when using clang + asan, we do not get a correct value for the thread stack as some local variables are not allocated at the normal stack. It looks like that for example clang 18.1.3, when compiling with -O2 -fsanitize=addressan it puts local variables and things allocated by alloca() in other areas than on the stack. The following code shows the issue Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection (connect=0x5080000027b8, put_in_cache=<optimized out>) at sql/sql_connect.cc:1399 THD thd; 1399 thd->thread_stack= (char) &thd; (gdb) p &thd (THD *) 0x7fffedee7060 (gdb) p $sp (void ) 0x7fffef4e7bc0 The address of thd is 24M away from the stack pointer (gdb) info reg ... rsp 0x7fffef4e7bc0 0x7fffef4e7bc0 ... r13 0x7fffedee7060 140737185214560 r13 is pointing to the address of the thd. Probably some kind of "local stack" used by the sanitizer I have verified this with gdb on a recursive call that calls alloca() in a loop. In this case all objects was stored in a local heap, not on the stack. To solve this issue in a portable way, I have added two functions: my_get_stack_pointer() returns the address of the current stack pointer. The code is using asm instructions for intel 32/64 bit, powerpc, arm 32/64 bit and sparc 32/64 bit. Supported compilers are gcc, clang and MSVC. For MSVC 64 bit we are using _AddressOfReturnAddress() As a fallback for other compilers/arch we use the address of a local variable. my_get_stack_bounds() that will return the address of the base stack and stack size using pthread_attr_getstack() or NtCurrentTed() with fallback to using the address of a local variable and user provided stack size. Server changes are: - Moving setting of thread_stack to THD::store_globals() using my_get_stack_bounds(). - Removing setting of thd->thread_stack, except in functions that allocates a lot on the stack before calling store_globals(). When using estimates for stack start, we reduce stack_size with MY_STACK_SAFE_MARGIN (8192) to take into account the stack used before calling store_globals(). I also added a unittest, stack_allocation-t, to verify the new code. Reviewed-by: Sergei Golubchik <serg@mariadb.org>	2024-10-16 17:24:46 +03:00
Marko Mäkelä	43465352b9	Merge 11.4 into 11.6	2024-10-03 16:09:56 +03:00
Marko Mäkelä	b53b81e937	Merge 11.2 into 11.4	2024-10-03 14:32:14 +03:00
Marko Mäkelä	12a91b57e2	Merge 10.11 into 11.2	2024-10-03 13:24:43 +03:00
Marko Mäkelä	63913ce5af	Merge 10.6 into 10.11	2024-10-03 10:55:08 +03:00
Marko Mäkelä	7e0afb1c73	Merge 10.5 into 10.6	2024-10-03 09:31:39 +03:00
Brandon Nesterenko	68938d2b42	MDEV-33500 (part 2): rpl.rpl_parallel_sbm can still fail The failing test case validates Seconds_Behind_Master for a delayed slave, while STOP SLAVE is executed during a delay. The test fixes initially added to the test (commit `b04c857596`) added a table lock to ensure a transaction could not finish before validating the Seconds_Behind_Master field after SLAVE START, but did not address a possibility that the transaction could finish before running the STOP SLAVE command, which invalidates the validations for the rest of the test case. Specifically, this would result in 1) a timeout in “Waiting for table metadata lock” on the replica, which expects the transaction to retry after slave restart and hit a lock conflict on the locked tables (added in `b04c857596`), and 2) that Seconds_Behind_Master should have increased, but did not. The failure can be reproduced by synchronizing the slave to the master before the MDEV-32265 echo statement (i.e. before the SLAVE STOP). This patch fixes the test by adding a mechanism to use DEBUG_SYNC to synchronize a MASTER_DELAY, rather than continually increase the duration of the delay each time the test fails on buildbot. This is to ensure that on slow machines, a delay does not pass before the test gets a chance to validate results. Additionally, it decreases overall test time because the test can continue immediately after validation, thereby bypassing the remainder of a full delay for each transaction.	2024-09-17 06:29:20 -06:00
Oleksandr Byelkin	d6444022ca	Merge branch 'bb-11.5-release' into bb-11.6-release	2024-08-06 17:28:38 +02:00
Oleksandr Byelkin	ea75a0b600	Merge branch '11.4' into 11.5	2024-08-05 17:50:18 +02:00
Oleksandr Byelkin	1640c9b06e	Merge branch '11.2' into 11.4	2024-08-04 17:27:48 +02:00
Oleksandr Byelkin	80abd847da	Merge branch '10.11' into 11.1	2024-08-03 09:32:42 +02:00
Monty	25b5c63905	MDEV-33856: Alternative Replication Lag Representation via Received/Executed Master Binlog Event Timestamps This commit adds 3 new status variables to 'show all slaves status': - Master_last_event_time ; timestamp of the last event read from the master by the IO thread. - Slave_last_event_time ; Master timestamp of the last event committed on the slave. - Master_Slave_time_diff: The difference of the above two timestamps. All the above variables are NULL until the slave has started and the slave has read one query event from the master that changes data. - Added information_schema.slave_status, which allows us to remove: - show_master_info(), show_master_info_get_fields(), send_show_master_info_data(), show_all_master_info() - class Sql_cmd_show_slave_status. - Protocol::store(I_List<i_string_pair>* str_list) as it is not used anymore. - Changed old SHOW SLAVE STATUS and SHOW ALL SLAVES STATUS to use the SELECT code path, as all other SHOW ... STATUS commands. Other things: - Xid_log_time is set to time of commit to allow slave that reads the binary log to calculate Master_last_event_time and Slave_last_event_time. This is needed as there is not 'exec_time' for row events. - Fixed that Load_log_event calculates exec_time identically to Query_event. - Updated RESET SLAVE to reset Master/Slave_last_event_time - Updated SQL thread's update on first transaction read-in to only update Slave_last_event_time on group events. - Fixed possible (unlikely) bugs in sql_show.cc ...old_format() functions if allocation of 'field' would fail. Reviewed By: Brandon Nesterenko <brandon.nesterenko@mariadb.com> Kristian Nielsen <knielsen@knielsen-hq.org>	2024-07-25 08:57:27 -06:00
Oleksandr Byelkin	0fe39d368a	Merge branch '10.6' into 10.11	2024-07-22 15:14:50 +02:00
Monty	e0cff1e72b	Fixed failure in rpl.rpl_change_master_demote : "IO thread should not be running..." The issue was that the test did not take into account that the IO thread could have been in COMMAND=Connecting state, which happens before the COMMANMD=Slave_IO state. The test is a bit fragile as it depends on the COMMAND state to be syncronised with the Slave_IO_State, which is not the case. I added a new proc state and some more information to the error output to be able to diagnose future failures more easily.	2024-07-11 11:15:47 +03:00
Vladislav Vaintroub	584fc85e21	MDEV-32537 Name threads to improve debugging experience and diagnostics. Use SetThreadDescription/pthread_setname_np to give threads a name.	2024-07-09 13:17:20 +02:00
Julius Goryavsky	4026f04425	Merge branch 10.5 into 10.6	2024-07-09 11:56:47 +02:00
Brandon Nesterenko	744580d5a7	MDEV-32892: IO Thread Reports False Error When Stopped During Connecting to Primary The IO thread can report error code 2013 into the error log when it is stopped during the initial connection process to the primary, as well as when trying to read an event. However, because the IO thread is being stopped, its connection to the primary is force-killed by the signaling thread (see THD::awake_no_mutex()), and thereby these connection errors should be ignored. Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org>	2024-07-08 10:39:17 -06:00
Vladislav Vaintroub	7c5fdc9b6a	MDEV-33748 get rid of pthread_(get_/set_)specific, use thread_local Apart from better performance when accessing thread local variables, we'll get rid of things that depend on initialization/cleanup of pthread_key_t variables. Where appropriate, use compiler-dependent pre-C++11 thread-local equivalents, where it makes sense, to avoid initialization check overhead that non-static thread_local can suffer from.	2024-06-21 13:46:41 +02:00
Monty	24c57165d5	ALTER TABLE and replication should convert old row_end timestamps to new timestamp range MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range - Added --update-history option to mariadb-dump to change 2038 row_end timestamp to 2106. - Updated ALTER TABLE ... to convert old row_end timestamps to 2106 timestamp for tables created before MariaDB 11.4.0. - Fixed bug in CHECK TABLE where we wrongly suggested to USE REPAIR TABLE when ALTER TABLE...FORCE is needed. - mariadb-check printed table names that where used with REPAIR TABLE but did not print table names used with ALTER TABLE or with name repair. Fixed by always printing a table that is fixed if --silent is not used. - Added TABLE::vers_fix_old_timestamp() that will change max-timestamp for versioned tables when replication from a pre-11.4.0 server. A few test cases changed. This is caused by: - CHECK TABLE now prints 'Please do ALTER TABLE... instead of 'Please do REPAIR TABLE' when there is a problem with the information in the .frm file (for example a very old frm file). - mariadb-check now prints repaired table names. - mariadb-check also now prints nicer error message in case ALTER TABLE is needed to repair a table.	2024-05-27 12:39:03 +02:00
Monty	2464ee758a	MDEV-33655 Remove alter_algorithm Remove alter_algorithm but keep the variable as no-op (with a warning). The reasons for removing alter_algorithm are: - alter_algorithm was introduced as a replacement for the old_alter_table that was used to force the usage of the original alter table algorithm (copy) in the cases where the new alter algorithm did not work. The new option was added as a way to force the usage of a specific algorithm when it should instead have made it possible to disable algorithms that would not work for some reason. - alter_algorithm introduced some cases where ALTER TABLE would not work without specifying the ALGORITHM=XXX option together with ALTER TABLE. - Having different values of alter_algorithm on master and slave could cause slave to stop unexpectedly. - ALTER TABLE FORCE, as used by mariadb-upgrade, would not always work if alter_algorithm was set for the server. - As part of the MDEV-33449 "improving repair of tables" it become clear that alter- algorithm made it harder to provide a better and more consistent ALTER TABLE FORCE and REPAIR TABLE and it would be better to remove it.	2024-05-27 12:39:03 +02:00
Monty	dfdedd46e4	MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range This patch extends the timestamp from 2038-01-19 03:14:07.999999 to 2106-02-07 06:28:15.999999 for 64 bit hardware and OS where 'long' is 64 bits. This is true for 64 bit Linux but not for Windows. This is done by treating the 32 bit stored int as unsigned instead of signed. This is safe as MariaDB has never accepted dates before the epoch (1970). The benefit of this approach that for normal timestamp the storage is compatible with earlier version. However for tables using system versioning we before stored a timestamp with the year 2038 as the 'max timestamp', which is used to detect current values. This patch stores the new 2106 year max value as the max timestamp. This means that old tables using system versioning needs to be updated with mariadb-upgrade when moving them to 11.4. That will be done in a separate commit.	2024-05-27 12:39:02 +02:00
Monty	ce6cce85d4	MDEV-33620 Improve times and states in show processlist for replication - Slave_IO thread time is now reset between reading events. Before this commit Slave_IO always showed "Waiting for master to send event" and the time was from SLAVE START. Now it shows time since reading last event.	2024-05-27 12:39:02 +02:00
Oleksandr Byelkin	dd7d9d7fb1	Merge branch '11.4' into 11.5	2024-05-23 17:01:43 +02:00
Oleksandr Byelkin	99b370e023	Merge branch '11.2' into 11.4	2024-05-21 19:38:51 +02:00

1 2 3 4 5 ...

2852 commits