mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-17 12:32:27 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	489ef007be	Merge 10.3 into 10.4	2021-10-21 14:57:00 +03:00
Marko Mäkelä	e4a7c15dd6	Merge 10.2 into 10.3	2021-10-21 13:41:04 +03:00
Brandon Nesterenko	2291f8ef73	MDEV-25284: Assertion `info->type == READ_CACHE \|\| info->type == WRITE_CACHE' failed Problem: ======== This patch addresses two issues. First, if a CHANGE MASTER command is issued and an error happens while locating the replica’s relay logs, the logs can be put into an invalid state where future updates fail and future CHANGE MASTER calls crash the server. More specifically, right before a replica purges the relay logs (part of the `CHANGE MASTER TO` logic), the relay log is temporarily closed with state LOG_TO_BE_OPENED. If the server errors in-between the temporary log closure and purge, i.e. during the function find_log_pos, the log should be closed. MDEV-25284 reveals the log is not properly closed. Second, upon issuing a RESET SLAVE ALL command, a slave’s GTID filters are not cleared (DO_DOMAIN_IDS, IGNORE_DOMIAN_IDS, IGNORE_SERVER_IDS). MySQL had a similar bug report, Bug #18816897, which fixed this issue to clear IGNORE_SERVER_IDS after issuing RESET SLAVE ALL in version 5.7. Solution: ========= To fix the first problem, the CHANGE MASTER error handling logic was extended to transition the relay log state to LOG_CLOSED from LOG_TO_BE_OPENED. To fix the second problem, the RESET SLAVE ALL logic is extended to clear the domain_id filter and ignore_server_ids. Reviewed By: ============ Andrei Elkin <andrei.elkin@mariadb.com>	2021-10-18 10:43:51 -06:00
Brandon Nesterenko	46c3e7e353	MDEV-20215: binlog.show_concurrent_rotate failed in buildbot with wrong result Problem: ======= There are two issues that are addressed in this patch: 1) SHOW BINARY LOGS uses caching to store the binary logs that exist in the log directory; however, if new events are written to the logs, the caching strategy is unaware. This is okay for users, as it is okay for SHOW to return slightly old data. The test, however, can result in inconsistent data. It runs two connections concurrently, where one shows the logs, and the other adds a new file. The output of SHOW BINARY LOGS then depends on when the cache is built, with respect to the time that the second connection rotates the logs. 2) There is a race condition between RESET MASTER and SHOW BINARY LOGS. More specifically, where they both need the binary log lock to begin, SHOW BINARY LOGS only needs the lock to build its cache. If RESET MASTER is issued after SHOW BINARY LOGS has built its cache and before it has returned the results, the presented data may be incorrect. Solution: ======== 1) As it is okay for users to see stale data, to make the test consistent, use DEBUG_SYNC to force the race condition (problem 2) to make SHOW BINARY LOGS build a cache before RESET MASTER is called. Then, use additional logic from the next part of the solution to rebuild the cache. 2) Use an Atomic_counter to keep track of the number of times RESET MASTER has been called. If the value of the counter changes after building the cache, the cache should be rebuilt and the analysis should be restarted. Reviewed By: ============ Andrei Elkin: <andrei.elkin@mariadb.com>	2021-08-13 10:53:19 -06:00
Nikita Malyavin	509e4990af	Merge branch bb-10.3-release into bb-10.4-release	2021-05-05 23:03:01 +03:00
Nikita Malyavin	a8a925dd22	Merge branch bb-10.2-release into bb-10.3-release	2021-05-04 14:49:31 +03:00
Sujatha	abe6eb10a6	MDEV-16146: MariaDB slave stops with following errors. Problem: ======== 180511 11:07:58 [ERROR] Slave I/O: Unexpected master's heartbeat data: heartbeat is not compatible with local info;the event's data: log_file_name mysql-bin.000009 log_pos 1054262041, Error_code: 1623 Analysis: ========= In replication setup when master server doesn't have any events to send to slave server it sends an 'Heartbeat_log_event'. This event carries the current binary log filename and offset details. The offset values is stored within 4 bytes of event header. When the size of binary log is higher than UINT32_MAX the log_pos values will not fit in 4 bytes memory. It overflows and hence slave stops with an error. Fix: === Since we cannot extend the common_header of Log_event class, a greater than 4GB value of Log_event::log_pos is made to be transported with a HeartBeat event's sub-header. Log_event::log_pos in such case is set to zero to indicate that the 8 byte sub-header is allocated in the event. In case of cross version replication following behaviour is expected OLD - Server without fix NEW - Server with fix OLD<->NEW : works bidirectionally as long as the binlog offset is (normally) within 4GB. When log_pos > UINT32_MAX OLD->NEW : The 'log_pos' is bound to overflow and NEW slave may report an invalid event/incompatible heart beat event error. NEW->OLD : Since patched server sets log_pos=0 on overflow, OLD slave will report invalid event error.	2021-04-30 20:34:31 +05:30
Sergei Golubchik	eac8341df4	MDEV-23328 Server hang due to Galera lock conflict resolution adaptation of `29bbcac0ee` for 10.4	2021-02-12 18:17:06 +01:00
Sergei Golubchik	9703cffa8c	don't take mutexes conditionally	2021-02-12 18:14:20 +01:00
Sergei Golubchik	00a313ecf3	Merge branch 'bb-10.3-release' into bb-10.4-release Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution" was null-merged. 10.4 version of the fix is coming up separately	2021-02-12 17:44:22 +01:00
Sergei Golubchik	60ea09eae6	Merge branch '10.2' into 10.3	2021-02-01 13:49:33 +01:00
Sujatha	eb75e8705d	MDEV-8134: The relay-log is not flushed after the slave-relay-log.999999 showed Problem: ======== Auto purge of relaylogs stops when relay-log-file is 'slave-relay-log.999999' and slave_parallel_threads is enabled. Analysis: ========= The problem is that in Relay_log_info::inc_group_relay_log_pos() function, when two log names are compared via strcmp() function, it gives correct result, when log name sequence numbers are of same digits(6 digits), But when the number goes to 7 digits, a 999999 compares greater than 1000000, which is wrong, hence the bug. Fix: ==== Extract the numeric extension part of the file name, convert it into unsigned long and compare. Thanks to David Zhao for the contribution.	2021-01-21 13:00:02 +05:30
Marko Mäkelä	61df98f964	Merge 10.3 into 10.4	2020-09-22 21:29:30 +03:00
Marko Mäkelä	d9d9c30b70	Merge 10.2 into 10.3	2020-09-22 21:12:48 +03:00
Marko Mäkelä	9d0ee2dcb7	Merge 10.1 into 10.2	2020-09-22 15:21:43 +03:00
Sujatha	873cc1e77a	MDEV-21839: Handle crazy offset to SHOW BINLOG EVENTS Problem: ======= SHOW BINLOG EVENTS FROM <"random"-pos> caused a variety of failures as reported in MDEV-18046. They are fixed but that approach is not future-proof as well as is not optimal to create extra check for being constructed event parameters. Analysis: ========= "show binlog events from <pos>" code considers the user given position as a valid event start position. The code starts reading data from this event start position onwards and tries to map it to a set of known events. Each event has a specific event structure and asserts have been added to ensure that, read event data, satisfies the event specific requirements. When a random position is supplied to "show binlog events command" the event structure specific checks will fail and they result in assert. For example: https://jira.mariadb.org/browse/MDEV-18046 In the bug description user executes CREATE TABLE/INSERT and ALTER SQL commands. When a crazy offset like "SHOW BINLOG EVENTS FROM 365" is provided code assumes offset 365 as valid event begin and proceeds to EVENT_LEN_OFFSET reads some random length and comes up with a crazy event which didn't exits in the binary log. In this quoted example scenario, event read at offset 365 is considered as "Update_rows_log_event", which is not present in binary log. Since this is a random event its validation fails and code results in assert/segmentation fault, as shown below. mysqld: /data/src/10.4/sql/log_event.cc:10863: Rows_log_event::Rows_log_event( const char, uint, const Format_description_log_event): Assertion `var_header_len >= 2' failed. 181220 15:27:02 [ERROR] mysqld got signal 6 ; #7 0x00007fa0d96abee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 #8 0x000055e744ef82de in Rows_log_event::Rows_log_event (this=0x7fa05800d390, buf=0x7fa05800d080 "", event_len=254, description_event=0x7fa058006d60) at /data/src/10.4/sql/log_event.cc:10863 #9 0x000055e744f00cf8 in Update_rows_log_event::Update_rows_log_event Since we are reading random data repeating the same command SHOW BINLOG EVENTS FROM 365 produces different types of crashes with different events. MDEV-18046 reported 10 such crashes. In order to avoid such scenarios user provided starting offset needs to be validated for its correctness. Best way of doing this is to make use of checksums if they are available. MDEV-18046 fix introduced the checksum based validation. The issue still remains in cases where binlog checksums are disabled. Please find the following bug reports. MDEV-22473: binlog.binlog_show_binlog_event_random_pos failed in buildbot, server crashed in read_log_event MDEV-22455: Server crashes in Table_map_log_event, binlog.binlog_invalid_read_in_rotate failed in buildbot Fix: ==== When binlog checksum is disabled, perform scan(via reading event by event), to validate the requested FROM <pos> offset. Starting from offset 4 read the event_length of next_event in the binary log. Using the next_event length advance current offset to point to next event. Repeat this process till the current offset is less than or equal to crazy offset. If current offset is higher than crazy offset provide appropriate invalid input offset error.	2020-09-16 14:03:32 +05:30
Marko Mäkelä	2f7b37b021	Merge 10.3 into 10.4, except MDEV-22543 Also, fix GCC -Og -Wmaybe-uninitialized in run_backup_stage()	2020-08-13 18:48:41 +03:00
Marko Mäkelä	b811c6ecc7	Fix GCC 10.2.0 -Og -Wmaybe-uninitialized Fix some more cases after merging commit `31aef3ae99`. Some warnings look possibly genuine, others are clearly bogus.	2020-08-13 18:21:30 +03:00
Marko Mäkelä	f347b3e0e6	Merge 10.3 into 10.4	2020-07-02 07:39:33 +03:00
Marko Mäkelä	1df1a63924	Merge 10.2 into 10.3	2020-07-02 06:17:51 +03:00
Marko Mäkelä	ea2bc974dc	Merge 10.1 into 10.2	2020-07-01 12:03:55 +03:00
Sujatha	bf74f7f9ff	MDEV-20428: "Start binlog_dump" message doesn't indicate GTID position Problem: ======= The "Start binlog_dump" message hasn't been updated to include the slave's requested GTID position: 20:05:05 139836760311552 [Note] Start binlog_dump to slave_server(2), pos(, 4) For diagnostic purposes, it would be helpful if the GTID position were included. Fix: === Imporve "Start binlog_dump" print message to include "using_gtid" and "GTID position" requested by slave. Ex: [Note] Start binlog_dump to slave_server(2), pos(, 4), using_gtid(1), gtid('1-1-201,2-2-100') [Note] Start binlog_dump to slave_server(3), pos('mariadb-bin.004142', 507988273), using_gtid(0), gtid('')	2020-06-16 11:08:24 +05:30
Marko Mäkelä	4f29d776c7	Merge 10.3 into 10.4	2020-05-16 06:27:55 +03:00
Monty	3eadb135fd	Fixed access to uninitalized memory found by valgrind	2020-05-15 15:20:42 +03:00
Jan Lindström	93475aff8d	MDEV-22203: WSREP_ON is unnecessarily expensive to evaluate Replaced WSREP_ON macro by single global variable WSREP_ON that is then updated at server statup and on wsrep_on and wsrep_provider update functions.	2020-04-24 13:12:46 +03:00
Sujatha	4032fc1d68	Merge branch '10.3' into 10.4	2020-01-08 13:53:03 +05:30
Sujatha	b365b6e7d8	Merge branch '10.2' into 10.3	2020-01-08 13:44:06 +05:30
Sujatha	8317f77ccc	Merge branch '10.1' into 10.2 MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events Problem: ======== SHOW BINLOG EVENTS FROM <pos> reports following assert when ASAN is enabled. uint32 binlog_get_uncompress_len(const char): Assertion `(buf[0] & 0xe0) == 0x80' failed Fix: === Part11: Converted debug assert to error handler code*	2020-01-07 21:29:07 +05:30
Sujatha	a6dd827a4d	MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events Problem: ======== SHOW BINLOG EVENTS FROM <pos> causes a variety of failures, some of which are listed below. It is not a race condition issue, but there is some non-determinism in it. Analysis: ======== "show binlog events from <pos>" code considers the user given position as a valid event start position. The code starts reading data from this event start position onwards and tries to map it to a set of known events. Each event has a specific event structure and asserts have been added to ensure that read event data satisfies the event specific requirements. When a random position is supplied to "show binlog events command" the event structure specific checks will fail and they result in assert. Fix: ==== The fix is split into different parts. Each part addresses either an ASAN issue or an assert/crash. Part1: Checksum based position validation when checksum is enabled Using checksum validate the very first event read at the user specified position. If there is a checksum mismatch report an appropriate error for the invalid event.	2020-01-07 18:27:05 +05:30
Oleksandr Byelkin	a15234bf4b	Merge branch '10.3' into 10.4	2019-12-09 15:09:41 +01:00
Faustin Lammler	2df2238cb8	Lintian complains on spelling error The lintian check complains on spelling error: https://salsa.debian.org/mariadb-team/mariadb-10.3/-/jobs/95739	2019-12-02 12:41:13 +02:00
Marko Mäkelä	33cb10d4e9	Merge 10.3 into 10.4	2019-11-12 16:55:44 +02:00
Andrei Elkin	d103c5a489	merge 10.2->10.3 with conflict resolutions	2019-11-11 16:28:21 +02:00
Andrei Elkin	26fd880d5e	manual merge 10.1->10.2	2019-11-11 16:03:43 +02:00
Andrei Elkin	13db50fc03	MDEV-19376 Repl_semi_sync_master::commit_trx assertion failure: ... \|\| !m_active_tranxs->is_tranx_end_pos(trx_wait_binlog_name, trx_wait_binlog_pos) The assert indicates that the current transaction got caught uncleaned from the semisync master's cache when it is signaled to proceed upon its ack receive. The reason of missed cleanup turns out to be a flaw in the gtid connect mode. A submitted by connecting slave value of its last received event's binlog file name was adopted into {{Repl_semi_sync_master::m_reply_file_name}} as a part of semisync initialization. Notice that the initialization still refines the position part of the submitted last received event's binlog coordinates. The master side binlog filename:pos refinement is specific to the gtid connect mode for purpose of computing the latest binlog file to resume slave feeding from. Effectively in the gtid connect mode the computed resumption filename:pos may appear smaller in which case a new post-connect time committing transaction may be logged with its filename:pos also less than the submitted coordinates and that triggers the assert. Fixed with making the semisync initialization to use the refined filename:pos. It is guaranteed to be less than any new generated transaction's binlog:pos.	2019-11-10 16:16:37 +02:00
Marko Mäkelä	c11e5cdd12	Merge 10.3 into 10.4	2019-10-10 11:19:25 +03:00
Marko Mäkelä	892378fb9d	Merge 10.2 into 10.3	2019-10-09 13:25:11 +03:00
Sachin Setiya	27664ef29d	MDEV-20574 Position of events reported by mysqlbinlog is wrong with encrypted binlogs, SHOW BINLOG EVENTS reports the correct one. Analysis Mysqlbinlog output for encrypted binary log #Q> insert into tab1 values (3,'row 003') #190912 17:36:35 server id 10221 end_log_pos 980 CRC32 0x53bcb3d3 Table_map: `test`.`tab1` mapped to number 19 # at 940 #190912 17:36:35 server id 10221 end_log_pos 1026 CRC32 0xf2ae5136 Write_rows: table id 19 flags: STMT_END_F Here we can see Table_map_log_event ends at 980 but Next event starts at 940. And the reason for that is we do not send START_ENCRYPTION_EVENT to the slave Solution:- Send Start_encryption_log_event as Ignorable_log_event to slave(mysqlbinlog), So that mysqlbinlog can update its log_pos. Since Slave can request multiple FORMAT_DESCRIPTION_EVENT while master does not have so We only update slave master pos when master actually have the FORMAT_DESCRIPTION_EVENT. Similar logic should be applied for START_ENCRYPTION_EVENT. Also added the test case when new server reads the data from old server which does not send START_ENCRYPTION_EVENT to slave. Master Slave Upgrade Scenario. When Slave is updated first, Slave will have extra logic of handling START_ENCRYPTION_EVENT But master willnot be sending START_ENCRYPTION_EVENT. So there will be no issue. When Master is updated first, It will send START_ENCRYPTION_EVENT to slave , But slave will ignore this event in queue_event.	2019-10-08 14:35:34 +05:30
Oleksandr Byelkin	c07325f932	Merge branch '10.3' into 10.4	2019-05-19 20:55:37 +02:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	cb248f8806	Merge branch '5.5' into 10.1	2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru	5543b75550	Update FSF Address * Update wrong zip-code	2019-05-11 21:29:06 +03:00
Monty	adb7016214	MDEV-19117 Don't keep binary log index file locked during show binary logs On some systems with 10,000+ binlogs, show binary logs could block log rotation for more than 10 seconds. This patch fixes this by first caching all binary log names and releases all mutexes while calculating the sizes of the binary logs. Other things: - Ensure that reinit_io_cache() sets end_of_file when moving to read_cache. This ensures that external changes of the underlying file is known to the cache. - get_binlog_list() is made more efficent and show_binlogs() is changed to call get_binlog_list() Reviewed by Andrei Elkin	2019-04-01 19:47:24 +03:00
Monty	5b15f68e0f	Fixed valgrind warning: Wrong usage of c_ptr()	2019-04-01 19:47:24 +03:00
Sergey Vojtovich	3568427d11	MDEV-18450 Slaves wait shutdown The patches features an optional shutdown behavior to hold on until after all connected slaves have been sent the last binlogged event. The connected slave is one whose START SLAVE has been acknowledged and that was not stopped since that though it could be technically reconnecting in background. The solution therefore disallows killing the dump thread until is has found EOF of the latest binlog file. It is up to the shutdown requester (DBA) to set up a sufficiently large shutdown timeout value for shudown to wait patiently until lagging behind slaves have been synchronized. On the other hand if a specific slave needs exclusion from synchronization the DBA would have to stop it manually which would terminate its dump thread. `mysqladmin shutdown' is extended with a `--wait_for_all_slaves' option which translates to `SHUTDOW WAIT FOR ALL SLAVES' sql query to enable the feature on the client side. The patch also performs a small refactoring of the server shutdown around close_connections() to introduce kill thread phases which are two as of current.	2019-03-12 17:34:48 +02:00
Oleksandr Byelkin	93ac7ae70f	Merge branch '10.3' into 10.4	2019-02-21 14:40:52 +01:00
mkaruza	ddc983394d	Fix for galera_3nodes.galera_gtid_2_cluster Temporary disable WSREP while executing RESET MASTER. In situation when 2 nodes are both master/slave first stop slave on both and than reset master. Enforce stricter causality check with wsrep_sync_wait.	2019-02-19 14:25:01 +02:00
Oleksandr Byelkin	65c5ef9b49	dirty merge	2019-02-07 13:59:31 +01:00
Marko Mäkelä	a249e57b68	Merge 10.1 into 10.2 Temporarily disable a test for commit `2175bfce3e` because fixing it in 10.2 requires updating libmariadb.	2019-02-03 17:22:05 +02:00

1 2 3 4 5 ...

1227 commits