Commit graph

195 commits

Author SHA1 Message Date
Kristian Nielsen
e7c6cdd842 Merge 10.6 -> 10.11
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-12-05 10:11:58 +01:00
Kristian Nielsen
0166c89e02 Merge 10.5 -> 10.6
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-12-05 09:20:36 +01:00
Teemu Ollakka
a2575a0703 MDEV-35465 Async replication stops working on Galera async replica node when parallel replication is enabled
Parallel slave failed to retry in retry_event_group() with error

    WSREP: Parallel slave worker failed at wsrep_before_command() hook

Fix wsrep transaction cleanup/restart in retry_event_group() to properly
clean up previous transaction by calling wsrep_after_statement().
Also move call to reset error after call to wsrep_after_statement()
to make sure that it remains effective.

Add a MTR test galera_as_slave_parallel_retry to reproduce the error
when the fix is not present.

Other issues which were detected when testing with sysbench:

Check if parallel slave is killed for retry before waiting for prior
commits in THD::wsrep_parallel_slave_wait_for_prior_commit(). This
is required with slave-parallel-mode=optimistic to avoid deadlock
when a slave later in commit order manages to reach prepare phase
before a lock conflict is detected.

Suppress wsrep applier specific warning for slave threads.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-12-03 15:05:32 +01:00
Marko Mäkelä
3d23adb766 Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
Marko Mäkelä
7d4077cc11 Merge 10.5 into 10.6 2024-11-29 12:37:46 +02:00
ParadoxV5
cf2d49ddcf Extract some of #3360 fixes to 10.5.x
That PR uncovered countless issues on `my_snprintf` uses.
This commit backports a squashed subset of their fixes.
2024-11-21 22:43:56 +11:00
ParadoxV5
79cc0f9f78 MDEV-11675 fixup: Fill in the GTID TODO
Fill in the `todo:gtid` in `check_and_remove_stale_alter`
(Note that `Master_info::master_id` is a `ulong`,
 unlike `rpl_gtid::server_id`)

> We could have caught this before MDEV-11675 was
> published if we'd had this validation earlier 😇 .
> ⸺ Brandon, reply in #3360 (MDEV-21978)

Co-authored-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
2024-11-18 14:04:56 +11:00
ParadoxV5
687377633d Extract some of #3360 fixes to 10.11.x
That PR uncovered countless issues on `my_snprintf` uses.
This commit backports a squashed subset of their fixes.
(Excludes previous parts #3485 and #3493)
2024-11-18 14:04:56 +11:00
Thirunarayanan Balathandayuthapani
074831ec61 Merge branch 10.5 into 10.6 2024-11-08 18:17:15 +05:30
Jan Lindström
4b38af06a4 MDEV-35157 : wrong binlog timestamps on secondary nodes of a galera cluster
Problem was missing thd->set_time() before binlog event
execution in wsrep_apply_events.

Removed part of earlier commit 1363580 because it had
nothing to do with VERSIONED tables.

Note that this commit does not contain mtr-testcase
because actual timestamps on binlog file depends the
actual time when events are executed and how long
their execution takes.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-11-06 04:59:10 +01:00
Julius Goryavsky
d058be62b8 Merge branch '10.6' into '10.11' 2024-09-02 03:49:03 +02:00
Julius Goryavsky
bac0804d81 Merge branch '10.5' into '10.6' 2024-09-01 06:51:25 +02:00
Jan Lindström
b1d74b7e72 MDEV-33997 : Assertion `((WSREP_PROVIDER_EXISTS_ && this->variables.wsrep_on) && wsrep_emulate_bin_log) || mysql_bin_log.is_open()' failed in int THD::binlog_write_row(TABLE*, bool, const uchar*)
Problem was that we did not found that table was partitioned
and then we should find what is actual underlaying storage
engine.

We should not use RSU for !InnoDB tables.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-08-29 13:41:23 +02:00
Oleksandr Byelkin
0fe39d368a Merge branch '10.6' into 10.11 2024-07-22 15:14:50 +02:00
Yuchen Pei
f071b7620b
Merge branch '10.5' into 10.6 2024-07-16 15:54:22 +08:00
Brandon Nesterenko
ea9869504d MDEV-33921: Replication breaks when filtering two-phase XA transactions
There are two problems.

First, replication fails when XA transactions are used where the
slave has replicate_do_db set and the client has touched a different
database when running DML such as inserts. This is because XA
commands are not treated as keywords, and are thereby not exempt
from the replication filter. The effect of this is that during an XA
transaction, if its logged “use db” from the master is filtered out
by the replication filter, then XA END will be ignored, yet its
corresponding XA PREPARE will be executed in an invalid state,
thereby breaking replication.

Second, if the slave replicates an XA transaction which results in
an empty transaction, the XA START through XA PREPARE first phase of
the transaction won’t be binlogged, yet the XA COMMIT will be
binlogged. This will break replication in chain configurations.

The first problem is fixed by treating XA commands in
Query_log_event as keywords, thus allowing them to bypass the
replication filter. Note that Query_log_event::is_trans_keyword() is
changed to accept a new parameter to define its mode, to either
check for XA commands or regular transaction commands, but not both.
In addition, mysqlbinlog is adapted to use this mode so its
--database filter does not remove XA commands from its output.

The second problem fixed by overwriting the XA state in the XID
cache to be XA_ROLLBACK_ONLY, so at commit time, the server knows to
rollback the transaction and skip its binlogging. If the xid cache
is cleared before an XA transaction receives its completion command
(e.g. on server shutdown), then before reporting ER_XAER_NOTA when
the completion command is executed, the filter is first checked if
the database is ignored, and if so, the error is ignored.

Reviewed By:
============
Kristian Nielsen <knielsen@knielsen-hq.org>
Andrei Elkin <andrei.elkin@mariadb.com>
2024-07-10 14:37:39 -06:00
Marko Mäkelä
27a3366663 Merge 10.6 into 10.11 2024-06-27 10:26:09 +03:00
Marko Mäkelä
0076eb3d4e Merge 10.5 into 10.6 2024-06-24 13:09:47 +03:00
Dave Gosselin
db0c28eff8 MDEV-33746 Supply missing override markings
Find and fix missing virtual override markings.  Updates cmake
maintainer flags to include -Wsuggest-override and
-Winconsistent-missing-override.
2024-06-20 11:32:13 -04:00
Marko Mäkelä
b81d717387 Merge 10.6 into 10.11 2024-06-11 12:50:10 +03:00
Marko Mäkelä
a687cf8661 Merge 10.5 into 10.6 2024-06-07 10:03:51 +03:00
Jan Lindström
d328705a12 MDEV-34170 : table gtid_slave_pos entries never been deleted with wsrep_gtid_mode = 0
Problem was that updates to mysql.gtid_slave_pos table were
replicated even when they were newer used and because that
newer deleted. Avoid replication of mysql.gtid_slave_pos
table if wsrep_gtid_mode=OFF.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-06 19:19:34 +02:00
Sergei Golubchik
a6b2f820e0 Merge branch '10.6' into 10.11 2024-05-10 20:02:18 +02:00
Sergei Golubchik
7b53672c63 Merge branch '10.5' into 10.6 2024-05-08 20:06:00 +02:00
Kristian Nielsen
383ee364dc Merge 10.6 to 10.11 2024-05-07 08:45:31 +02:00
Julius Goryavsky
b88c20ce1b Merge branch 10.4 into 10.5 2024-05-06 13:55:42 +02:00
Kristian Nielsen
4b4db4a8e5 MDEV-34042: Deadlock kill of XA PREPARE can break replication / rpl.rpl_parallel_multi_domain_xa sporadic failure
Refinement of the original patch.

Move the code to reset the kill up into the parent class
Xid_apply_log_event, to also fix the similar issue for XA COMMIT.

Increase the number of slave retries in the test case
rpl.rpl_parallel_multi_domain_xa to fix some sporadic failures. The test
generates massive amounts of conflicting transactions in multiple
independent domains, which can cause multiple rollback+retry for a
transaction as it conflicts with transactions in other domains one-by-one.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-05-05 19:01:56 +02:00
Kristian Nielsen
596921dab8 MDEV-34042: Deadlock kill of XA PREPARE can break replication / rpl.rpl_parallel_multi_domain_xa sporadic failure
Clear any pending deadlock kill after completing XA PREPARE, and before
updating the mysql.gtid_slave_pos table in a separate transaction.

Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-05-02 21:07:51 +02:00
Brandon Nesterenko
8c7992165b MDEV-33672: 10.11 Fix for Two Phase Alter Flags
Extends 89c907bd4f to account for
binlog_two_phase_alter flags in a Gtid log event. I.e., if the
FL_COMMIT_ALTER_E1 or FL_ROLLBACK_ALTER_E2 flags are set in the
event flags, yet the length of the event is too short to hold
the value, then set the event as invalid
2024-04-24 13:19:36 +02:00
Sergei Golubchik
018d537ec1 Merge branch '10.6' into 10.11 2024-04-22 15:23:10 +02:00
Marko Mäkelä
bb2e125d07 Merge 10.5 into 10.6
This excludes commit 040069f4ba
because it is specific to innodb_sync_debug, which had been removed
in commit ff5d306e29.
2024-04-18 07:14:56 +03:00
Brandon Nesterenko
0ad52e4d6a MDEV-27512: Assertion !thd->transaction_rollback_request failed in rows_event_stmt_cleanup
If replicating an event in ROW format, and InnoDB detects a deadlock
while searching for a row, the row event will error and rollback in
InnoDB and indicate that the binlog cache also needs to be cleared,
i.e. by marking thd->transaction_rollback_request. In the normal
case, this will trigger an error in Rows_log_event::do_apply_event()
and cause a rollback. During the Rows_log_event::do_apply_event()
cleanup of a successful event application, there is a DBUG_ASSERT in
log_event_server.cc::rows_event_stmt_cleanup(), which sets the
expectation that thd->transaction_rollback_request cannot be set
because the general rollback (i.e. not the InnoDB rollback) should
have happened already. However, if the replica is configured to skip
deadlock errors, the rows event logic will clear the error and
continue on, as if no error happened. This results in
thd->transaction_rollback_request being set while in
rows_event_stmt_cleanup(), thereby triggering the assertion.

This patch fixes this in the following ways:
 1) The assertion is invalid, and thereby removed.
 2) The rollback case is forced in rows_event_stmt_cleanup() if
transaction_rollback_request is set.

Note the differing behavior between transactions which are skipped
due to deadlock errors and other errors. When a transaction is
skipped due to an ignored deadlock error, the entire transaction is
rolled back and skipped (though note MDEV-33930 which allows
statements in the same transaction after the deadlock-inducing one
to commit). When a transaction is skipped due to ignoring a
different error, only the erroring statements are rolled-back and
skipped - the rest of the transaction will execute as normal. The
effect of this can be seen in the test results. The added test case
to rpl_skip_error.test shows that only statements which are ignored
due to non-deadlock errors are ignored in larger transactions. A
diff between rpl_temporary_error2_skip_all.result and
rpl_temporary_error2.result shows that all statements in the errored
transaction are rolled back (diff pasted below):

: diff rpl_temporary_error2.result rpl_temporary_error2_skip_all.result
49c49
< 2	1
---
> 2	NULL
51c51
< 4	1
---
> 4	NULL
53c53
< * There will be two rows in t2 due to the retry.
---
> * There will be one row in t2 because the ignored deadlock does not retry.
57d56
< 1
59c58
< 1
---
> 0

Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
2024-04-17 11:14:21 -06:00
Sergei Golubchik
41296a07c8 Merge branch '10.5' into 10.6 2024-04-11 13:58:22 +02:00
Andrei
0da1653f1b MDEV-31779 Server crash in Rows_log_event::update_sequence upon replaying binary log
The crash at running mysqlbinlog on a SEQUENCE containing binlog file
was caused MDEV-29621 fixes that did not check which of the slave
or binlog applier executes a block introduced there.

The block is meaningful only for the parallel slave applier, so
it's safe to fix this bug with identified the actual applier and
skipping the block when it's the mysqlbinlog one.
2024-04-10 19:31:39 +03:00
Kristian Nielsen
d90a2b44ad MDEV-33668: More precise dependency tracking of XA XID in parallel replication
Keep track of each recently active XID, recording which worker it was queued
on. If an XID might still be active, choose the same worker to queue event
groups that refer to the same XID to avoid conflicts.

Otherwise, schedule the XID freely in the next round-robin slot.

This way, XA PREPARE can normally be scheduled without restrictions (unless
duplicate XID transactions come close together). This improves scheduling
and parallelism over the old method, where the worker thread to schedule XA
PREPARE on was fixed based on a hash value of the XID.

XA COMMIT will normally be scheduled on the same worker as XA PREPARE, but
can be a different one if the XA PREPARE is far back in the event history.

Testcase and code for trimming dynamic array due to Andrei.

Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-09 11:42:34 +03:00
Brandon Nesterenko
89c907bd4f MDEV-33672: Gtid_log_event Construction from File Should Ensure Event Length When Using Extra Flags
A GTID event can have variable length, with contributing factors
such as the variable length from the flags2 and optional extra flags
fields. These fields are bitmaps, where each set bit indicates an
additional value that should be appended to the event, e.g.
multi-engine transactions append a number to indicate the number of
additional engines a transaction uses. However, if a flags bit is
set, and no additional fields are appended to the event, MDEV-33672
reports that the server can still try to read from memory as if it
did exist. Note, however, in debug builds, this condition is
asserted for FL_EXTRA_MULTI_ENGINE.

This patch fixes this to check that the length of the event is
aligned with the expectation set by the flags for FL_PREPARED_XA,
FL_COMPLETED_XA, and FL_EXTRA_MULTI_ENGINE.

Reviewed By
============
Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-08 07:57:14 -06:00
Marko Mäkelä
788953463d Merge 10.6 into 10.11
Some fixes related to commit f838b2d799 and
Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row()
for system-versioned tables were provided by Nikita Malyavin.
This was required by test versioning.rpl,trx_id,row.
2024-03-28 09:16:57 +02:00
Kristian Nielsen
1fb00f37ae MDEV-33303: slave_parallel_mode=optimistic should not report the mode's specific temporary errors
An earlier patch for MDEV-13577 fixed the most common instances of this, but
missed one case for tables without primary key when the scan reaches the end
of the table. This patch adds similar code to handle this case, converting
the error to HA_ERR_RECORD_CHANGED when doing optimistic parallel apply.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-03-15 15:55:06 +01:00
Monty
f838b2d799 MDEV-33623 Partitioning is broken on big endian architectures
MDEV-33502 Slowdown when running nested statement with many partitions
caused this error as I failed to take into account bigendian architectures.

This patch also introduces bitmap_import() and bitmap_export() to be used
when one wants to store bitmaps in files/logs in a portable way.

Reviewed-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-03-08 15:25:12 +02:00
Monty
b5d65fc105 Optimize performance of my_bitmap
MDEV-33502 Slowdown when running nested statement with many partitions

This change was triggered to help some MariaDB users with close to
10000 bits in their bitmaps.

- Change underlaying storage to be 64 bit instead of 32bit.
  - This reduses number of loops to scan bitmaps.
  - This can cause some bitmaps to be 4 byte large.
- Ensure that all not used top-bits are always 0 (simplifes code as
  the last 64 bit storage is not a special case anymore).
- Use my_find_first_bit() to find the first set bit which is much faster
  than scanning trough things byte by byte and then bit by bit.

Other things:
- Added a bool to remember if my_bitmap_init() did allocate the bitmap
  array. my_bitmap_free() will only free arrays it did allocate.
  This allowed me to remove setting 'bitmap=0' before calling
  my_bitmap_free() for cases where the bitmap's where allocated externally.
- my_bitmap_init() sets bitmap to 0 in case of failure.
- Added 'universal' asserts to most bitmap functions.
- Change all remaining calls to bitmap_init() to my_bitmap_init().
  - To finish the change from 2014.
- Changed all usage of uint32 in my_bitmap.h to my_bitmap_map.
- Updated bitmap_copy() to handle bitmaps of different size.
- Removed const from bitmap_exists_intersection() as this caused casts
  on all usage.
- Removed not used function bitmap_set_above().
- Renamed create_last_word_mask() to create_last_bit_mask() (to match
  name changes in my_bitmap.cc)
- Extended bitmap-t with test for more bitmap functions.
2024-02-27 14:51:33 +02:00
Sergei Golubchik
87e13722a9 Merge branch '10.6' into 10.11 2024-02-01 18:36:14 +01:00
Monty
e20693c167 Fixed some wrong printf() usage after changing m_table_id to ulonglong
This caused some crashes on 32 bit platforms.
2024-01-27 16:29:40 +02:00
Monty
26c86c39fc Fixed some mtr tests that failed on windows
Most things where wrong in the test suite.
The one thing that was a bug was that table_map_id was in some places
defined as ulong and in other places as ulonglong. On Linux 64 bit this
is not a problem as ulong == ulonglong, but on windows this caused failures.
Fixed by ensuring that all instances of table_map_id are ulonglong.
2024-01-23 13:03:12 +02:00
Daniel Black
4ef9c9bb75 Merge remote-tracking branch 10.5 into 10.6
Notably MDEV-33290, columnstore disable stays in 10.5.
2024-01-23 15:25:42 +11:00
Brandon Nesterenko
207c85783b MDEV-33283: Binlog Checksum is Zeroed by Zlib if Part of Event Data is Empty
An existing binlog checksum can be overridden to 0 if writing a NULL
payload when using Zlib for the computation. That is, calling into
Zlib's crc32 with empty data initializes an incremental CRC
computation to 0.

This patch changes the Log_event_writer::write_data() to exit
immediately if there is nothing to write, thereby bypassing the
checksum computation. This follows the pattern of
Log_event_writer::encrypt_and_write(), which also exits immediately
if there is no data to write.

Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
2024-01-22 07:15:42 -07:00
Marko Mäkelä
9d20853c74 Merge 10.6 into 10.11 2024-01-18 19:22:23 +02:00
Marko Mäkelä
ad13fb36bf Merge 10.6 into 10.11 2024-01-17 17:37:15 +02:00
Marko Mäkelä
3a96eba25f Merge 10.5 into 10.6 2024-01-17 13:35:05 +02:00
Alexander Barkov
fa3171df08 MDEV-27666 User variable not parsed as geometry variable in geometry function
Adding GEOMETRY type user variables.
2024-01-16 18:53:23 +04:00
Yuchen Pei
d06b6de305
Merge branch '10.5' into 10.6 2024-01-11 12:59:22 +11:00