The crash at running mysqlbinlog on a SEQUENCE containing binlog file
was caused MDEV-29621 fixes that did not check which of the slave
or binlog applier executes a block introduced there.
The block is meaningful only for the parallel slave applier, so
it's safe to fix this bug with identified the actual applier and
skipping the block when it's the mysqlbinlog one.
Keep track of each recently active XID, recording which worker it was queued
on. If an XID might still be active, choose the same worker to queue event
groups that refer to the same XID to avoid conflicts.
Otherwise, schedule the XID freely in the next round-robin slot.
This way, XA PREPARE can normally be scheduled without restrictions (unless
duplicate XID transactions come close together). This improves scheduling
and parallelism over the old method, where the worker thread to schedule XA
PREPARE on was fixed based on a hash value of the XID.
XA COMMIT will normally be scheduled on the same worker as XA PREPARE, but
can be a different one if the XA PREPARE is far back in the event history.
Testcase and code for trimming dynamic array due to Andrei.
Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
A GTID event can have variable length, with contributing factors
such as the variable length from the flags2 and optional extra flags
fields. These fields are bitmaps, where each set bit indicates an
additional value that should be appended to the event, e.g.
multi-engine transactions append a number to indicate the number of
additional engines a transaction uses. However, if a flags bit is
set, and no additional fields are appended to the event, MDEV-33672
reports that the server can still try to read from memory as if it
did exist. Note, however, in debug builds, this condition is
asserted for FL_EXTRA_MULTI_ENGINE.
This patch fixes this to check that the length of the event is
aligned with the expectation set by the flags for FL_PREPARED_XA,
FL_COMPLETED_XA, and FL_EXTRA_MULTI_ENGINE.
Reviewed By
============
Kristian Nielsen <knielsen@knielsen-hq.org>
Some fixes related to commit f838b2d799 and
Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row()
for system-versioned tables were provided by Nikita Malyavin.
This was required by test versioning.rpl,trx_id,row.
An earlier patch for MDEV-13577 fixed the most common instances of this, but
missed one case for tables without primary key when the scan reaches the end
of the table. This patch adds similar code to handle this case, converting
the error to HA_ERR_RECORD_CHANGED when doing optimistic parallel apply.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
MDEV-33502 Slowdown when running nested statement with many partitions
caused this error as I failed to take into account bigendian architectures.
This patch also introduces bitmap_import() and bitmap_export() to be used
when one wants to store bitmaps in files/logs in a portable way.
Reviewed-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Under terms of MDEV 27490 we'll add support for non-BMP identifiers
and upgrade casefolding information to Unicode version 14.0.0.
In Unicode-14.0.0 conversion to lower and upper cases can increase octet length
of the string, so conversion won't be possible in-place any more.
This patch removes virtual functions performing in-place casefolding:
- my_charset_handler_st::casedn_str()
- my_charset_handler_st::caseup_str()
and fixes the code to use the non-inplace functions instead:
- my_charset_handler_st::casedn()
- my_charset_handler_st::caseup()
MDEV-33502 Slowdown when running nested statement with many partitions
This change was triggered to help some MariaDB users with close to
10000 bits in their bitmaps.
- Change underlaying storage to be 64 bit instead of 32bit.
- This reduses number of loops to scan bitmaps.
- This can cause some bitmaps to be 4 byte large.
- Ensure that all not used top-bits are always 0 (simplifes code as
the last 64 bit storage is not a special case anymore).
- Use my_find_first_bit() to find the first set bit which is much faster
than scanning trough things byte by byte and then bit by bit.
Other things:
- Added a bool to remember if my_bitmap_init() did allocate the bitmap
array. my_bitmap_free() will only free arrays it did allocate.
This allowed me to remove setting 'bitmap=0' before calling
my_bitmap_free() for cases where the bitmap's where allocated externally.
- my_bitmap_init() sets bitmap to 0 in case of failure.
- Added 'universal' asserts to most bitmap functions.
- Change all remaining calls to bitmap_init() to my_bitmap_init().
- To finish the change from 2014.
- Changed all usage of uint32 in my_bitmap.h to my_bitmap_map.
- Updated bitmap_copy() to handle bitmaps of different size.
- Removed const from bitmap_exists_intersection() as this caused casts
on all usage.
- Removed not used function bitmap_set_above().
- Renamed create_last_word_mask() to create_last_bit_mask() (to match
name changes in my_bitmap.cc)
- Extended bitmap-t with test for more bitmap functions.
This reverts commit c37b2087b4.
In c37b20887, when re-binlogging a GTID event on a replica,
it will overwrite the thread_id from the primary to be the
value of the slave applier (SQL thread or parallel worker).
This should be the value of the original thread_id on the
master connection though, to both help track temporary
tables, and be consistent with Query_log_event.
Reverting the commit to re-target 11.5, so we can re-test
with the corrected thread_id.
Most things where wrong in the test suite.
The one thing that was a bug was that table_map_id was in some places
defined as ulong and in other places as ulonglong. On Linux 64 bit this
is not a problem as ulong == ulonglong, but on windows this caused failures.
Fixed by ensuring that all instances of table_map_id are ulonglong.
This patch augments Gtid_log_event with the user thread-id.
In particular that compensates for the loss of this info in
Rows_log_events.
Gtid_log_event::thread_id gets visible in mysqlbinlog output like
#231025 16:21:45 server id 1 end_log_pos 537 CRC32 0x1cf1d963 GTID 0-1-2 ddl thread_id=10
as 64 bit unsigned integer.
While the size of Gtid event has grown by 8-9 bytes
replication from OLD <-> NEW is not affected by it.
This work was started by the late Sujatha Sivakumar.
Brandon Nesterenko took it over, reviewed initial patches and extended
the work.
Reviewed-by: <andrei.elkin@mariadb.com>
An existing binlog checksum can be overridden to 0 if writing a NULL
payload when using Zlib for the computation. That is, calling into
Zlib's crc32 with empty data initializes an incremental CRC
computation to 0.
This patch changes the Log_event_writer::write_data() to exit
immediately if there is nothing to write, thereby bypassing the
checksum computation. This follows the pattern of
Log_event_writer::encrypt_and_write(), which also exits immediately
if there is no data to write.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
Compute binlog checksums (when enabled) already when writing events
into the statement or transaction caches, where before it was done
when the caches are copied to the real binlog file. This moves the
checksum computation outside of holding LOCK_log, improving
scalabitily.
At stmt/trx cache write time, the final end_log_pos values are not
known, so with this patch these will be set to 0. Events that are
written directly to the binlog file (not through stmt/trx cache) keep
the correct end_log_pos value. The GTID and COMMIT/XID events at the
start and end of event groups are written directly, so the zero
end_log_pos is only for events in the middle of event groups, which
do not negatively affect replication.
An option --binlog-legacy-event-pos, off by default, is provided to
disable this behavior to provide backwards compatibility with any
external applications that might rely on end_log_pos in events in the
middle of event groups.
Checksums cannot be pre-computed when binlog encryption is enabled, as
encryption relies on correct end_log_pos to provide part of the
nonce/IV.
Checksum pre-computation is also disabled for WSREP/Galera, as it uses
events differently in its write-sets and so on. Extending pre-computation of
checksums to Galera where it makes sense could be added in a future patch.
The current --binlog-checksum configuration is saved in
binlog_cache_data at transaction start and used to pre-compute
checksums in cache, if applicable. When the cache is later copied to
the binlog, a check is made if the saved value still matches the
configured global value; if so, the events are block-copied directly
into the binlog file. If --binlog-checksum was changed during the
transaction, events are re-written to the binlog file one-by-one and
the checksums recomputed/discarded as appropriate.
Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
This is a preparatory commit for pre-computing checksums outside of
holding LOCK_log, no functional changes.
Which checksum algorithm is used (if any) when writing an event does not
belong in the event, it is a property of the log being written to.
Instead decide the checksum algorithm when constructing the
Log_event_writer object, and store it there.
Introduce a client-only Log_event::read_checksum_alg to be able to
print the checksum read, and a
Format_description_log_event::source_checksum_alg which is the
checksum algorithm (if any) to use when reading events from a log.
Also eliminate some redundant `enum` keywords on the enum_binlog_checksum_alg
type.
Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
This is a preparatory patch for precomputing binlog checksums outside
of holding LOCK_log, no functional changes.
Replace Log_event::writer with just passing the writer object as a
function parameter to Log_event::write().
This is mainly for code clarity. Having to set ev->writer before every
call to ev->write() is error-prone (what if it's forgotten in some
code place?), while passing it as parameter as usual makes it explicit
how the dataflow is.
As a minor point, it also improves the code, as the compiler now can
save the function parameter in a register across nested calls (when it
is a class member, compiler needs to reload across nested calls in
case the object would be modified during the call).
Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
The MDEV-29693 conflict resolution is from Monty, as well as is
a bug fix where ANALYZE TABLE wrongly built histograms for
single-column PRIMARY KEY.
Also includes a fix for safe_malloc error reporting.
Other things:
- Copied main.log_slow from 10.4 to avoid mtr issue
Disabled test:
- spider/bugfix.mdev_27239 because we started to get
+Error 1429 Unable to connect to foreign data source: localhost
-Error 1158 Got an error reading communication packets
- main.delayed
- Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED
This part is disabled for now as it fails randomly with different
warnings/errors (no corruption).