Attempting to set a SAVEPOINT when one of the involved storage engines
does not support savepoints, raises an error, and results in statement
rollback. If Galera is enabled with binlog emulation, the above
scenario was not handled correctly, and resulted in cluster wide
inconsistency.
The problem was in wsrep_register_binlog_handler(), which is called
towards the beginning of SAVEPOINT execution. This function is
supposed to mark the beginning of statement position in trx cache
through `set_prev_position()`. However, it did so only on condition
that `get_prev_position()` returns `MY_OFF_T_UNDEF`.
This before statement position is typically reset to undefined at the
end of statement in `binlog_commit()` / `binlog_rollback()`.
However that's not the case with Galera and binlog emulation, for
which binlog commit / rollback hooks are not called due to the
optimization that avoids internal 2PC (MDEV-16509).
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
This patch fixes cases where a transaction caused empty writeset to be
replicated. This could happen in the case where a transaction executes
a statement that initially manages to modify some data and therefore
appended keys some for certification. The statement is however rolled
back at some later stage due to some error (for example, a duplicate
key error). After statement rollback the transaction is still alive,
has no other changes. When committing such transaction, an empty
writeset was replicated through Galera.
The fix is to avoid calling into commit hook only when transaction
has appended one or keys for certification *and* has some data in
binlog cache to replicate. Otherwise, the commit is considered empty,
and goes through usual empty commit path.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Partial revert of this commit:
commit 6b685ea7b0
Author: Sergei Golubchik <serg@mariadb.org>
Date: Wed Sep 28 18:55:15 2022 +0200
Don't hold LOCK_thd_data over run_commit_ordered(). Holding the mutex
is unnecessary and will deadlock if any code in a commit_ordered
handlerton call tries to take the mutex to change THD local data.
Instead, set the current_thd for the duration of the call to keep
asserts happy around LOCK_thd_data.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
This commit fixes several bugs in error handling around disk full when
writing the statement/transaction binlog caches:
1. If the error occurs during a non-transactional statement, the code
attempts to binlog the partially executed statement (as it cannot roll
back). The stmt_cache->error was still set from the disk full error. This
caused MYSQL_BIN_LOG::write_cache() to get an error while trying to read the
cache to copy it to the binlog. This was then wrongly interpreted as a disk
full error writing to the binlog file. As a result, a partial event group
containing just a GTID event (no query or commit) was binlogged. Fixed by
checking if an error is set in the statement cache, and if so binlog an
INCIDENT event instead of a corrupt event group, as for other errors.
2. For LOAD DATA LOCAL INFILE, if a disk full error occured while writing to
the statement cache, the code would attempt to abort and read-and-discard
any remaining data sent by the client. The discard code would however
continue trying to write data to the statement cache, and wrongly interpret
another disk full error as end-of-file from the client. This left the client
connection with extra data which corrupts the communication for the next
command, as well as again causing an corrupt/incomplete event to be
binlogged. Fixed by restoring the default read function before reading any
remaining data from the client connection.
Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
At the moment we cannot support
wsrep_forced_binlog_format=[MIXED|STATEMENT]
during CREATE TABLE AS SELECT.
Statement will use ROW instead and give
a warning.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Revert the old work-around for buggy fdatasync() on Linux ext3. This bug was
fixed in Linux > 10 years ago back to kernel version at least 3.0.
Reviewed-by: Marko Mäkelä <marko.makela@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
sql/log.cc:11101:56: runtime error: downcast of address 0x7f9dc801e9c8 which does not point to an object of type 'Gtid_list_log_event'
sql/sql_repl.cc:1429:12: runtime error: member call on address 0x7f1ca401ea48 which does not point to an object of type 'Gtid_list_log_event'
Problem:
========
A master can segfault if it can't set up decryption for its binary
log during a binlog dump with Using_Gtid=Slave_Pos. If slave
connects using GTID mode, the master will call into
log.cc::get_gtid_list_event(), which iterate through binlog events
looking for a Gtid_list_log_event. On an encrypted binlog that the
master cannot decrypt, the first event will be a
START_ENCRYPTION_EVENT which will call into the following decryption branch
if (fdle->start_decryption((Start_encryption_log_event*) ev))
errormsg= ‘Could not set up decryption for binlog.’;
The event iteration however, does not stop in spite of this error.
The master will try to read the next event, but segfault while
trying to decrypt it because decryption failed to initialize.
Solution:
========
Break the event iteration if decryption cannot be set up.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
This patch is the result of running
run-clang-tidy -fix -header-filter=.* -checks='-*,modernize-use-equals-default' .
Code style changes have been done on top. The result of this change
leads to the following improvements:
1. Binary size reduction.
* For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by
~400kb.
* A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb.
2. Compiler can better understand the intent of the code, thus it leads
to more optimization possibilities. Additionally it enabled detecting
unused variables that had an empty default constructor but not marked
so explicitly.
Particular change required following this patch in sql/opt_range.cc
result_keys, an unused template class Bitmap now correctly issues
unused variable warnings.
Setting Bitmap template class constructor to default allows the compiler
to identify that there are no side-effects when instantiating the class.
Previously the compiler could not issue the warning as it assumed Bitmap
class (being a template) would not be performing a NO-OP for its default
constructor. This prevented the "unused variable warning".
The user XA commit execution branch was caught not have been covered
with MDEV-21953 fixes.
The XA involved deadlock is resolved now to apply the former fixes
pattern.
Along the fixes the following changes have been implemented.
- MDL lock attribute correction
- dissociation of the externally completed XA from the current
thread's xid_state in the error branches
- cleanup_context() preseves the prepared XA
- wait_for_prior_commit() is relocated to satisfy both
the binlog ON (log-slave-updates and skip-log-bin)
and OFF slave execution branches.
Fixing a few problems relealed by UBSAN in type_float.test
- multiplication overflow in dtoa.c
- uninitialized Field::geom_type (and Field::srid as well)
- Wrong call-back function types used in combination with SHOW_FUNC.
Changes in the mysql_show_var_func data type definition were not
properly addressed all around the code by the following commits:
b4ff64568c18feb62fee0ee879ff8a
Adding a helper SHOW_FUNC_ENTRY() function and replacing
all mysql_show_var_func declarations using SHOW_FUNC
to SHOW_FUNC_ENTRY, to catch mysql_show_var_func in the future
at compilation time.
the only query of the XA transaction is on a non-transactional table
errors out:
XA BEGIN 'x';
--error ER_DUP_ENTRY
INSERT INTO t1 VALUES (1),(1);
XA END 'x';
XA PREPARE 'x';
The binlogging pattern is correctly started as expected with
the errored-out Query or its ROW format events, but there is
no empty XA_prepare_log_event group.
The following
XA COMMIT 'x';
therefore should not be logged either, but it does.
The bug is fixed with proper maintaining of a read-write binlog hton
property and use it to enforce correct binlogging decisions.
Specifically in the bug description case XA COMMIT won't be binlogged
in both when given in the same connection and externally after disconnect.
The same continue to apply to an empty XA that do not change any data in all
transactional engines involved.
thd_get_ha_data() can be used without a lock, but only from the
current thd thread, when calling from anoher thread it *must*
be protected by thd->LOCK_thd_data
* fix group commit code to take thd->LOCK_thd_data
* remove innobase_close_connection() from the innodb background thread,
it's not needed after 87775402cd and was failing the assert with
current_thd==0
There are separate flags DBUG_OFF for disabling the DBUG facility
and ENABLED_DEBUG_SYNC for enabling the DEBUG_SYNC facility.
Let us allow debug builds without DEBUG_SYNC.
Note: For CMAKE_BUILD_TYPE=Debug, CMakeLists.txt will continue to
define ENABLED_DEBUG_SYNC.
The shutdown time assert was caused by untimely deactivation of
the binlog background thread and related structs destruction.
It could specifically occur when a transaction is replication unsafe
and has to be completed with a ROLLBACK event in binlog.
This gets fixed with the binlog background thread stop relocation
to a point and user transactions have been completed.
A test case is added to binlog.binlog_checkpoint which
also receives as a bonus a minor correction to reactivate a MDEV-4322 test
case that originally required a shutdown phase (that ceased to do).
The hang may be caused by a 1pc branch that was fixed by MDEV-26031 in
10.6 and up. That commit did not look relevant in 10.5 and below
so was not pushed to the low branches.
To possibly tackle the reported issue
the MDEV-26031 is backported now with a test that
unlike 10.6 does not expose the former bug in 10.5.
It is only needed for checking a refined logics
inside MYSQL_BIN_LOG::write_transaction_to_binlog.
The latter is made to do away with xid-unlogging (which is suspected
to have been at fault) for xid-less transaction.
Problem:
=======
This patch addresses two issues:
1. An incident event can be incorrectly reported for transactions
which are rolled back successfully. That is, an incident event
should only be generated for failed “non-transactional transactions”
(i.e., those which modify non-transactional tables) because they
cannot be rolled back.
2. When the mariadb slave (error) stops at receiving the incident
event there's no description of what led to it. Neither in the event
nor in the master's error log.
Solution:
========
Before reporting an incident event for a transaction, first validate
that it is “non-transactional” (i.e. cannot be safely rolled back).
To determine if a transaction is non-transactional,
lex->stmt_accessed_table(LEX::STMT_WRITES_NON_TRANS_TABLE)
is used because it is set previously in
THD::decide_logging_format().
Additionally, when an incident event is written, write an error
message to the server’s error log to indicate the underlying issue.
Reviewed by:
===========
Andrei Elkin <andrei.elkin@mariadb.com>
Sequence storage engine is not transactionl so cache will be written in
stmt_cache that is not replicated in cluster. To fix this replicate
what is available in both trans_cache and stmt_cache.
Sequences will only work when NOCACHE keyword is used when sequnce is
created. If WSREP is enabled and we don't have this keyword report error
indicting that sequence will not work correctly in cluster.
When binlog is enabled statement cache will be cleared in transaction
before COMMIT so cache generated from sequence will not be replicated.
We need to keep cache until replication.
Tests are re-recorded because of replication changes that were
introducted with this PR.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
don't initialize error_log_handler_list in set_handlers()
* error_log_handler_list is initialized to LOG_FILE early, in init_base()
* set_handlers always reinitializes it to LOG_FILE, so it's pointless
* after init_base() concurrent threads start using sql_log_warning,
so following set_handlers() shouldn't modify error_log_handler_list
without some protection
For GTID consistenty, GTID events was artificialy added before
replication happned. This event should not contain CHECKSUM calculated.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Problem:
========
A slave’s relay log format description event is used when
calculating Seconds_Behind_Master (SBM). This forces the SBM
value to spike when processing these events, as their creation
date is set to the timestamp that the IO thread begins.
Solution:
========
When the slave generates a format description event, mark the
event as a relay log event so it does not update the
rli->last_master_timestamp variable.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
This could cause out of order wsrep checkpoints due wsrep specific leader
code not being executed in `MYSQL_BIN_LOG::write_transaction_to_binlog_events`.
Move original result assignment to before wsrep logic to prevent that.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>