If async slave thread (slave SQL handler), becomes a BF victim, it may occasionally happen that rollbacker thread is used to carry out the rollback instead of the async slave thread.
This can happen, if async slave thread has flagged "idle" state when BF thread tries to figure out how to kill the victim.
The issue was possible to test by using a galera cluster as slave for external master, and issuing high load of conflicting writes through async replication and directly against galera cluster nodes.
However, a deterministic mtr test for the "conflict window" has not yet been worked on.
The fix, in this patch makes sure that async slave thread state is never set to IDLE. This prevents the rollbacker thread to intervene.
The wsrep_query_state change was refactored to happen by dedicated function to make controlling the idle state change in one place.
This bug could cause a crash for any query that used a derived table/view/CTE
whose specification was a SELECT with a GROUP BY clause and a FROM list
containing 32 or more table references.
The problem appeared only in the cases when the splitting optimization
could be applied to such derived table/view/CTE.
prepared statement
The method Item_func_in::build_clone() that builds a clone item for an
Item_func_in item first calls a generic method Item_func::build_item()
that builds the the clones for the arguments of the Item_func_in item
to be cloned, creates a copy of the Item_func_in object and attaches the
clones for the arguments to this copy. Then the method Item_func_in::build_clone()
makes the copy fully independent on the copied object in order to
guarantee a proper destruction of the clone. The fact is the copy of the
Item_func_in object is registered as any other item object and should be
destructed as any other item object.
If the method Item_func::build_item fails to build a clone of an argument
then it returns 0. In this case no copy of the Item_func_in object should
be created. Otherwise the finalizing actions for this copy would not be
performed and the copy would remain in a state that would prevent its
proper destruction.
The code of Item_func_in::build_clone() before this patch created the copy
of the Item_func_in object before cloning the argument items. If this
cloning failed the server crashed when trying to destruct the copy item.
The code of Item_row::build_clone() was changed similarly to the code of
Item_func::build_clone though this code could not cause any problems.
The problem happened in these line:
uval0= (ulonglong) (val0_negative ? -val0 : val0);
uval1= (ulonglong) (val1_negative ? -val1 : val1);
return check_integer_overflow(val0_negative ? -(longlong) res : res,
!val0_negative);
when unary minus was performed on -9223372036854775808.
This behavior is undefined in C/C++.
Rows_log_event::change_to_flashback_event(): Reduce the scope
of the variable swap_buff2, and do not duplicate conditions.
GCC 9.3.0 flagged the -Wmaybe-uninitialized when compiling the
10.5 branch using cmake -DWITH_ASAN=ON -DCMAKE_CXX_FLAGS=-O2
This bug could manifest itself in a very rare cases when the optimizer
chose an execution plan by which a joined table was accessed by a table
scan and the optimizer was checking whether ranges checked for each record
could improve this plan. In such cases the optimizer evaluates range
conditions over a table that depend on other tables. For such conditions
the constructed SEL_ARG trees are marked as MAYBE_KEY. If a SEL_ARG object
constructed for a sargable condition marked as RANGE_KEY had the same
first key part as a MAYBE_KEY SEL_ARG object and the key_and() function
was called for this pair of SEL_ARG objects then an invalid SEL_ARG
object could be constructed that ultimately could lead to a crash before
the execution phase.
'index_merge_sort_union=off'
When index_merge_sort_union is set to 'off' and index_merge_union is set
to 'on' then any evaluated index merge scan must consist only of ROR scans.
The cheapest out of such index merges must be chosen. This index merge
might not be the cheapest index merge.
_ma_fetch_keypage(): Correct an assertion that used to always hold.
Thanks to clang -Wint-in-bool-context for flagging this.
double_to_datetime_with_warn(): Suppress -Wimplicit-int-float-conversion
by adding a cast. LONGLONG_MAX converted to double will actually be
LONGLONG_MAX+1.
It was:
implicit conversion from 'ha_rows' (aka 'unsigned long long') to 'double'
changes value from 18446744073709551615 to 18446744073709551616
Follow what JOIN::get_examined_rows() does for similar code.
join_cache_level=6+
The patch fixes two similar bugs in the commit 8eeb689e9f
that added multi_range_read support to partitions. The commit opened
a possibility to join a partition table using BKA+MRR. However in some
cases it could lead to wrong results or even crashes.
This could happened when
- index condition pushdown was used to join the table or
- the joined table was an inner table of an outer join and 'not exist'
optimization was applied or
- the join table was the inner table of a semi-join and the first match
optimization was applied
The bugs were in the code of the call-back functions
- partition_multi_range_key_skip_record() and
- partition_multi_range_key_skip_index_tuple().
Each of this function consist only of an invocation of another function.
Yet a wrong parameter was passed at this invocation.
The fix was suggested by Sergey Petrunia and it is apparently in line
with original design.
The corresponding comprehensive test cases demonstrating the problems
caused by the bugs were constructed by me.
If async replication slave thread conflicts with cluster replication,
then the async slave transaction should be BF aborted, and depending on the
state of async slave transaction execution, potentially also replayed.
There were problems in such BF abort implementation and the replaying was not
started.
This pull request contains fixes which make sure that if async slave thread is
marked to abort and replay, it will complete carry out the rollback and
release all locks and resources before starting the replaying. After replaying,
async slave transactions is treated as successful, so the slave thread will
continue as usual, handling next replication event.
There is also new mtr test: galera.galera_slave_replay, which stresses both a
certification failure for async slave thread and a successful BF abort
followed by replaying.
(Backport to 10.3)
Partitioning storage now supports MRR but doesn't support Index Condition
Pushdown (aka ICP). This causes counter-intuitive query plans for queries
that use BKA and conditions that depend on index fields:
- If the condition refers to other tables, BKA's variant of ICP is used
to handle it.
- If the condition depends on this table only, the optimizer will try to
use regular ICP for it, which will fail because the storage engine
doesn't support ICP.
Make the optimizer be smarter in the second case: if we were not able to
use regular ICP, use BKA's variant of ICP..
Context involves semicolon batching, and the error starts 10.2
No reproducible examples were made yet, but TCP trace suggests
multiple packets that are "squeezed" together (e.g overlong OK packet
that has a trailer which is belongs to another packet)
Remove thd->get_stmt_da()->set_skip_flush() when processing a batch.
skip_flush stems from the COM_MULTI code, which was checked in during
10.2 (and is never used)
The fix is confirmed to work, when evaluated by bug reporter (one of them)
We never reproduced it locally, with multiple tries
thus the root cause analysis is still missing.
Do not materialize a semi-join nest if it contains a materialized derived
table /view that potentially can be subject to the split optimization.
Splitting of materialization of such nest would help, but currently there
is no code to support this technique.
Fixed a bug introduced in MDEV-11345, server did not start if
non-english error messages were set in startup parameters.
Added lc_messages=de_DE option into an existing test case.
Problem:
-------
Accessing a member within 'xid_count_per_binlog' structure results in
following error when 'UBSAN' is enabled.
member access within address 0xXXX which does not point to an object of type
'xid_count_per_binlog'
Analysis:
---------
The problem appears to be that no constructor for 'xid_count_per_binlog' is
being called, and thus the vtable will not be initialized.
Fix:
---
Defined a parameterized constructor for 'xid_count_per_binlog' class.