join_cache_level=6+
The patch fixes two similar bugs in the commit 8eeb689e9f
that added multi_range_read support to partitions. The commit opened
a possibility to join a partition table using BKA+MRR. However in some
cases it could lead to wrong results or even crashes.
This could happened when
- index condition pushdown was used to join the table or
- the joined table was an inner table of an outer join and 'not exist'
optimization was applied or
- the join table was the inner table of a semi-join and the first match
optimization was applied
The bugs were in the code of the call-back functions
- partition_multi_range_key_skip_record() and
- partition_multi_range_key_skip_index_tuple().
Each of this function consist only of an invocation of another function.
Yet a wrong parameter was passed at this invocation.
The fix was suggested by Sergey Petrunia and it is apparently in line
with original design.
The corresponding comprehensive test cases demonstrating the problems
caused by the bugs were constructed by me.
If async replication slave thread conflicts with cluster replication,
then the async slave transaction should be BF aborted, and depending on the
state of async slave transaction execution, potentially also replayed.
There were problems in such BF abort implementation and the replaying was not
started.
This pull request contains fixes which make sure that if async slave thread is
marked to abort and replay, it will complete carry out the rollback and
release all locks and resources before starting the replaying. After replaying,
async slave transactions is treated as successful, so the slave thread will
continue as usual, handling next replication event.
There is also new mtr test: galera.galera_slave_replay, which stresses both a
certification failure for async slave thread and a successful BF abort
followed by replaying.
(Backport to 10.3)
Partitioning storage now supports MRR but doesn't support Index Condition
Pushdown (aka ICP). This causes counter-intuitive query plans for queries
that use BKA and conditions that depend on index fields:
- If the condition refers to other tables, BKA's variant of ICP is used
to handle it.
- If the condition depends on this table only, the optimizer will try to
use regular ICP for it, which will fail because the storage engine
doesn't support ICP.
Make the optimizer be smarter in the second case: if we were not able to
use regular ICP, use BKA's variant of ICP..
Context involves semicolon batching, and the error starts 10.2
No reproducible examples were made yet, but TCP trace suggests
multiple packets that are "squeezed" together (e.g overlong OK packet
that has a trailer which is belongs to another packet)
Remove thd->get_stmt_da()->set_skip_flush() when processing a batch.
skip_flush stems from the COM_MULTI code, which was checked in during
10.2 (and is never used)
The fix is confirmed to work, when evaluated by bug reporter (one of them)
We never reproduced it locally, with multiple tries
thus the root cause analysis is still missing.
Do not materialize a semi-join nest if it contains a materialized derived
table /view that potentially can be subject to the split optimization.
Splitting of materialization of such nest would help, but currently there
is no code to support this technique.
Fixed a bug introduced in MDEV-11345, server did not start if
non-english error messages were set in startup parameters.
Added lc_messages=de_DE option into an existing test case.
Problem:
-------
Accessing a member within 'xid_count_per_binlog' structure results in
following error when 'UBSAN' is enabled.
member access within address 0xXXX which does not point to an object of type
'xid_count_per_binlog'
Analysis:
---------
The problem appears to be that no constructor for 'xid_count_per_binlog' is
being called, and thus the vtable will not be initialized.
Fix:
---
Defined a parameterized constructor for 'xid_count_per_binlog' class.
[Variant 2 of the fix: collect the attached conditions]
Problem:
make_join_select() has a section of code which starts with
"We plan to scan all rows. Check again if we should use an index."
the code in that section will [unnecessarily] re-run the range
optimizer using this condition:
condition_attached_to_current_table AND current_table's_ON_expr
Note that the original invocation of range optimizer in
make_join_statistics was done using the whole select's WHERE condition.
Taking the whole select's WHERE condition and using multiple-equalities
allowed the range optimizer to infer more range restrictions.
The fix:
- Do range optimization using a condition that is an AND of this table's
condition and all of the previous tables' conditions.
- Also, fix the range optimizer to prefer SEL_ARGs with type=KEY_RANGE
over SEL_ARGS with type=MAYBE_KEY, regardless of the key part.
Computing
key_and(
SEL_ARG(type=MAYBE_KEY key_part=1),
SEL_ARG(type=KEY_RANGE, key_part=2)
)
will now produce the SEL_ARG with type=KEY_RANGE.
Problem:
=======
P1) Conditional jump or move depends on uninitialised value(s)
sql_ex_info::init(char const*, char const*, bool) (log_event.cc:3083)
code: All the following variables are not initialized.
----
return ((cached_new_format != -1) ? cached_new_format :
(cached_new_format=(field_term_len > 1 || enclosed_len > 1 ||
line_term_len > 1 || line_start_len > 1 || escaped_len > 1)));
P2) Conditional jump or move depends on uninitialised value(s)
Rows_log_event::Rows_log_event(char const*, unsigned
int, Format_description_log_event const*) (log_event.cc:9571)
Code: Uninitialized values is reported for 'var_header_len' variable.
----
if (var_header_len < 2 || event_len < static_cast<unsigned
int>(var_header_len + (post_start - buf)))
P3) Conditional jump or move depends on uninitialised value(s)
Table_map_log_event::pack_info(Protocol*) (log_event.cc:11553)
code:'m_table_id' is uninitialized.
----
void Table_map_log_event::pack_info(Protocol *protocol)
...
size_t bytes= my_snprintf(buf, sizeof(buf), "table_id: %lu (%s.%s)",
m_table_id, m_dbnam, m_tblnam);
Fix:
===
P1 - Fix)
Initialize cached_new_format,field_term_len, enclosed_len, line_term_len,
line_start_len, escaped_len members in default constructor.
P2 - Fix)
"var_header_len" is initialized by reading the event buffer. In case of an
invalid event the buffer will contain invalid data. Hence added a check to
validate the event data. If event_len is smaller than valid header length
return immediately.
P3 - Fix)
'm_table_id' within Table_map_log_event is initialized by reading data from
the event buffer. Use 'VALIDATE_BYTES_READ' macro to validate the current
state of the buffer. If it is invalid return immediately.
If the initialization of the wsrep provider failed, in some
cases the internal variable wrep_inited indicating that the
initialization has already been completed is still set to
"1", which then leads to confusion in the initialization
status. To solve the problem, we should set this variable
to "1" only if the wsrep provider initialization really
completed successfully.
An earlier issue has already been fixed for branch 10.4,
and this patch contains a fix for earlier versions (where
Galera 3.x is used).
* size represents the size of an element in the Unique class
* full_size is used when the Unique class counts the number of
duplicates stored per element. This requires additional space per Unique
element.
Item_cond inherits from Item_args but doesn't store its arguments
as function arguments, which means it has zero arguments.
Don't call memcpy in this case.
(Variant #2 of the patch, which keeps the sp_head object inside the
MEM_ROOT that sp_head object owns)
(10.3 requires extra work due to sp_package, will commit a separate
patch for it)
sp_head::operator new() and operator delete() were dereferencing sp_head*
pointers to memory that didn't hold a valid sp_head object (it was
not created/already destroyed).
This caused UBSan to crash when looking up type information.
Fixed by providing static sp_head::create() and sp_head::destroy() methods.
The string doesn't appear to be null-terminated when binlog checksums are
enabled. This causes a corrupt binlog name in the error message when a
slave is ahead of the master.
(Variant #2 of the patch, which keeps the sp_head object inside the
MEM_ROOT that sp_head object owns)
(10.3 version of the fix, with handling for class sp_package)
sp_head::operator new() and operator delete() were dereferencing sp_head*
pointers to memory that didn't hold a valid sp_head object (it was
not created/already destroyed).
This caused UBSan to crash when looking up type information.
Fixed by providing static sp_head::create() and sp_head::destroy() methods.
In this scenario:
- There is a possible range access for table T
- And there is a ref access on the same index which uses fewer key parts
- The join optimizer picks the ref access (because it is cheaper)
- make_join_select applies this heuristic to switch to range:
/* Range uses longer key; Use this instead of ref on key */
Join buffer will be used without having called
JOIN_TAB::make_scan_filter(). This means, conditions that should be
checked when reading table T will be checked after T is joined with the
contents of the join buffer, instead.
Fixed this by adding a make_scan_filter() check.
(updated patch after backport to 10.3)
(Fix testcase on Windows)