Rows_log_event::change_to_flashback_event(): Reduce the scope
of the variable swap_buff2, and do not duplicate conditions.
GCC 9.3.0 flagged the -Wmaybe-uninitialized when compiling the
10.5 branch using cmake -DWITH_ASAN=ON -DCMAKE_CXX_FLAGS=-O2
_ma_fetch_keypage(): Correct an assertion that used to always hold.
Thanks to clang -Wint-in-bool-context for flagging this.
double_to_datetime_with_warn(): Suppress -Wimplicit-int-float-conversion
by adding a cast. LONGLONG_MAX converted to double will actually be
LONGLONG_MAX+1.
It was:
implicit conversion from 'ha_rows' (aka 'unsigned long long') to 'double'
changes value from 18446744073709551615 to 18446744073709551616
Follow what JOIN::get_examined_rows() does for similar code.
If async replication slave thread conflicts with cluster replication,
then the async slave transaction should be BF aborted, and depending on the
state of async slave transaction execution, potentially also replayed.
There were problems in such BF abort implementation and the replaying was not
started.
This pull request contains fixes which make sure that if async slave thread is
marked to abort and replay, it will complete carry out the rollback and
release all locks and resources before starting the replaying. After replaying,
async slave transactions is treated as successful, so the slave thread will
continue as usual, handling next replication event.
There is also new mtr test: galera.galera_slave_replay, which stresses both a
certification failure for async slave thread and a successful BF abort
followed by replaying.
Context involves semicolon batching, and the error starts 10.2
No reproducible examples were made yet, but TCP trace suggests
multiple packets that are "squeezed" together (e.g overlong OK packet
that has a trailer which is belongs to another packet)
Remove thd->get_stmt_da()->set_skip_flush() when processing a batch.
skip_flush stems from the COM_MULTI code, which was checked in during
10.2 (and is never used)
The fix is confirmed to work, when evaluated by bug reporter (one of them)
We never reproduced it locally, with multiple tries
thus the root cause analysis is still missing.
Fixed a bug introduced in MDEV-11345, server did not start if
non-english error messages were set in startup parameters.
Added lc_messages=de_DE option into an existing test case.
Problem:
-------
Accessing a member within 'xid_count_per_binlog' structure results in
following error when 'UBSAN' is enabled.
member access within address 0xXXX which does not point to an object of type
'xid_count_per_binlog'
Analysis:
---------
The problem appears to be that no constructor for 'xid_count_per_binlog' is
being called, and thus the vtable will not be initialized.
Fix:
---
Defined a parameterized constructor for 'xid_count_per_binlog' class.
Problem:
=======
P1) Conditional jump or move depends on uninitialised value(s)
sql_ex_info::init(char const*, char const*, bool) (log_event.cc:3083)
code: All the following variables are not initialized.
----
return ((cached_new_format != -1) ? cached_new_format :
(cached_new_format=(field_term_len > 1 || enclosed_len > 1 ||
line_term_len > 1 || line_start_len > 1 || escaped_len > 1)));
P2) Conditional jump or move depends on uninitialised value(s)
Rows_log_event::Rows_log_event(char const*, unsigned
int, Format_description_log_event const*) (log_event.cc:9571)
Code: Uninitialized values is reported for 'var_header_len' variable.
----
if (var_header_len < 2 || event_len < static_cast<unsigned
int>(var_header_len + (post_start - buf)))
P3) Conditional jump or move depends on uninitialised value(s)
Table_map_log_event::pack_info(Protocol*) (log_event.cc:11553)
code:'m_table_id' is uninitialized.
----
void Table_map_log_event::pack_info(Protocol *protocol)
...
size_t bytes= my_snprintf(buf, sizeof(buf), "table_id: %lu (%s.%s)",
m_table_id, m_dbnam, m_tblnam);
Fix:
===
P1 - Fix)
Initialize cached_new_format,field_term_len, enclosed_len, line_term_len,
line_start_len, escaped_len members in default constructor.
P2 - Fix)
"var_header_len" is initialized by reading the event buffer. In case of an
invalid event the buffer will contain invalid data. Hence added a check to
validate the event data. If event_len is smaller than valid header length
return immediately.
P3 - Fix)
'm_table_id' within Table_map_log_event is initialized by reading data from
the event buffer. Use 'VALIDATE_BYTES_READ' macro to validate the current
state of the buffer. If it is invalid return immediately.
If the initialization of the wsrep provider failed, in some
cases the internal variable wrep_inited indicating that the
initialization has already been completed is still set to
"1", which then leads to confusion in the initialization
status. To solve the problem, we should set this variable
to "1" only if the wsrep provider initialization really
completed successfully.
An earlier issue has already been fixed for branch 10.4,
and this patch contains a fix for earlier versions (where
Galera 3.x is used).
* size represents the size of an element in the Unique class
* full_size is used when the Unique class counts the number of
duplicates stored per element. This requires additional space per Unique
element.
Item_cond inherits from Item_args but doesn't store its arguments
as function arguments, which means it has zero arguments.
Don't call memcpy in this case.
(Variant #2 of the patch, which keeps the sp_head object inside the
MEM_ROOT that sp_head object owns)
(10.3 requires extra work due to sp_package, will commit a separate
patch for it)
sp_head::operator new() and operator delete() were dereferencing sp_head*
pointers to memory that didn't hold a valid sp_head object (it was
not created/already destroyed).
This caused UBSan to crash when looking up type information.
Fixed by providing static sp_head::create() and sp_head::destroy() methods.
The string doesn't appear to be null-terminated when binlog checksums are
enabled. This causes a corrupt binlog name in the error message when a
slave is ahead of the master.
Analysis:
========
'max_binlog_cache_size' is configured and a huge transaction is executed. When
the transaction specific events size exceeds 'max_binlog_cache_size' the event
cannot be written to the binary log cache and cache write error is raised.
Upon cache write error the statement is rolled back and the transaction cache
should be truncated to a previous statement specific position. The truncate
operation should reset the cache to earlier valid positions and flush the new
changes. Even though the flush is successful the cache write error is still in
marked state. The truncate code interprets the cache write error as cache flush
failure and returns abruptly without modifying the write cache parameters.
Hence cache is in a invalid state. When a COMMIT statement is executed in this
session it tries to flush the contents of transaction cache to binary log.
Since cache has partial events the cache write operation will report
'writer.remains' assert.
Fix:
===
Binlog truncate function resets the cache to a specified size. As a first step
of truncation, clear the cache write error flag that was raised during earlier
execution. With this new errors that surface during cache truncation can be
clearly identified.
MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following assert when ASAN is enabled.
uint32 binlog_get_uncompress_len(const char*):
Assertion `(buf[0] & 0xe0) == 0x80' failed
Fix:
===
**Part11: Converted debug assert to error handler code**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error.
AddressSanitizer: heap-buffer-overflow on address
READ of size 1 at 0x60e00009cf71 thread T28
#0 0x55e37e034ae2 in net_field_length
Fix:
===
**Part10: Avoid reading out of buffer**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error
AddressSanitizer: SEGV on unknown address
The signal is caused by a READ memory access.
User_var_log_event::User_var_log_event(char const*, unsigned int,
Format_description_log_event const*)
Implemented part of upstream patch.
commit: mysql/mysql-server@a3a497ccf7
Fix:
===
**Part8: added checks to avoid reading out of buffer limits**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error
"heap-buffer-overflow on address" and some times it asserts.
Table_map_log_event::Table_map_log_event(const char*, uint,
const Format_description_log_event*)
Assertion `m_field_metadata_size <= (m_colcnt * 2)' failed.
Fix:
===
**Part7: Avoid reading out of buffer**
Converted debug assert to error handler code.
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error
AddressSanitizer: heap-buffer-overflow on address 0x60400002acb8
Load_log_event::copy_log_event(char const*, unsigned long, int,
Format_description_log_event const*)
Fix:
===
**Part6: Moved the event_len validation to the begin of copy_log_event function**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error
AddressSanitizer: heap-buffer-overflow on address
String::append(char const*, unsigned int)
Query_log_event::pack_info(Protocol*)
Fix:
===
**Part5: Added check to catch buffer overflow**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error
heap-buffer-overflow within "my_strndup" in Rotate_log_event
my_strndup /mysys/my_malloc.c:254
Rotate_log_event::Rotate_log_event(char const*, unsigned int,
Format_description_log_event const*)
Fix:
===
**Part4: Improved the check for event_len validation**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following crash when ASAN is enabled.
SEGV on unknown address
in inline_mysql_mutex_destroy
in my_bitmap_free
in Update_rows_log_event::~Update_rows_log_event()
Fix:
===
**Part3: Initialize m_cols_ai.bitmap to NULL**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following assert when ASAN is enabled.
Rows_log_event::Rows_log_event(const char*, uint,
const Format_description_log_event*):
Assertion `var_header_len >= 2'
Implemented part of upstream patch.
commit: mysql/mysql-server@a3a497ccf7
Fix:
===
**Part2: Avoid reading out of buffer limits**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> causes a variety of failures, some of which are
listed below. It is not a race condition issue, but there is some
non-determinism in it.
Analysis:
========
"show binlog events from <pos>" code considers the user given position as a
valid event start position. The code starts reading data from this event start
position onwards and tries to map it to a set of known events. Each event has
a specific event structure and asserts have been added to ensure that read
event data satisfies the event specific requirements. When a random position
is supplied to "show binlog events command" the event structure specific
checks will fail and they result in assert.
Fix:
====
The fix is split into different parts. Each part addresses either an ASAN
issue or an assert/crash.
**Part1: Checksum based position validation when checksum is enabled**
Using checksum validate the very first event read at the user specified
position. If there is a checksum mismatch report an appropriate error for the
invalid event.
Set read_set bitmap for view from the JOIN::all_fields list instead of JOIN::fields_list
as split_sum_func would have added items to the all_fields list.
The issue here is for degenerate joins we should execute the window
function but it is not getting executed in all the cases.
To get the window function values window function needs to be executed
always. This currently does not happen in few cases
where the join would return 0 or 1 row like
1) IMPOSSIBLE WHERE
2) MIN/MAX optimization
3) EMPTY CONST TABLE
The fix is to make sure that window functions get executed
and the temporary table is setup for the execution of window functions