- Fixed a warning visible in optimized build related to calling
memcpy with length parameters larger than ptrdiff_t max.
rb#23333 approved by Annamalai Gurusami <annamalai.gurusami@oracle.com>
IndexPurge::next(): Replace btr_pcur_move_to_next_user_rec()
with some equivalent code that performs sanity checks without
killing the server. Perform some additional sanity checks as well.
This change is motivated by
mysql/mysql-server@48de4d74f4
which unnecessarily introduces storage overhead to btr_pcur_t
and uses a test case that injects a fault somewhere else,
not in the code path that was modified.
MySQL 5.7.29 includes the following fix:
Bug #30287668 INNODB: A LONG SEMAPHORE WAIT
mysql/mysql-server@5cdbb22b51
There is no test case. It seems that the problem could occur when
a spatial index is large and peculiar enough so that multiple R-tree
leaf pages will have the exactly same maximum bounding rectangle (MBR).
The commit message suggests that the hang can occur when R-tree
non-leaf pages are being merged, which should only be possible
during transaction rollback or the purge of transaction history,
when the R-tree index is at least 2 levels high and very many records
are being deleted. The message says that a comparison result that two
spatial index node pointer records are equal will cause an infinite loop
in rtr_page_copy_rec_list_end_no_locks(). Hence, we must include the
child page number in the comparison to be consistent with
mysql/mysql-server@2e11fe0e15.
We fix this bug in a simpler way, involving fewer code changes.
cmp_rec_rec(): Renamed from cmp_rec_rec_with_match().
Assert that rec2 always resides in an index page.
Treat non-leaf spatial index pages specially.
Now that we will be invoking dtuple_get_n_ext() instead of
letting btr_push_update_extern_fields() update an already
calculated value, it is unnecessary to calculate the n_ext
upfront.
row_rec_to_index_entry(), row_rec_to_index_entry_low():
Remove the output parameter n_ext.
During update, rollback, or MVCC read, we may miscalculate
the number of off-page columns, and thus the size of the
clustered index record. The function btr_push_update_extern_fields()
is mostly redundant, because the off-page columns would also be
moved by row_upd_index_replace_new_col_val(), which is invoked
via row_upd_index_replace_new_col_vals().
btr_push_update_extern_fields(): Remove.
This is based on
mysql/mysql-server@1fa475b85d
which refines a fix for a recovery bug fix
mysql/mysql-server@ce0a1e85e2
in MySQL 5.7.5.
No test case was provided by Oracle.
Some of the changed code is being covered by the existing test
innodb.blob-crash.
WL#6326 in MariaDB 10.2.2 introduced a potential hang on purge or rollback
when an index tree is being shrunk by multiple levels.
This fix is based on
mysql/mysql-server@f2c5852630
with the main difference that our version of the test case uses
DEBUG_SYNC instrumentation on ROLLBACK, not on purge.
btr_cur_will_modify_tree(): Simplify the check further.
This is the actual bug fix.
row_undo_mod_remove_clust_low(), row_undo_mod_clust(): Add DEBUG_SYNC
instrumentation for the test case.
The write-heavy test innodb_zip.wl6501_scale_1 timed out on
10.2 60d7011c5f for me.
Out of os_aio_n_segments=6, 5 are waiting for an event in
os_aio_simulated_handler(). One thread is waiting for a
write to complete in buf_dblwr_add_to_batch(), but that
would never happen, because nothing is waking up the simulated AIO
handler threads.
This hang appears to have been introduced in MySQL 5.6.12
in mysql/mysql-server@26cfde776c.
Item_cond inherits from Item_args but doesn't store its arguments
as function arguments, which means it has zero arguments.
Don't call memcpy in this case.
(Variant #2 of the patch, which keeps the sp_head object inside the
MEM_ROOT that sp_head object owns)
(10.3 requires extra work due to sp_package, will commit a separate
patch for it)
sp_head::operator new() and operator delete() were dereferencing sp_head*
pointers to memory that didn't hold a valid sp_head object (it was
not created/already destroyed).
This caused UBSan to crash when looking up type information.
Fixed by providing static sp_head::create() and sp_head::destroy() methods.
The long semaphore wait appeared to be the caused by the following
pattern in the MTR test:
```
SET DEBUG_SYNC = "now SIGNAL wsrep_after_certification_continue";
SET DEBUG_SYNC = "now SIGNAL signal.wsrep_apply_cb;
```
Raising two signals, one right after another, caused one signal to
overwrite the other, before the signal was consumed by the thread.
This caused one thread to be stuck until the debug sync point would
timeout.
(Variant #2 of the patch, which keeps the sp_head object inside the
MEM_ROOT that sp_head object owns)
(10.3 version of the fix, with handling for class sp_package)
sp_head::operator new() and operator delete() were dereferencing sp_head*
pointers to memory that didn't hold a valid sp_head object (it was
not created/already destroyed).
This caused UBSan to crash when looking up type information.
Fixed by providing static sp_head::create() and sp_head::destroy() methods.
In this scenario:
- There is a possible range access for table T
- And there is a ref access on the same index which uses fewer key parts
- The join optimizer picks the ref access (because it is cheaper)
- make_join_select applies this heuristic to switch to range:
/* Range uses longer key; Use this instead of ref on key */
Join buffer will be used without having called
JOIN_TAB::make_scan_filter(). This means, conditions that should be
checked when reading table T will be checked after T is joined with the
contents of the join buffer, instead.
Fixed this by adding a make_scan_filter() check.
(updated patch after backport to 10.3)
(Fix testcase on Windows)
Analysis:
========
'max_binlog_cache_size' is configured and a huge transaction is executed. When
the transaction specific events size exceeds 'max_binlog_cache_size' the event
cannot be written to the binary log cache and cache write error is raised.
Upon cache write error the statement is rolled back and the transaction cache
should be truncated to a previous statement specific position. The truncate
operation should reset the cache to earlier valid positions and flush the new
changes. Even though the flush is successful the cache write error is still in
marked state. The truncate code interprets the cache write error as cache flush
failure and returns abruptly without modifying the write cache parameters.
Hence cache is in a invalid state. When a COMMIT statement is executed in this
session it tries to flush the contents of transaction cache to binary log.
Since cache has partial events the cache write operation will report
'writer.remains' assert.
Fix:
===
Binlog truncate function resets the cache to a specified size. As a first step
of truncation, clear the cache write error flag that was raised during earlier
execution. With this new errors that surface during cache truncation can be
clearly identified.
MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following assert when ASAN is enabled.
uint32 binlog_get_uncompress_len(const char*):
Assertion `(buf[0] & 0xe0) == 0x80' failed
Fix:
===
**Part11: Converted debug assert to error handler code**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error.
AddressSanitizer: heap-buffer-overflow on address
READ of size 1 at 0x60e00009cf71 thread T28
#0 0x55e37e034ae2 in net_field_length
Fix:
===
**Part10: Avoid reading out of buffer**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error
AddressSanitizer: SEGV on unknown address
The signal is caused by a READ memory access.
User_var_log_event::User_var_log_event(char const*, unsigned int,
Format_description_log_event const*)
Implemented part of upstream patch.
commit: mysql/mysql-server@a3a497ccf7
Fix:
===
**Part8: added checks to avoid reading out of buffer limits**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error
"heap-buffer-overflow on address" and some times it asserts.
Table_map_log_event::Table_map_log_event(const char*, uint,
const Format_description_log_event*)
Assertion `m_field_metadata_size <= (m_colcnt * 2)' failed.
Fix:
===
**Part7: Avoid reading out of buffer**
Converted debug assert to error handler code.
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error
AddressSanitizer: heap-buffer-overflow on address 0x60400002acb8
Load_log_event::copy_log_event(char const*, unsigned long, int,
Format_description_log_event const*)
Fix:
===
**Part6: Moved the event_len validation to the begin of copy_log_event function**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error
AddressSanitizer: heap-buffer-overflow on address
String::append(char const*, unsigned int)
Query_log_event::pack_info(Protocol*)
Fix:
===
**Part5: Added check to catch buffer overflow**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error
heap-buffer-overflow within "my_strndup" in Rotate_log_event
my_strndup /mysys/my_malloc.c:254
Rotate_log_event::Rotate_log_event(char const*, unsigned int,
Format_description_log_event const*)
Fix:
===
**Part4: Improved the check for event_len validation**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following crash when ASAN is enabled.
SEGV on unknown address
in inline_mysql_mutex_destroy
in my_bitmap_free
in Update_rows_log_event::~Update_rows_log_event()
Fix:
===
**Part3: Initialize m_cols_ai.bitmap to NULL**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following assert when ASAN is enabled.
Rows_log_event::Rows_log_event(const char*, uint,
const Format_description_log_event*):
Assertion `var_header_len >= 2'
Implemented part of upstream patch.
commit: mysql/mysql-server@a3a497ccf7
Fix:
===
**Part2: Avoid reading out of buffer limits**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> causes a variety of failures, some of which are
listed below. It is not a race condition issue, but there is some
non-determinism in it.
Analysis:
========
"show binlog events from <pos>" code considers the user given position as a
valid event start position. The code starts reading data from this event start
position onwards and tries to map it to a set of known events. Each event has
a specific event structure and asserts have been added to ensure that read
event data satisfies the event specific requirements. When a random position
is supplied to "show binlog events command" the event structure specific
checks will fail and they result in assert.
Fix:
====
The fix is split into different parts. Each part addresses either an ASAN
issue or an assert/crash.
**Part1: Checksum based position validation when checksum is enabled**
Using checksum validate the very first event read at the user specified
position. If there is a checksum mismatch report an appropriate error for the
invalid event.
INNOBASE_ALTER_NOVALIDATE: Remove the set of operations
INNOBASE_ONLINE_CREATE that was accidentally included in the
definition.
In the merge of 82187a1221 to 10.3
(in commit eda719793a) the flags
were defined correctly.
This bug was caught by the test innodb_zip.index_large_prefix.
By default (innodb_strict_mode=ON), InnoDB attempts to guarantee
at DDL time that any INSERT to the table can succeed.
MDEV-19292 recently revised the "row size too large" check in InnoDB.
The check still is somewhat inaccurate;
that should be addressed in MDEV-20194.
Note: If a table contains multiple long string columns so that each column
is part of a column prefix index, then an UPDATE that attempts to modify
all those columns at once may fail, because the undo log record might
not fit in a single undo log page (of innodb_page_size). In the worst case,
the undo log record would grow by about 3KiB of for each updated column.
The DDL-time check (since the InnoDB Plugin for MySQL 5.1) is optional
in the sense that when the maximum B-tree record size or undo log
record size would be exceeded, the DML operation will fail and the
transaction will be properly rolled back.
create_table_info_t::row_size_is_acceptable(): Add the parameter
'bool strict' so that innodb_strict_mode=ON can be overridden during
TRUNCATE, OPTIMIZE and ALTER TABLE...FORCE (when the storage format
is not changing).
create_table_info_t::create_table(): Perform a sloppy check for
TRUNCATE TABLE (create_fk=false).
prepare_inplace_alter_table_dict(): Perform a sloppy check for
simple operations.
trx_is_strict(): Remove. The function became unused in
commit 98694ab0cb (MDEV-20949).
Features:
* STL-like interface
* Fast modification: no branches on insertion or deletion
* Fast iteration: one pointer dereference and one pointer comparison
* Your class can be a part of several lists
Modeled after std::list<T> but currently has fewer methods (not complete yet)
For even more performance it's possible to customize list with templates so
it won't have size counter variable or won't NULLify unlinked node.
How existing lists differ?
No existing lists support STL-like interface.
I_List:
* slower iteration (one more branch on iteration)
* element can't be a part of two lists simultaneously
I_P_List:
* slower modification (branches, except for the fastest push_back() case)
* slower iteration (one more branch on iteration)
UT_LIST_BASE_NODE_T:
* slower modification (branches)
Three UT_LISTs were replaced: two in fil_system_t and one in dyn_buf_t.