Remove incorrect BF (brute force) handling from lock_rec_has_to_wait_in_queue
and move condition to correct callers. Add a function to report
BF lock waits and assert if incorrect BF-BF lock wait happens.
wsrep_report_bf_lock_wait
Add a new function to report BF lock wait.
wsrep_assert_no_bf_bf_wait
Add a new function to check do we have a
BF-BF wait and if we have report this case
and assert as it is a bug.
lock_rec_has_to_wait
Use new wsrep_assert_bf_wait to check BF-BF wait.
lock_rec_create_low
lock_table_create
Use new function to report BF lock waits.
lock_rec_insert_by_trx_age
lock_grant_and_move_on_page
lock_grant_and_move_on_rec
Assert that trx is not Galera as VATS is not compatible
with Galera.
lock_rec_add_to_queue
If there is conflicting lock in a queue make sure that
transaction is BF.
lock_rec_has_to_wait_in_queue
Remove incorrect BF handling. If there is conflicting
locks in a queue all transactions must wait.
lock_rec_dequeue_from_page
lock_rec_unlock
If there is conflicting lock make sure it is not
BF-BF case.
lock_rec_queue_validate
Add Galera record locking rules comment and use
new function to report BF lock waits.
All attempts to reproduce the original assertion have been
failed. Therefore, there is no test case on this commit.
fts_drop_orphaned_tables() takes long time to remove the orphaned
FTS tables. In order to reduce the time, do the following:
- Traverse fil_system.space_list and construct a set of
table_id,index_id of all FTS_*.ibd tablespaces.
- Traverse the sys_indexes table and ignore the entry
from the above collection if it exist.
- Existing elements in the collection can be considered as
orphaned fts tables. construct the table name from
(table_id,index_id) and invoke fts_drop_tables().
- Removed DICT_TF2_FTS_AUX_HEX_NAME flag usage from upgrade.
- is_aux_table() in dict_table_t to check whether the given name
is fts auxiliary table
fts_space_set_t is a structure to store set of parent table id
and index id
- Remove unused FTS function in fts0fts.cc
- Remove the fulltext index in row_format_redundant test case.
Because it deals with the condition that SYS_TABLES does have
corrupted entry and valid entry exist in SYS_INDEXES.
dict_foreign_qualify_index(): Reject corrupted or garbage indexes.
For index stubs that are created on virtual columns, no
dict_field_t::col would be assign. Instead, the entire table
definition would be reloaded on a successful operation.
Before commit 05fa4558e0 (MDEV-22110)
we have slot->type == MTR_MEMO_MODIFY that are unrelated to
incrementing the buffer-fix count.
FindBlock::operator(): In debug builds, skip MTR_MEMO_MODIFY entries.
Also, simplify the code a little.
This fixes an infinite loop in the tests
innodb.innodb_defragment and innodb.innodb_wl6326_big.
buf_page_create() is invoked when page is initialized. So that
previous contents of the page ignored. In few cases, it calls
buf_page_get_gen() is called to fetch the page from buffer pool.
It should take x-latch on the page. If other thread uses the block
or block io state is different from BUF_IO_NONE then release the
mutex and check the state and buffer fix count again. For compressed
page, use the existing free block from LRU list to create new page.
Retry to fetch the compressed page if it is in flush list
fseg_create(), fseg_create_general(): Introduce block as a parameter
where segment header is placed. It is used to avoid repetitive
x-latch on the same page
Change the assert to check whether the page has SX latch and
X latch in all callee function of buf_page_create()
mtr_t::get_fix_count(): Get the buffer fix count of the given
block added by the mtr
FindBlock is added to find the buffer fix count of the given
block acquired by the mini-transaction
An unsafe optimization was introduced by
commit 2347ffd843 (MDEV-20301)
which is based on
mysql/mysql-server@3f3136188f or
mysql/mysql-server@647a3814a9
in MySQL 8.0.12 or MySQL 8.0.13
(which in turn is based on the contribution in MySQL Bug #84958).
Row_sel_get_clust_rec_for_mysql::operator(): In addition to checking
that the pointer to the record matches, also check the latest
modification of the page (FIL_PAGE_LSN) as well as the page identifier.
Only if all three match, it is safe to reuse cached_old_vers.
Row_sel_get_clust_rec_for_mysql::check_eq(): Assert that the PRIMARY KEY
of the cached old version of the record corresponds to the latest version.
We got a test case where CHECK TABLE, UPDATE and purge would be
hammering on the same table (with only 6 rows) and a pointer that
was originally pointing to a record pk=2 would match a cached_clust_rec
that was pointing to a record pk=1. In the diagnosed `rr replay` trace,
we would wrongly return an old cached version of the pk=1 record,
instead of retrieving the correct version of the pk=2 record. Because
of this, CHECK TABLE would fail to count one of the records in a
secondary index, and report failure.
This bug appears to affect MVCC reads via secondary indexes only.
The purge of history in secondary indexes uses a different code path,
and so do checks for implicit record locks.
build_template_field(): Initialize templ->rec_field_is_prefix
also for indexes on virtual columns. This was caught on 10.5 by
MemorySanitizer as use-of-uninitialized-value in
row_search_with_covering_prefix() when running the test
main.fast_prefix_index_fetch_innodb.
Passing a null pointer to a nonnull argument is not only undefined
behaviour, but it also grants the compiler the permission to optimize
away further checks whether the pointer is null. GCC -O2 at least
starting with version 8 may do that, potentially causing SIGSEGV.
These problems were caught in a WITH_UBSAN=ON build with the
Bug#7024 test in main.view.
Passing a null pointer to the "%s" argument of a printf-like
function is undefined behaviour. In the GNU libc implementation
of the printf() family of functions, it happens to work.
GCC 10.2.0 would diagnose this with -Wformat-overflow -Og.
In -fsanitize=undefined (WITH_UBSAN=ON) builds, a runtime error
would be generated. In some other builds, GCC 8 or later might infer
that the parameter is nonnull and optimize away further checks whether
the parameter is null, leading to SIGSEGV.
During insertion of clustered index, InnoDB does the check for foreign key
constraints. Problem is that it uses the clustered index entry to search
indexes of referenced tables and it could lead to unexpected result
when there is no foreign index.
Solution:
========
Rebuild the tuple based on foreign column names before searching it
on reference index when there is no foreign index.
Build failure was:
storage/innobase/os/os0proc.cc:144:3: error: use of undeclared identifier 'MEM_UNDEFINED'
MEM_UNDEFINED(ptr, size);
Assumed to be introduced in MDEV-20377
commit: c36834c832
The InnoDB index fields store bytes, not characters.
Remove some unnecessary conversions from characters to bytes.
This also fixes MDEV-20422 and the wrong-result bug MDEV-12486.
Add a proper error handling of innobase_get_computed_value results in
row_upd_store_row/row_upd_store_v_row.
Also add an assertion in row_vers_build_clust_v_col to fail during row
purge.
Add one more assertion in row_sel_sec_rec_is_for_clust_rec for possible
future catches.
The problem was in improper error handling behavior in
`row_upd_build_difference_binary`:
`innobase_free_row_for_vcol` wasn't called.
To eliminate this problem in all potential places, a refactoring has been
made:
* class ib_vcol_row is added. It owns VCOL_STORAGE and heap and maintains
it in RAII manner
* all innobase_allocate_row_for_vcol/innobase_free_row_for_vcol pairs are
substituted
with ib_vcol_row usage
* row_merge_buf_add is only left untouched because it doesn't own vheap
passed as an argument
* innobase_allocate_row_for_vcol does not allocate VCOL_STORAGE anymore and
accepts it as an argument -- this reduces a number of memory allocations
* move rec_printer out of `#ifndef DBUG_OFF` and mark it cold
(This commit is exclusively for 10.1 branch, do not merge it to upper ones)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.
Extra: mysqlbinlog_row_minimal refined to not produce mutable numeric values into the result file.
(This commit is exclusively for 10.2 branch. Do not merge it to 10.3)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
and to Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.
This commit contains a fix and extended test case for a ASAN failure
reported during galera.fk mtr testing.
The reported heap buffer overflow happens in test case where a cascading
foreign key constraint is defined for a column of varchar type, and
galera.fk.test has such vulnerable test scenario.
Troubleshoting revealed that erlier fix for MDEV-19660 has made a fix
for cascading delete handling to append wsrep keys from pcur->old_rec,
in row_ins_foreign_check_on_constraint(). And, the ASAN failuer comes from
later scanning of this old_rec reference.
The fix in this commit, moves the call for wsrep_append_foreign_key() to happen
somewhat earlier, and inside ongoing mtr, and using clust_rec which is set
earlier in the same mtr for both update and delete cascade operations.
for wsrep key populating, it does not matter when the keys are populated,
all keys just have to be appended before wsrep transaction replicates.
Note that I also tried similar fix for earlier wsrep key append, but using
the old implementation with pcur->old_rec (instead of clust_rec), and same
ASAN failure was reported. So it appears that pcur->old_rec is not properly
set, to be used for wsrep key appending.
galera.galera_fk_cascade_delete test has been extended by two new test scenarios:
* FK cascade on varchar column.
This test case reproduces same scenario as galera.fk, and this test scenario
will also trigger ASAN failure with non fixed MariaDB versions.
* multi-master conflict with FK cascading.
this scenario causes a conflict between a replicated FK cascading transaction
and local transaction trying to modify the cascaded child table row.
Local transaction should be aborted and get deadlock error.
This test scenario is passing both with old MariaDB version and with this
commit as well.
The issue here was that the query was using ORDER BY LIMIT optimzation where
the access method was changed from EQ_REF access to an index scan (index that would
resolve the ORDER BY clause).
But the parameter READ_RECORD::unlock_row was not reset to rr_unlock_row, which is
used when the access method is not EQ_REF access.
for internal temporary tables: don't use realpath(),
and let them overwrite whatever orphan temp files might've
left in the tmpdir (see main.error_simulation test).
for user created temporary tables: we have to use realpath(),
(see 3a726ab6e2, remember DATA/INDEX DIRECTORY). don't allow
them to overwrite existing files.
This bug was reported by RACK911 LABS
This bug was originally repeated on 10.4 after defining a UNIQUE KEY
on a TEXT column, which is implemented by MDEV-371 by creating the
index on a hidden virtual column.
While row_vers_vc_matches_cluster() is executing in a purge thread
to find out if an index entry may be removed in a secondary index
that comprises a virtual column, another purge thread may process
the undo log record that this check is interested in, and write
a null BLOB pointer in that record. This would trip the assertion.
To prevent this from occurring, we must propagate the 'missing BLOB'
error up the call stack.
row_upd_ext_fetch(): Return NULL when the error occurs.
row_upd_index_replace_new_col_val(): Return whether the previous
version was built successfully.
row_upd_index_replace_new_col_vals_index_pos(): Check the error
result. Yes, we would intentionally crash on this error if it
occurs outside the purge thread.
row_upd_index_replace_new_col_vals(): Check for the error condition,
and simplify the logic.
trx_undo_prev_version_build(): Check for the error condition.