Commit graph

7149 commits

Author SHA1 Message Date
Thirunarayanan Balathandayuthapani
8984d77910 MDEV-21329 InnoDB: Failing assertion: lock->lock_word.load(std::memory_order_relaxed) == X_LOCK_DECR upon server shutdown
Problem is that dropping of fts table and sync of fts table
happens concurrently during fts optimize thread shutdown.
fts_optimize_remove_table() is executed by a user thread that
performs DDL, and that the fts_optimize_wq may be removed if
fts_optimize_shutdown() has started executing.

fts_optimize_remove_table() doesn't remove the table from the queue
and it leads to above scenario. While removing the table from
fts_optimize_wq, if the table can't be removed then wait till
fts_optimize_thread shuts down.

Reviewed-by: Marko Mäkelä
2020-10-08 21:54:11 +05:30
Thirunarayanan Balathandayuthapani
6504d3d229 MDEV-23722 InnoDB: Assertion: result != FTS_INVALID in fts_trx_row_get_new_state
Marking of deletion of row in fts index happens twice in
self-referential foreign key relation. So while performing
referential checks of foreign key, InnoDB can avoid updating
of fts index if the foreign key has self-referential relationship.

Reviewed-by: Marko Mäkelä
2020-10-08 19:41:03 +05:30
Marko Mäkelä
1595189250 MDEV-23897 SIGSEGV on commit with innodb_lock_schedule_algorithm=VATS
This regression for debug builds was introduced by
MDEV-23101 (commit 224c950462).

Due to MDEV-16664, the parameter
innodb_lock_schedule_algorithm=VATS
is not enabled by default.

The purpose of the added assertions was to enforce the invariant that
Galera replication cannot be enabled together with VATS due to MDEV-12837.
However, upon closer inspection, it is obvious that the variable 'lock'
may be assigned to the null pointer if no match is found in the
previous->hash list.

lock_grant_and_move_on_page(), lock_grant_and_move_on_rec():
Assert !lock->trx->is_wsrep() only after ensuring that lock
is not a null pointer.
2020-10-06 22:35:43 +03:00
Marko Mäkelä
577c61e8be MDEV-23888: Potential server hang on replication with InnoDB
In MDEV-21452, SAFE_MUTEX flagged an ordering problem that involved
trx_t::mutex, LOCK_global_system_variables, and LOCK_commit_ordered
when running
./mtr --no-reorder\
 binlog.binlog_checksum,mix binlog.binlog_commit_wait,mix

Because LOCK_commit_ordered is acquired by replication code before
innobase_commit_ordered() is invoked, and because LOCK_commit_ordered
should be below LOCK_global_system_variables in the global latching
order, it turns out that we must avoid acquiring
LOCK_global_system_variables in any low-level code.

It also turns out that lock_rec_lock() acquires lock_sys_t::mutex
and then carries on to call lock_rec_enqueue_waiting(), which may
invoke THDVAR() via thd_lock_wait_timeout(). This call is problematic
if THDVAR() had never been invoked in that thread earlier.

innobase_trx_init(): Let us invoke THDVAR() at the start of an InnoDB
transaction so that future invocations of THDVAR() will avoid
LOCK_global_system_variables acquisition on the same THD. Because
the first call to intern_sys_var_ptr() will initialize all session
variables by not passing the offset to sync_dynamic_session_variables(),
this will indeed make any future THDVAR() invocation mutex-free.

There are some THDVAR() calls in other code (related to indexed virtual
columns, fulltext indexes, and DDL operations). No SAFE_MUTEX warning
was known for those, but there does not appear to be any replication
test coverage for indexed virtual columns or fulltext indexes. DDL should
be covered, and perhaps DDL code paths were already invoking THDVAR()
while not holding any InnoDB mutex.

Side note: MySQL should avoid this type of deadlocks since
mysql/mysql-server@4d275c8995.
MariaDB never defined alloc_and_copy_thd_dynamic_variables(),
because we prefer to avoid overhead during connection creation.

An important part of the deadlock could be the current handling of
SET GLOBAL binlog_checksum=NONE; and similar assignments.
In binlog_checksum_update(), we would hold LOCK_global_system_variables
while potentially acquiring LOCK_commit_ordered in MYSQL_BIN_LOG::open().
Even if that code was changed later to release
LOCK_global_system_variables during the write to mysql_bin_log,
it could be a good idea for performance to avoid invoking the
expensive code path of THDVAR() while holding any InnoDB mutexes,
such as lock_sys.mutex in lock_rec_enqueue_waiting().

Thanks to Andrei Elkin for debugging the SAFE_MUTEX issue, and to
Sergei Golubchik for the suggestion to invoke THDVAR() early.
2020-10-06 07:47:11 +03:00
Marko Mäkelä
295e2d500b MDEV-16664: Add deprecation warning for innodb_lock_schedule_algorithm=VATS
The setting innodb_lock_schedule_algorithm=VATS that was introduced
in MDEV-11039 (commit 021212b525)
causes conflicting exclusive locks to be incorrectly granted to
two transactions. Specifically, in lock_rec_insert_by_trx_age()
the predicate !lock_rec_has_to_wait_in_queue(in_lock) would hold even
though an active transaction is already holding an exclusive lock.
This was observed between two DELETE of the same clustered index record.
The HASH_DELETE invocation in lock_rec_enqueue_waiting() may be related.

Due to lack of progress in diagnosing the problem, we will deprecate the
option and issue a warning that using it may corrupt data. The unsafe
option was enabled between
commit 0c15d1a6ff (MariaDB 10.2.3)
and the parent of
commit 1cc1d0429d (MariaDB 10.2.17, 10.3.9).
2020-10-05 09:34:27 +03:00
Marko Mäkelä
199bc67144 Cleanup: Remove unused SYNC_REC_LOCK
SYNC_REC_LOCK was never used in the public history of InnoDB,
starting with commit 132e667b0b.
2020-10-05 09:12:12 +03:00
Marko Mäkelä
b8b1aef6b1 Cleanup: Orphan que_thr_mutex declaration
The orphan declaration was added in MySQL 5.6.2
mysql/mysql-server@2915417e02
and MariaDB commit 1d0f70c2f8.
2020-10-02 08:40:06 +03:00
Marko Mäkelä
46890349bf Cleanup: Remove fts_t::bg_threads_mutex, fts_t::bg_threads
The unused fts_t::bg_threads was added in
mysql/mysql-server@4b1049625c.

Any usage of fts_t::bg_threads_mutex was removed in
mysql/mysql-server@33c2404b39.
2020-10-02 08:36:50 +03:00
Marko Mäkelä
d99f787244 Merge 10.2 into 10.3 2020-10-01 13:33:51 +03:00
Marko Mäkelä
323500bfa9 Merge 10.2 into 10.3 2020-09-30 16:25:06 +03:00
Thirunarayanan Balathandayuthapani
d6b33ea237 MDEV-23856 fts_optimize_wq accessed after shutdown of FTS Optimize thread
In fts_optimize_remove_table(), InnoDB tries to access the
fts_optimize_wq after shutting down the fts optimize thread.
This issue caused by the commit a41d429765.
Fix should check for fts optimize thread shutdown state
before checking fts_optimize_wq.
2020-09-30 18:02:29 +05:30
Marko Mäkelä
cd5f4d2a59 Cleanup: Remove unused fts_cache_t::optimize_lock
This has been unused from the very beginning
(mysql/mysql-server@d5e512ae7e).
2020-09-30 13:26:46 +03:00
Marko Mäkelä
aa2f263e59 Cleanup: Remove constant parameters async=false, index_name=NULL 2020-09-29 11:07:34 +03:00
Marko Mäkelä
74bd3683ca MDEV-23839 innodb_fast_shutdown=0 hang on change buffer merge
ibuf_merge_or_delete_for_page(): Do not attempt to invoke
ibuf_delete_recs() on a page of the change buffer itself.
The caller could already be holding ibuf->index->lock,
and an attempt to acquire it in S mode would hang the release server
or cause an assertion failure in rw_lock_s_lock_func() in a debug
server.

This problem was reproducible on 1 out of 2 runs of the following:
./mtr --no-reorder \
innodb.innodb-page_compression_default \
innodb.innodb-page_compression_snappy \
innodb.innodb-page_compression_zip \
innodb.innodb_wl6326_big innodb.xa_recovery
2020-09-29 11:04:12 +03:00
Marko Mäkelä
a441b06489 Merge 10.1 into 10.2 2020-09-29 10:04:37 +03:00
Marko Mäkelä
83a520dbbb Cleanup: Remove unused rw_lock_t::writer_is_wait_ex
This was missed in commit 2c252ba96b
(MySQL 5.5.42, MariaDB 5.5.42).
2020-09-29 09:39:08 +03:00
Sujatha
6cbbd6bd96 Merge branch '10.2' into 10.3 2020-09-28 17:27:42 +05:30
Thirunarayanan Balathandayuthapani
fdb3c64e42 MDEV-22277 LeakSanitizer: detected memory leaks in mem_heap_create_block_func after attempt to create foreign key
- During online DDL, prepare phase error handler fails to remove
the memory allocated for newly created foreign keys.
2020-09-28 17:04:02 +05:30
Marko Mäkelä
af40a2b43e Fix GCC 10.2.0 -Og -fsanitize=undefined -Wmaybe-uninitialized
For some reason, adding -fsanitize=undefined (cmake -DWITH_UBSAN=ON)
to the compilation flags will cause even more warnings to be emitted.
The warnings do look bogus, but the code can be simplified.
2020-09-23 12:27:56 +03:00
Marko Mäkelä
d9d9c30b70 Merge 10.2 into 10.3 2020-09-22 21:12:48 +03:00
Marko Mäkelä
9d0ee2dcb7 Merge 10.1 into 10.2 2020-09-22 15:21:43 +03:00
Marko Mäkelä
fde3d895d9 Merge 10.2 into 10.3 2020-09-22 14:33:03 +03:00
Marko Mäkelä
78efa10930 MDEV-22939: Restore an AUTO_INCREMENT check
It turns out that we must check for DISCARD TABLESPACE both
when the table is being rebuilt and when the AUTO_INCREMENT
value of the table is being added.

This was caught by the test innodb.alter_missing_tablespace.
Somehow I failed to run all tests. Sorry!
2020-09-22 13:55:36 +03:00
Marko Mäkelä
3eb81136e1 MDEV-22939 Server crashes in row_make_new_pathname()
The statement ALTER TABLE...DISCARD TABLESPACE is problematic,
because its designed purpose is to break the referential integrity
of the data dictionary and make a table point to nowhere.

ha_innobase::commit_inplace_alter_table(): Check whether the
table has been discarded. (This is a bit late to check it, right
before committing the change.) Previously, we performed this check
only in a specific branch of the function commit_set_autoinc().

Note: We intentionally allow non-rebuilding ALTER TABLE even if
the tablespace has been discarded, to remain compatible with MySQL.
(See the various tests with "wl5522" in the name, such as
innodb.innodb-wl5522.)

The test case would crash starting with 10.3 only, but it does not hurt
to minimize the code and test difference between 10.2 and 10.3.
2020-09-22 13:40:05 +03:00
Marko Mäkelä
e5e83daf32 Make DISCARD TABLESPACE more robust
dict_load_table_low(): Copy the 'discarded' flag to file_unreadable.
This allows to avoid a potentially harmful call to dict_stats_init()
in ha_innobase::open().
2020-09-22 13:09:51 +03:00
Marko Mäkelä
2af8f712de MDEV-23776: Re-apply the fix and make the test more robust
The test that was added in commit e05650e686
would break a subsequent run of a test encryption.innodb-bad-key-change
because some pages in the system tablespace would be encrypted with
a different key.

The failure was repeatable with the following invocation:

./mtr --no-reorder \
encryption.create_or_replace,cbc \
encryption.innodb-bad-key-change,cbc

Because the crash was unrelated to the code changes that we reverted
in commit eb38b1f703
we can safely re-apply those fixes.
2020-09-22 13:08:09 +03:00
Marko Mäkelä
732cd7fd53 MDEV-23705 Assertion 'table->data_dir_path || !space'
After DISCARD TABLESPACE, the tablespace of a table will no longer
exist, and dict_get_and_save_data_dir_path() would invoke
dict_get_first_path() to read an entry from SYS_DATAFILES.
For some reason, DISCARD TABLESPACE would not to remove the entry
from there.

dict_get_and_save_data_dir_path(): If the tablespace has been
discarded, do not bother trying to read the name.

Side note: The tables SYS_TABLESPACES and SYS_DATAFILES are
redundant and subject to removal in MDEV-22343.
2020-09-22 11:13:51 +03:00
Marko Mäkelä
eb38b1f703 Revert "MDEV-23776 Test encryption.create_or_replace fails with a warning"
This reverts commit e33f7b6faa.
The change seems to have introduced intermittent failures of the test
encryption.innodb-bad-key-change on many platforms.

The failure that we were trying to address was not reproduced on 10.2.
It could be related to commit a7dd7c8993
(MDEV-23651) or de942c9f61 (MDEV-15983)
or other changes that reduced contention on fil_system.mutex in 10.3.

The fix that we are hereby reverting from 10.2 seems to work fine
on 10.3 and 10.4.
2020-09-22 10:41:06 +03:00
Marko Mäkelä
2cf489d430 Merge 10.2 into 10.3 2020-09-21 16:39:23 +03:00
Marko Mäkelä
e33f7b6faa MDEV-23776 Test encryption.create_or_replace fails with a warning
The test encryption.create_or_replace would occasionally fail with
a warning message from fil_check_pending_ops().

fil_crypt_find_space_to_rotate(): While waiting for available
I/O capacity, check fil_space_t::is_stopping() and release a
handle if necessary.

fil_space_crypt_close_tablespace(): Wake up the waiters in
fil_crypt_find_space_to_rotate().
2020-09-21 15:55:27 +03:00
Vlad Lesin
6c4c88dbb8 MDEV-18867: Remove an orphan function 2020-09-21 12:52:44 +03:00
Vlad Lesin
0a224edc3e MDEV-23711 make mariabackup innodb redo log read error message more clear
log_group_read_log_seg() returns error when:

1) Calculated log block number does not correspond to read log block
number. This can be caused by:
  a) Garbage or an incompletely written log block. We can exclude this
  case by checking log block checksum if it's enabled(see innodb-log-checksums,
  encrypted log block contains checksum always).
  b) The log block is overwritten. In this case checksum will be correct and
  read log block number will be greater then requested one.

2) When log block length is wrong. In this case recv_sys->found_corrupt_log
is set.

3) When redo log block checksum is wrong. In this case innodb code
writes messages to error log with the following prefix: "Invalid log
block checksum."

The fix processes all the cases above.
2020-09-21 12:29:52 +03:00
Marko Mäkelä
cbcb4ecabb Merge 10.2 into 10.3 2020-09-21 11:04:04 +03:00
Nikita Malyavin
ae8ff3a067 MDEV-20396 Server crashes after DELETE with SEL NULL Foreign key and a
virtual column in index

Problem:
row_ins_foreign_fill_virtual was unconditionally set virtual fields to NULL
even though the field is not a part of a foreign key
(but a part of an index)

Solution:
The new virtual value should be computed with regard to cascade updates.
2020-09-14 18:08:56 +10:00
Thirunarayanan Balathandayuthapani
bfd1ed5a13 MDEV-18867 Long Time to Stop and Start
- Addressing ASAN failure in fts_check_aux_table()
2020-09-11 12:24:29 +05:30
Jan Lindström
224c950462 MDEV-23101 : SIGSEGV in lock_rec_unlock() when Galera is enabled
Remove incorrect BF (brute force) handling from lock_rec_has_to_wait_in_queue
and move condition to correct callers. Add a function to report
BF lock waits and assert if incorrect BF-BF lock wait happens.

wsrep_report_bf_lock_wait
	Add a new function to report BF lock wait.

wsrep_assert_no_bf_bf_wait
	Add a new function to check do we have a
	BF-BF wait and if we have report this case
	and assert as it is a bug.

lock_rec_has_to_wait
	Use new wsrep_assert_bf_wait to check BF-BF wait.

lock_rec_create_low
lock_table_create
	Use new function to report BF lock waits.

lock_rec_insert_by_trx_age
lock_grant_and_move_on_page
lock_grant_and_move_on_rec
	Assert that trx is not Galera as VATS is not compatible
	with Galera.

lock_rec_add_to_queue
	If there is conflicting lock in a queue make sure that
	transaction is BF.

lock_rec_has_to_wait_in_queue
	Remove incorrect BF handling. If there is conflicting
	locks in a queue all transactions must wait.

lock_rec_dequeue_from_page
lock_rec_unlock
	If there is conflicting lock make sure it is not
	BF-BF case.

lock_rec_queue_validate
	Add Galera record locking rules comment and use
	new function to report BF lock waits.

All attempts to reproduce the original assertion have been
failed. Therefore, there is no test case on this commit.
2020-09-10 13:18:12 +03:00
Thirunarayanan Balathandayuthapani
75e82f71f1 MDEV-18867 Long Time to Stop and Start
fts_drop_orphaned_tables() takes long time to remove the orphaned
FTS tables. In order to reduce the time, do the following:

- Traverse fil_system.space_list and construct a set of
table_id,index_id of all FTS_*.ibd tablespaces.
- Traverse the sys_indexes table and ignore the entry
from the above collection if it exist.
- Existing elements in the collection can be considered as
orphaned fts tables. construct the table name from
(table_id,index_id) and invoke fts_drop_tables().
- Removed DICT_TF2_FTS_AUX_HEX_NAME flag usage from upgrade.
- is_aux_table() in dict_table_t to check whether the given name
is fts auxiliary table
fts_space_set_t is a structure to store set of parent table id
and index id
- Remove unused FTS function in fts0fts.cc
- Remove the fulltext index in row_format_redundant test case.
Because it deals with the condition that SYS_TABLES does have
corrupted entry and valid entry exist in SYS_INDEXES.
2020-09-10 14:10:26 +05:30
Marko Mäkelä
7e07e38cf6 Merge 10.2 into 10.3 2020-09-09 13:06:46 +03:00
Marko Mäkelä
0eb38243ce MDEV-23456 fixup: Simplify a comparison 2020-09-09 13:04:11 +03:00
Marko Mäkelä
040ae4c59b MDEV-22924 fixup: Replace C++11 auto 2020-09-09 13:02:25 +03:00
Marko Mäkelä
d44c0f46c5 MDEV-22924 fixup: Replace C++11 nullptr
Only starting with MariaDB Server 10.4 we may depend on C++11.
2020-09-09 12:26:51 +03:00
Marko Mäkelä
64c8fa58a8 MDEV-23685 SIGSEGV on ADD FOREIGN KEY after failed ADD KEY
dict_foreign_qualify_index(): Reject corrupted or garbage indexes.
For index stubs that are created on virtual columns, no
dict_field_t::col would be assign. Instead, the entire table
definition would be reloaded on a successful operation.
2020-09-09 12:04:42 +03:00
Marko Mäkelä
c26eae0cc0 MDEV-23456 fixup: Fix mtr_t::get_fix_count()
Before commit 05fa4558e0 (MDEV-22110)
we have slot->type == MTR_MEMO_MODIFY that are unrelated to
incrementing the buffer-fix count.

FindBlock::operator(): In debug builds, skip MTR_MEMO_MODIFY entries.

Also, simplify the code a little.

This fixes an infinite loop in the tests
innodb.innodb_defragment and innodb.innodb_wl6326_big.
2020-09-09 12:01:03 +03:00
Thirunarayanan Balathandayuthapani
b1009ae5c1 MDEV-23456 fil_space_crypt_t::write_page0() is accessing an uninitialized page
buf_page_create() is invoked when page is initialized. So that
previous contents of the page ignored. In few cases, it calls
buf_page_get_gen() is called to fetch the page from buffer pool.
It should take x-latch on the page. If other thread uses the block
or block io state is different from BUF_IO_NONE then release the
mutex and check the state and buffer fix count again. For compressed
page, use the existing free block from LRU list to create new page.
Retry to fetch the compressed page if it is in flush list

fseg_create(), fseg_create_general(): Introduce block as a parameter
where segment header is placed. It is used to avoid repetitive
x-latch on the same page

Change the assert to check whether the page has SX latch and
X latch in all callee function of buf_page_create()

mtr_t::get_fix_count(): Get the buffer fix count of the given
block added by the mtr

FindBlock is added to find the buffer fix count of the given
block acquired by the mini-transaction
2020-09-09 11:58:15 +05:30
Marko Mäkelä
f99cace77f MDEV-22924 Corruption in MVCC read via secondary index
An unsafe optimization was introduced by
commit 2347ffd843 (MDEV-20301)
which is based on
mysql/mysql-server@3f3136188f or
mysql/mysql-server@647a3814a9
in MySQL 8.0.12 or MySQL 8.0.13
(which in turn is based on the contribution in MySQL Bug #84958).

Row_sel_get_clust_rec_for_mysql::operator(): In addition to checking
that the pointer to the record matches, also check the latest
modification of the page (FIL_PAGE_LSN) as well as the page identifier.
Only if all three match, it is safe to reuse cached_old_vers.

Row_sel_get_clust_rec_for_mysql::check_eq(): Assert that the PRIMARY KEY
of the cached old version of the record corresponds to the latest version.

We got a test case where CHECK TABLE, UPDATE and purge would be
hammering on the same table (with only 6 rows) and a pointer that
was originally pointing to a record pk=2 would match a cached_clust_rec
that was pointing to a record pk=1. In the diagnosed `rr replay` trace,
we would wrongly return an old cached version of the pk=1 record,
instead of retrieving the correct version of the pk=2 record. Because
of this, CHECK TABLE would fail to count one of the records in a
secondary index, and report failure.

This bug appears to affect MVCC reads via secondary indexes only.
The purge of history in secondary indexes uses a different code path,
and so do checks for implicit record locks.
2020-09-07 15:31:54 +03:00
xdavidwu
3ee2422624 InnoDB, XtraDB: handle EOPNOTSUPP from posix_fallocate()
On some libc (like musl[1]), posix_fallocate() is a fallocate() syscall
wrapper, and does not include fallback code like glibc does. In that
case, EOPNOTSUPP is returned if underlying filesystem does not
support fallocate() with mode = 0.

This patch enables falling back to writing zeros when EOPNOTSUPP, fixes
some cases like running on filesystem without proper fallocate support
on Alpine.

[1]: https://git.musl-libc.org/cgit/musl/tree/src/fcntl/posix_fallocate.c?h=v1.2.1
2020-09-04 13:07:45 +03:00
Marko Mäkelä
c5cb59ce77 Merge 10.2 into 10.3 2020-09-04 12:31:58 +03:00
Marko Mäkelä
1a3ce7e77c MDEV-23651: Fix the Windows build
In the Microsoft environment, my_atomic requires int32.
2020-09-04 12:17:35 +03:00
Marko Mäkelä
c029d45623 MDEV-23600 follow-up: uninitialized rec_field_is_prefix
build_template_field(): Initialize templ->rec_field_is_prefix
also for indexes on virtual columns. This was caught on 10.5 by
MemorySanitizer as use-of-uninitialized-value in
row_search_with_covering_prefix() when running the test
main.fast_prefix_index_fetch_innodb.
2020-09-04 12:14:24 +03:00
Marko Mäkelä
a7dd7c8993 MDEV-23651 InnoDB: Failing assertion: !space->referenced()
commit de942c9f61 (MDEV-15983)
introduced a race condition that we inadequately fixed in
commit 93b69825ad (MDEV-16169).

Because fil_space_t::release() or fil_space_t::acquire() are
not protected by fil_system.mutex like their predecessors,
it is possible that stop_new_ops was set between the time
a thread checked fil_space_t::is_stopping() and invoked
fil_space_t::acquire().

In an execution trace, this happened in fil_system_t::keyrotate_next(),
causing an assertion failure in fil_delete_tablespace()
in the other thread that seeked to stop new operations.

We fix this bug by merging the flag fil_space_t::stop_new_ops
and the reference count fil_space_t::n_pending_ops into a
single word that is only being accessed by atomic memory operations.

fil_space_t::set_stopping(): Accessor for changing the state of
the former stop_new_ops flag.

fil_space_t::acquire(): Return whether the acquisition succeeded.
It would fail between set_stopping(true) and set_stopping(false).
2020-09-03 14:49:11 +03:00