This essentially reverts commit b393e2cb0c.
The leak might have been fixed, but because the
DEBUG_SYNC instrumentation for InnoDB purge threads was reverted
in 10.5 commit 5e62b6a5e0
as part of introducing a thread pool, it is easiest to revert
the entire change.
The DECIMAL data type branch in Item_func_int_val::fix_length_and_dec()
incorrectly used DOUBLE-style length calculation, which resulted in
a smaller data type than the actual result of FLOOR()/CEIL() needs.
innobase_get_charset(), innobase_get_stmt_safe(): Remove.
It is more efficient and readable to invoke thd_charset()
and thd_query_safe() directly, without a non-inlined wrapper function.
The function thd_query_safe() is used in the implementation of the
following INFORMATION_SCHEMA views:
information_schema.innodb_trx
information_schema.innodb_locks
information_schema.innodb_lock_waits
information_schema.rocksdb_trx
The implementation of the InnoDB views is in trx_i_s_common_fill_table().
This function invokes trx_i_s_possibly_fetch_data_into_cache(),
which will acquire lock_sys->mutex and trx_sys->mutex in order to
protect the set of active transactions and explicit locks.
While holding those mutexes, it will traverse the collection of
InnoDB transactions. For each transaction, thd_query_safe() will be
invoked.
When called via trx_i_s_common_fill_table(), thd_query_safe()
is acquiring THD::LOCK_thd_data while holding the InnoDB locks.
This will cause a deadlock with THD::awake() (such as executing
KILL QUERY), because THD::awake() could invoke lock_trx_handle_wait(),
which attempts to acquire lock_sys->mutex while already holding
THD::lock_thd_data.
thd_query_safe(): Invoke mysql_mutex_trylock() instead of
mysql_mutex_lock(). Return the empty string if the mutex
cannot be acquired without waiting.
As part of the SPATIAL INDEX implementation in InnoDB,
dict_index_t was expanded by a rtr_ssn_t field. There are only
3 operations for this field, all protected by rtr_ssn_t::mutex:
* btr_cur_search_to_nth_level() stores the least significant 32 bits
of the 64-bit value that is stored in the index root page.
(This would better be done when the table is opened for the
very first time.)
* rtr_get_new_ssn_id() increments the value by 1.
* rtr_get_current_ssn_id() reads the current value.
All these operations can be implemented equally safely by using
atomic memory access operations.
Let us limit the maximum value of the debug parameter
innodb_data_file_size to 256 MiB. It is only being used
in the test innodb.log_data_file_size, and the size
of the system tablespace should never exceed some 70 MiB
in ./mtr. Thus, 256 MiB should be a reasonable limit.
The fact that negative values that are passed to unsigned parameters
wrap around to the maximum value appears to be a regression due to
commit 18ef02b04d
and has been filed as bug MDEV-22219.
The test was incompatible with ./mtr --repeat=2 until
commit 2d6719d7ee
fixed that.
It turns out that the failing assertion that we disabled in
commit 3db94d2403
is bogus and can fail when the change buffer is emptied
during the last batch of crash recovery. The reason for this
is the condition around the page_create_empty() call in
page_cur_delete_rec(). The condition was removed in MariaDB
Server 10.5 as part of MDEV-12353, in
commit 7ae21b18a6 and
commit f8a9f90667.
The bug that the assertion aimed to catch is MDEV-22497, which
was fixed in commit 26aab96ecf.
The InnoDB insert buffer was upgraded in MySQL 5.5 into a change
buffer that also covers delete-mark and delete (purge) operations.
There is an important constraint for delete operations: a B-tree
leaf page must not become empty unless the entire tree becomes empty,
consisting of an empty root page. Because change buffer merges only
occur on a single leaf page at a time, delete operations must not be
buffered if it is possible that the last record of the page could be
deleted. (In that case, we would refuse to use the change buffer, and
if we really delete the last record, we would shrink the index tree.)
The function ibuf_get_volume_buffered_hash() is part of our insurance
that the page would not become empty. It is supposed to map each
buffered INSERT or DELETE_MARK record payload into a hash value.
We will only count each such record as a distinct key if there is no
hash collision. DELETE operations will always decrement the predicted
number fo records in the page.
Due to a bug in the function, we would actually compute the hash value
not only on the record payload, but also on some following bytes,
in case the record contains NULL values. In MySQL Bug #61104, we had
some examples of this dating back to 2012. But back then, we failed to
reproduce the bug, and in commit d84c95579b
we simply demoted the hard assertion to a message printout and a debug
assertion failure.
ibuf_get_volume_buffered_hash(): Correctly compute the hash value
of the payload bytes only. Note: we will consider
('foo','bar'),(NULL,'foobar'),('foob','ar') to be equal, but this
is not a problem, because in case of a hash collision, we could
also consider ('boo','far') to be equal, and underestimate the number
of records in the page, leading to refusing to buffer a DELETE.
only MDL-prelock but do not open FK child tables for read-only (RESTRICT)
FK actions.
Tables still needs to be opened for CASCADE actions, see 9180e8666b
Fixing a race condition while collecting the engine independent statistics.
Thread1>
1) start running "ANALYZE TABLE t PERISTENT FOR COLUMNS (..) INDEXES ($list)"
2) Walk through $list and save it in TABLE::keys_in_use_for_query
3) Close/re-open tables
Thread2>
1) Make some use of table t. This involves taking table t from
the table cache, and putting it back (with TABLE::keys_in_use_for_query reset to 0)
Thread1>
continue collecting EITS stats. Since TABLE::keys_in_use_for_query is set to 0 we
will not collect statistics for indexes in $list.
This is a partial backport of
commit 5e7e7153b4 from 10.4.
assert_trx_is_free(): Assert !is_wsrep().
trx_init(): Do not initialize trx->wsrep, because it must have been
initialized already.
trx_commit_in_memory(): Invoke wsrep_commit_ordered(). This call
was being skipped, because the transaction object had already been
freed to the pool.
trx_rollback_for_mysql(), innobase_commit_low(),
innobase_rollback_trx(): Always reset trx->wsrep.