The issue here is that end_of_file for encrypted temporary IO_CACHE (used by filesort) is updated
using lseek.
Encryption adds storage overhead and hides it from the caller by recalculating offsets and lengths.
Two different IO_CACHE cannot possibly modify the same file
because the encryption key is randomly generated and stored in the IO_CACHE.
So when the tempfiles are encrypted DO NOT use lseek to change end_of_file.
Further observations about updating end_of_file using lseek
1) The end_of_file update is only used for binlog index files
2) The whole point is to update file length when the file was modified via a different file descriptor.
3) The temporary IO_CACHE files can never be modified via a different file descriptor.
4) For encrypted temporary IO_CACHE, end_of_file should not be updated with lseek
The rw_lock_stats were incorrectly updated.
While global statistics have limited usefulness, we cannot
remove them from a GA version. This contribution is slightly
improving performance in write workloads.
If the InnoDB buffer pool contains many pages for a table or index
that is being dropped or rebuilt, and if many of such pages are
pointed to by the adaptive hash index, dropping the adaptive hash index
may consume a lot of time.
The time-consuming operation of dropping the adaptive hash index entries
is being executed while the InnoDB data dictionary cache dict_sys is
exclusively locked.
It is not actually necessary to drop all adaptive hash index entries
at the time a table or index is being dropped or rebuilt. We can let
the LRU replacement policy of the buffer pool take care of this gradually.
For this to work, we must detach the dict_table_t and dict_index_t
objects from the main dict_sys cache, and once the last
adaptive hash index entry for the detached table is removed
(when the garbage page is evicted from the buffer pool) we can free
the dict_table_t and dict_index_t object.
Related to this, in MDEV-16283, we made ALTER TABLE...DISCARD TABLESPACE
skip both the buffer pool eviction and the drop of the adaptive hash index.
We shifted the burden to ALTER TABLE...IMPORT TABLESPACE or DROP TABLE.
We can remove the eviction from DROP TABLE. We must retain the eviction
in the ALTER TABLE...IMPORT TABLESPACE code path, so that in case the
discarded table is being re-imported with the same tablespace identifier,
the fresh data from the imported tablespace will replace any stale pages
in the buffer pool.
rpl.rpl_failed_drop_tbl_binlog: Remove the test. DROP TABLE can
no longer be interrupted inside InnoDB.
fseg_free_page(), fseg_free_step(), fseg_free_step_not_header(),
fseg_free_page_low(), fseg_free_extent(): Remove the parameter
that specifies whether the adaptive hash index should be dropped.
btr_search_lazy_free(): Lazily free an index when the last
reference to it is dropped from the adaptive hash index.
buf_pool_clear_hash_index(): Declare static, and move to the
same compilation unit with the bulk of the adaptive hash index
code.
dict_index_t::clone(), dict_index_t::clone_if_needed():
Clone an index that is being rebuilt while adaptive hash index
entries exist. The original index will be inserted into
dict_table_t::freed_indexes and dict_index_t::set_freed()
will be called.
dict_index_t::set_freed(), dict_index_t::freed(): Note that
or check whether the index has been freed. We will use the
impossible page number 1 to denote this condition.
dict_index_t::n_ahi_pages(): Replaces btr_search_info_get_ref_count().
dict_index_t::detach_columns(): Move the assignment n_fields=0
to ha_innobase_inplace_ctx::clear_added_indexes().
We must have access to the columns when freeing the
adaptive hash index. Note: dict_table_t::v_cols[] will remain
valid. If virtual columns are dropped or added, the table
definition will be reloaded in ha_innobase::commit_inplace_alter_table().
buf_page_mtr_lock(): Drop a stale adaptive hash index if needed.
We will also reduce the number of btr_get_search_latch() calls
and enclose some more code inside #ifdef BTR_CUR_HASH_ADAPT
in order to benefit cmake -DWITH_INNODB_AHI=OFF.
When neither MSAN nor Valgrind are enabled, declare
Field::mark_unused_memory_as_defined() as an empty inline function,
instead of declaring it as a virtual function.
Same array instance in two Item_func_in instances. First Item_func_in
instance is freed on table close. Second one is freed on
cleanup_after_query().
get_copy() depends on copy ctor for copying an item and hence does
shallow copy for default copy ctor. Use build_clone() for deep copy of
Item_func_in.
MDEV-22073 MSAN use-of-uninitialized-value in
collect_statistics_for_table()
Other things:
innodb.analyze_table was changed to mainly test statistic
collection. Was discussed with Marko.
The issue here was that when the schema was changed the value for the THD::server_status
is ored with SERVER_SESSION_STATE_CHANGED.
For custom aggregate functions, currently we check if the server_status is equal to
SERVER_STATUS_LAST_ROW_SENT then we should terminate the execution of the custom
aggregate function as there are no more rows to fetch.
So the check should be that if the server status has the bit set for
SERVER_STATUS_LAST_ROW_SENT then we should terminate the execution of the
custom aggregate function.
Problem was that trx->lock.was_chosen_as_wsrep_victim variable was
not set back to false after it was set true.
wsrep_thd_bf_abort
Add assertions for correct mutex status and take necessary
mutexes before calling thd->awake_no_mutex().
innobase_rollback_trx()
Reset trx->lock.was_chosen_as_wsrep_victim
wsrep_abort_slave_trx()
Removed unused function.
wsrep_innobase_kill_one_trx()
Added function comment, removed unnecessary parameters
and added debug assertions to enforce correct usage. Added
more debug output to help out on error analysis.
wsrep_abort_transaction()
Added debug assertions and removed unused variables.
trx0trx.h
Removed assert_trx_is_free macro and replaced it with
assert_freed() member function.
trx_create()
Use above assert_free() and initialize wsrep variables.
trx_free()
Use assert_free()
trx_t::commit_in_memory()
Reset lock.was_chosen_as_wsrep_victim
trx_rollback_for_mysql()
Reset trx->lock.was_chosen_as_wsrep_victim
Add test case galera_bf_kill
For the case when the optimizer does the IN-EXISTS transformation,
the equality condition is injected in the WHERE OR HAVING clause of
the subquery. If the select list of the subquery has a reference to
the parent select make sure to use the reference and not the original
item.
On a checksum failure of a ROW_FORMAT=COMPRESSED page,
buf_LRU_free_one_page() would invoke buf_LRU_block_remove_hashed()
which will read the uncompressed page frame, although it would not
be initialized. With bad enough luck, fil_page_get_type(page)
could return an unrecognized value and cause the server to abort.
buf_page_io_complete(): On the corruption of a ROW_FORMAT=COMPRESSED
page, zerofill the uncompressed page frame.
When server is compiled with recent VS2019, then executables,
have dependency on vcruntime140_1.dll
While we include the VC redistributable merge modules into our MSI package,
those merge modules were stale (taken from older VS version, 2017)
Since VS2019 brough new DLL dependency by introducing new exception handling
https://devblogs.microsoft.com/cppblog/making-cpp-exception-handling-smaller-x64
thus the old MSMs were not enough.
The fix is to change logic in win/packaging/CMakeLists.txt to look up for
the correct, new MSMs.
The bug only affects 10.4,as we compile with static CRT before 10.4,
and partly-statically(just vcruntime stub is statically linked, but not UCRT)
after 10.4
For the fix to work, it required also some changes on the build machine
(vs_installer, modify VS2019 installation, add Individual Component
"C++ 2019 Redistributable MSMs")
This essentially reverts commit b393e2cb0c.
The leak might have been fixed, but because the
DEBUG_SYNC instrumentation for InnoDB purge threads was reverted
in 10.5 commit 5e62b6a5e0
as part of introducing a thread pool, it is easiest to revert
the entire change.
- There are multiple inconsistency and incorrect way in which rw-lock
stats are calculated.
- shared rw-lock stats:
"rounds" counter is incremented only once for N rounds done
in spin-cycle.
- all rw-lock stats:
If the spin-cycle is short-circuited then attempts are re-counted.
[If spin-cycle is interrupted, before it completes
srv_n_spin_wait_rounds (default 30) rounds, spin_count is incremented
to consider this. If thread resumes spin-cycle (due to unavailability
of the locks) and is again interrupted or completed, spin_count
is again incremented with the total count, failing to adjust the
previous attempt increment].
- s/x rw-lock stats:
spin_loop counter is not incremented at-all instead it is projected
as 0 (in show engine output) and division to calculate spin-round per
spin-loop is adjusted.
As per the original semantics spin_loop counter should be incremented
once per spin_loop execution.
- sx rw-lock stats:
sx locks increments spin_loop counter but instead of incrementing it
once for a spin_loop invocation it does it multiple times based on how
many time spin_loop flow is repeated for same instance post os-wait.
The DECIMAL data type branch in Item_func_int_val::fix_length_and_dec()
incorrectly used DOUBLE-style length calculation, which resulted in
a smaller data type than the actual result of FLOOR()/CEIL() needs.
Type_handler_xxx::Item_const_eq() can handle only non-NULL values.
The code in Item_basic_value::eq() did not take this into account.
Adding a test to detect three different combinations:
- Both values are NULLs, return true.
- Only one value is NULL, return false.
- Both values are not NULL, call Type_handler::Item_const_eq()
to check equality.
innobase_get_charset(), innobase_get_stmt_safe(): Remove.
It is more efficient and readable to invoke thd_charset()
and thd_query_safe() directly, without a non-inlined wrapper function.
The function thd_query_safe() is used in the implementation of the
following INFORMATION_SCHEMA views:
information_schema.innodb_trx
information_schema.innodb_locks
information_schema.innodb_lock_waits
information_schema.rocksdb_trx
The implementation of the InnoDB views is in trx_i_s_common_fill_table().
This function invokes trx_i_s_possibly_fetch_data_into_cache(),
which will acquire lock_sys->mutex and trx_sys->mutex in order to
protect the set of active transactions and explicit locks.
While holding those mutexes, it will traverse the collection of
InnoDB transactions. For each transaction, thd_query_safe() will be
invoked.
When called via trx_i_s_common_fill_table(), thd_query_safe()
is acquiring THD::LOCK_thd_data while holding the InnoDB locks.
This will cause a deadlock with THD::awake() (such as executing
KILL QUERY), because THD::awake() could invoke lock_trx_handle_wait(),
which attempts to acquire lock_sys->mutex while already holding
THD::lock_thd_data.
thd_query_safe(): Invoke mysql_mutex_trylock() instead of
mysql_mutex_lock(). Return the empty string if the mutex
cannot be acquired without waiting.