Introduce -DFAST_BUILD parameter for a little faster build or test
if set,
- do not compile with /d2OptimizeHugeFunctions, this makes compilation
of bison output much slower on optimized build
- do not use runtime checks on debug build (RTC1). This slows down tests
considerably
We will disable some optimizations, because the function
row_ins_clust_index_entry_low() would fail to compile
ever since commit a73eedbf3f
changed the definition of srw_mutex::wr_unlock() to use
fetch_sub() instead of fetch_and().
For some reason, applying this work-around does not fix the
"could not split insn" error for that commit,
while it does work for
commit 277ba134ad.
Typically, index_lock and fil_space_t::latch will be held for a longer
time than the spin loop in latch acquisition would be waiting for.
Let us avoid spin loops for those as well as dict_sys.latch, which
could be held in exclusive mode for a longer time (while loading
metadata into the buffer pool and the dictionary cache).
Performance testing on a dual Intel Xeon E5-2630 v4 (2 NUMA nodes)
suggests that the buffer pool page latch (block_lock) benefits from a
spin loop in both read-only and read-write workloads where the working
set is slightly larger than the buffer pool. Presumably, most contention
would occur on leaf page latches. Contention on upper level pages in
the buffer pool should intuitively last longer.
We introduce srw_spin_lock and srw_spin_mutex to allow users of
srw_lock or srw_mutex to opt in for the spin loop.
On Microsoft Windows, a spin loop variant was and will not be available;
srw_mutex and srw_lock will simply wrap SRWLOCK.
That is, on Microsoft Windows, the parameters innodb_sync_spin_loops
and innodb_spin_wait_delay will only affect block_lock.
Invoking ut_delay(srv_wpin_wait_delay) inside a spinloop would
cause a read of 2 global variables as well as multiplication.
Let us loop around MY_RELAX_CPU() using a precomputed loop count
to keep the loops simpler, to help them scale better.
We also tried precomputing the delay into a global variable,
but that appeared to result in slightly worse throughput.
srw_mutex::wait_and_lock(): In the spin loop, we will try to poll
for non-conflicting lock word state by reads, avoiding any writes.
We invoke explicit std::atomic_thread_fence(std::memory_order_acquire)
before returning. The individual operations on the lock word
can use memory_order_relaxed.
srw_mutex:🔒 Document that the value for a single writer is
HOLDER+1 instead of HOLDER.
srw_mutex::wr_lock_try(), srw_mutex::wr_unlock(): Adjust the value
of the lock word of a single writer from HOLDER to HOLDER+1.
buf_read_page_background(): Remove the parameter "bool sync"
and always actually initiate a page read in the background.
buf_load(): Always submit asynchronous reads. This allows
page checksums to be verified in concurrent threads as
soon as the reads are completed.
The server crashed when SPIDER_DIRECT_SQL UDF was called with
non-existing temporary table.
The bug has been introduced by 91ffdc8. The commit removed
the check, from THD::open_temporary_table(), which ensure that
the target temporary tables exist.
We can fix the bug by adding the check before the call of
THD::open_temporary_table().
if the Ptr="abc", then str_length=3, and for a C ptr it needs Ptr[3]=0;
but it passes str_length+1 (=4) to realloc, and realloc allocates
arg_length+1 bytes (that is 5) and does Ptr[arg_length]= 0; (Ptr[4]=0)
dict_table_close(): Fix a race condition around dict_stats_deinit().
This was not observed; it should have been caught by an assertion.
dict_stats_deinit(): Slightly simplify the code.
ha_innobase::info_low(): If the table is unreadable,
initialize some dummy statistics.
This is a side-effect of my_large_malloc() introduction,MDEV-18851
It removed a cast to size_t to variable 'blocks' in
multiplication blocks * keycache->key_cache_block_size , creating ulong value
instead of correct size_t.
Replaced a couple of ulongs with appropriate data type, which is size_t.
Also, fixed casts to ulongs in crash handler messages, so that people would
not be confused by that, too.
Interestingly, aria did not expose the same problem even if it contains
copied and pasted code in ma_pagecache, because Aria had some ulongs removed
when fixing a similar problem in MDEV-9256.
Allow the caller to have current_thd. Also do not store
PSI_CALL_get_thread() in the new THD, it is a thread local storage variable
that can become invalid any time, we do not control the lifetime of the
caller's thread.
A performance regression was observed after
commit 82b7c561b7
because purge tasks would end up waiting more elsewhere,
most notably trx_purge_get_next_rec() and trx_purge_truncate_history().
row_purge_parse_undo_rec(): Prevent the performance regression by
unnecessarily acquiring dict_sys.latch in exclusive mode after the
table lookup.
Other things:
- Don't allocate an IO_CACHE for scanning tables of type BLOCK
(It was never used in this case)
- Fixed bug in page cache that cased a hang when trying to read a
not existing S3 block.
These options was needed in some cases, like when using minio that require
the port option, to be able to connect to the S3 storage.
The sympthom was that one could get the error
"Table t1.MAI doesn't exist in s3"
even if the table did exits.
Other things:
- Improved error message for non existing S3 files
This essentially reverts commit 4e89ec6692
and only disables InnoDB persistent statistics for tests where it is
desirable. By design, InnoDB persistent statistics will not be updated
except by ANALYZE TABLE or by STATS_AUTO_RECALC.
The internal transactions that update persistent InnoDB statistics
in background tasks (with innodb_stats_auto_recalc=ON) may cause
nondeterministic query plans or interfere with some tests that deal
with other InnoDB internals, such as the purge of transaction history.
The purpose of dict_table_t::stats_bg_flag was to prevent
race conditions between DDL operations and a background thread
that updates persistent statistics for InnoDB tables.
Now that with the parent commit, we started to acquire a
shared meta-data lock (MDL) on the InnoDB persistent statistics tables
in background tasks that access them, we may easily acquire MDL
on the table for which the statistics are being updated. This will by
design prevent race conditions with any DDL operations on that table,
and the stats_bg_flag may be removed.
dict_stats_process_entry_from_recalc_pool(): Complete rewrite.
During the processing, retain the entry in recalc_pool, so
that dict_stats_recalc_pool_del() will be able to request
deletion of the entry, or delete the entry if its caller is
holding MDL_EXCLUSIVE while we are waiting for MDL.
recalc_pool: In addition to the table ID, store a state for
inter-thread communication, so that dict_stats_recalc_pool_del()
can wait until all processing is finished.
Reviewed by: Thirunarayanan Balathandayuthapani
In commit 1bd681c8b3 (MDEV-25506 part 3)
we introduced a "fake instant timeout" when a transaction would wait
for a table or record lock while holding dict_sys.latch. This prevented
a deadlock of the server but could cause bogus errors for operations
on the InnoDB persistent statistics tables.
A better fix is to ensure that whenever a transaction is being
executed in the InnoDB internal SQL parser (which will for now
require dict_sys.latch to be held), it will already have acquired
all locks that could be required for the execution. So, we will
acquire the following locks upfront, before acquiring dict_sys.latch:
(1) MDL on the affected user table (acquired by the SQL layer)
(2) If applicable (not for RENAME TABLE): InnoDB table lock
(3) If persistent statistics are going to be modified:
(3.a) MDL_SHARED on mysql.innodb_table_stats, mysql.innodb_index_stats
(3.b) exclusive table locks on the statistics tables
(4) Exclusive table locks on the InnoDB data dictionary tables
(not needed in ANALYZE TABLE and the like)
Note: Acquiring exclusive locks on the statistics tables may cause
more locking conflicts between concurrent DDL operations.
Notably, RENAME TABLE will lock the statistics tables
even if no persistent statistics are enabled for the table.
DROP DATABASE will only acquire locks on statistics tables if
persistent statistics are enabled for the tables on which the
SQL layer is invoking ha_innobase::delete_table().
For any "garbage collection" in innodb_drop_database(), a timeout
while acquiring locks on the statistics tables will result in any
statistics not being deleted for any tables that the SQL layer
did not know about.
If innodb_defragment=ON, information may be written to the statistics
tables even for tables for which InnoDB persistent statistics are
disabled. But, DROP TABLE will no longer attempt to delete that
information if persistent statistics are not enabled for the table.
This change should also fix the hangs related to InnoDB persistent
statistics and STATS_AUTO_RECALC (MDEV-15020) as well as
a bug that running ALTER TABLE on the statistics tables
concurrently with running ALTER TABLE on InnoDB tables could
cause trouble.
lock_rec_enqueue_waiting(), lock_table_enqueue_waiting():
Do not issue a fake instant timeout error when the transaction
is holding dict_sys.latch. Instead, assert that the dict_sys.latch
is never being held here.
lock_sys_tables(): A new function to acquire exclusive locks on all
dictionary tables, in case DROP TABLE or similar operation is
being executed. Locking non-hard-coded tables is optional to avoid
a crash in row_merge_drop_temp_indexes(). The SYS_VIRTUAL table was
introduced in MySQL 5.7 and MariaDB Server 10.2. Normally, we require
all these dictionary tables to exist before executing any DDL, but
the function row_merge_drop_temp_indexes() is an exception.
When upgrading from MariaDB Server 10.1 or MySQL 5.6 or earlier,
the table SYS_VIRTUAL would not exist at this point.
ha_innobase::commit_inplace_alter_table(): Invoke
log_write_up_to() while not holding dict_sys.latch.
dict_sys_t::remove(), dict_table_close(): No longer try to
drop index stubs that were left behind by aborted online ADD INDEX.
Such indexes should be dropped from the InnoDB data dictionary by
row_merge_drop_indexes() as part of the failed DDL operation.
Stubs for aborted indexes may only be left behind in the
data dictionary cache.
dict_stats_fetch_from_ps(): Use a normal read-only transaction.
ha_innobase::delete_table(), ha_innobase::truncate(), fts_lock_table():
While waiting for purge to stop using the table,
do not hold dict_sys.latch.
ha_innobase::delete_table(): Implement a work-around for the rollback
of ALTER TABLE...ADD PARTITION. MDL_EXCLUSIVE would not be held if
ALTER TABLE hits lock_wait_timeout while trying to upgrade the MDL
due to a conflicting LOCK TABLES, such as in the first ALTER TABLE
in the test case of Bug#53676 in parts.partition_special_innodb.
Therefore, we must explicitly stop purge, because it would not be
stopped by MDL.
dict_stats_func(), btr_defragment_chunk(): Allocate a THD so that
we can acquire MDL on the InnoDB persistent statistics tables.
mysqltest_embedded: Invoke ha_pre_shutdown() before free_used_memory()
in order to avoid ASAN heap-use-after-free related to acquire_thd().
trx_t::dict_operation_lock_mode: Changed the type to bool.
row_mysql_lock_data_dictionary(), row_mysql_unlock_data_dictionary():
Implemented as macros.
rollback_inplace_alter_table(): Apply an infinite timeout to lock waits.
innodb_thd_increment_pending_ops(): Wrapper for
thd_increment_pending_ops(). Never attempt async operation for
InnoDB background threads, such as the trx_t::commit() in
dict_stats_process_entry_from_recalc_pool().
lock_sys_t::cancel(trx_t*): Make dictionary transactions immune to KILL.
lock_wait(): Make dictionary transactions immune to KILL, and to
lock wait timeout when waiting for locks on dictionary tables.
parts.partition_special_innodb: Use lock_wait_timeout=0 to instantly
get ER_LOCK_WAIT_TIMEOUT.
main.mdl: Filter out MDL on InnoDB persistent statistics tables
Reviewed by: Thirunarayanan Balathandayuthapani
que_eval_sql(): Remove the parameter lock_dict. The only caller
with lock_dict=true was dict_stats_exec_sql(), which will now
explicitly invoke dict_sys.lock() and dict_sys.unlock() by itself.
row_import_cleanup(): Do not unnecessarily lock the dictionary.
Concurrent access to the table during ALTER TABLE...IMPORT TABLESPACE
is prevented by MDL and the fact that there cannot exist any
undo log or change buffer records that would refer to the table
or tablespace.
row_import_for_mysql(): Do not unnecessarily lock the dictionary
while accessing fil_system. Thanks to MDL_EXCLUSIVE that was acquired
by the SQL layer, only one IMPORT may be in effect for the table name.
row_quiesce_set_state(): Do not unnecessarily lock the dictionary.
The dict_table_t::quiesce state is documented to be protected by
all index latches, which we are acquiring.
dict_table_close(): Introduce a simpler variant with fewer parameters.
dict_table_close(): Reduce the amount of calls.
We can simply invoke dict_table_t::release() on startup or
in DDL operations, or when the table is inaccessible.
In none of these cases, there is no need to invalidate the
InnoDB persistent statistics.
pars_info_t::graph_owns_us: Remove (unused).
pars_info_free(): Define inline.
fts_delete(), trx_t::evict_table(), row_prebuilt_free(),
row_rename_table_for_mysql(): Simplify.
row_mysql_lock_data_dictionary(): Remove some references;
use dict_sys.lock() and dict_sys.unlock() instead.
row_mysql_lock_table(): Remove. Use lock_table_for_trx() instead.
ha_innobase::check_if_supported_inplace_alter(),
row_create_table_for_mysql(): Simply assert dict_sys.sys_tables_exist().
In commit 49e2c8f0a6 and
commit 1bd681c8b3 srv_start()
actually guarantees that the system tables will exist,
or the server is in read-only mode, or startup will fail.
Reviewed by: Thirunarayanan Balathandayuthapani
sym_tab_free_private(): Do not call dict_table_close(), but
simply invoke dict_table_t::release(), which we can do without
locking the whole dictionary cache. (Note: On user tables it
may still be necessary to invoke dict_table_close(), so that
InnoDB persistent statistics will be deinitialized as expected.)
fts_check_corrupt(), row_fts_merge_insert(): Invoke
aux_table->release() to simplify the code. This is never a user table.
fts_que_graph_free(), fts_que_graph_free_check_lock(): Replaced with
que_graph_free().
Reviewed by: Thirunarayanan Balathandayuthapani
In the parent commit, dict_sys.latch could theoretically have been
replaced with a mutex. But, we can do better and merge dict_sys.mutex
into dict_sys.latch. Generally, every occurrence of dict_sys.mutex_lock()
will be replaced with dict_sys.lock().
The PERFORMANCE_SCHEMA instrumentation for dict_sys_mutex
will be removed along with dict_sys.mutex. The dict_sys.latch
will remain instrumented as dict_operation_lock.
Some use of dict_sys.lock() will be replaced with dict_sys.freeze(),
which we will reintroduce for the new shared mode. Most notably,
concurrent table lookups are possible as long as the tables are present
in the dict_sys cache. In particular, this will allow more concurrency
among InnoDB purge workers.
Because dict_sys.mutex will no longer 'throttle' the threads that purge
InnoDB transaction history, a performance degradation may be observed
unless innodb_purge_threads=1.
The table cache eviction policy will become FIFO-like,
similar to what happened to fil_system.LRU
in commit 45ed9dd957.
The name of the list dict_sys.table_LRU will become somewhat misleading;
that list contains tables that may be evicted, even though the
eviction policy no longer is least-recently-used but first-in-first-out.
(Note: Tables can never be evicted as long as locks exist on them or
the tables are in use by some thread.)
As demonstrated by the test perfschema.sxlock_func, there
will be less contention on dict_sys.latch, because some previous
use of exclusive latches will be replaced with shared latches.
fts_parse_sql_no_dict_lock(): Replaced with pars_sql().
fts_get_table_name_prefix(): Merged to fts_optimize_create().
dict_stats_update_transient_for_index(): Deduplicated some code.
ha_innobase::info_low(), dict_stats_stop_bg(): Use a combination
of dict_sys.latch and table->stats_mutex_lock() to cover the
changes of BG_STAT_SHOULD_QUIT, because the flag is being read
in dict_stats_update_persistent() while not holding dict_sys.latch.
row_discard_tablespace_for_mysql(): Protect stats_bg_flag by
exclusive dict_sys.latch, like most other code does.
row_quiesce_table_has_fts_index(): Remove unnecessary mutex
acquisition. FLUSH TABLES...FOR EXPORT is protected by MDL.
row_import::set_root_by_heuristic(): Remove unnecessary mutex
acquisition. ALTER TABLE...IMPORT TABLESPACE is protected by MDL.
row_ins_sec_index_entry_low(): Replace a call
to dict_set_corrupted_index_cache_only(). Reads of index->type
were not really protected by dict_sys.mutex, and writes
(flagging an index corrupted) should be extremely rare.
dict_stats_process_entry_from_defrag_pool(): Only freeze the dictionary,
do not lock it exclusively.
dict_stats_wait_bg_to_stop_using_table(), DICT_BG_YIELD: Remove trx.
We can simply invoke dict_sys.unlock() and dict_sys.lock() directly.
dict_acquire_mdl_shared()<trylock=false>: Assert that dict_sys.latch is
only held in shared more, not exclusive mode. Only acquire it in
exclusive mode if the table needs to be loaded to the cache.
dict_sys_t::acquire(): Remove. Relocating elements in dict_sys.table_LRU
would require holding an exclusive latch, which we want to avoid
for performance reasons.
dict_sys_t::allow_eviction(): Add the table first to dict_sys.table_LRU,
to compensate for the removal of dict_sys_t::acquire(). This function
is only invoked by INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS.
dict_table_open_on_id(), dict_table_open_on_name(): If dict_locked=false,
try to acquire dict_sys.latch in shared mode. Only acquire the latch in
exclusive mode if the table is not found in the cache.
Reviewed by: Thirunarayanan Balathandayuthapani
This will essentially make dict_sys.latch a mutex
(it is only acquired in exclusive mode).
The subsequent commit will merge dict_sys.mutex into dict_sys.latch
and reintroduce dict_sys.freeze() for those cases where we currently
acquire only dict_sys.latch but not dict_sys.mutex. The case where
both are acquired will be mapped to dict_sys.lock().
i_s_sys_tables_fill_table_stats(): Invoke dict_sys.prevent_eviction()
and the new function dict_sys.allow_eviction() to avoid table eviction
while a row in INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS is being
produced.
Reviewed by: Thirunarayanan Balathandayuthapani
row_purge_remove_clust_if_poss_low(): When dict_table_open_on_id()
is being invoked with the data dictionary locked, it will not
actually acquire MDL. Remove the MDL that became dead code in
commit c366845a0b.
THD::copy_db_to(): Always return true if the output parameter
was left uninitialized. This fixes a regression that was caused
by commit 7d0d934ca6 (MDEV-16473).
MariaDB Server 10.3 and later were unaffected by this bug
thanks to commit a7e352b54d.
Possibly this bug only affects mysql_list_fields()
in the Embedded Server (libmysqld).
This bug was found by GCC 11.2.0 in CMAKE_BUILD_TYPE=RelWithDebInfo.
init_mutex_v1_t: Stop lying that the mutex parameter is const.
GCC 11.2.0 assumes that it is and could complain about any mysql_mutex_t
being uninitialized even after mysql_mutex_init() as long as
PLUGIN_PERFSCHEMA is enabled.
init_rwlock_v1_t, init_cond_v1_t: Remove untruthful const qualifiers.
Note: init_socket_v1_t is expecting that the socket fd has already
been created before PSI_SOCKET_CALL(init_socket), and therefore that
parameter really is being treated as a pointer to const.
if all options from a combination from the combinations file are already
present in the server's list of options, then don't try to run tests
in other combinations from this file.
old behavior was: if at least one option from a combination is
already present in the list...
replaced CPACK_RPM_PACKAGE_VERSION with SERVER_VERSION.
CPACK_* variables are empty and can't be used until INCLUDE(CPack) is
called.
SERVER_VERSION is the safest option because other variables may be
overwritten from submodules
Thanks to Theodore Brockman on Zulip for noticing
on an OSX ARM64 and testing this patch.
Per https://github.com/google/cpu_features/pull/150/files
CMAKE_SYSTEM_PROCESSOR is arm64 on Apple.
Without this, compulation error:
[ 80%] Building CXX object storage/rocksdb/CMakeFiles/rocksdblib.dir/rocksdb/util/crc32c.cc.o
/mariadb/storage/rocksdb/rocksdb/util/crc32c.cc:500:18: error: use of undeclared identifier 'isSSE42'
has_fast_crc = isSSE42();
^
/mariadb/storage/rocksdb/rocksdb/util/crc32c.cc:1230:7: error: use of undeclared identifier 'isSSE42'
if (isSSE42()) {
^
/mariadb/storage/rocksdb/rocksdb/util/crc32c.cc:1231:9: error: use of undeclared identifier 'isPCLMULQDQ'
if (isPCLMULQDQ()) {
^
This can be reverted when the RocksDB submodule is updated.
ee4bd4780b