The background drop table queue in InnoDB is a work-around for
cases where the SQL layer is requesting DDL on tables on which
transactional locks exist.
One such case are XA transactions. Our test case exploits the
fact that the recovery of XA PREPARE transactions will
only resurrect InnoDB table locks, but not MDL that should
block any concurrent DDL.
srv_shutdown_t: Introduce the srv_shutdown_state=SRV_SHUTDOWN_INITIATED
for the initial part of shutdown, to wait for the background drop
table queue to be emptied.
srv_shutdown_bg_undo_sources(): Assign
srv_shutdown_state=SRV_SHUTDOWN_INITIATED
before waiting for the background drop table queue to be emptied.
row_drop_tables_for_mysql_in_background(): On slow shutdown, if
no active transactions exist (excluding ones that are in
XA PREPARE state), skip any tables on which locks exist.
row_drop_table_for_mysql(): Do not unnecessarily attempt to
drop InnoDB persistent statistics for tables that have
already been added to the background drop table queue.
row_mysql_close(): Relax an assertion, and free all memory
even if innodb_force_recovery=2 would prevent the background
drop table queue from being emptied.
Introduce a new ATTRIBUTE_NOINLINE to
ib::logger member functions, and add UNIV_UNLIKELY hints to callers.
Also, remove some crash reporting output. If needed, the
information will be available using debugging tools.
Furthermore, remove some fts_enable_diag_print output that included
indexed words in raw form. The code seemed to assume that words are
NUL-terminated byte strings. It is not clear whether a NUL terminator
is always guaranteed to be present. Also, UCS2 or UTF-16 strings would
typically contain many NUL bytes.
Several MYSQL_SYSVAR_STR parameters that employ both a validate
function callback fail to copy the string for saving the
validated value. The affected variables include the following:
innodb_ft_aux_table
innodb_ft_server_stopword_table
innodb_ft_user_stopword_table
innodb_buffer_pool_filename
The test case is an enhanced version of
mysql/mysql-server@0b0c30641f
and the code changes are inspired by their fixes.
We are also importing and adjusting the test innodb_fts.stopword
to get coverage for the variable innodb_ft_user_stopword_table.
buf_dump(), buf_load(): Protect srv_buf_dump_filename with
LOCK_global_system_variables.
fts_load_user_stopword(): Minor cleanup
fts_load_stopword(): Remove the parameter global_stopword_table.
innobase_fts_load_stopword(): Protect innodb_server_stopword_table
against concurrent SET GLOBAL.
Problem:
=======
The problem is that InnoDB doesn't add the table in fts slots if drop table fails. InnoDB marks the table is in fts slots while processing sync message. So the consecutive alter statement assumes that table is in queue and tries to remove it. But InnoDB can't find the table in fts_slots.
Solution:
=========
i) Removal of in_queue in fts_t while processing the fts sync message.
ii) Add the table to fts_slots when drop table fails.
offset_t: this is a type which represents one record offset.
It's unsigned short int.
a lot of functions: replace ulint with offset_t
btr_pcur_restore_position_func(),
page_validate(),
row_ins_scan_sec_index_for_duplicate(),
row_upd_clust_rec_by_insert_inherit_func(),
row_vers_impl_x_locked_low(),
trx_undo_prev_version_build():
allocate record offsets on the stack instead of waiting for rec_get_offsets()
to allocate it from mem_heap_t. So, reducing memory allocations.
RECORD_OFFSET, INDEX_OFFSET:
now it's less convenient to store pointers in offset_t*
array. One pointer occupies now several offset_t. And those constant are start
indexes into array to places where to store pointer values
REC_OFFS_HEADER_SIZE: adjusted for the new reality
REC_OFFS_NORMAL_SIZE:
increase size from 100 to 300 which means less heap allocations.
And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which
is smaller than previous 800 bytes.
REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality
rem0rec.h, rem0rec.ic, rem0rec.cc:
various arguments, return values and local variables types were changed to
fix numerous integer conversions issues.
enum field_type_t:
offset types concept was introduces which replaces old offset flags stuff.
Like in earlier version, 2 upper bits are used to store offset type.
And this enum represents those types.
REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed
get_type(), set_type(), get_value(), combine():
these are convenience functions to work with offsets and it's types
rec_offs_base()[0]:
still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL
rec_offs_base()[i]:
these have type offset_t now. Two upper bits contains type.
- fts_optimize_thread() uses dict_table_t object instead of table id.
So that it doesn't acquire dict_sys->mutex. It leads to remove the
hang of dict_sys->mutex between fts_optimize_thread() and other threads.
- in_queue to indicate whether the table is in fts_optimize_queue. It
is protected by fts_optimize_wq->mutex to avoid any race condition.
- fts_optimize_init() adds the fts table to the fts_optimize_wq
InnoDB stores synced_doc_id + 1 value in FTS_CONFIG table. But
while reading the synced doc id from FTS_CONFIG table after restart,
InnoDB should read synced_doc_id - 1 to get the actual synced
doc id value.
Problem:
=======
During dropping of fts index, InnoDB waits for fts_optimize_remove_table()
and it holds dict_sys->mutex and dict_operaiton_lock even though the
table id is not present in the queue. But fts_optimize_thread does wait
for dict_sys->mutex to process the unrelated table id from the slot.
Solution:
========
Whenever table is added to fts_optimize_wq, update the fts_status
of in-memory fts subsystem to TABLE_IN_QUEUE. Whenever drop index
wants to remove table from the queue, it can check the fts_status
to decide whether it should send the MSG_DELETE_TABLE to the queue.
Removed the following functions because these are all deadcode.
dict_table_wait_for_bg_threads_to_exit(),
fts_wait_for_background_thread_to_start(),fts_start_shutdown(), fts_shudown().
fts_sync(): Remove the constant parameter has_dict=false.
fts_sync_table(): Remove the constant parameter has_dict=false,
and the redundant parameter unlock_cache = !wait.
Make wait=true the default parameter.
The FTS optimizer thread made a false assumption that time(NULL)
is monotonic. The system clock can be adjusted to the past,
for example if the hardware clock was drifting to the future,
and it was adjusted by NTP.
fts_slot_t::interval_time: Replace with the constant
FTS_OPTIMIZE_INTERVAL_IN_SECS.
fts_slot_t::last_run, fts_slot_t::completed: Clarify the
documentation.
fts_optimize_get_time_limit(): Remove a type cast, and
add a FIXME comment about domain mismatch.
fts_optimize_compact(), fts_optimize_words(): Limit the time
also when the current time has been moved to the past.
fts_optimize_table_bk(): Check for wrap-around.
fts_optimize_how_many(): Check for wrap-around, and remove the
failing assertions.
fts_is_sync_needed(): Remove a redundant call to time(NULL).
Try to fix the race conditions between
SET GLOBAL innodb_ft_aux_table = ...;
and access to the INFORMATION_SCHEMA tables that depend on
this variable.
innodb_ft_aux_table: Replaces
fts_internal_tbl_name,fts_internal_tbl_name2. Just store the
user-specified parameter as is.
innodb_ft_aux_table_id: The table_id corresponding to
SET GLOBAL innodb_ft_aux_table, or 0 if the table does not exist
or does not contain FULLTEXT INDEX. If the table is renamed later,
the INFORMATION_SCHEMA tables will continue to refer to the table.
If the table is dropped or rebuilt, the INFORMATION_SCHEMA tables
will not find the table.
With SET GLOBAL innodb_optimize_fulltext_only=1
in effect, OPTIMIZE TABLE would output words from the fulltext index
to the server error log, even in non-debug builds.
fts_optimize_words(): Remove the unwanted output.
fts_get_table_name(): Output to a caller-allocated buffer.
fts_get_table_name_prefix(): Use the lower-overhead allocation
ut_malloc() instead of mem_alloc().
This is based on mysql/mysql-server@d1584b9f38
in MySQL 5.7.4.
fts_table_t::parent: Remove the redundant field. Refer to
table->name.m_name instead.
fts_update_sync_doc_id(), fts_update_next_doc_id(): Remove
the redundant parameter table_name.
fts_get_table_name_prefix(): Access the dict_table_t::name.
FIXME: Ensure that this access is always covered by
dict_sys->mutex.
fts_state_t, fts_slot_t::state: Remove. Replaced by fts_slot_t::running
and fts_slot_t::table_id as follows.
FTS_STATE_SUSPENDED: Removed (unused).
FTS_STATE_EMPTY: Removed. table_id=0 will denote empty slots.
FTS_STATE_RUNNING: Equivalent to running=true.
FTS_STATE_LOADED, FTS_STATE_DONE: Equivalent to running=false.
fts_slot_t::table: Remove. Tables will be identified by table_id.
After opening a table, we will check fil_table_accessible() before
accessing the data.
fts_optimize_new_table(), fts_optimize_del_table(),
fts_optimize_how_many(), fts_is_sync_needed():
Remove the parameter tables, and use the static variable fts_slots
(which was introduced in MariaDB 10.2) instead.
The accessor dtuple_get_nth_v_field() was defined differently between
debug and release builds in MySQL 5.7.8 in
mysql/mysql-server@c47e1751b7
and a debug assertion to document or enforce the questionable assumption
tuple->v_fields == &tuple->fields[tuple->n_fields] was missing.
This was apparently no problem until MDEV-11369 introduced instant
ADD COLUMN to MariaDB Server 10.3. With that work present, in one
test case, trx_undo_report_insert_virtual() could in release builds
fetch the wrong value for a virtual column.
We replace many of the dtuple_t accessors with const-preserving
inline functions, and fix missing or misleadingly applied const
qualifiers accordingly.
InnoDB includes 3 parsers, which use 3 lexical analyzers that
are generated with flex. Flex versions before 2.6 emitted
the keyword "register", which is deprecated in C++17.
The lexical analyzers were regenerated as follows:
for s in storage/innobase storage/xtradb
do
(cd "$s"/pars; ./make_flex.sh)
touch "$s"/fts/*.l
make -C "$s"/fts -f Makefile.query
done
I know no test case for this bug in 10.1. So a test case will be
committed separately in 10.2
fts_reset_get_doc(): properly initialize fts_get_doc_t::cache
fts_fetch_index_words(): Restore the initialization len=0.
The test innodb_fts.create in 10.2 would end up in an infinite loop
if this assignment is removed, because a following iteration of the
while() loop would assign zip->zp->avail_in=len with the original value
instead of the 0 that was reset in the previous iteration.