Try to fix the race conditions between
SET GLOBAL innodb_ft_aux_table = ...;
and access to the INFORMATION_SCHEMA tables that depend on
this variable.
innodb_ft_aux_table: Replaces
fts_internal_tbl_name,fts_internal_tbl_name2. Just store the
user-specified parameter as is.
innodb_ft_aux_table_id: The table_id corresponding to
SET GLOBAL innodb_ft_aux_table, or 0 if the table does not exist
or does not contain FULLTEXT INDEX. If the table is renamed later,
the INFORMATION_SCHEMA tables will continue to refer to the table.
If the table is dropped or rebuilt, the INFORMATION_SCHEMA tables
will not find the table.
With SET GLOBAL innodb_optimize_fulltext_only=1
in effect, OPTIMIZE TABLE would output words from the fulltext index
to the server error log, even in non-debug builds.
fts_optimize_words(): Remove the unwanted output.
fts_get_table_name(): Output to a caller-allocated buffer.
fts_get_table_name_prefix(): Use the lower-overhead allocation
ut_malloc() instead of mem_alloc().
This is based on mysql/mysql-server@d1584b9f38
in MySQL 5.7.4.
fts_table_t::parent: Remove the redundant field. Refer to
table->name.m_name instead.
fts_update_sync_doc_id(), fts_update_next_doc_id(): Remove
the redundant parameter table_name.
fts_get_table_name_prefix(): Access the dict_table_t::name.
FIXME: Ensure that this access is always covered by
dict_sys->mutex.
fts_state_t, fts_slot_t::state: Remove. Replaced by fts_slot_t::running
and fts_slot_t::table_id as follows.
FTS_STATE_SUSPENDED: Removed (unused).
FTS_STATE_EMPTY: Removed. table_id=0 will denote empty slots.
FTS_STATE_RUNNING: Equivalent to running=true.
FTS_STATE_LOADED, FTS_STATE_DONE: Equivalent to running=false.
fts_slot_t::table: Remove. Tables will be identified by table_id.
After opening a table, we will check fil_table_accessible() before
accessing the data.
fts_optimize_new_table(), fts_optimize_del_table(),
fts_optimize_how_many(), fts_is_sync_needed():
Remove the parameter tables, and use the static variable fts_slots
(which was introduced in MariaDB 10.2) instead.
The accessor dtuple_get_nth_v_field() was defined differently between
debug and release builds in MySQL 5.7.8 in
mysql/mysql-server@c47e1751b7
and a debug assertion to document or enforce the questionable assumption
tuple->v_fields == &tuple->fields[tuple->n_fields] was missing.
This was apparently no problem until MDEV-11369 introduced instant
ADD COLUMN to MariaDB Server 10.3. With that work present, in one
test case, trx_undo_report_insert_virtual() could in release builds
fetch the wrong value for a virtual column.
We replace many of the dtuple_t accessors with const-preserving
inline functions, and fix missing or misleadingly applied const
qualifiers accordingly.
InnoDB includes 3 parsers, which use 3 lexical analyzers that
are generated with flex. Flex versions before 2.6 emitted
the keyword "register", which is deprecated in C++17.
The lexical analyzers were regenerated as follows:
for s in storage/innobase storage/xtradb
do
(cd "$s"/pars; ./make_flex.sh)
touch "$s"/fts/*.l
make -C "$s"/fts -f Makefile.query
done
I know no test case for this bug in 10.1. So a test case will be
committed separately in 10.2
fts_reset_get_doc(): properly initialize fts_get_doc_t::cache
fts_fetch_index_words(): Restore the initialization len=0.
The test innodb_fts.create in 10.2 would end up in an infinite loop
if this assignment is removed, because a following iteration of the
while() loop would assign zip->zp->avail_in=len with the original value
instead of the 0 that was reset in the previous iteration.
Fix the warnings issued by GCC 8 -Wstringop-truncation
and -Wstringop-overflow in InnoDB and XtraDB.
This work is motivated by Jan Lindström. The patch mainly differs
from his original one as follows:
(1) We remove explicit initialization of stack-allocated string buffers.
The minimum amount of initialization that is needed is a terminating
NUL character.
(2) GCC issues a warning for invoking strncpy(dest, src, sizeof dest)
because if strlen(src) >= sizeof dest, there would be no terminating
NUL byte in dest. We avoid this problem by invoking strncpy() with
a limit that is 1 less than the buffer size, and by always writing
NUL to the last byte of the buffer.
(3) We replace strncpy() with memcpy() or strcpy() in those cases
when the result is functionally equivalent.
Note: fts_fetch_index_words() never deals with len==UNIV_SQL_NULL.
This was enforced by an assertion that limits the maximum length
to FTS_MAX_WORD_LEN. Also, the encoding that InnoDB uses for
the compressed fulltext index is not byte-order agnostic, that is,
InnoDB data files that use FULLTEXT INDEX are not portable between
big-endian and little-endian systems.
There was a race condition in the error handling of ALTER TABLE when
the table contains FULLTEXT INDEX.
During the error handling of an erroneous ALTER TABLE statement,
when InnoDB would drop the internally created tables for FULLTEXT INDEX,
it could happen that one of the hidden tables was being concurrently
accessed by a background thread. Because of this, InnoDB would defer
the drop operation to the background.
However, related to MDEV-13564 backup-safe TRUNCATE TABLE and its
prerequisite MDEV-14585, we had to make the background drop table queue
crash-safe by renaming the table to a temporary name before enqueueing it.
This renaming was introduced in a follow-up of the MDEV-13407 fix.
As part of this rename operation, we were unnecessarily parsing the
current SQL statement, because the same rename operation could also be
executed as part of ALTER TABLE via ha_innobase::rename_table().
If an ALTER TABLE statement was being refused due to incorrectly formed
FOREIGN KEY constraint, then it could happen that the renaming of the hidden
internal tables for FULLTEXT INDEX could also fail, triggering a host of
error log messages, and causing a subsequent table-rebuilding ALTER TABLE
operation to fail due to the tablespace already existing.
innobase_rename_table(), row_rename_table_for_mysql(): Add the parameter
use_fk for suppressing the parsing of FOREIGN KEY constraints. It
will only be passed as use_fk=true by ha_innobase::rename_table(),
which can be invoked as part of ALTER TABLE...ALGORITHM=COPY.
The error handling in the MDEV-13564 TRUNCATE TABLE was broken
when an error occurred during table creation.
row_create_index_for_mysql(): Do not drop the table on error.
fts_create_one_common_table(), fts_create_one_index_table():
Do drop the table on error.
create_index(), create_table_info_t::create_table():
Let the caller handle the index creation errors.
ha_innobase::create(): If create_table_info_t::create_table()
fails, drop the incomplete table, roll back the transaction,
and finally return an error to the caller.
The relevant InnoDB/XtraDB fixes up to 5.6.42 had already
been applied to MariaDB in commit 30c3d6db32.
Revert some changes that appeared in
the merge commit 87d852f102.
- Backported the MYSQL_SYSVAR_SIZE_T to 10.0
- The parameter innodb_ft_result_cache_limit was only 32 bits wide
also on 64-bit systems. Make it size_t, so that it will be 64 bits
on 64-bit systems.
- Added a test case that show how innodb_ft_result_cache_limit variables
behaves in 32bit and 64 bit system.
When converting table identifiers to a new format,
some tables can be renamed twice, which subsequently
leads to the appearance of "false" auxiliary tables
belonging to another main (parent) table (which does
not actually have auxiliary tables).
This is because the table number is repeatedly added
to the aux_tables_to_rename vector inside the function
fts_check_and_drop_orphaned_tables.
To correct this error, we must add a check for the
occurrence of the table number in the aux_tables_to_rename
vector before adding a new element.
https://jira.mariadb.org/browse/MDEV-16656
Implement undo tablespace truncation via normal redo logging.
Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name,
CREATE, and DROP.
Note: Orphan #sql-ib*.ibd may be left behind if MariaDB Server 10.2
is killed before the DROP operation is committed. If MariaDB Server 10.2
is killed during TRUNCATE, it is also possible that the old table
was renamed to #sql-ib*.ibd but the data dictionary will refer to the
table using the original name.
In MariaDB Server 10.3, RENAME inside InnoDB is transactional,
and #sql-* tables will be dropped on startup. So, this new TRUNCATE
will be fully crash-safe in 10.3.
ha_mroonga::wrapper_truncate(): Pass table options to the underlying
storage engine, now that ha_innobase::truncate() will need them.
rpl_slave_state::truncate_state_table(): Before truncating
mysql.gtid_slave_pos, evict any cached table handles from
the table definition cache, so that there will be no stale
references to the old table after truncating.
== TRUNCATE TABLE ==
WL#6501 in MySQL 5.7 introduced separate log files for implementing
atomic and crash-safe TRUNCATE TABLE, instead of using the InnoDB
undo and redo log. Some convoluted logic was added to the InnoDB
crash recovery, and some extra synchronization (including a redo log
checkpoint) was introduced to make this work. This synchronization
has caused performance problems and race conditions, and the extra
log files cannot be copied or applied by external backup programs.
In order to support crash-upgrade from MariaDB 10.2, we will keep
the logic for parsing and applying the extra log files, but we will
no longer generate those files in TRUNCATE TABLE.
A prerequisite for crash-safe TRUNCATE is a crash-safe RENAME TABLE
(with full redo and undo logging and proper rollback). This will
be implemented in MDEV-14717.
ha_innobase::truncate(): Invoke RENAME, create(), delete_table().
Because RENAME cannot be fully rolled back before MariaDB 10.3
due to missing undo logging, add some explicit rename-back in
case the operation fails.
ha_innobase::delete(): Introduce a variant that takes sqlcom as
a parameter. In TRUNCATE TABLE, we do not want to touch any
FOREIGN KEY constraints.
ha_innobase::create(): Add the parameters file_per_table, trx.
In TRUNCATE, the new table must be created in the same transaction
that renames the old table.
create_table_info_t::create_table_info_t(): Add the parameters
file_per_table, trx.
row_drop_table_for_mysql(): Replace a bool parameter with sqlcom.
row_drop_table_after_create_fail(): New function, wrapping
row_drop_table_for_mysql().
dict_truncate_index_tree_in_mem(), fil_truncate_tablespace(),
fil_prepare_for_truncate(), fil_reinit_space_header_for_table(),
row_truncate_table_for_mysql(), TruncateLogger,
row_truncate_prepare(), row_truncate_rollback(),
row_truncate_complete(), row_truncate_fts(),
row_truncate_update_system_tables(),
row_truncate_foreign_key_checks(), row_truncate_sanity_checks():
Remove.
row_upd_check_references_constraints(): Remove a check for
TRUNCATE, now that the table is no longer truncated in place.
The new test innodb.truncate_foreign uses DEBUG_SYNC to cover some
race-condition like scenarios. The test innodb-innodb.truncate does
not use any synchronization.
We add a redo log subformat to indicate backup-friendly format.
MariaDB 10.4 will remove support for the old TRUNCATE logging,
so crash-upgrade from old 10.2 or 10.3 to 10.4 will involve
limitations.
== Undo tablespace truncation ==
MySQL 5.7 implements undo tablespace truncation. It is only
possible when innodb_undo_tablespaces is set to at least 2.
The logging is implemented similar to the WL#6501 TRUNCATE,
that is, using separate log files and a redo log checkpoint.
We can simply implement undo tablespace truncation within
a single mini-transaction that reinitializes the undo log
tablespace file. Unfortunately, due to the redo log format
of some operations, currently, the total redo log written by
undo tablespace truncation will be more than the combined size
of the truncated undo tablespace. It should be acceptable
to have a little more than 1 megabyte of log in a single
mini-transaction. This will be fixed in MDEV-17138 in
MariaDB Server 10.4.
recv_sys_t: Add truncated_undo_spaces[] to remember for which undo
tablespaces a MLOG_FILE_CREATE2 record was seen.
namespace undo: Remove some unnecessary declarations.
fil_space_t::is_being_truncated: Document that this flag now
only applies to undo tablespaces. Remove some references.
fil_space_t::is_stopping(): Do not refer to is_being_truncated.
This check is for tablespaces of tables. Potentially used
tablespaces are never truncated any more.
buf_dblwr_process(): Suppress the out-of-bounds warning
for undo tablespaces.
fil_truncate_log(): Write a MLOG_FILE_CREATE2 with a nonzero
page number (new size of the tablespace in pages) to inform
crash recovery that the undo tablespace size has been reduced.
fil_op_write_log(): Relax assertions, so that MLOG_FILE_CREATE2
can be written for undo tablespaces (without .ibd file suffix)
for a nonzero page number.
os_file_truncate(): Add the parameter allow_shrink=false
so that undo tablespaces can actually be shrunk using this function.
fil_name_parse(): For undo tablespace truncation,
buffer MLOG_FILE_CREATE2 in truncated_undo_spaces[].
recv_read_in_area(): Avoid reading pages for which no redo log
records remain buffered, after recv_addr_trim() removed them.
trx_rseg_header_create(): Add a FIXME comment that we could write
much less redo log.
trx_undo_truncate_tablespace(): Reinitialize the undo tablespace
in a single mini-transaction, which will be flushed to the redo log
before the file size is trimmed.
recv_addr_trim(): Discard any redo logs for pages that were
logged after the new end of a file, before the truncation LSN.
If the rec_list becomes empty, reduce n_addrs. After removing
any affected records, actually truncate the file.
recv_apply_hashed_log_recs(): Invoke recv_addr_trim() right before
applying any log records. The undo tablespace files must be open
at this point.
buf_flush_or_remove_pages(), buf_flush_dirty_pages(),
buf_LRU_flush_or_remove_pages(): Add a parameter for specifying
the number of the first page to flush or remove (default 0).
trx_purge_initiate_truncate(): Remove the log checkpoints, the
extra logging, and some unnecessary crash points. Merge the code
from trx_undo_truncate_tablespace(). First, flush all to-be-discarded
pages (beyond the new end of the file), then trim the space->size
to make the page allocation deterministic. At the only remaining
crash injection point, flush the redo log, so that the recovery
can be tested.
fts_query(): Remove a redundant condition (result will never be NULL),
and instead check if *result is NULL, to prevent SIGSEGV in
fts_query_free_result().
The functions fts_ast_visit() and fts_query() inside
InnoDB FULLTEXT INDEX query processing are not checking
for THD::killed (trx_is_interrupted()), like anything
that potentially takes a long time should do.
This is a port of the following change from MySQL 5.7.23,
with a completely rewritten test case.
commit c58c6f8f66ddd0357ecd0c99646aa6bf1dae49c8
Author: Aakanksha Verma <aakanksha.verma@oracle.com>
Date: Fri May 4 15:53:13 2018 +0530
Bug #27155294 MAX_EXECUTION_TIME NOT INTERUPTED WITH FULLTEXT SEARCH USING MECAB
This is a backport of the following fix from MySQL 5.7.23.
Some code refactoring has been omitted, and the test case has
been adapted to MariaDB.
commit 7a689acaa65e9d602575f7aa53fe36a64a07460f
Author: Krzysztof Kapuścik <krzysztof.kapuscik@oracle.com>
Date: Tue Mar 13 12:34:03 2018 +0100
Bug#27082268 Invalid FTS sync synchronization
The fix closes two issues:
Bug #27082268 - INNODB: FAILING ASSERTION: SYM_NODE->TABLE != NULL DURING FTS SYNC
Bug #27095935 - DEADLOCK BETWEEN FTS_DROP_INDEX AND FTS_OPTIMIZE_SYNC_TABLE
Both issues were related to a FTS cache sync being done during
operations that perfomed DDL actions on internal FTS tables
(ALTER TABLE, TRUNCATE). In some cases the FTS tables and/or
internal cache structures could get removed while still being
used to perform FTS synchronization leading to crashes. In other
the sync operations could not get finishes as it was waiting for
dict lock which was taken by thread waiting for the background
sync to be finished.
The changes done includes:
- Stopping background operations during ALTER TABLE and TRUNCATE.
- Removal of unused code in FTS.
- Cleanup of FTS sync related code to make it more readable and
easier to maintain.
RB#18262
Problem:
As part of bug #24938374 fix, dict_operation_lock was not taken by
fts_optimize_thread while syncing fts cache.
Due to this change, alter query is able to update SYS_TABLE rows
simultaneously. Now when fts_optimizer_thread goes open index table,
It doesn't open index table if the record corresponding to that table is
set to REC_INFO_DELETED_FLAG in SYS_TABLES and hits an assert.
Fix:
If fts sync is already in progress, Alter query would wait for sync to
complete before renaming table.
RB: #19604
Reviewed by : Jimmy.Yang@oracle.com
PROBLEM
-------
Whenever an fts table is created it registers itself in a queue which
is operated by a background thread whose job is to optimize the
fts tables in background. Additionally we place these fts tables in
non-LRU list so that they cannot be evicted from cache. But in the
scenario when a node is brought up which is already having fts
tables ,we first try to load the fts tables in dictionary ,but we skip
the part where it is added in background queue and in non-LRU list because
the background thread is not yet created,so these tables are loaded
but they can be evicted from the cache. Now coming to the deadlock scenario
1. A Server background thread is trying to evict a table from the cache
because the cache is full,so it scans the LRU list for the tables it can
evict.It finds the fts table (because of the reason explained above)
can be evicted and it takes the dict_sys->mutex (this is a system wide mutex)
submits a request to the background thread to remove this table from queue
and waits it to be completed.
2. In the mean time fts_optimize_thread() is processing another job
in the queue and needs dict_sys->mutex for a small amount of time,
but it cannot get it because it is blocked by the first background thread.
So Thread 1 is waiting for its job to be completed by Thread 2,whereas Thread 2
is waiting for dict_sys->mutex held by thread 1 ,causing the deadlock.
FIX
If creating a secondary index fails (typically, ADD UNIQUE INDEX fails
due to duplicate key), it is possible that concurrently running UPDATE
or DELETE will access the index stub and hit the debug assertion.
It does not make any sense to keep updating an uncommitted index whose
creation has failed.
dict_index_t::is_corrupted(): Replaces dict_index_is_corrupted().
Also take online_status into account.
Replace some calls to dict_index_is_clust() with calls to
dict_index_t::is_primary().
fts_sync(): If the dict_table_t::to_be_dropped flag is set,
do not "goto begin_sync".
Also, clean up the way how dict_index_t::index_fts_syncing
is cleared.
It looks like this regression was introduced by merging
Oracle Bug #24938374 MYSQL CRASHED AFTER LONG WAIT ON DICT OPERATION LOCK
WHILE SYNCING FTS INDEX
068f8261d4
from MySQL 5.6.38 into MariaDB 10.0.33, 10.1.29, 10.2.10.
The same hang is present in MySQL 5.7.20.
fts_cmp_set_sync_doc_id(), fts_load_stopword(): Start the transaction
in read-only mode if innodb_read_only is set.
fts_update_sync_doc_id(), fts_commit_table(), fts_sync(),
fts_optimize_table(): Return DB_READ_ONLY if innodb_read_only is set.
fts_doc_fetch_by_doc_id(), fts_table_fetch_doc_ids():
Remove the code to start an internal transaction or to roll back,
because this is a read-only operation.
If CREATE TABLE...FULLTEXT INDEX was initiated right before shutdown,
then the function fts_load_stopword() could commit modifications
after shutdown was initiated, causing an assertion failure in
the function trx_purge_add_update_undo_to_history().
Mark as internal all the read/write transactions that
modify fulltext indexes, so that they will be ignored by
the assertion that guards against transaction commits
after shutdown has been initiated.
fts_optimize_free(): Invoke trx_commit_for_mysql() just in case,
because in fts_optimize_create() we started the transaction as
internal, and fts_free_for_backgruond() would assert that the
flag is clear. Transaction commit would clear the flag.