- During shutdown, InnoDB fts fails to update synced doc id
when there is only one doc id about to sync. While starting
the server, InnoDB fetches the already synced doc id from
config table. In the subsequent sync operation, InnoDB fails
with DB_DUPLICATE_KEY error.
In commit 244fdc435d (MDEV-29438)
we made sure that if the preceding record is the page infimum record,
no more than 8 bytes will be read from it. But, if the data payload of
the being-inserted record is less than 8 bytes (this can happen in
secondary indexes), we must not compare all 8 bytes.
This was caught by a failure of the test gcol.innodb_virtual_basic
under MemorySanitizer and some builds with AddressSanitizer.
- This is caused by commit 1bd681c8b3c5213ce1f7976940a7dc38b48a0d39(MDEV-25506)
InnoDB removes the index from the table object before freeing the
leaf and non-leaf segments.
This bug was found in MariaDB Server 10.6 thanks to the
OPT_PAGE_CHECKSUM record that was implemented
in commit 4179f93d28 for catching
this type of recovery failures.
page_cur_insert_rec_low(): If the previous record is the page infimum,
correctly limit the end of the record. We do not want to copy data from
the header of the page supremum. This omission caused the incorrect
recovery of DB_TRX_ID in an instant ALTER TABLE metadata record, because
part of the DB_TRX_ID was incorrectly copied from the n_owned of the
page supremum, which in recovery would be updated after the copying,
but in normal operation would already have been updated at the time the
common prefix was being determined.
log_phys_t::apply(): If a data page is found to be corrupted, do not
flag the log corrupted but instead return a new status APPLIED_CORRUPTED
so that the caller may discard all log for this page. We do not want
the recovery of unrelated pages to fail in recv_recover_page().
No test case is included, because the known test case would only work
in 10.6, and even after this fix, it would trigger another bug in
instant ALTER TABLE crash recovery.
These files are currently not being used nor compiled in MariaDB. The
use of large lists of 'case' statements in these source files are also
not a great way to represent translated strings. This git history can
be referred to when a better translation interface can be implemented
in the future.
Therefore, these files can be removed to cleanup the MariaDB codebase.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer
Amazon Web Services, Inc.
Spider converts HA_READ_KEY_EXACT to the equality (=) in the
function spider_db_append_key_where_internal() but the conversion
is not necessarily correct for tables with prefix indices.
We fix the bug by converting HA_READ_KEY_EXACT to 'LIKE "foo%"' when
a target key is a prefix key. The fix is partly inspired by FEDERATED.
See ha_federated::create_where_from_key() for more details.
If multiple threads invoke buf_page_get_low() on a ROW_FORMAT=COMPRESSED
page that does not reside in the buffer pool, then one of the threads
will end up acquiring an exclusive page latch (the "if" statement
right before the new wait_for_unzip: label) and other threads will
end up waiting for a shared latch while holding a buffer-fix.
The exclusive latch holder would then wait for the buffer-fixes to
be released while the buffer-fix holders are waiting for the shared latch.
buf_page_get_low(): Prevent the hang that was introduced
in commit 9436c778c3 (MDEV-27058),
by releasing the buffer-fix, sleeping some time, and retrying the
page lookup.
recv_sys_t::free_corrupted_page(): Identify the corrupted page in
an error or warning message.
buf_page_free(): Just in case, register the page as modified.
This should already have been done in mtr_t::free() as part of
fseg_free_page_low().
mtr_t::memo_push(): Simplify a condition, so that when invoked
with MTR_MEMO_PAGE_X_MODIFY, we will do the right thing.
fseg_free_page_low(): Remove an accidentally added return statement
that prevented mtr_t::free() from being called. This fixes a regression
that was introduced in
commit 0b47c126e3 (MDEV-13542).
Making changes to wsrep_mysqld.h causes large parts of server code to
be recompiled. The reason is that wsrep_mysqld.h is included by
sql_class.h, even tough very little of wsrep_mysqld.h is needed in
sql_class.h. This commit introduces a new header file, wsrep_on.h,
which is meant to be included from sql_class.h, and contains only
macros and variable declarations used to determine whether wsrep is
enabled.
Also, header wsrep.h should only contain definitions that are also
used outside of sql/. Therefore, move WSREP_TO_ISOLATION* and
WSREP_SYNC_WAIT macros to wsrep_mysqld.h.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
dict_table_rename_in_cache(), dict_table_get_highest_foreign_id():
Reserve sufficient space for the fkid[] buffer, and ensure that the
fkid[] will be NUL-terminated.
The fkid[] must accommodate both the database name (which is already
encoded in my_charset_filename) and the constraint name
(which must be converted to my_charset_filename) so that we can check
if it is in the format databasename/tablename_ibfk_1 (all encoded in
my_charset_filename).
Modern software (including text editors, static analysis software,
and web-based code review interfaces) often requires source code files
to be interpretable via a consistent character encoding, with UTF-8 or
ASCII (a strict subset of UTF-8) as the default. Several of the MariaDB
source files contain bytes that are not valid in either the UTF-8 or
ASCII encodings, but instead represent strings encoded in the
ISO-8859-1/Latin-1 or ISO-8859-2/Latin-2 encodings.
These inconsistent encodings may prevent software from correctly
presenting or processing such files. Converting all source files to
valid UTF8 characters will ensure correct handling.
Comments written in Czech were replaced with lightly-corrected
translations from Google Translate. Additionally, comments describing
the proper handling of special characters were changed so that the
comments are now purely UTF8.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer
Amazon Web Services, Inc.
Co-authored-by: Andrew Hutchings <andrew@linuxjedi.co.uk>
trx_undo_page_report_rename(): Use the correct maximum length of
a table name. Both the database name and the table name can be up to
NAME_CHAR_LEN (64 characters) times 5 bytes per character in the
my_charset_filename encoding. They are not encoded in UTF-8!
fil_op_write_log(): Reserve the correct amount of log buffer for
a rename operation. The file name will be appended by
mlog_catenate_string().
rename_file_ext(): Reserve a large enough buffer for the file names.
In commit 0b47c126e3 (MDEV-13542)
a few calls to mtr_t::memo_push() were moved before a write latch
on the page was acquired. This introduced a race condition:
1. is_block_dirtied() returned false to mtr_t::memo_push()
2. buf_page_t::write_complete() was executed, the block marked clean,
and a page latch released
3. The page latch was acquired by the caller of mtr_t::memo_push(),
and mtr_t::m_made_dirty was not set even though the block is in
a clean state.
The impact of this race condition is that crash recovery and backups
may fail.
btr_cur_latch_leaves(), btr_store_big_rec_extern_fields(),
btr_free_externally_stored_field(), trx_purge_free_segment():
Acquire the page latch before invoking mtr_t::memo_push().
This fixes the regression caused by MDEV-13542.
Side note: It would suffice to set mtr_t::m_made_dirty at the time
we set the MTR_MEMO_MODIFY flag for a block. Currently that flag is
unnecessarily set if a mini-transaction acquires a page latch on
a page that is in a clean state, and will not actually modify the block.
This may cause unnecessary acquisitions of log_sys.flush_order_mutex
on mtr_t::commit().
mtr_t::free(): If the block had been exclusively latched in this
mini-transaction, set the m_made_dirty flag so that the flush order mutex
will be acquired during mtr_t::commit(). This should have been part of
commit 4179f93d28 (MDEV-18976).
It was necessary to change mtr_t::free() so that
WriteOPT_PAGE_CHECKSUM::operator() would be able to avoid writing
checksums for freed pages.
buf_defer_drop_ahi(): Remove. Ever since
commit c7f8cfc9e7 (MDEV-27700)
it is safe to invoke btr_search_drop_page_hash_index(block, true)
to remove an orphan adaptive hash index.
Any attempt to upgrade page latches is prone to deadlocks. Recently,
we observed a few hangs that involved nothing more than a small table
consisting of one clustered index page, one secondary index page and
some undo pages.
The issue is that trx_t::lock.was_chosen_as_deadlock_victim can be reset
before the transaction check it and set trx_t::error_state.
The fix is to reset trx_t::lock.was_chosen_as_deadlock_victim only in
trx_t::commit_in_memory(), which is invoked on full rollback. There is
also no need to have separate bit in
trx_t::lock.was_chosen_as_deadlock_victim to flag transaction it was
chosen as a victim of Galera conflict resolution, the same variable can be
used for both cases except debug build. For debug build we need to
distinguish deadlock and Galera's abort victims for debug checks. Also
there is no need to check for deadlock in lock_table_enqueue_waiting() for
Galera as the coresponding check presents in lock_wait().
Local variable "error_state" in lock_wait() was replaced with
trx->error_state, because before the replace
lock_sys_t::cancel<false>(trx, lock) and lock_sys.deadlock_check() could
change trx->error_state, which then could be overwritten with the local
"error_state" variable value.
The lock_wait_suspend_thread_enter DEBUG_SYNC point name is misleading,
because lock_wait_suspend_thread was eliminated in e71e613. It was renamed
to lock_wait_start.
Reviewed by: Marko Mäkelä, Jan Lindström.
Removed use std::vector's ba push_back(), pop_back() to make it more
obvious that memory in the vectors won't be reallocated.
Also, "borrowed" elements can be debugged a little better now,
they are put into the start of the m_cache vector.
- InnoDB fts table initially added to LRU table cache
while creating the table. Later, table was marked
as non-evicted when we add the table to fts optimizer
list. Before marking the table as non-evicted, master
thread can try to evict the fts table.
- Race condition between fsp_get_available_space_in_free_extents()
and fsp_try_extend_data_file() while accessing space.free_limit.
Before calling fsp_get_available_space_in_free_extents(), take
shared lock on space->latch.
- Newly created InnoDB fulltext index does have only one column
and doesn't associate with primary key fields during DDL.
init_change_cols() has strict assertion that number of fields
should be greater than number of collation change columns.
This reverts part of commit 212994f704
(MDEV-28974) and implements a better fix that works in that special case
while avoiding other failures.
fil_name_process(): Do not rename the tablespace in deferred_spaces;
it already contains the latest name for the space id.
deferred_spaces::create(): In mariadb-backup --prepare,
replace absolute data directory file path with short name
relative to the backup directory and store it as filename while
deferring tablespace file creation.
The index root page contains the fields BTR_SEG_TOP and BTR_SEG_LEAF
which keep track of allocated pages in the index tree. These fields
are normally protected by an Update latch, so that concurrent read
access to other parts of the page will be possible.
When the index root page is already exclusively latched in the
mini-transaction, we must not try to acquire a lower-grade Update latch.
In fact, when the root page is already X or U latched in the
mini-transaction, there is no point to acquire another latch.
Moreover, after a U latch was acquired on top of an X-latch,
mtr_t::defer_drop_ahi() would trigger an assertion failure or
lock corruption in block->page.lock.u_x_upgrade() because X locks
already exist on the block.
This problem may have been introduced in
commit 03ca6495df (MDEV-24142).
btr_page_alloc_low(), btr_page_free(): Initially buffer-fix the root page.
If it is already U or X latched, release the buffer-fix. Else, upgrade
the buffer-fix to a U latch.
mtr_t::u_lock_register(): Upgrade a buffer-fix to U latch.
mtr_t::have_u_or_x_latch(): Check if U or X latches are already
registered in the mini-transaction.
Reason:
=======
Race condition between btr_search_drop_hash_index() and
btr_search_lazy_free(). One thread does resizing of buffer pool
and clears the ahi on all pages in the buffer pool, frees the
index and table while removing the last reference. At the same time,
other thread access index->heap in btr_search_drop_hash_index().
Solution:
=========
Acquire the respective ahi latch before checking index->freed()
btr_search_drop_page_hash_index(): Added new parameter to indicate
that drop ahi entries only if the index is marked as freed
btr_search_check_marked_free_index(): Acquire all ahi latches and
return true if the index was freed