The InnoDB buffer pool and locking were heavily refactored in
MariaDB Server 10.6. Among other things, dict_sys.mutex was removed,
and the contended lock_sys.mutex was replaced with a combination of
lock_sys.latch and distributed latches in hash tables. Also, a
default value was changed to innodb_flush_method=O_DIRECT to improve
performance in write-heavy workloads.
One thing where an adjustment was missing is around the parameters
innodb_max_purge_lag (number of committed transactions waiting to
be purged), and innodb_max_purge_lag_delay
(maximum number of microseconds to delay a DML operation).
purge_coordinator_state::do_purge(): Pass the history_size to trx_purge()
and reset srv_dml_needed_delay if the history is empty.
Keep executing the loop non-stop as long as srv_dml_needed_delay is set.
trx_purge_dml_delay(): Made part of trx_purge().
Set srv_dml_needed_delay=0 when nothing can be purged (!n_pages_handled).
row_mysql_delay_if_needed(): Mimic the logic of
innodb_max_purge_lag_wait_update().
Reviewed by: Thirunarayanan Balathandayuthapani
log_free_check(): Assert that the caller must not hold
exclusive lock_sys.latch. This was the case for calls from
ibuf_delete_for_discarded_space(). This caused a deadlock with
another thread that would be holding a latch on a dirty page
that would need to be written so that the checkpoint would advance
and log_free_check() could return. That other thread was waiting
for a shared lock_sys.latch.
fil_delete_tablespace(): Do not invoke ibuf_delete_for_discarded_space()
because in DDL operations, we will be holding exclusive lock_sys.latch.
trx_t::commit(std::vector<pfs_os_file_t>&), innodb_drop_database(),
row_purge_remove_clust_if_poss_low(), row_undo_ins_remove_clust_rec(),
row_discard_tablespace_for_mysql():
Invoke ibuf_delete_for_discarded_space() on the deleted tablespaces after
releasing all latches.
os_aio_wait_until_no_pending_reads(), os_aio_wait_until_pending_writes():
Add a Boolean parameter to indicate whether the wait should be declared
in the thread pool.
buf_flush_wait(): The callers have already declared a wait, so let us
avoid doing that again, just call os_aio_wait_until_pending_writes(false).
buf_flush_wait_flushed(): Do not declare a wait in the rare case that
the buf_flush_page_cleaner thread has been shut down already.
buf_flush_page_cleaner(), buf_flush_buffer_pool(): In the code that runs
during shutdown, do not declare waits.
buf_flush_buffer_pool(): Remove a debug assertion that might fail.
What really matters here is buf_pool.flush_list.count==0.
buf_read_recv_pages(), srv_prepare_to_delete_redo_log_file():
Do not declare waits during InnoDB startup.
stored externally
row_merge_buf_add(): Has strict assert that fixed length mismatch
shouldn't happen while rebuilding the redundant row format table
btr_index_rec_validate(): Fixed size column can be stored externally.
So sum of inline stored length and external stored length of the
column should be equal to total column length
Sequence objects are implemented using special tables.
These tables do not have primary key and only one row.
NEXTVAL is basically update from existing value to new
value. In Galera this could mean that two write-sets
from different nodes do not conflict and this could
lead situation where write-sets are executed
concurrently and possibly in wrong seqno order.
This is fixed by using table-level exclusive key for
SEQUENCE updates. Note that this naturally works
correctly only if InnoDB storage engine is used for
sequence.
This fix does not contain a test case because while
it is possible to syncronize appliers using dbug_sync
it was too hard to syncronize MDL-lock requests to
exact objects. Testing done for this fix is documented
on MDEV.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
handle_slave_io(), handle_slave_sql(), os_thread_exit():
Remove a redundant pthread_exit(nullptr) call, because it
would cause SIGSEGV.
mysql_print_status(): Add MEM_MAKE_DEFINED() to work around
some missing instrumentation around mallinfo2().
que_graph_free_stat_list(): Invoke que_node_get_next(node) before
que_graph_free_recursive(node). That is the logical and
MSAN_OPTIONS=poison_in_dtor=1 compatible way of freeing memory.
ins_node_t::~ins_node_t(): Invoke mem_heap_free(entry_sys_heap).
que_graph_free_recursive(): Rely on ins_node_t::~ins_node_t().
fts_t::~fts_t(): Invoke mem_heap_free(fts_heap).
fts_free(): Replace with direct calls to fts_t::~fts_t().
The failures in free_root() due to MSAN_OPTIONS=poison_in_dtor=1
will be covered in MDEV-30942.
Problem:
========
- InnoDB replace statement returns can't find record as result during
bulk insert operation. InnoDB returns DB_END_OF_INDEX blindly when
bulk transaction is visible to current transaction even though
the search tuple is inserted as a part of current replace statement.
Solution:
=========
row_search_mvcc(): InnoDB should allow the transaction to read
all the rows when innodb intends to do any locking on the
record even though bulk insert transaction changes are
visible to the current transaction
row_upd_rec_in_place(): Avoid calling page_zip_write_rec() if we
are not modifying any fields that are stored in compressed format.
btr_cur_update_in_place_zip_check(): New function to check if a
ROW_FORMAT=COMPRESSED record can actually be updated in place.
btr_cur_pessimistic_update(): If the BTR_KEEP_POS_FLAG is not set
(we are in a ROLLBACK and cannot write any BLOBs), ignore the potential
overflow and let page_zip_reorganize() or page_zip_compress() handle it.
This avoids a failure when an attempted UPDATE of an NULL column to 0 is
rolled back. During the ROLLBACK, we would try to move a non-updated
long column to off-page storage in order to avoid a compression failure
of the ROW_FORMAT=COMPRESSED page.
page_zip_write_trx_id_and_roll_ptr(): Remove an assertion that would fail
in row_upd_rec_in_place() because the uncompressed page would already
have been modified there.
Thanks to Jean-François Gagné for providing a copy of a page that
triggered these bugs on the ROLLBACK of UPDATE and DELETE.
A 10.6 version of this was tested by Matthias Leich using
cmake -DWITH_INNODB_EXTRA_DEBUG=ON a.k.a. UNIV_ZIP_DEBUG.
This is a follow-up to
commit de4030e4d4 (MDEV-30400),
which fixed some hangs related to B-tree split or merge.
btr_root_block_get(): Use and update the root page guess. This is just
a minor performance optimization, not affecting correctness.
btr_validate_level(): Remove the parameter "lockout", and always
acquire an exclusive dict_index_t::lock in CHECK TABLE without QUICK.
This is needed in order to avoid latching order violation in
btr_page_get_father_node_ptr_for_validate().
btr_cur_need_opposite_intention(): Return true in case
btr_cur_compress_recommendation() would hold later during the
mini-transaction, or if a page underflow or overflow is possible.
If we return true, our caller will escalate to aqcuiring an exclusive
dict_index_t::lock, to prevent a latching order violation and deadlock
during btr_compress() or btr_page_split_and_insert().
btr_cur_t::search_leaf(), btr_cur_t::open_leaf():
Also invoke btr_cur_need_opposite_intention() on the leaf page.
btr_cur_t::open_leaf(): When escalating to exclusive index locking,
acquire exclusive latches on all pages as well.
innobase_instant_try(): Return an error code if the root page cannot
be retrieved.
In addition to the normal stress testing with Random Query Generator (RQG)
this has been tested with
./mtr --mysqld=--loose-innodb-limit-optimistic-insert-debug=2
but with the injection in btr_cur_optimistic_insert() for non-leaf pages
adjusted so that it would use the value 3. (Otherwise, infinite page
splits could occur in some mtr tests.)
Tested by: Matthias Leich
- InnoDB tries to build the previous version of the record for
the virtual index, but the undo log record doesn't contain
virtual column information. This leads to assert failure while
building the tuple.
This patch is the result of running
run-clang-tidy -fix -header-filter=.* -checks='-*,modernize-use-equals-default' .
Code style changes have been done on top. The result of this change
leads to the following improvements:
1. Binary size reduction.
* For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by
~400kb.
* A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb.
2. Compiler can better understand the intent of the code, thus it leads
to more optimization possibilities. Additionally it enabled detecting
unused variables that had an empty default constructor but not marked
so explicitly.
Particular change required following this patch in sql/opt_range.cc
result_keys, an unused template class Bitmap now correctly issues
unused variable warnings.
Setting Bitmap template class constructor to default allows the compiler
to identify that there are no side-effects when instantiating the class.
Previously the compiler could not issue the warning as it assumed Bitmap
class (being a template) would not be performing a NO-OP for its default
constructor. This prevented the "unused variable warning".
This also fixes part of MDEV-29835 Partial server freeze
which is caused by violations of the latching order that was
defined in https://dev.mysql.com/worklog/task/?id=6326
(WL#6326: InnoDB: fix index->lock contention). Unless the
current thread is holding an exclusive dict_index_t::lock,
it must acquire page latches in a strict parent-to-child,
left-to-right order. Not all cases of MDEV-29835 are fixed yet.
Failure to follow the correct latching order will cause deadlocks
of threads due to lock order inversion.
As part of these changes, the BTR_MODIFY_TREE mode is modified
so that an Update latch (U a.k.a. SX) will be acquired on the
root page, and eXclusive latches (X) will be acquired on all pages
leading to the leaf page, as well as any left and right siblings
of the pages along the path. The DEBUG_SYNC test innodb.innodb_wl6326
will be removed, because at the time the DEBUG_SYNC point is hit,
the thread is actually holding several page latches that will be
blocking a concurrent SELECT statement.
We also remove double bookkeeping that was caused due to excessive
information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo
store information of latched pages, and ensure that
mtr_memo_slot_t::object is never a null pointer.
The tree_blocks[] and tree_savepoints[] were redundant.
buf_page_get_low(): If innodb_change_buffering_debug=1, to avoid
a hang, do not try to evict blocks if we are holding a latch on
a modified page. The test innodb.innodb-change-buffer-recovery
will be removed, because change buffering may no longer be forced
by debug injection when the change buffer comprises multiple pages.
Remove a debug assertion that could fail when
innodb_change_buffering_debug=1 fails to evict a page.
For other cases, the assertion is redundant, because we already
checked that right after the got_block: label. The test
innodb.innodb-change-buffering-recovery will be removed, because
due to this change, we will be unable to evict the desired page.
mtr_t::lock_register(): Register a change of a page latch
on an unmodified buffer-fixed block.
mtr_t::x_latch_at_savepoint(), mtr_t::sx_latch_at_savepoint():
Replaced by the use of mtr_t::upgrade_buffer_fix(), which now
also handles RW_S_LATCH.
mtr_t::set_modified(): For temporary tables, invoke
buf_page_t::set_modified() here and not in mtr_t::commit().
We will never set the MTR_MEMO_MODIFY flag on other than
persistent data pages, nor set mtr_t::m_modifications when
temporary data pages are modified.
mtr_t::commit(): Only invoke the buf_flush_note_modification() loop
if persistent data pages were modified.
mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo.
This avoids many redundant entries in mtr_t::m_memo, as well as
redundant calls to buf_page_get_gen() for blocks that had already
been looked up in a mini-transaction.
btr_get_latched_root(): Return a pointer to an already latched root page.
This replaces btr_root_block_get() in cases where the mini-transaction
has already latched the root page.
btr_page_get_parent(): Fetch a parent page that was already latched
in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched().
If needed, upgrade the root page U latch to X.
This avoids bloating mtr_t::m_memo as well as performing redundant
buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for
B-tree defragmentation, we will invoke btr_cur_search_to_nth_level().
btr_cur_search_to_nth_level(): This will only be used for non-leaf
(level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE
or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be
removed altogether, or retained for the case of
CHECK TABLE without QUICK.
btr_cur_t::left_block: Remove. btr_pcur_move_backward_from_page()
can retrieve the left sibling from the end of mtr_t::m_memo.
btr_cur_t::open_leaf(): Some clean-up.
btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level()
for searches to level=0 (the leaf level). We will never release
parent page latches before acquiring leaf page latches. If we need to
temporarily release the level=1 page latch in the BTR_SEARCH_PREV or
BTR_MODIFY_PREV latch_mode, we will reposition the cursor on the
child node pointer so that we will land on the correct leaf page.
btr_cur_t::pessimistic_search_leaf(): Implement new BTR_MODIFY_TREE
latching logic in the case that page splits or merges will be needed.
The parent pages (and their siblings) should already be latched on
the first dive to the leaf and be present in mtr_t::m_memo; there
should be no need for BTR_CONT_MODIFY_TREE. This pre-latching almost
suffices; it must be revised in MDEV-29835 and work-arounds removed
for cases where mtr_t::get_already_latched() fails to find a block.
rtr_search_to_nth_level(): A SPATIAL INDEX version of
btr_search_to_nth_level() that can search to any level
(including the leaf level).
rtr_search_leaf(), rtr_insert_leaf(): Wrappers for
rtr_search_to_nth_level().
rtr_search(): Replaces rtr_pcur_open().
rtr_latch_leaves(): Replaces btr_cur_latch_leaves(). Note that unlike
in the B-tree code, there is no error handling in case the sibling
pages are corrupted.
rtr_cur_restore_position(): Remove an unused constant parameter.
btr_pcur_open_on_user_rec(): Remove the constant parameter
mode=PAGE_CUR_GE.
row_ins_clust_index_entry_low(): Use a new
mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page
when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC.
BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove.
BTR_CONT_MODIFY_TREE: Note that this is only used by
rtr_search_to_nth_level().
btr_pcur_optimistic_latch_leaves(): Replaces
btr_cur_optimistic_latch_leaves().
ibuf_delete_rec(): Acquire exclusive ibuf.index->lock in order
to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV).
btr_blob_log_check_t(): Acquire a U latch on the root page,
so that btr_page_alloc() in btr_store_big_rec_extern_fields()
will avoid a deadlock.
btr_store_big_rec_extern_fields(): Assert that the root page latch
is being held.
Tested by: Matthias Leich
Reviewed by: Vladislav Lesin
- introduce table key construction function in wsrep service interface
- don't add row keys when replicating bulk insert
- don't start bulk insert on applier or when transaction is not active
- don't start bulk insert on system versioned tables
- implement actual bulk insert table-level key replication
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
This also fixes part of MDEV-29835 Partial server freeze
which is caused by violations of the latching order that was
defined in https://dev.mysql.com/worklog/task/?id=6326
(WL#6326: InnoDB: fix index->lock contention). Unless the
current thread is holding an exclusive dict_index_t::lock,
it must acquire page latches in a strict parent-to-child,
left-to-right order. Not all cases are fixed yet. Failure to
follow the correct latching order will cause deadlocks of threads
due to lock order inversion.
As part of these changes, the BTR_MODIFY_TREE mode is modified
so that an Update latch (U a.k.a. SX) will be acquired on the
root page, and eXclusive latches (X) will be acquired on all pages
leading to the leaf page, as well as any left and right siblings
of the pages along the path. The test innodb.innodb_wl6326
will be removed, because at the time the DEBUG_SYNC point is hit,
the thread is actually holding several page latches that will be
blocking a concurrent SELECT statement.
We also remove double bookkeeping that was caused due to excessive
information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo
store information of latched pages, and ensure that
mtr_memo_slot_t::object is never a null pointer.
The tree_blocks[] and tree_savepoints[] were redundant.
mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo.
This avoids many redundant entries in mtr_t::m_memo, as well as
redundant calls to buf_page_get_gen() for blocks that had already
been looked up in a mini-transaction.
btr_get_latched_root(): Return a pointer to an already latched root page.
This replaces btr_root_block_get() in cases where the mini-transaction
has already latched the root page.
btr_page_get_parent(): Fetch a parent page that was already latched
in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched().
If needed, upgrade the root page U latch to X.
This avoids bloating mtr_t::m_memo as well as redundant
buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for
B-tree defragmentation, we will invoke btr_cur_search_to_nth_level().
btr_cur_search_to_nth_level(): This will only be used for non-leaf
(level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE
or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be
removed altogether, or retained for the case of
CHECK TABLE without QUICK.
btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level()
for searches to level=0 (the leaf level).
btr_cur_t::pessimistic_search_leaf(): Implement the new
BTR_MODIFY_TREE latching logic in the case that page splits
or merges will be needed. The parent pages (and their siblings)
should already be latched on the first dive to the leaf and be
present in mtr_t::m_memo; there should be no need for
BTR_CONT_MODIFY_TREE. This pre-latching almost suffices;
MDEV-29835 will have to revise it and remove work-arounds where
mtr_t::get_already_latched() fails to find a block.
rtr_search_to_nth_level(): A SPATIAL INDEX version of
btr_search_to_nth_level() that can search to any level
(including the leaf level).
rtr_search_leaf(), rtr_insert_leaf(): Wrappers for
rtr_search_to_nth_level().
rtr_search(): Replaces rtr_pcur_open().
rtr_cur_restore_position(): Remove an unused constant parameter.
btr_pcur_open_on_user_rec(): Remove the constant parameter
mode=PAGE_CUR_GE.
btr_cur_latch_leaves(): Update a pre-existing mtr_t::m_memo entry
for the current leaf page.
row_ins_clust_index_entry_low(): Use a new
mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page
when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC.
btr_cur_t::open_leaf(): Some clean-up.
mtr_t::lock_register(): Register a page latch on a buffer-fixed block.
BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove.
BTR_CONT_MODIFY_TREE: Note that this is only used by
rtr_search_to_nth_level().
btr_pcur_optimistic_latch_leaves(): Replaces
btr_cur_optimistic_latch_leaves().
ibuf_delete_rec(): Acquire ibuf.index->lock.u_lock() in order
to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV).
Tested by: Matthias Leich
1. In case of system-versioned table add row_end into FTS_DOC_ID index
in fts_create_common_tables() and innobase_create_key_defs().
fts_n_uniq() returns 1 or 2 depending on whether the table is
system-versioned.
After this patch recreate of FTS_DOC_ID index is required for
existing system-versioned tables. If you see this message in error
log or server warnings: "InnoDB: Table db/t1 contains 2 indexes
inside InnoDB, which is different from the number of indexes 1
defined in the MariaDB" use this command to fix the table:
ALTER TABLE db.t1 FORCE;
2. Fix duplicate history for secondary unique index like it was done
in MDEV-23644 for clustered index (932ec586aa). In case of
existing history row which conflicts with currently inseted row we
check in row_ins_scan_sec_index_for_duplicate() whether that row
was inserted as part of current transaction. In that case we
indicate with DB_FOREIGN_DUPLICATE_KEY that new history row is not
needed and should be silently skipped.
3. Some parts of MDEV-21138 (7410ff436e) reverted. Skipping of
FTS_DOC_ID index for history rows made problems with purge
system. Now this is fixed differently by p.2.
4. wait_all_purged.inc checks that we didn't affect non-history rows
so they are deleted and purged correctly.
Additional FTS fixes
fts_init_get_doc_id(): exclude history rows from max_doc_id
calculation. fts_init_get_doc_id() callback is used only for crash
recovery.
fts_add_doc_by_id(): set max value for row_end field.
fts_read_stopword(): stopwords table can be system-versioned too. We
now read stopwords only for current data.
row_insert_for_mysql(): exclude history rows from doc_id validation.
row_merge_read_clustered_index(): exclude history_rows from doc_id
processing.
fts_load_user_stopword(): for versioned table retrieve row_end field
and skip history rows. For non-versioned table we retrieve 'value'
field twice (just for uniformity).
FTS tests for System Versioning now include maybe_versioning.inc which
adds 3 combinations:
'vers' for debug build sets sysvers_force and
sysvers_hide. sysvers_force makes every created table
system-versioned, sysvers_hide hides WITH SYSTEM VERSIONING
for SHOW CREATE.
Note: basic.test, stopword.test and versioning.test do not
require debug for 'vers' combination. This is controlled by
$modify_create_table in maybe_versioning.inc and these
tests run WITH SYSTEM VERSIONING explicitly which allows to
test 'vers' combination on non-debug builds.
'vers_trx' like 'vers' sets sysvers_force_trx and sysvers_hide. That
tests FTS with trx_id-based System Versioning.
'orig' works like before: no System Versioning is added, no debug is
required.
Upgrade/downgrade test for System Versioning is done by
innodb_fts.versioning. It has 2 combinations:
'prepare' makes binaries in std_data (requires old server and OLD_BINDIR).
It tests upgrade/downgrade against old server as well.
'upgrade' tests upgrade against binaries in std_data.
Cleanups:
Removed innodb-fts-stopword.test as it duplicates stopword.test
Works like vers_force but forces trx_id-based system-versioned tables
if the storage supports it (currently InnoDB-only). Otherwise creates
timestamp-based system-versioned table.
Before the fix next-key lock was requested only if a record was
delete-marked for locking unique search in RR isolation level.
There can be several delete-marked records for the same unique key,
that's why InnoDB scans the records until eighter non-delete-marked record
is reached or all delete-marked records with the same unique key are
scanned.
For range scan next-key locks are used for RR to protect scanned range from
inserting new records by other transactions. And this is the reason of why
next-key locks are used for delete-marked records for unique searches.
If a record is not delete-marked, the requested lock type was "not-gap".
When a record is not delete-marked during lock request by trx 1, and
some other transaction holds conflicting lock, trx 1 creates waiting
not-gap lock on the record and suspends. During trx 1 suspending the
record can be delete-marked. And when the lock is granted on conflicting
transaction commit or rollback, its type is still "not-gap". So we have
"not-gap" lock on delete-marked record for RR. And this let some other
transaction to insert some record with the same unique key when trx 1 is
not committed, what can cause isolation level violation.
The fix is to set next-key locks for both delete-marked and
non-delete-marked records for unique search in RR.
mysql_discard_or_import_tablespace(): On successful
ALTER TABLE...DISCARD TABLESPACE, evict the table handle from the
table definition cache, so that ha_innobase::close() will be invoked,
like InnoDB expects to be the case. This will avoid an assertion failure
ut_a(table->get_ref_count() == 0) during IMPORT TABLESPACE.
ha_innobase::open(): Do not issue any ER_TABLESPACE_DISCARDED warning.
Member functions for DML will do that.
ha_innobase::truncate(), ha_innobase::check_if_supported_inplace_alter():
Issue ER_TABLESPACE_DISCARDED warnings, to compensate for the removal of
the warning in ha_innobase::open().
row_quiesce_write_indexes(): Only write information about committed
indexes. The ALTER TABLE t NOWAIT ADD INDEX(c) in the nondeterministic
test case will most of the time fail due to a metadata lock (MDL) timeout
and leave behind an uncommitted index.
Reviewed by: Sergei Golubchik
os_file_read(): Merged with os_file_read_no_error_handling().
Crashing on a partial page read is as unhelpful as crashing on a
corrupted page read (commit 0b47c126e3).
Report the file name if it is available via IORequest.
btr_cur_t: Zero-initialize all fields in the default constructor.
btr_cur_t::index: Remove; it duplicated page_cur.index.
Many functions: Remove arguments that were duplicating
page_cur_t::index and page_cur_t::block.
page_cur_open_level(), btr_pcur_open_level(): Replaces
btr_cur_open_at_index_side() for dict_stats_analyze_index().
At the end, release all latches except the dict_index_t::lock
and the buf_page_t::lock on the requested page.
dict_stats_analyze_index(): Rely on mtr_t::rollback_to_savepoint()
to release all uninteresting page latches.
btr_search_guess_on_hash(): Simplify the logic, and invoke
mtr_t::rollback_to_savepoint().
We will use plain C++ std::vector<mtr_memo_slot_t> for mtr_t::m_memo.
In this way, we can avoid setting mtr_memo_slot_t::object to nullptr
and instead just remove garbage from m_memo.
mtr_t::rollback_to_savepoint(): Shrink the vector. We will be needing this
in dict_stats_analyze_index(), where we will release page latches and
only retain the index->lock in mtr_t::m_memo.
mtr_t::release_last_page(): Release the last acquired page latch.
Replaces btr_leaf_page_release().
mtr_t::release(const buf_block_t&): Release a single page latch.
Used in btr_pcur_move_backward_from_page().
mtr_t::memo_release(): Replaced with mtr_t::release().
mtr_t::upgrade_buffer_fix(): Acquire a latch for a buffer-fixed page.
This replaces the double bookkeeping in btr_cur_t::open_leaf().
Reviewed by: Vladislav Lesin
btr_cur_t::open_leaf(): Replaces btr_cur_open_at_index_side() for
most calls, except dict_stats_analyze_index(), which is the only
place where we need to open a page at the non-leaf level.
Use btr_block_get() for better error handling.
Also, use the enumeration type btr_latch_mode wherever possible.
Reviewed by: Vladislav Lesin
row_check_index(): Treat secondary indexes of temporary tables as if
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
is in effect. That is, only consider the delete-mark and nothing else.
Per fsp0types.h, SDI is on tablespace flags position 14 where MariaDB
stores its pagesize. Flag at position 13, also in MariaDB pagesize
flags, is a MySQL encryption flag.
These are checked only if fsp_flags_is_valid fails, so valid MariaDB
pages sizes don't become errors.
The error message "Cannot reset LSNs in table" was rather specific and
not always true to replaced with more generic error.
ALTER TABLE tbl IMPORT TABLESPACE now reports Unsupported on MySQL
tablespace (rather than index corrupted) along with a server error
message.
MySQL innodb Errors are with with UNSUPPORTED rather than CORRUPTED
to avoid user anxiety.
Reviewer: Marko Mäkelä