mtr_t::Impl, mtr_t::Command: Merge to mtr_t.
MTR_MAGIC_N: Remove.
MTR_STATE_COMMITTING: Remove. This state was only being set
internally during mtr_t::commit().
mtr_t::Command::m_locks_released: Remove (set-and-never-read member).
mtr_t::Command::m_start_lsn: Replaced with the return value of
finish_write() and a parameter to release_blocks().
mtr_t::Command::m_end_lsn: Removed as a duplicate of mtr_t::m_commit_lsn.
mtr_t::Command::prepare_write(): Replace a switch () with a
comparison against 0. Only 2 m_log_mode are allowed.
The XDES_CLEAN_BIT is always set for every element of
the page allocation bitmap in the extent descriptor pages.
Do not bother touching it, to avoid redundant writes.
page_rec_write_field(): Remove.
dict_create_index_tree_step(): If the SYS_INDEXES.PAGE does not change,
do not update it in the data dictionary. Typically, all index page numbers
would be unchanged before and after IMPORT TABLESPACE, except if some
secondary indexes were created after loading some data.
btr_root_fseg_adjust_on_import(): Remove the redundant mtr_t* parameter.
Redo logging is disabled during the page adjustments that IMPORT TABLESPACE
is performing.
fsp_alloc_seg_inode_page(): Ever since
commit 3926673ce7
all newly allocated pages are zero-initialized.
Assert that this is the case for the FSEG_ID fields.
btr_store_big_rec_extern_fields(): Remove the redundant initialization
of the most significant 32 bits of BTR_EXTERN_LEN. InnoDB never supported
BLOBs that are longer than 4GiB. In fact, dtuple_convert_big_rec()
would write emit an error message if a clustered index record tuple would
exceed 1,000,000,000 bytes in length.
The BTR_EXTERN_LEN in the BLOB pointers in clustered index leaf page
records is zero-initialized at least since
commit 41bb3537ba
dict_index_add_to_cache(): Make the 'index' a reference to a pointer,
so that the caller will avoid the expensive call to
dict_index_get_if_in_cache_low().
Due to a data corruption bug that may have occurred a long time earlier
(possibly involving physical backup and MySQL Bug #69122, which was
addressed in commit f166ec71b7)
it seems possible that the InnoDB change buffer might end up containing
entries, while no buffered changes exist according to the change buffer
bitmap pages in the .ibd files.
ibuf_delete_recs(): New function, to be invoked on slow shutdown only.
Remove all buffered changes for a specific page.
ibuf_merge_or_delete_for_page(): If the change buffer bitmap is clean
and a slow shutdown is in progress, invoke ibuf_delete_recs().
We do not want to do that during normal operation, due to the additional
overhead that is involved. The bitmap page should be consistent with
the change buffer in the first place.
InnoDB: Assertion failure in file .../dict/dict0dict.cc line ...
InnoDB: Failing assertion: table->can_be_evicted
This fixes a regression that was caused by the fix of MDEV-20621
(commit a41d429765).
MySQL 5.6 (and MariaDB 10.0) introduced eviction of tables from
the InnoDB data dictionary cache. Tables that are connected to
FOREIGN KEY constraints or FULLTEXT INDEX are exempt of the eviction.
With the problematic change, a table that would already be exempt
from eviction due to FOREIGN KEY would cause the problem if there
also was a FULLTEXT INDEX defined on it.
dict_load_table(): Only prevent eviction if table->can_be_evicted holds.
dict_table_rename_in_cache(): Use strcpy() instead of strncpy(),
because they are known to be equivalent in this case (the length
of old_name was already validated).
mariabackup: Invoke strncpy() with one less than the buffer size,
and explicitly add NUL as the last byte of the buffer.
Apply the changes to InnoDB and XtraDB that had been
inadvertently skipped in the merge
commit ae476868a5
That merge failure sabotaged part of MDEV-20127:
>Revert a problematic auto_increment_increment 'fix' from 2014.
>This involves replacing the MDEV-8827 fix and in 10.1,
>removing some WSREP instrumentation.
The code changes were re-merged manually by executing the following:
# Get the parent of the problematic merge.
git checkout ae476868a5394041a00e75a29c7d45917e8dfae8^
# Perform the merge again.
git merge ae476868a5394041a00e75a29c7d45917e8dfae8^2
# Get the conflict resolution from that merge.
git checkout ae476868a5 .
# Note: Any changes to these files were removed (empty diff)!
git diff HEAD storage/{innobase,xtradb}/handler/ha_innodb.cc
# Apply the code changes:
git diff cf40393471b10ca68cc1d2804c22ab9203900978^2..MERGE_HEAD \
storage/{innobase,xtradb}/handler/ha_innodb.cc|
patch -p1
Lock wait can happen on secondary index when doing FK checks for wsrep.
We should just return error to upper layer and applier will retry
operation when needed.
ut_strlcpy(): Replace with the standard function strncpy().
ut_strlcpy_rev(): Define in the same compilation unit where
the only caller resides. Avoid unnecessary definition
in non-debug builds.
mem_heap_dup(): Avoid mem_heap_alloc() and memcpy() of data=NULL, len=0.
trx_undo_report_insert_virtual(), trx_undo_page_report_insert(),
trx_undo_page_report_modify(): Avoid memcpy(ptr, NULL, 0).
dfield_data_is_binary_equal(): Correctly handle data=NULL, len=0.
This clean-up was motivated by WITH_UBSAN, and no bug related to this
was observed in the wild. It should be noted that undefined behaviour
such as memcpy(ptr, NULL, 0) could allow compilers to perform unsafe
optimizations, like it was the case in
commit fc168c3a5e (MDEV-15587).
Ignore GetDiskFreeSpace() errors in os_file_get_status_win32
The call is only used to calculate filesystem block size, and this in
turn is only shown in information_schema.sys_tablespaces.FS_BLOCK_SIZE.
There is no other use of this field, it does not affect any Innodb
functionality
- fts_optimize_thread() uses dict_table_t object instead of table id.
So that it doesn't acquire dict_sys->mutex. It leads to remove the
hang of dict_sys->mutex between fts_optimize_thread() and other threads.
- in_queue to indicate whether the table is in fts_optimize_queue. It
is protected by fts_optimize_wq->mutex to avoid any race condition.
- fts_optimize_init() adds the fts table to the fts_optimize_wq
InnoDB stores synced_doc_id + 1 value in FTS_CONFIG table. But
while reading the synced doc id from FTS_CONFIG table after restart,
InnoDB should read synced_doc_id - 1 to get the actual synced
doc id value.
To diagnose a hang in slow shutdown (innodb_fast_shutdown=0),
let us introduce a Boolean startup option in debug builds
that will cause the contents of the InnoDB change buffer
to be dumped to the server error log at startup.
Reduce the scope of some variables, remove a goto and a redundant
assertion.
For B-tree secondary indexes, this function can remove a delete-marked
purgeable record, in case a row rollback of the INSERT was initiated
due to an error in an earlier secondary index.
The BtrBulk class, which was introduced in MySQL 5.7, is by design
the exclusive writer to an index. It is therefore unnecessary to
acquire the dict_index_t::lock in that code.
Holding the dict_index_t::lock would unnecessarily block other threads
(SQL connections and the InnoDB purge threads) from buffering concurrent
modifications to being-created secondary indexes.
This fix is motivated by a change in MySQL 5.7.28:
Bug #29008298 MYSQLD CRASHES ITSELF WHEN CREATING INDEX
mysql/mysql-server@f9fb96c20f
PageBulk::init(), PageBulk::latch(): Never acquire m_index->lock.
PageBulk::storeExt(): Remove some pointer indirection, and improve
a debug assertion that seems to prove that some code is redundant.
BtrBulk::pageCommit(): Assert that m_index->lock is not being held.
btr_blob_log_check_t: Do not acquire m_index->lock if
m_op == BTR_STORE_INSERT_BULK. Add UNIV_UNLIKELY hints around
that condition.
btr_store_big_rec_extern_fields(): Allow index->lock not to be held
while op == BTR_STORE_INSERT_BULK. Add UNIV_UNLIKELY hints around
that condition.
fil_crypt_rotate_page(): Skip the key rotation for pages that carry 0
in FIL_PAGE_TYPE. This avoids not only unnecessary writes, but also
failures of the recently added debug assertion in
buf_flush_init_for_writing() that the FIL_PAGE_TYPE should be nonzero.
Note: the debug assertion can fail if the file was originally created
before MySQL 5.5. In old InnoDB versions, FIL_PAGE_TYPE was only
initialized for B-tree pages, to FIL_PAGE_INDEX. For any other pages,
the field could be garbage, including FIL_PAGE_INDEX. In MariaDB 10.2
and later, buf_flush_init_for_writing() would initialize the
FIL_PAGE_TYPE on such old pages, but only after passing the debug
assertion that insists that pages have a nonzero FIL_PAGE_TYPE.
Thus, the debug assertion at the start of buf_flush_init_for_writing()
can fail when upgrading from very old debug files. This assertion is
only present in debug builds, not release builds.
Old InnoDB/XtraDB versions only initialized FIL_PAGE_TYPE for
B-tree pages (to FIL_PAGE_INDEX), and left it uninitialized
(possibly containing FIL_PAGE_INDEX) for others. In MySQL
or MariaDB 5.5, the field is initialized on almost all pages,
but still not all of them.
In MariaDB 10.2 and later, buf_flush_init_for_writing() would
initialize the FIL_PAGE_TYPE on such old pages, but only after
passing the debug assertion that we are now removing from 10.1.
There, we will be able to modify fil_crypt_rotate_page() so
that it will skip the key rotation for pages that contain 0
in FIL_PAGE_TYPE.
In MariaDB 10.1, there is no logic that would initialize
FIL_PAGE_TYPE on data pages in old data files after an update.
So, encryption key rotation may routinely cause page flushes
on pages that contain 0 in FIL_PAGE_TYPE.
The assertion that was added in
commit c0c003beb4
to augment the fix of MDEV-20805 turns out to be invalid when
innodb_immediate_scrub_data_uncompressed is enabled.
In this mode, fsp_init_file_page() will be invoked on data pages
that have been freed, causing writes of almost-all-zero pages.
btr_page_free(): Adjust the comment.
buf_flush_init_for_writing(): Disable the assertion with a note
that it should be re-enabled in MDEV-15528.
This is another follow-up fix to
commit b393e2cb0c
which turned out to be still broken.
Replace the C++11 keyword 'constexpr' with #define.
debug_sync_t::str: Remove the zero-length array.
Replace sync->str with reinterpret_cast<char*>(&sync[1]).
Remove unused variables and type mismatch that was introduced
in commit b393e2cb0c
Also, fix a typo in the documentation of the parameter, and
update the test.
buf_flush_init_for_writing(): Assert that FIL_PAGE_TYPE is set
except when creating a new data file with a dummy first page.
buf_dblwr_create(): Ensure that FIL_PAGE_TYPE on all pages
will be initialized. Reset buf_dblwr_being_created at the end.
In the function recv_parse_or_apply_log_rec_body() there are debug checks
for validating the state of the page when redo log records are being
applied. Most notably, FIL_PAGE_TYPE should be set before anything else
is being written to the page.
ibuf_add_free_page(): Set FIL_PAGE_TYPE before performing any other changes.
In MDEV-11369 (instant ADD COLUMN) in MariaDB Server 10.3,
we introduced the hidden metadata record that must be the
first record in the clustered index if and only if
index->is_instant() holds.
To catch MDEV-19783, in
commit ed0793e096 and
commit 99dc40d6ac
we added some assertions to find cases where
the metadata record is missing while it should not be, or a
record exists when it should not. Those assertions were invalid
when traversing the PAGE_FREE list. That list can contain anything;
we must only be able to determine the successor and the size of
each garbage record in it.
page_validate(), page_simple_validate_old(), page_simple_validate_new():
Do not invoke page_rec_get_next_const() for traversing the PAGE_FREE
list, but instead use a lower-level accessor that does not attempt to
validate the REC_INFO_MIN_REC_FLAG.
page_copy_rec_list_end_no_locks(),
page_copy_rec_list_start(), page_delete_rec_list_start():
Add assertions.
btr_page_get_split_rec_to_left(): Remove a redundant return value,
and make the output parameter the return value.
btr_page_get_split_rec_to_right(), btr_page_split_and_insert(): Clean up.