Such CMake check is not relevant as the oldest supported gcc is 4.8 and
compiler issue was solved in gcc-4.6 already.
https://godbolt.org/z/7G64qo
cmp_data(): simplify code
- Moved the recv_sys->heap memory condition inside recv_parse_log_recs().
So that, InnoDB can mark the status as STORE_NO earlier.
- InnoDB uses one third of buffer pool chunk size for reading the redo
log records. In that case, we can avoid the scenario where buffer ran
out of memory issue during recovery.
in 10.1+ one should use
MY_CHECK_AND_SET_COMPILER_FLAG("-Wno-address-of-packed-member")
and it's already done in storage/tokudb/PerconaFT/CMakeLists.txt
A debug assertion that was added in
commit ed0793e096
turns out to be too strict. In the test innodb_gis.rtree_compress,4k
the function is sometimes being invoked by purge for a
spatial index root page that is not a leaf page (PAGE_LEVEL is 1).
In commit befde6e97e
a bogus condition was added, aiming to avoid debug assertion failures
during ALTER TABLE...IMPORT TABLESPACE.
page_cur_delete_rec(): Remove the bogus condition, and replace the
use of buf_block_modify_clock_inc() with a lower-level operation
that skips the debug checks that would fail during IMPORT.
The debug instrumentation with the MLOG_LSN pseudo-record has not been
used for debugging for years. Let us remove this code now.
It would have to be removed as part of MDEV-12353 or MDEV-14425 anyway,
when implementing a new redo log file format.
Pass buf_block_t* to more functions that write redo log.
page_zip_write_node_ptr(), page_zip_write_blob_ptr(),
page_zip_compress_write_log_no_data():
Take buf_block_t* as parameter, and do not tolerate mtr=NULL.
page_zip_compress(): Do not tolerate mtr=NULL.
page_zip_dir_insert(): Take page_cur_t* as parameter.
mlog_write_initial_log_record(): Remove. This function was unused.
RecIterator::remove(): Remove the redundant page_zip parameter.
PageConverter::m_page_zip_ptr: Remove.
page_cur_delete_rec(): Do not tolerate mtr=NULL.
page_delete_rec(): Merge with the only caller, RecIterator::remove().
RecIterator::m_mtr: New data member: a dummy mini-transaction.
offset_t: this is a type which represents one record offset.
It's unsigned short int.
a lot of functions: replace ulint with offset_t
btr_pcur_restore_position_func(),
page_validate(),
row_ins_scan_sec_index_for_duplicate(),
row_upd_clust_rec_by_insert_inherit_func(),
row_vers_impl_x_locked_low(),
trx_undo_prev_version_build():
allocate record offsets on the stack instead of waiting for rec_get_offsets()
to allocate it from mem_heap_t. So, reducing memory allocations.
RECORD_OFFSET, INDEX_OFFSET:
now it's less convenient to store pointers in offset_t*
array. One pointer occupies now several offset_t. And those constant are start
indexes into array to places where to store pointer values
REC_OFFS_HEADER_SIZE: adjusted for the new reality
REC_OFFS_NORMAL_SIZE:
increase size from 100 to 300 which means less heap allocations.
And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which
is smaller than previous 800 bytes.
REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality
rem0rec.h, rem0rec.ic, rem0rec.cc:
various arguments, return values and local variables types were changed to
fix numerous integer conversions issues.
enum field_type_t:
offset types concept was introduces which replaces old offset flags stuff.
Like in earlier version, 2 upper bits are used to store offset type.
And this enum represents those types.
REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed
get_type(), set_type(), get_value(), combine():
these are convenience functions to work with offsets and it's types
rec_offs_base()[0]:
still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL
rec_offs_base()[i]:
these have type offset_t now. Two upper bits contains type.
recv_dblwr_t::list is used for appending to the beginning and iterating
through its elements. std::deque fits better for that purpose because
it does less allocations than std::forward_list and provides better memory
locality.
fil_node_t::find_metadata() tries to find out whether file
is on an SSD, and the disk sector size.
On Windows, it opens the corresponding volume for finding this data.
This does not go well, if datadir is on network path/UNC. The volume name
is invalid, CreateFile() function fails, and a cryptic (from the end user
perspective) error is reported. Like this
[ERROR] InnoDB: File \\.\\\workpc\work: 'CreateFile()' returned OS error 203.
The fix is not to report error if open volume failed, and the path was not
on fixed disk, i.e not on HDD or SSD. This is not a fatal error, there is
a fallback anyway.
Use std::queue backed by std::deque instead of list because it does less
allocations which still having O(1) push and pop operations.
Also store trx_purge_rec_t directly, because its only 16 bytes and allocating
it is to wasteful. This should be faster.
purge_node_t::purge_node_t: container is already empty after creation
purge_node_t::end(): replace clearing container with assertion that it's clear
InnoDB RNG maintains global state, causing otherwise unnecessary bus
traffic. Even worse, this is cross-mutex traffic. That is, different
mutexes suffer from contention.
Fixed delay of 4 was verified to give best throughput by OLTP update
index and read-write benchmarks on Intel Broadwell (2/20/40) and
ARM (1/46/46).
This is a backport of ce04790065 from
MariaDB Server 10.3.
We should not need anywhere near 32 bits of entropy, so we might
just limit ourselves to a 32-bit random number generator.
Also, it might be cheaper to use exclusive-or, bit shifting and
conditional jumps, instead of multiplication and addition.
We use relaxed atomic operations on the global random number generator
state in order in an attempt to silence any warnings about race conditions.
There is an obvious race condition between the load and store in
ut_rnd_gen(), but we do not think that it matters much that the
state of the random number generator could 'stutter'.
This change seems makes the 'uncompress_ops' nondeterministic
in innodb_zip.cmp_per_index after the restart. It looks like
there is an inherent race condition in the test, because the
table could be opened for InnoDB statistics recalculation
already before innodb_cmp_per_index_enabled was set. We might
end up having uncompress_ops anywhere between 0 and 9, or perhaps
even more. Let us remove that part of the test.
ut_rnd_interval(): Remove the first parameter, which was mostly
passed as 0. Implement as a simple wrapper around ut_rnd_gen().
Trivially return 0 if the size of the interval is smaller than 2.
ut_rnd_ulint_counter, ut_rnd_gen_next_ulint(), ut_rnd_gen_ulint(): Remove.
ut_rnd_set_seed(): Unused function; remove.
ut_rnd_gen(): Renamed from page_cur_lcg_prng().
ut_rnd_current: The internal state of ut_rnd_gen().
page_cur_open_on_rnd_user_rec(): Replace linear search with
page_rec_get_nth().
row_drop_table_for_mysql(): If a #sql2 table is open in another
thread (purge) while we attempting to drop it, rename it to #sql-ib
name to hide it from the SQL layer.
Adjust tests accordingly to hide #sql-ib tables, which can
continue to exist until the background DROP TABLE completes.
This is joint work with Thirunarayanan Balathandayuthapani.
The MDL interface between InnoDB and the rest of the server
(in storage/innobase/dict/dict0dict.cc and in include/)
is my work, while most everything else is Thiru's.
The collection of InnoDB persistent statistics and the
defragmentation were not refactored to use MDL. They will
keep relying on lower-level interlocking with
fil_check_pending_operations().
The purge of transaction history and the background operations on
fulltext indexes will use MDL. We will revert
commit 2c4844c9e7
(MDEV-17813) because thanks to MDL, purge cannot conflict
with DDL operations anymore. For a similar reason, we will remove
the MDEV-16222 test case from gcol.innodb_virtual_debug_purge.
Purge is essentially replacing all use of the global dict_sys.latch
with MDL. Purge will skip the undo log records for tables whose names
start with #sql-ib or #sql2. Theoretically, such tables might
be renamed back to visible table names if TRUNCATE fails to
create a new table, or the final rename in ALTER TABLE...ALGORITHM=COPY
fails. In that case, purge could permanently leave some garbage
in the table. Such garbage will be tolerated; the table would not
be considered corrupted.
To avoid repeated MDL releases and acquisitions,
trx_purge_attach_undo_recs() will sort undo log records by table_id,
and purge_node_t will keep the MDL and table handle open for multiple
successive undo log records.
get_purge_table(): A new accessor, used during the purge of
history for indexed virtual columns. This interface should ideally
not exist at all.
thd_mdl_context(): Accessor of THD::mdl_context.
Wrapped in a new thd_mdl_service.
dict_get_db_name_len(): Define inline.
dict_acquire_mdl_shared(): Acquire explicit shared MDL on a table name
if needed.
dict_table_open_on_id(): Return MDL_ticket, if requested.
dict_table_close(): Release MDL ticket, if requested.
dict_fts_index_syncing(), dict_index_t::index_fts_syncing: Remove.
row_drop_table_for_mysql() no longer needs to check these, because
MDL guarantees that a fulltext index sync will not be in progress
while MDL_EXCLUSIVE is protecting a DDL operation.
dict_table_t::parse_name(): Parse the table name for acquiring MDL.
purge_node_t::undo_recs: Change the type to std::list<trx_purge_rec_t*>
(different container, and storing also roll_ptr).
purge_node_t: Add mdl_ticket, last_table_id, purge_thd, mdl_hold_recs
for acquiring MDL and for keeping the table open across multiple
undo log records.
purge_vcol_info_t, row_purge_store_vsec_cur(), row_purge_restore_vsec_cur():
Remove. We will acquire the MDL earlier.
purge_sys_t::heap: Added, for reading undo log records.
fts_sync_during_ddl(): Invoked during ALGORITHM=INPLACE operations
to ensure that fts_sync_table() will not conflict with MDL_EXCLUSIVE.
Uses fts_t::sync_message for bookkeeping.
- wait notification, tpool_wait_begin/tpool_wait_end - to notify the
threadpool that current thread is going to wait
Use it to wait for IOs to complete and also when purge waits for workers.