page_create(): Create normal B-tree pages. Callers that create
R-tree pages will set FIL_PAGE_TYPE and reset the split
sequence number afterwards.
The creation of ROW_FORMAT=COMPRESSED pages is unaffected;
they will be logged as compressed page images.
page_create_low(): Take const buf_block_t* as a parameter.
Let the callers invoke buf_block_modify_clock_inc().
btr_cur_upd_rec_sys(): Replaces row_upd_rec_sys_fields() and
implements redo logging.
row_upd_rec_sys_fields_in_recovery(): Remove, and merge to the
only remaining caller btr_cur_parse_update_in_place().
btr_cur_del_mark_set_clust_rec_log(),
btr_cur_del_mark_set_sec_rec_log(),
btr_cur_set_deleted_flag_for_ibuf():
Remove, and replace with btr_rec_set_deleted<bool>().
page_zip_rec_set_deleted(): Add the parameter mtr, and write a
MLOG_ZIP_WRITE_STRING record to the log.
Log the low-level operations for ROW_FORMAT=COMPRESSED index pages
using a new record, MLOG_ZIP_WRITE_STRING. We will still use
MLOG_1BYTE,..., MLOG_8BYTES or MLOG_WRITE_STRING for operations
on other than index pages (such as the page allocation bitmap pages).
We will stop writing the record MLOG_ZIP_PAGE_COMPRESS later, after
replacing all MLOG_REC_ and MLOG_COMP_REC_ that update index pages.
Log page reorganize as a series of insert operations.
This will make the redo log volume proportional to the page payload size.
btr_page_reorganize_low(): Add template <bool recovery=false>
btr_page_reorganize_block(): Remove the parameter 'bool recovery'
Instead of writing the high-level redo log records
MLOG_LIST_END_COPY_CREATED, MLOG_COMP_LIST_END_COPY_CREATED
write log for each individual insert of a record.
page_copy_rec_list_end_to_created_page(): Remove.
This will improve the fill factor of some pages.
Adjust some tests accordingly.
PageBulk::init(), PageBulk::finish(): Avoid setting bogus limits
to PAGE_HEAP_TOP and PAGE_N_DIR_SLOTS. Avoid accessor functions
that would enforce these limits before the correct ones are set
at the end of PageBulk::finish().
page_zip_reorganize(): Restore the page on failure.
In callers, omit now-redundant calls to page_zip_decompress().
btr_page_reorganize_low(): Define in static scope only, and
remove the z_level parameter. Assert that ROW_FORMAT is not COMPRESSED.
btr_page_reorganize_block(), btr_page_reorganize(): Invoke
page_zip_reorganize() for ROW_FORMAT=COMPRESSED.
page_zip_compress_write_log_no_data(): Remove.
We no longer write the MLOG_ZIP_PAGE_COMPRESS_NO_DATA record.
Instead, we will write MLOG_ZIP_PAGE_COMPRESS records.
For compatibility with diagnostic software, let us
return a dummy buffer pool identifier 0 and restore
the columns that were initially deleted in
commit 1a6f708ec5:
information_schema.innodb_buffer_page.pool_id
information_schema.innodb_buffer_page_lru.pool_id
information_schema.innodb_buffer_pool_stats.pool_id
information_schema.innodb_cmpmem.buffer_pool_instance
information_schema.innodb_cmpmem_reset.buffer_pool_instance
Thanks to Vladislav Vaintroub for pointing this out.
Our benchmarking efforts indicate that the reasons for splitting the
buf_pool in commit c18084f71b
have mostly gone away, possibly as a result of
mysql/mysql-server@ce6109ebfd
or similar work.
Only in one write-heavy benchmark where the working set size is
ten times the buffer pool size, the buf_pool->mutex would be
less contended with 4 buffer pool instances than with 1 instance,
in buf_page_io_complete(). That contention could be alleviated
further by making more use of std::atomic and by splitting
buf_pool_t::mutex further (MDEV-15053).
We will deprecate and ignore the following parameters:
innodb_buffer_pool_instances
innodb_page_cleaners
There will be only one buffer pool and one page cleaner task.
In a number of INFORMATION_SCHEMA views, columns that indicated
the buffer pool instance will be removed:
information_schema.innodb_buffer_page.pool_id
information_schema.innodb_buffer_page_lru.pool_id
information_schema.innodb_buffer_pool_stats.pool_id
information_schema.innodb_cmpmem.buffer_pool_instance
information_schema.innodb_cmpmem_reset.buffer_pool_instance
During native table rebuild or index creation, InnoDB used to skip
redo logging and write MLOG_INDEX_LOAD records to inform crash recovery
and Mariabackup of the gaps in redo log. This is fragile and prohibits
some optimizations, such as skipping the doublewrite buffer for
newly (re)initialized pages (MDEV-19738).
row_merge_write_redo(): Remove. We do not write MLOG_INDEX_LOAD
records any more. Instead, we write full redo log.
FlushObserver: Remove.
fseg_free_page_func(): Remove the parameter log. Redo logging
cannot be disabled.
fil_space_t::redo_skipped_count: Remove.
We cannot remove buf_block_t::skip_flush_check, because PageBulk
will temporarily generate invalid B-tree pages in the buffer pool.
Let us define page_id_t as a thin wrapper of uint64_t so that
the comparison operators can be simplified. This is a follow-up
to the original commit 14be814380.
The comparison operator for recv_sys.pages.emplace() turned out to be
a busy spot in a recovery benchmark. That data structure was introduced
in MDEV-19586 in commit 177a571e01.
The linear scan of recv_sys_t::blocks() in recv_sys_t::free()
turns out to dominate the execution time in crash recovery.
Let us scan the much shorter buf_pool->chunks lists instead.
ut_align_down(): Preserve the const qualifier. Use C++ casts.
ha_delete_hash_node(): Correct an assertion expression.
fil_page_get_type(): Perform an assumed-aligned read.
page_align(): Preserve the const qualifier. Assume (some) alignment.
page_get_max_trx_id(): Check the index page type.
page_header_get_field(): Perform an assumed-aligned read.
page_get_autoinc(): Perform an assumed-aligned read.
page_dir_get_nth_slot(): Perform an assumed-aligned read.
Preserve the const qualifier.
- n_ext value may be less than dtuple_get_n_ext(dtuple) when PK is being
updated and new record inherits the externally stored fields from
delete mark old record.
Use bit-fields for some mtr_t members to improve locality of reference.
Because mtr_t is never shared between threads, there are no considerations
regarding concurrent access.
Since commit 5e62b6a5e0 (MDEV-16264),
purge_sys_t::stop() no longer waited for all purge activity to stop.
This caused problems on FLUSH TABLES...FOR EXPORT because of
purge running concurrently with the buffer pool flush.
The assertion at the end of buf_flush_dirty_pages() could fail.
The, implemented by Vladislav Vaintroub, aims to eliminate race
conditions when stopping or resuming purge:
waitable_task::disable(): Wait for the task to complete, then replace
the task callback function with noop.
waitable_task::enable(): Restore the original task callback function
after disable().
purge_sys_t::stop(): Invoke purge_coordinator_task.disable().
purge_sys_t::resume(): Invoke purge_coordinator_task.enable().
purge_sys_t::running(): Add const qualifier, and clarify the comment.
The purge coordinator task will remain active as long as any purge
worker task is active.
purge_worker_callback(): Assert purge_sys.running().
srv_purge_wakeup(): Merge with the only caller purge_sys_t::resume().
purge_coordinator_task: Use static linkage.
Problem:
=======
After discarding the table, fts_optimize_thread aborts during shutdown.
InnoDB fails to remove the table from fts_optimize_wq and it leads to
the fts_optimize_thread to lookup for the auxiliary table and fails.
Fix:
====
While discarding the fts table, remove the table from fts_optimize_wq.
srv_export_innodb_status(): While gathering
innodb_mem_adaptive_hash, acquire btr_search_latches[i]
in order to prevent a race condition with buffer pool resizing.
Release memory as soon as redo log records are processed.
Because the memory allocation and deallocation of parsed redo log
records must be protected by recv_sys.mutex, it is better to avoid
using a std::atomic field for bookkeeping.
buf_page_t::access_time: Keep track of the recv_sys.pages record
allocations. The most significant 16 bits will count allocated
blocks (which were previously counted by buf_page_t::buf_fix_count
in the debug version), and the least significant 16 bits indicate
the number of allocated bytes in the block (which was previously
managed in buf_block_t::modify_clock), which must be a positive
number, up to innodb_page_size. The byte offset 65536 is represented
as the value 0.
recv_recover_page(): Let the caller erase the log.
recv_validate_tablespace(): Acquire recv_sys_t::mutex.
row_log_table_get_pk_old_col(): For replacing a NULL value for a
column of the being-added primary key, look up the correct
default value, even if columns had been instantly reordered or
dropped earlier. This ought to have been broken ever since
commit 0e5a4ac253 (MDEV-15562).
ha_innobase::commit_inplace_alter_table(): After
ALTER_STORED_COLUMN_ORDER, ensure that the virtual column metadata
will be reloaded also when the table is not being rebuilt.
The column INFORMATION_SCHEMA.INNODB_MUTEXES.NAME is not populated ever since
commit 2e814d4702 applied the InnoDB changes from
MySQL 5.7.9 to MariaDB Server 10.2.2.
Since the same commit, the view is only providing information about
rw_lock_t, not any mutexes.
For now, let us convert the source code file name and line number of
the rw_lock_t creation into a name. A better option in the future might
be to store the information somewhere where it can be looked up by
mysql_pfs_key_t, and possibly to remove the CREATE_FILE and CREATE_LINE
columns.
Problem:-
So the issue is when we do bulk insert with rows
> MI_MIN_ROWS_TO_DISABLE_INDEXES(100) , We try to disable the indexes to
speedup insert. But current logic also disables the long unique indexes.
Solution:- In ha_myisam::start_bulk_insert if we find long hash index
(HA_KEY_ALG_LONG_HASH) we will not disable the index.
This commit also refactors the mi_disable_indexes_for_rebuild function,
Since this is function is called at only one place, it is inlined into
start_bulk_insert
mi_clear_key_active is added into myisamdef.h because now it is also used
in ha_myisam.cc file.
(Same is done for Aria Storage engine)