log_t::create(), log_t::attach(): Return whether the initialisation
succeeded. It may fail if too large an innodb_log_buffer_size is specified.
recv_sys_t::close_files(): Actually close the data files so that the
test mariabackup.huge_lsn,strict_crc32 will not fail on Microsoft Windows
when renaming ib_logfile101 due to a leaked file handle of ib_logfile0.
recv_sys_t::find_checkpoint(): Register recv_sys.files[0] as OS_FILE_CLOSED
because the file handle has already been attached to log_sys.log and
we do not want to close the file twice.
recv_sys_t::read(): Access the first log file via log_sys.log.
This is a port of commit 6e9b421f77
adapted to commit 685d958e38 (MDEV-14425).
The test case is omitted, because it would fail to fail when the log
is stored in persistent memory (or "fake PMEM" on Linux /dev/shm).
MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT monitor should has
MONITOR_DISPLAY_CURRENT flag set in its definition, as it shows the
current state and does not accumulate anything.
Reviewed by: Marko Mäkelä
Add threadpool functionality to restrict concurrency during "batch"
periods (where tasks are added in rapid succession).
This will throttle thread creation more agressively than usual, while
keeping performance at least on-par.
One of these cases is bufferpool load, where async read IOs are executed
without any throttling. There can be as much as 650K read IOs for
loading 10GB buffer pool.
Another one is recovery, where "fake read" IOs are executed.
Why there are more threads than we expect?
Worker threads are not be recognized as idle, until they return to the
standby list, and to return to that list, they need to acquire
mutex currently held in the submit_task(). In those cases, submit_task()
has no worker to wake, and would create threads until default concurrency
level (2*ncpus) is satisfied. Only after that throttling would happen.
innodb_max_purge_lag_wait_update(): Return immediately if we are
in high_level_read_only mode.
srv_wake_purge_thread_if_not_active(): Relax a debug assertion.
If srv_read_only_mode holds, purge_sys.enabled() will not hold
and this function will do nothing.
trx_t::commit_in_memory(): Remove a redundant condition before
invoking srv_wake_purge_thread_if_not_active().
In commit 03ca6495df and
commit ff5d306e29
we forgot to remove some Google copyright notices related to
a contribution of using atomic memory access in the old InnoDB
mutex_t and rw_lock_t implementation.
The copyright notices had been mostly added in
commit c6232c06fa
due to commit a1bb700fd2.
The following Google contributions remain:
* some logic related to the parameter innodb_io_capacity
* innodb_encrypt_tables, added in MariaDB Server 10.1
Before MDEV-24671, the wait time was derived from my_interval_timer() /
1000 (nanoseconds converted to microseconds, and not microseconds to
milliseconds like I must have assumed). The lock_sys.wait_time and
lock_sys.wait_time_max are already in milliseconds; we should not divide
them by 1000.
In MDEV-24738 the millisecond counts lock_sys.wait_time and
lock_sys.wait_time_max were changed to a 32-bit type. That would
overflow in 49.7 days. Keep using a 64-bit type for those millisecond
counters.
Reviewed by: Marko Mäkelä
innodb_undo_log_truncate_update(): A callback function. If
SET GLOBAL innodb_undo_log_truncate=ON, invoke
srv_wake_purge_thread_if_not_active().
srv_wake_purge_thread_if_not_active(): If innodb_undo_log_truncate=ON,
always wake up the purge subsystem.
srv_do_purge(): If the history is empty, invoke
trx_purge_truncate_history() in order to free undo log pages.
trx_purge_truncate_history(): If head.trx_no==0, consider the
cached undo logs to be free.
trx_purge(): Remove the parameter "bool truncate" and let the
caller invoke trx_purge_truncate_history() directly.
Reviewed by: Vladislav Lesin
srv_export_innodb_status(): Update
export_vars.innodb_buffer_pool_read_requests as it was done
before commit a55b951e60 (MDEV-26827).
If innodb_status_variables[] pointed to a sharded variable, it would
only access the first shard.
trx_purge_truncate_history(): Only call trx_purge_truncate_rseg_history()
if the rollback segment is safe to process. This will avoid leaking undo
log pages that are not yet ready to be processed. This fixes a regression
that was introduced in
commit 0de3be8cfd (MDEV-30671).
trx_sys_t::any_active_transactions(): Separately count XA PREPARE
transactions.
srv_purge_should_exit(): Terminate slow shutdown if the history size
does not change and XA PREPARE transactions exist in the system.
This will avoid a hang of the test innodb.recovery_shutdown.
Tested by: Matthias Leich
srv_export_innodb_status(): Update
export_vars.innodb_buffer_pool_read_requests as it was done
before commit a55b951e60 (MDEV-26827).
If innodb_status_variables[] pointed to a sharded variable, it would
only access the first shard.
trx_purge_truncate_history(): Only call trx_purge_truncate_rseg_history()
if the rollback segment is safe to process. This will avoid leaking undo
log pages that are not yet ready to be processed. This fixes a regression
that was introduced in
commit 0de3be8cfd (MDEV-30671).
trx_sys_t::any_active_transactions(): Separately count XA PREPARE
transactions.
srv_purge_should_exit(): Terminate slow shutdown if the history size
does not change and XA PREPARE transactions exist in the system.
This will avoid a hang of the test innodb.recovery_shutdown.
Tested by: Matthias Leich
The InnoDB buffer pool and locking were heavily refactored in
MariaDB Server 10.6. Among other things, dict_sys.mutex was removed,
and the contended lock_sys.mutex was replaced with a combination of
lock_sys.latch and distributed latches in hash tables. Also, a
default value was changed to innodb_flush_method=O_DIRECT to improve
performance in write-heavy workloads.
One thing where an adjustment was missing is around the parameters
innodb_max_purge_lag (number of committed transactions waiting to
be purged), and innodb_max_purge_lag_delay
(maximum number of microseconds to delay a DML operation).
purge_coordinator_state::do_purge(): Pass the history_size to trx_purge()
and reset srv_dml_needed_delay if the history is empty.
Keep executing the loop non-stop as long as srv_dml_needed_delay is set.
trx_purge_dml_delay(): Made part of trx_purge().
Set srv_dml_needed_delay=0 when nothing can be purged (!n_pages_handled).
row_mysql_delay_if_needed(): Mimic the logic of
innodb_max_purge_lag_wait_update().
Reviewed by: Thirunarayanan Balathandayuthapani
When commit a5a2ef079c
implemented asynchronous doublewrite, the writes via
the doublewrite buffer started to be counted incorrectly,
without multiplying them by innodb_page_size.
srv_export_innodb_status(): Correctly count the
Innodb_data_written.
buf_dblwr_t: Remove submitted(), because it is close to written()
and only Innodb_data_written was interested in it. According to
its name, it should count completed and not submitted writes.
Tested by: Axel Schwenke
os_aio_wait_until_no_pending_reads(), os_aio_wait_until_pending_writes():
Add a Boolean parameter to indicate whether the wait should be declared
in the thread pool.
buf_flush_wait(): The callers have already declared a wait, so let us
avoid doing that again, just call os_aio_wait_until_pending_writes(false).
buf_flush_wait_flushed(): Do not declare a wait in the rare case that
the buf_flush_page_cleaner thread has been shut down already.
buf_flush_page_cleaner(), buf_flush_buffer_pool(): In the code that runs
during shutdown, do not declare waits.
buf_flush_buffer_pool(): Remove a debug assertion that might fail.
What really matters here is buf_pool.flush_list.count==0.
buf_read_recv_pages(), srv_prepare_to_delete_redo_log_file():
Do not declare waits during InnoDB startup.
Let us remove explicit updates of MONITOR_NUM_UNDO_SLOT_USED
and MONITOR_NUM_UNDO_SLOT_CACHED, and let us compute the rough values
from trx_sys.rseg_array[] on demand.
buf_LRU_get_free_block(): Always wake up the page cleaner if needed
before exiting the inner loop.
srv_prepare_to_delete_redo_log_file():
Replace a debug assertion with a wait in debug builds.
Starting with commit 7e31a8e7fa
the debug assertion ut_ad(!os_aio_pending_writes())
could occasionally fail, while it would hold in core dumps of crashes.
The failure can be reproduced more easily by adding a sleep to the
write completion callback function, right before releasing to
write_slots.
srv_start(): Remove a bogus debug assertion
ut_ad(!os_aio_pending_writes()) that could fail in
mariadb-backup --prepare. In an rr replay trace, we had
buf_pool.flush_list.count==0 but write_slots->m_cache.m_pos==1
and buf_page_t::write_complete() was executing u_unlock().
fil_space_t::create(), fil_space_t::add(): Expect the caller to
acquire and release fil_system.mutex. In this way, creating a tablespace
and adding the first (usually only) data file will be atomic.
recv_sys_t::recover_deferred(): Correctly protect some changes by
holding fil_system.mutex.
Tested by: Matthias Leich