Commit graph

1122 commits

Author SHA1 Message Date
Marko Mäkelä
cc5c0eda4c MDEV-33156 Crash on innodb_buf_flush_list_now=ON and innodb_force_recovery=6
srv_start(): Move a read only mode startup tweak from
innodb_init_params() to the correct location. Also if
innodb_force_recovery=6 we will disable the doublewrite buffer,
because InnoDB must run in read-only mode to prevent further corruption.

This change only affects debug checks. Whenever srv_read_only_mode holds,
the buf_pool.flush_list will be empty, that is, there will be no writes
of persistent InnoDB data pages.

Reviewed by: Thirunarayanan Balathandayuthapani
2024-01-03 12:08:21 +02:00
Marko Mäkelä
c638051d80 MDEV-32798 innodb_fast_shutdown=0 hang after incomplete startup
innodb_preshutdown(): Only wait for active transactions to be terminated
if InnoDB was started and innodb_force_recovery=3 or larger does not
prevent a rollback.

This fixes the following:

./mtr --parallel=auto --mysqld=--innodb-fast-shutdown=0 \
innodb.log_file_size innodb.innodb_force_recovery \
innodb.read_only_recovery innodb.read_only_recover_committed \
mariabackup.apply-log-only-incr
2023-11-14 14:35:51 +02:00
Oleksandr Byelkin
6cfd2ba397 Merge branch '10.4' into 10.5 2023-11-08 12:59:00 +01:00
Thirunarayanan Balathandayuthapani
85751ed81d MDEV-31851 After crash recovery, undo tablespace fails to open
srv_all_undo_tablespaces_open(): While opening the extra unused
undo tablespaces, InnoDB should use ULINT_UNDEFINED instead of
SRV_SPACE_ID_UPPER_BOUND.
2023-10-19 15:39:44 +05:30
Thirunarayanan Balathandayuthapani
3da5d047b8 MDEV-31851 After crash recovery, undo tablespace fails to open
Problem:
========
- InnoDB fails to open undo tablespace when page0 is corrupted
and fails to throw error.

Solution:
=========
- InnoDB throws DB_CORRUPTION error when InnoDB encounters
page0 corruption of undo tablespace.

- InnoDB restores the page0 of undo tablespace from
doublewrite buffer if it encounters page corruption

- Moved Datafile::restore_from_doublewrite() to
recv_dblwr_t::restore_first_page(). So that undo
tablespace and system tablespace can use this function
instead of duplicating the code

srv_undo_tablespace_open(): Returns 0 if file doesn't exist
or ULINT_UNDEFINED if page0 is corrupted.
2023-10-17 18:41:21 +05:30
Marko Mäkelä
6e9b421f77 MDEV-32364 Server crashes when starting server with high innodb_log_buffer_size
log_t::create(): Return whether the initialisation succeeded.
It may fail if too large an innodb_log_buffer_size is specified.
2023-10-06 14:16:01 +03:00
Vlad Lesin
96ae37abc5 MDEV-30658 lock_row_lock_current_waits counter in information_schema.innodb_metrics may become negative
MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT monitor should has
MONITOR_DISPLAY_CURRENT flag set in its definition, as it shows the
current state and does not accumulate anything.

Reviewed by: Marko Mäkelä
2023-10-05 18:27:54 +03:00
Marko Mäkelä
aeb8eae5c8 Merge 10.4 into 10.5 2023-08-24 10:12:13 +03:00
Marko Mäkelä
02878f128e MDEV-31813 SET GLOBAL innodb_max_purge_lag_wait hangs if innodb_read_only
innodb_max_purge_lag_wait_update(): Return immediately if we are
in high_level_read_only mode.

srv_wake_purge_thread_if_not_active(): Relax a debug assertion.
If srv_read_only_mode holds, purge_sys.enabled() will not hold
and this function will do nothing.

trx_t::commit_in_memory(): Remove a redundant condition before
invoking srv_wake_purge_thread_if_not_active().
2023-08-24 10:08:51 +03:00
Marko Mäkelä
c25b496724 MDEV-31382 SET GLOBAL innodb_undo_log_truncate=ON has no effect on logically empty undo logs
innodb_undo_log_truncate_update(): A callback function. If
SET GLOBAL innodb_undo_log_truncate=ON, invoke
srv_wake_purge_thread_if_not_active().

srv_wake_purge_thread_if_not_active(): If innodb_undo_log_truncate=ON,
always wake up the purge subsystem.

srv_do_purge(): If the history is empty, invoke
trx_purge_truncate_history() in order to free undo log pages.

trx_purge_truncate_history(): If head.trx_no==0, consider the
cached undo logs to be free.

trx_purge(): Remove the parameter "bool truncate" and let the
caller invoke trx_purge_truncate_history() directly.

Reviewed by: Vladislav Lesin
2023-06-08 09:18:21 +03:00
Marko Mäkelä
e0084b9d31 MDEV-31234 InnoDB does not free UNDO after the fix of MDEV-30671
trx_purge_truncate_history(): Only call trx_purge_truncate_rseg_history()
if the rollback segment is safe to process. This will avoid leaking undo
log pages that are not yet ready to be processed. This fixes a regression
that was introduced in
commit 0de3be8cfd (MDEV-30671).

trx_sys_t::any_active_transactions(): Separately count XA PREPARE
transactions.

srv_purge_should_exit(): Terminate slow shutdown if the history size
does not change and XA PREPARE transactions exist in the system.
This will avoid a hang of the test innodb.recovery_shutdown.

Tested by: Matthias Leich
2023-05-19 12:19:26 +03:00
Marko Mäkelä
50f3b7d164 MDEV-31124 Innodb_data_written miscounts doublewrites
When commit a5a2ef079c
implemented asynchronous doublewrite, the writes via
the doublewrite buffer started to be counted incorrectly,
without multiplying them by innodb_page_size.

srv_export_innodb_status(): Correctly count the
Innodb_data_written.

buf_dblwr_t: Remove submitted(), because it is close to written()
and only Innodb_data_written was interested in it. According to
its name, it should count completed and not submitted writes.

Tested by: Axel Schwenke
2023-04-25 12:17:06 +03:00
Oleksandr Byelkin
ac5a534a4c Merge remote-tracking branch '10.4' into 10.5 2023-03-31 21:32:41 +02:00
Vlad Lesin
4c226c1850 MDEV-29050 mariabackup issues error messages during InnoDB tablespaces export on partial backup preparing
The solution is to suppress error messages for missing tablespaces if
mariabackup is launched with "--prepare --export" options.

"mariabackup --prepare --export" invokes itself with --mysqld parameter.
If the parameter is set, then it starts server to feed "FLUSH TABLES ...
FOR EXPORT;" queries for exported tablespaces. This is "normal" server
start, that's why new srv_operation value is introduced.

Reviewed by Marko Makela.
2023-03-27 20:15:10 +03:00
Marko Mäkelä
1495f057c8 MDEV-30860 Race condition between buffer pool flush and log file deletion in mariadb-backup --prepare
srv_start(): If we are going to close the log file in
mariadb-backup --prepare, call buf_flush_sync() before
calling recv_sys.debug_free() to ensure that the log file
will not be accessed.

This fixes a rather rare failure in the test
mariabackup.innodb_force_recovery where buf_flush_page_cleaner()
would invoke log_checkpoint_low() because !recv_recovery_is_on()
would hold due to the fact that recv_sys.debug_free() had
already been called. Then, the log write for the checkpoint
would fail because srv_start() had invoked log_sys.log.close_file().
2023-03-16 13:39:23 +02:00
Vlad Lesin
7d6b3d4008 MDEV-30775 Performance regression in fil_space_t::try_to_close() introduced in MDEV-23855
fil_node_open_file_low() tries to close files from the top of
fil_system.space_list if the number of opened files is exceeded.

It invokes fil_space_t::try_to_close(), which iterates the list searching
for the first opened space. Then it just closes the space, leaving it in
the same position in fil_system.space_list.

On heavy files opening, like during 'SHOW TABLE STATUS ...' execution,
if the number of opened files limit is reached,
fil_space_t::try_to_close() iterates more and more closed spaces before
reaching any opened space for each fil_node_open_file_low() call. What
causes performance regression if the number of spaces is big enough.

The fix is to keep opened spaces at the top of fil_system.space_list,
and move closed files at the end of the list.

For this purpose fil_space_t::space_list_last_opened pointer is
introduced. It points to the last inserted opened space in
fil_space_t::space_list. When space is opened, it's inserted to the
position just after the pointer points to in fil_space_t::space_list to
preserve the logic, inroduced in MDEV-23855. Any closed space is added
to the end of fil_space_t::space_list.

As opened spaces are located at the top of fil_space_t::space_list,
fil_space_t::try_to_close() finds opened space faster.

There can be the case when opened and closed spaces are mixed in
fil_space_t::space_list if fil_system.freeze_space_list was set during
fil_node_open_file_low() execution. But this should not cause any error,
as fil_space_t::try_to_close() still iterates spaces in the list.

There is no need in any test case for the fix, as it does not change any
functionality, but just fixes performance regression.
2023-03-10 18:31:10 +03:00
Marko Mäkelä
08267ba0c8 MDEV-30819 InnoDB fails to start up after downgrading from MariaDB 11.0
While downgrades are not supported and misguided attempts at it could
cause serious corruption especially after
commit b07920b634
it might be useful if InnoDB would start up even after an upgrade to
MariaDB Server 11.0 or later had removed the change buffer.

innodb_change_buffering_update(): Disallow anything else than
innodb_change_buffering=none when the change buffer is corrupted.

ibuf_init_at_db_start(): Mention a possible downgrade in the corruption
error message. If innodb_change_buffering=none, ignore the error but do
not initialize ibuf.index.

ibuf_free_excess_pages(), ibuf_contract(), ibuf_merge_space(),
ibuf_update_max_tablespace_id(), ibuf_delete_for_discarded_space(),
ibuf_print(): Check for !ibuf.index.

ibuf_check_bitmap_on_import(): Remove some unnecessary code.
This function is only accessing change buffer bitmap pages in a
data file that is not attached to the rest of the database.
It is not accessing the change buffer tree itself, hence it does
not need any additional mutex protection.

This has been tested both by starting up MariaDB Server 10.8 on
a 11.0 data directory, and by running ./mtr --big-test while
ibuf_init_at_db_start() was tweaked to always fail.
2023-03-09 16:16:58 +02:00
Marko Mäkelä
e02ed04d17 MDEV-23855 fixup: Remove SRV_MASTER_CHECKPOINT_INTERVAL 2023-01-25 13:53:10 +02:00
Marko Mäkelä
b8f4b984f9 MDEV-24685 fixup: Remove srv_n_file_io_threads
The variable was not really being used for anything. The parameters
innodb_read_io_threads, innodb_write_io_threads have replaced
innodb_file_io_threads.
2022-12-16 17:08:56 +02:00
Marko Mäkelä
db14eb16f9 MDEV-30106 InnoDB fails to validate the change buffer on startup
ibuf_init_at_db_start(): Validate the change buffer root page.
A later version may stop creating a change buffer, and this
validation check will prevent a downgrade from such later versions.

ibuf_max_size_update(): If the change buffer was not loaded, do nothing.

dict_boot(): Merge the local variable "error" to "err". Ignore
failures of ibuf_init_at_db_start() if innodb_force_recovery>=4.
2022-11-28 11:34:22 +02:00
Marko Mäkelä
165564d3c3 MDEV-30009 InnoDB shutdown hangs when the change buffer is corrupted
The InnoDB change buffer (ibuf.index, stored in the system tablespace)
and the change buffer bitmaps in persistent tablespaces could get out
of sync with each other: According to the bitmap, no changes exist for
a page, while there actually exist buffered entries in ibuf.index.

InnoDB performs lazy deletion of buffered changes. When a secondary
index leaf page is freed (possibly as part of DROP INDEX), any
buffered changes will not be deleted. Instead, they would be deleted
on a subsequent buf_page_create_low().

One scenario where InnoDB failed to delete buffered changes is
as follows:
1. Some changes were buffered for a secondary index leaf page.
2. The index page had been freed.
3. ibuf_read_merge_pages() invoked ibuf_merge_or_delete_for_page(),
which noticed that the page had been freed, and reset the change buffer
bits, but did not delete the records from ibuf.index.
4. The index page was reallocated for something else.
5. The index page was removed from the buffer pool.
6. Some changes were buffered for the newly created page.
7. Finally, the buffered changes from both 1. and 6. were merged.
8. The index is corrupted.

An alternative outcome is:
4. Shutdown with innodb_fast_shutdown=0 gets into an infinite loop.

An alternative scenario is:
3. ibuf_set_bitmap_for_bulk_load() reset the IBUF_BITMAP_BUFFERED bit
but did not delete the ibuf.index records for that page number.

The shutdown hang was already once fixed in
commit d7a2401750, refactored for
10.5 in commit 77e8a311e1 and
disabled in commit 310dff5d84
due to corruption.

We will fix this as follows:

ibuf_delete_recs(): Delete all ibuf.index entries for the specified page.

ibuf_merge_or_delete_for_page(): When the change buffer bitmap bits
were set and the page had been freed, and the page does not belong
to ibuf.index itself, invoke ibuf_delete_recs(). This prevents the
corruption from occurring when a DML operation is allocating a
previously freed page for which changes had been buffered.

ibuf_set_bitmap_for_bulk_load(): When the change buffer bitmap bits
were set, invoke ibuf_delete_recs(). This prevents the corruption
from occurring when CREATE INDEX is reusing a previously freed page.

ibuf_read_merge_pages(): On slow shutdown, remove the orphan records
by invoking ibuf_delete_recs(). This fixes the hang when the change
buffer had become corrupted. We also remove the dops[] accounting,
because nothing can monitor it during shutdown. We invoke
ibuf_delete_recs() if:
(a) buf_page_get_gen() failed to load the page or merge changes
(b) the page is not a valid index leaf page
(c) the page number is out of tablespace bounds

srv_shutdown(): Invoke ibuf_max_size_update(0) to ensure that
the race condition that motivated us to disable the code in
ibuf_read_merge_pages() in commit 310dff5d84
is no longer possible. That is, during slow shutdown, both the
rollback of transactions and the purge of history will return
early from ibuf_insert_low().

ibuf_merge_space(), ibuf_delete_for_discarded_space(): Cleanup:
Do not allocate a memory heap.

This was implemented by Thirunarayanan Balathandayuthapani
and tested with innodb_change_buffering_debug=1 by Matthias Leich.
2022-11-23 17:34:05 +02:00
Marko Mäkelä
dc2741be52 MDEV-29984 innodb_fast_shutdown=0 fails to report change buffer merge progress
ibuf.size, ibuf.max_size: Changed the type to Atomic_relaxed<ulint>
in order to fix some (not all) race conditions.

ibuf_contract(): Renamed from ibuf_merge_pages(ulint*).

ibuf_merge(), ibuf_merge_all(): Removed.

srv_shutdown(): Invoke log_free_check() and ibuf_contract(). Even though
ibuf_contract() is not writing anything, it will trigger calls of
ibuf_merge_or_delete_for_page(), which will write something. Because
we cannot invoke log_free_check() at that low level, we must invoke
it at the high level.

srv_shutdown_print(): Replaces srv_shutdown_print_master_pending().
Report progress and remaining work every 15 seconds. For the
change buffer merge, the remaining work is indicated by ibuf.size.
2022-11-14 15:40:28 +02:00
Marko Mäkelä
ee7fba1749 MDEV-16264 fixup: Remove unused variables 2022-11-14 14:01:37 +02:00
Marko Mäkelä
2283f82de5 MDEV-29905 fixup: Remove some unnecessary code
srv_shutdown(): Do not call log_free_check(), because it will now
be repeatedly called by ibuf_merge_all(). Do not call
srv_sync_log_buffer_in_background(), because we do not actually care
about durability during shutdown. Log writes will already be triggered
by buf_flush_page_cleaner() for writing back modified pages, possibly by
log_free_check().

logs_empty_and_mark_files_at_shutdown(): Clean up a condition.
This function is the caller of srv_shutdown(), and it will ensure that
the log and the buffer pool will be in clean state before shutdown.
2022-11-14 10:22:29 +02:00
Marko Mäkelä
e29fb95614 Cleanup: Remove innobase_destroy_background_thd()
We do not need a non-inline wrapper for the function
destroy_background_thd().
2022-09-30 08:25:00 +03:00
Marko Mäkelä
fe7c95ec78 Cleanup: Declare srv_shutdown_bg_undo_sources() static 2022-09-26 13:45:53 +03:00
Marko Mäkelä
0792aff161 Merge 10.4 into 10.5 2022-09-20 13:17:02 +03:00
Marko Mäkelä
0c0a569028 Merge 10.3 into 10.4 2022-09-20 12:38:25 +03:00
Marko Mäkelä
c22dff21a5 InnoDB cleanup: Replace UNIV_LINUX, UNIV_SOLARIS, UNIV_AIX
Let us use the normal platform-specific preprocessor symbols
__linux__, __sun__, _AIX instead of some homebrew ones.

The preprocessor symbol UNIV_HPUX must have lost its meaning
by f6deb00a56 (note: the symbol
UNIV_HPUX10 is being checked for, but only UNIV_HPUX is defined).
2022-09-19 12:20:53 +03:00
Vladislav Vaintroub
fb70bb44d0 MDEV-29513 avoid useless os_thread_sleep() during srv_purge_shutdown()
use waitable_task.wait() function to wait for the end of previous purge
2022-09-12 12:24:26 +02:00
Marko Mäkelä
e5c4f4e590 Merge 10.3 into 10.4 2022-07-27 14:25:36 +03:00
Marko Mäkelä
0ee1082bd2 MDEV-28495 InnoDB corruption due to lack of file locking
Starting with commit da094188f6 (MDEV-24393),
MariaDB will no longer acquire advisory file locks on InnoDB data
files by default, because it would create a large number of
entries in Linux /proc/locks.

The motivation for acquiring the file locks is to prevent accidental
concurrent startup of multiple server processes on the same data files.
Such mistake still turns out to be relatively common, based on
corruption bug reports from the community.

To prevent corruption due to concurrent startup attempts, the
Aria storage engine would unconditionally acquire an advisory lock
on one of its log files.

Solution: InnoDB will always lock its system tablespace files.
(Ever since commit 685d958e38
the InnoDB log file will not necessarily be open while the
server is running, because it can be accessed via memory-mapped I/O.)

If more protection is desired, then the option --external-locking
can be used.

The mandatory advisory lock also fixes intermittent failures of
some crash recovery tests. It turns out that when the mtr test harness
kills and restarts the server, it will not actually ensure that the
old process has terminated before starting the new one.
2022-07-27 14:15:14 +03:00
Marko Mäkelä
ea40c75c27 Merge 10.4 into 10.5 2022-05-25 14:24:51 +03:00
Marko Mäkelä
99c8aed00d MDEV-28601 InnoDB history list length was reverted to 32 bits
srv_do_purge(): In commit edde1f6e0d
when the de-facto 32-bit trx_sys_t::history_size() was replaced with
32-bit trx_sys.rseg_history_len, some more variables were changed
from ulint (size_t) to uint32_t.

The history list length is the number of committed transactions whose
undo logs are waiting to be purged. Each TRX_RSEG_HISTORY list is
storing the number of entries in a 32-bit field and each transaction
will occupy at least one undo log page. It is thinkable that the
length of each TRX_RSEG_HISTORY list may approach the maximum
representable number. The number cannot be exceeded, because the
rollback segment header is allocated from the same tablespace as
the undo log header pages it is pointing to, and because the page
numbers of a tablespace are stored in 32 bits. In any case, it is
possible that the total number of unpurged committed transactions
cannot be represented in 32 but 39 bits (corresponding to
128 rollback segments and undo tablespaces).
2022-05-25 14:06:04 +03:00
Sergei Golubchik
7970ac7fe8 Merge branch '10.4' into 10.5 2022-05-18 09:50:26 +02:00
Sergei Golubchik
23ddc3518f Merge branch '10.3' into 10.4 2022-05-18 01:25:30 +02:00
Marko Mäkelä
3e564d468d MDEV-28541 Unused counter Innodb_encryption_key_rotation_list_length
The counter srv_stats.key_rotation_list_length is never updated, and
therefore Innodb_encryption_key_rotation_list_length will always be 0.

The view INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION comes close
to reporting this information.
2022-05-16 13:45:17 +03:00
Marko Mäkelä
4e1bf2bb23 MDEV-28537 Unused or useless InnoDB counters num_index_pages_written, num_non_index_pages_written
The counters were added in commit 5e55d1ced5
and any code to update them was
inadvertently removed in commit 2e814d4702
when applying InnoDB changes from MySQL 5.7.

Let us remove these counters that never reported anything useful. If such
statistics are really needed in a special case, they can be obtained by
instrumenting the code by some means, such as eBPF or a source code patch.
2022-05-16 13:41:53 +03:00
Marko Mäkelä
c009ce7dd0 MDEV-27094 Debug builds include useless InnoDB "disabled" options
This is a backport of commit 4489a89c71
in order to remove the test innodb.redo_log_during_checkpoint
that would cause trouble in the DBUG subsystem invoked by
safe_mutex_lock() via log_checkpoint(). Before
commit 7cffb5f6e8
these mutexes were of different type.

The following options were introduced in
commit 2e814d4702 (mariadb-10.2.2)
and have little use:

innodb_disable_resize_buffer_pool_debug had no effect even in
MariaDB 10.2.2 or MySQL 5.7.9. It was introduced in
mysql/mysql-server@5c4094cf49
to work around a problem that was fixed in
mysql/mysql-server@2957ae4f99
(but the parameter was not removed).

innodb_page_cleaner_disabled_debug and innodb_master_thread_disabled_debug
are only used by the test innodb.redo_log_during_checkpoint
that will be removed as part of this commit.

innodb_dict_stats_disabled_debug is only used by that test,
and it is redundant because one could simply use
innodb_stats_persistent=OFF or the STATS_PERSISTENT=0 attribute
of the table in the test to achieve the same effect.
2022-04-22 12:48:40 +03:00
Marko Mäkelä
5d8dcfd86c MDEV-25975: Merge 10.4 into 10.5 2022-04-06 10:30:49 +03:00
Marko Mäkelä
d172df9913 MDEV-25975: Merge 10.3 into 10.4 2022-04-06 09:18:38 +03:00
Marko Mäkelä
e9735a8185 MDEV-25975 innodb_disallow_writes causes shutdown to hang
We will remove the parameter innodb_disallow_writes because it is badly
designed and implemented. The parameter was never allowed at startup.
It was only internally used by Galera snapshot transfer.
If a user executed
SET GLOBAL innodb_disallow_writes=ON;
the server could hang even on subsequent read operations.

During Galera snapshot transfer, we will block writes
to implement an rsync friendly snapshot, as follows:

sst_flush_tables() will acquire a global lock by executing
FLUSH TABLES WITH READ LOCK, which will block any writes
at the high level.

sst_disable_innodb_writes(), invoked via ha_disable_internal_writes(true),
will suspend or disable InnoDB background tasks or threads that could
initiate writes. As part of this, log_make_checkpoint() will be invoked
to ensure that anything in the InnoDB buf_pool.flush_list will be written
to the data files. This has the nice side effect that the Galera joiner
will avoid crash recovery.

The changes to sql/wsrep.cc and to the tests are based on a prototype
that was developed by Jan Lindström.

Reviewed by: Jan Lindström
2022-04-06 08:06:49 +03:00
Vlad Lesin
6a3545dd1e MDEV-26322 Last binlog file and position are "empty" in mariabackup --prepare output
The issue is caused by 59a0236da4 commit.
The initial intention of the commit was to speed up
"mariabackup --prepare".

The call stack of binlog position reading is the following:
▾ trx_rseg_mem_restore
  ▾ trx_rseg_array_init
      ▾ trx_lists_init_at_db_start
            ▸ srv_start
Both trx_lists_init_at_db_start() and trx_rseg_mem_restore() contain
special cases for srv_operation == SRV_OPERATION_RESTORE condition, and
on this condition only rseg headers are read to parse binlog position.
Performance impact is not so big.

The solution is to revert 59a0236da4.
2022-04-04 12:19:09 +03:00
Marko Mäkelä
42609c240d Cleanup: Replace log_sys.n_pending_checkpoint_writes with a Boolean
Only one checkpoint may be in progress at a time.
The counter log_sys.n_pending_checkpoint_writes
was being protected by log_sys.mutex.
Let us replace it with the Boolean log_sys.checkpoint_pending.
2022-03-29 14:56:44 +03:00
Marko Mäkelä
b7016bd379 MDEV-26626 fixup: SIGFPE during startup
srv_start(): Set srv_startup_is_before_trx_rollback_phase before
starting the buf_flush_page_cleaner() thread, so that it will not
invoke log_checkpoint() before the log file has been created.

This race condition was reproduced with https://rr-project.org.
This fixes up commit 15efb7ed48
2022-03-29 14:53:51 +03:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Oleksandr Byelkin
a576a1cea5 Merge branch '10.3' into 10.4 2022-01-30 09:46:52 +01:00
Oleksandr Byelkin
41a163ac5c Merge branch '10.2' into 10.3 2022-01-29 15:41:05 +01:00
Daniel Black
410c4edef3 MDEV-27467: innodb to enforce the minimum innodb_buffer_pool_size in SET GLOBAL
.. to be the same as startup.

In resolving MDEV-27461, BUF_LRU_MIN_LEN (256) is the minimum number of
pages for the innodb buffer pool size. Obviously we need more than just
flushing pages. Taking the 16k page size and its default minimum, an
extra 25% is needed on top of the flushing pages to make a workable buffer
pool.

The minimum innodb_buffer_pool_chunk_size (1M) restricts the minimum
otherwise we'd have a pool made up of different chunk sizes.

The resulting minimum innodb buffer pool sizes are:

Page Size, Previously minimum (startup), with change.
        4k                            5M           2M
        8k                            5M           3M
       16k                            5M           5M
       32k                           24M          10M
       64k                           24M          20M

With this patch, SET GLOBAL innodb_buffer_pool_size minimums are
enforced.

The evident minimum system variable size for innodb_buffer_pool_size
is 2M, however this is only setable if using 4k page size. As
the order of the page_size and buffer_pool_size aren't fixed, we can't
hide this change.

Subsequent changes:
* innodb_buffer_pool_resize_with_chunks.test - raised of pool resize due to new
  minimums. Chunk size also needed increase as the test was for
  pool_size < chunk_size to generate a warning.
* Removed srv_buf_pool_min_size and replaced use with MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Removed srv_buf_pool_def_size and replaced constant defination in
  MYSQL_SYSVAR_LONGLONG(buffer_pool_size)
* Reordered ha_innodb to allow for direct use of MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Moved buf_pool_size_align into ha_innodb to access to MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* loose-innodb_disable_resize_buffer_pool_debug is needed in the
  innodb.restart.opt test so that under debug mode, resizing of the
  innodb buffer pool can occur.
2022-01-19 11:10:45 +11:00
Marko Mäkelä
4c3ad24413 MDEV-27416 InnoDB hang in buf_flush_wait_flushed(), on log checkpoint
InnoDB could sometimes hang when triggering a log checkpoint. This is
due to commit 7b1252c03d (MDEV-24278),
which introduced an untimed wait to buf_flush_page_cleaner().

The hang was noticed by occasional failures of IMPORT TABLESPACE tests,
such as innodb.innodb-wl5522, which would (unnecessarily) invoke
log_make_checkpoint() from row_import_cleanup().

The reason of the hang was that buf_flush_page_cleaner() would enter
untimed sleep despite buf_flush_sync_lsn being set. The exact failure
scenario is unclear, because buf_flush_sync_lsn should actually be
protected by buf_pool.flush_list_mutex. We prevent the hang by
invoking buf_pool.page_cleaner_set_idle(false) whenever we are
setting buf_flush_sync_lsn and signaling buf_pool.do_flush_list.

The bulk of these changes was originally developed as a preparation
for MDEV-26827, to invoke buf_flush_list() from fewer threads,
and tested on 10.6 by Matthias Leich.

This fix was tested by running 100 repetitions of 100 concurrent instances
of the test innodb.innodb-wl5522 on a RelWithDebInfo build, using ext4fs
and innodb_flush_method=O_DIRECT on a SATA SSD with 4096-byte block size.
During the test, the call to log_make_checkpoint() in row_import_cleanup()
was present.

buf_flush_list(): Make static.

buf_flush_wait(): Wait for buf_pool.get_oldest_modification()
to reach a target, by work done in the buf_flush_page_cleaner.
If buf_flush_sync_lsn is going to be set, we will invoke
buf_pool.page_cleaner_set_idle(false).

buf_flush_ahead(): If buf_flush_sync_lsn or buf_flush_async_lsn
is going to be set and the page cleaner woken up, we will invoke
buf_pool.page_cleaner_set_idle(false).

buf_flush_wait_flushed(): Invoke buf_flush_wait().

buf_flush_sync(): Invoke recv_sys.apply() at the start in case
crash recovery is active. Invoke buf_flush_wait().

buf_flush_sync_batch(): A lower-level variant of buf_flush_sync()
that is only called by recv_sys_t::apply().

buf_flush_sync_for_checkpoint(): Do not trigger log apply
or checkpoint during recovery.

buf_dblwr_t::create(): Only initiate a buffer pool flush, not
a checkpoint.

row_import_cleanup(): Do not unnecessarily invoke log_make_checkpoint().
Invoking buf_flush_list_space() before starting to generate redo log
for the imported tablespace should suffice.

srv_prepare_to_delete_redo_log_file():
Set recv_sys.recovery_on in order to prevent
buf_flush_sync_for_checkpoint() from initiating a checkpoint
while the log is inaccessible. Remove a wait loop that is already
part of buf_flush_sync().
Do not invoke fil_names_clear() if the log is being upgraded,
because the FILE_MODIFY record is specific to the latest format.

create_log_file(): Clear recv_sys.recovery_on only after calling
log_make_checkpoint(), to prevent buf_flush_page_cleaner from
invoking a checkpoint.

innodb_shutdown(): Simplify the logic in mariadb-backup --prepare.

os_aio_wait_until_no_pending_writes(): Update the function comment.
Apart from row_quiesce_table_start() during FLUSH TABLES...FOR EXPORT,
this is being called by buf_flush_list_space(), which is invoked
by ALTER TABLE...IMPORT TABLESPACE as well as some encryption operations.
2022-01-04 07:40:31 +02:00