mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-17 04:22:27 +01:00

Author	SHA1	Message	Date
Daniel Black	fbd11d5f29	MDEV-18200 MariaBackup full backup failed with InnoDB: Failing assertion: success Review cleanups.	2023-10-13 09:48:57 +11:00
Daniel Black	c79ca7c7ad	MDEV-18200 MariaBackup full backup failed with InnoDB: Failing assertion: success There are many filesystem related errors that can occur with MariaBackup. These already outputed to stderr with a good description of the error. Many of these are permission or resource (file descriptor) limits where the assertion and resulting core crash doesn't offer developers anything more than the log message. To the user, assertions and core crashes come across as poor error handling. As such we return an error and handle this all the way up the stack.	2023-10-12 21:37:27 +11:00
Marko Mäkelä	6e9b421f77	MDEV-32364 Server crashes when starting server with high innodb_log_buffer_size log_t::create(): Return whether the initialisation succeeded. It may fail if too large an innodb_log_buffer_size is specified.	2023-10-06 14:16:01 +03:00
Marko Mäkelä	84dbd0253d	MDEV-31487: Recovery or backup failure after innodb_undo_log_truncate=ON recv_sys_t::parse(): For undo tablespace truncation mini-transactions, remember the start_lsn instead of the end LSN. This is what we expect after commit `461402a564` (MDEV-30479).	2023-06-27 09:12:38 +03:00
Marko Mäkelä	bb9da13baf	MDEV-31373 innodb_undo_log_truncate=ON recovery results in a corrupted undo log recv_sys_t::apply(): When applying an undo log truncation operation, invoke os_file_truncate() on space->recv_size, which must not be less than the original truncated file size. Alternatively, as pointed out by Thirunarayanan Balathandayuthapani, we could assign space->size = t.pages, so that fil_system_t::extend_to_recv_size() would extend the file back to space->recv_size.	2023-06-01 12:11:18 +03:00
Oleksandr Byelkin	ac5a534a4c	Merge remote-tracking branch '10.4' into 10.5	2023-03-31 21:32:41 +02:00
Vlad Lesin	4c226c1850	MDEV-29050 mariabackup issues error messages during InnoDB tablespaces export on partial backup preparing The solution is to suppress error messages for missing tablespaces if mariabackup is launched with "--prepare --export" options. "mariabackup --prepare --export" invokes itself with --mysqld parameter. If the parameter is set, then it starts server to feed "FLUSH TABLES ... FOR EXPORT;" queries for exported tablespaces. This is "normal" server start, that's why new srv_operation value is introduced. Reviewed by Marko Makela.	2023-03-27 20:15:10 +03:00
Marko Mäkelä	5300c0fb76	MDEV-30657 InnoDB: Not applying UNDO_APPEND due to corruption This almost completely reverts commit `acd23da4c2` and retains a safe optimization: recv_sys_t::parse(): Remove any old redo log records for the truncated tablespace, to free up memory earlier. If recovery consists of multiple batches, then recv_sys_t::apply() will must invoke recv_sys_t::trim() again to avoid wrongly applying old log records to an already truncated undo tablespace.	2023-02-15 18:16:41 +02:00
Thirunarayanan Balathandayuthapani	1a5c7552ea	MDEV-30552 InnoDB recovery crashes when error handling scenario - InnoDB fails to reset the after_apply variable before applying the redo log in last batch during multi-batch recovery.	2023-02-14 14:36:17 +05:30
Thirunarayanan Balathandayuthapani	3eea2e8e10	MDEV-30551 InnoDB recovery hangs when buffer pool ran out of memory - During non-last batch of multi-batch recovery, InnoDB holds log_sys.mutex and preallocates the block which may intiate page flush, which may initiate log flush, which requires log_sys.mutex to acquire again. This leads to assert failure. So InnoDB recovery should release log_sys.mutex before preallocating the block.	2023-02-14 14:35:35 +05:30
Marko Mäkelä	acd23da4c2	MDEV-30479 optimization: Invoke recv_sys_t::trim() earlier recv_sys_t::parse(): Discard old page-level redo log when parsing a TRIM_PAGES record. recv_sys_t::apply(): trim() was invoked in parse() already. recv_sys_t::truncated_undo_spaces[]: Only store the size, no LSN.	2023-02-06 20:29:42 +02:00
Marko Mäkelä	461402a564	MDEV-30479 OPT_PAGE_CHECKSUM mismatch after innodb_undo_log_truncate=ON page_recv_t::trim(): Do remove log records for mini-transactions that end right at the threshold LSN. This will avoid an inconsistency where a dirty page had been evicted from the buffer pool during undo tablespace truncation, and recovery would attempt to apply log records for which the last available copy in the data file is too new. These changes would be discarded anyway.	2023-02-06 20:29:29 +02:00
Thirunarayanan Balathandayuthapani	17858e03a7	MDEV-30179 mariabackup --backup fails with FATAL ERROR: ... failed to copy datafile - Mariabackup fails to copy the undo log tablespace when it undergoes truncation. So Mariabackup should detect the redo log which does undo tablespace truncation and also backup should read the minimum file size of the tablespace and ignore the error while reading. - Throw error when innodb undo tablespace read failed, but backup doesn't find the redo log for undo tablespace truncation	2023-01-10 15:47:13 +05:30
Marko Mäkelä	bd694bb7b2	MDEV-24412 InnoDB: Upgrade after a crash is not supported recv_log_recover_10_4(): Widen the operand of bitwise and to 64 bits, so that the upgrade check will work when the redo log record is located more than 4 gigabytes from the start of the first file.	2022-11-28 11:56:09 +02:00
Marko Mäkelä	9d388192c7	Cleanup: Say "mariadbd" instead of "mysqld" in InnoDB messages	2022-11-22 15:32:47 +02:00
Marko Mäkelä	cff9939d09	MDEV-30068 Confusing error message when encryption is not available on recovery fil_name_process(): If fil_ibd_load() returns FIL_LOAD_INVALID, display the file name and the tablespace identifier.	2022-11-22 15:31:12 +02:00
Marko Mäkelä	0b25551a61	MDEV-29999 innodb_undo_log_truncate=ON is not crash safe If a log checkpoint occurs at the end LSN of mtr.commit_shrink(space) in trx_purge_truncate_history(), then recovery may fail because it could try to apply too old log records to too old copies of undo log pages. This was repeated with the following test: ./mtr innodb.undo_log_truncate,4k,strict_full_crc32 recv_sys_t::trim(): Move some code to the caller. recv_sys_t::apply(): For undo tablespace truncation, discard all old redo log for the undo tablespace, and then truncate the file to the desired size. Tested by: Matthias Leich	2022-11-15 16:56:13 +02:00
Marko Mäkelä	e0e096faaa	MDEV-29982 Improve the InnoDB log overwrite error message The InnoDB write-ahead log ib_logfile0 is of fixed size, specified by innodb_log_file_size. If the tail of the log manages to overwrite the head (latest checkpoint) of the log, crash recovery will be broken. Let us clarify the messages about this, including adding a message on the completion of a log checkpoint that notes that the dangerous situation is over. To reproduce the dangerous scenario, we will introduce the debug injection label ib_log_checkpoint_avoid_hard, which will avoid log checkpoints even harder than the previous ib_log_checkpoint_avoid. log_t::overwrite_warned: The first known dangerous log sequence number. Set in log_close() and cleared in log_write_checkpoint_info(), which will output a "Crash recovery was broken" message.	2022-11-14 12:18:03 +02:00
Marko Mäkelä	2283f82de5	MDEV-29905 fixup: Remove some unnecessary code srv_shutdown(): Do not call log_free_check(), because it will now be repeatedly called by ibuf_merge_all(). Do not call srv_sync_log_buffer_in_background(), because we do not actually care about durability during shutdown. Log writes will already be triggered by buf_flush_page_cleaner() for writing back modified pages, possibly by log_free_check(). logs_empty_and_mark_files_at_shutdown(): Clean up a condition. This function is the caller of srv_shutdown(), and it will ensure that the log and the buffer pool will be in clean state before shutdown.	2022-11-14 10:22:29 +02:00
Marko Mäkelä	a732d5e2ba	Merge 10.4 into 10.5	2022-11-08 17:01:28 +02:00
Marko Mäkelä	93b4f84ab2	Merge 10.3 into 10.4	2022-11-08 16:04:01 +02:00
Marko Mäkelä	9ac8be4e29	Include some advice in the crash-upgrade message	2022-11-08 10:39:29 +02:00
Marko Mäkelä	2f1a4328cb	MDEV-29613 fixup: clang -Wunused-but-set-variable	2022-10-11 15:36:24 +03:00
Marko Mäkelä	fed0d85de7	MDEV-29559 Recovery of INSERT_HEAP_DYNAMIC into secondary index fails log_phys_t::apply(): When parsing an INSERT_HEAP_DYNAMIC record, allow ll==rlen to hold for the last part. A secondary index record may inherit all preceding bytes from the infimum pseudo-record. For INSERT_HEAP_REDUNDANT, some header bytes will always be present because the header will never be copied from the page infimum. We will tolerate ll==rlen also in that case to be consistent with the parsing of INSERT_HEAP_DYNAMIC.	2022-09-19 11:46:25 +03:00
Marko Mäkelä	244fdc435d	MDEV-29438 Recovery or backup of instant ALTER TABLE is incorrect This bug was found in MariaDB Server 10.6 thanks to the OPT_PAGE_CHECKSUM record that was implemented in commit `4179f93d28` for catching this type of recovery failures. page_cur_insert_rec_low(): If the previous record is the page infimum, correctly limit the end of the record. We do not want to copy data from the header of the page supremum. This omission caused the incorrect recovery of DB_TRX_ID in an instant ALTER TABLE metadata record, because part of the DB_TRX_ID was incorrectly copied from the n_owned of the page supremum, which in recovery would be updated after the copying, but in normal operation would already have been updated at the time the common prefix was being determined. log_phys_t::apply(): If a data page is found to be corrupted, do not flag the log corrupted but instead return a new status APPLIED_CORRUPTED so that the caller may discard all log for this page. We do not want the recovery of unrelated pages to fail in recv_recover_page(). No test case is included, because the known test case would only work in 10.6, and even after this fix, it would trigger another bug in instant ALTER TABLE crash recovery.	2022-09-05 09:54:47 +03:00
Marko Mäkelä	5d8dcfd86c	MDEV-25975: Merge 10.4 into 10.5	2022-04-06 10:30:49 +03:00
Marko Mäkelä	cbdf62ae90	MDEV-25975 merge fixup	2022-04-06 10:13:21 +03:00
Marko Mäkelä	d172df9913	MDEV-25975: Merge 10.3 into 10.4	2022-04-06 09:18:38 +03:00
Marko Mäkelä	e9735a8185	MDEV-25975 innodb_disallow_writes causes shutdown to hang We will remove the parameter innodb_disallow_writes because it is badly designed and implemented. The parameter was never allowed at startup. It was only internally used by Galera snapshot transfer. If a user executed SET GLOBAL innodb_disallow_writes=ON; the server could hang even on subsequent read operations. During Galera snapshot transfer, we will block writes to implement an rsync friendly snapshot, as follows: sst_flush_tables() will acquire a global lock by executing FLUSH TABLES WITH READ LOCK, which will block any writes at the high level. sst_disable_innodb_writes(), invoked via ha_disable_internal_writes(true), will suspend or disable InnoDB background tasks or threads that could initiate writes. As part of this, log_make_checkpoint() will be invoked to ensure that anything in the InnoDB buf_pool.flush_list will be written to the data files. This has the nice side effect that the Galera joiner will avoid crash recovery. The changes to sql/wsrep.cc and to the tests are based on a prototype that was developed by Jan Lindström. Reviewed by: Jan Lindström	2022-04-06 08:06:49 +03:00
Marko Mäkelä	42609c240d	Cleanup: Replace log_sys.n_pending_checkpoint_writes with a Boolean Only one checkpoint may be in progress at a time. The counter log_sys.n_pending_checkpoint_writes was being protected by log_sys.mutex. Let us replace it with the Boolean log_sys.checkpoint_pending.	2022-03-29 14:56:44 +03:00
Marko Mäkelä	5503c40460	Stabilize innodb.redo_log_during_checkpoint Externally kill and restart the server, and remove the unreliable crash_after_checkpoint.	2022-03-11 09:46:50 +02:00
Vladislav Vaintroub	881918bf77	MDEV-27754 : Assertion with innodb_flush_method=O_DSYNC If innodb_flush_method=O_DSYNC, log_sys.flushed_to_disk_lsn is changed without 'flush_lock' protection inside log_write(). This leads to a race condition, if there are 2 threads running in parallel, doing log_write_up_to() with different values for 'flush_to_disk' In this case, log_write() and log_write_flush_to_disk_low() can execute at the same time, and both would change flushed_lsn. The fix is to remove special treatment of durable writes from log_write(). There is no apparent reason for this special treatment, log_write_flush_to_disk_low() is already optimized for durable writes. Nor there is an apparent reason to call log_flush_notify() more often in for O_DSYNC.	2022-02-07 09:14:00 +01:00
Marko Mäkelä	56f5599f09	MDEV-27610 Unnecessary wait in InnoDB crash recovery In recv_sys_t::apply(), we were unnecessarily looking up pages in buf_pool.page_hash and potentially waiting for exclusive page latches. Before buf_page_get_low() would return an x-latched page, that page will have to be read and buf_page_read_complete() would have invoked recv_recover_page() to apply the log to the page. Therefore, it suffices to invoke recv_read_in_area() to trigger a transition from RECV_NOT_PROCESSED. recv_read_in_area(): Take the iterator as a parameter, and remove page_id lookups. Should the page already be in buf_pool.page_hash, buf_page_init_for_read() will return nullptr to buf_read_page_low() and buf_read_page_background(). recv_sys_t::apply(): Replace goto, remove dead code, and add assertions to guarantee that the iteration will make progress. Reviewed by: Vladislav Lesin	2022-01-26 13:29:34 +02:00
Marko Mäkelä	8535c260dd	Remove FIXME comments that refer to an early MDEV-14425 plan In MDEV-14425, an early plan was to introduce a separate log file for file-level records and checkpoint information. The reasoning was that fil_system.mutex contention would be reduced by not having to maintain fil_system.named_spaces. The mutex contention was actually fixed in MDEV-23855 by making some data fields in fil_space_t and fil_node_t use std::atomic. Using a single circular log file simplifies recovery and backup.	2022-01-14 20:27:51 +02:00
Eugene Kosov	f443cd1100	MDEV-27022 Buffer pool is being flushed during recovery The problem was introduced by the removal of buf_pool.flush_rbt in commit `46b1f50098` (MDEV-23399) recv_sys_t::apply(): don't write to disc and fsync() the last batch. Insead, sort it by oldest_modification for MariaDB server and some mariabackup operations. log_sort_flush_list(): a thread-safe function which sorts buf_pool::flush_list	2022-01-11 16:20:20 +03:00
Marko Mäkelä	4c3ad24413	MDEV-27416 InnoDB hang in buf_flush_wait_flushed(), on log checkpoint InnoDB could sometimes hang when triggering a log checkpoint. This is due to commit `7b1252c03d` (MDEV-24278), which introduced an untimed wait to buf_flush_page_cleaner(). The hang was noticed by occasional failures of IMPORT TABLESPACE tests, such as innodb.innodb-wl5522, which would (unnecessarily) invoke log_make_checkpoint() from row_import_cleanup(). The reason of the hang was that buf_flush_page_cleaner() would enter untimed sleep despite buf_flush_sync_lsn being set. The exact failure scenario is unclear, because buf_flush_sync_lsn should actually be protected by buf_pool.flush_list_mutex. We prevent the hang by invoking buf_pool.page_cleaner_set_idle(false) whenever we are setting buf_flush_sync_lsn and signaling buf_pool.do_flush_list. The bulk of these changes was originally developed as a preparation for MDEV-26827, to invoke buf_flush_list() from fewer threads, and tested on 10.6 by Matthias Leich. This fix was tested by running 100 repetitions of 100 concurrent instances of the test innodb.innodb-wl5522 on a RelWithDebInfo build, using ext4fs and innodb_flush_method=O_DIRECT on a SATA SSD with 4096-byte block size. During the test, the call to log_make_checkpoint() in row_import_cleanup() was present. buf_flush_list(): Make static. buf_flush_wait(): Wait for buf_pool.get_oldest_modification() to reach a target, by work done in the buf_flush_page_cleaner. If buf_flush_sync_lsn is going to be set, we will invoke buf_pool.page_cleaner_set_idle(false). buf_flush_ahead(): If buf_flush_sync_lsn or buf_flush_async_lsn is going to be set and the page cleaner woken up, we will invoke buf_pool.page_cleaner_set_idle(false). buf_flush_wait_flushed(): Invoke buf_flush_wait(). buf_flush_sync(): Invoke recv_sys.apply() at the start in case crash recovery is active. Invoke buf_flush_wait(). buf_flush_sync_batch(): A lower-level variant of buf_flush_sync() that is only called by recv_sys_t::apply(). buf_flush_sync_for_checkpoint(): Do not trigger log apply or checkpoint during recovery. buf_dblwr_t::create(): Only initiate a buffer pool flush, not a checkpoint. row_import_cleanup(): Do not unnecessarily invoke log_make_checkpoint(). Invoking buf_flush_list_space() before starting to generate redo log for the imported tablespace should suffice. srv_prepare_to_delete_redo_log_file(): Set recv_sys.recovery_on in order to prevent buf_flush_sync_for_checkpoint() from initiating a checkpoint while the log is inaccessible. Remove a wait loop that is already part of buf_flush_sync(). Do not invoke fil_names_clear() if the log is being upgraded, because the FILE_MODIFY record is specific to the latest format. create_log_file(): Clear recv_sys.recovery_on only after calling log_make_checkpoint(), to prevent buf_flush_page_cleaner from invoking a checkpoint. innodb_shutdown(): Simplify the logic in mariadb-backup --prepare. os_aio_wait_until_no_pending_writes(): Update the function comment. Apart from row_quiesce_table_start() during FLUSH TABLES...FOR EXPORT, this is being called by buf_flush_list_space(), which is invoked by ALTER TABLE...IMPORT TABLESPACE as well as some encryption operations.	2022-01-04 07:40:31 +02:00
Marko Mäkelä	1df05a0854	Correct some copyright messages Most of the Facebook contribution mysql/mysql-server@72d656acdf was removed in commit `5bea43f5e0` (MDEV-12353). Mainly the configuration parameter innodb_compression_level remains. It had been renamed to page_zip_level in mysql/mysql-server@5b38f2a712.	2022-01-03 07:23:39 +02:00
Marko Mäkelä	c14dd0d19d	Cleanup: Remove RECV_READ_AHEAD_AREA Let us directly use the constant 32 in recv_read_in_area().	2022-01-03 07:23:18 +02:00
Marko Mäkelä	cfcfdc65df	MDEV-27190 InnoDB upgrade from 10.2, 10.3, 10.4 is not crash-safe During startup, InnoDB must write a FILE_CHECKPOINT record. However, before MDEV-12353 (in MariaDB Server 10.2, 10.3, 10.4) the corresponding record MLOG_CHECKPOINT was encoded in a different way. When we are upgrading from a logically empty 10.2, 10.3, or 10.4 redo log, we must not write anything to the old log file, because if the server were killed during the upgrade, we would end up with a corrupted log file, and both the old and the new server would refuse to start up. On upgrade, we must simply create a new logically empty log file and replace the old ib_logfile0 with that.	2021-12-07 17:00:46 +02:00
Eugene Kosov	890c55177d	MDEV-27183 optimize std::map lookup in during crash recovery This is a low hanging fruit. Before this patch std::map::emplace() was a ~50% of the whole recv_sys_t::parse() operation in by test. After the fix it's only ~20%. recv_sys_t::parse() recv_sys_t::pages is a collection of all pages to recovery. Often, there are multiple changes for a single page. Often, they go in a row and for such cases let's avoid lookup in a std::map. cached_pages_it serves as a cache of size 1. recv_sys_t::add(): replace page_id argument with a std::map::iterator	2021-12-07 15:50:00 +06:00
Eugene Kosov	0064316f19	cleanup: reduce code bloat	2021-12-06 14:06:17 +06:00
Eugene Kosov	5d7da02793	MDEV-27139 32-bit systems fail to use big innodb-log-file-size log_write_buf(): do not cast to size_t which prevents to write to files which a bigger that 4G and remove useless assertion	2021-12-03 14:57:23 +06:00
Marko Mäkelä	a0fda162eb	Fix GCC 11.2.0 -m32 (IA-32) warnings page_create_low(): Fix -Warray-bounds log_buffer_extend(): Fix -Wstringop-overflow	2021-10-21 15:31:21 +03:00
Marko Mäkelä	a736a3174a	Merge 10.3 into 10.4	2021-10-13 12:03:32 +03:00
Marko Mäkelä	4a7dfda373	Merge 10.2 into 10.3	2021-10-13 11:38:21 +03:00
Marko Mäkelä	2bb8d7c2f3	MDEV-26811: Assertion "log_sys->n_pending_flushes == 1" fails In commit `1cb218c37c` (MDEV-26450) we introduced the function log_write_and_flush(), which may compete with log_checkpoint() invoking log_write_flush_to_disk_low() from another thread. The assertion n_pending_flushes==1 is too strict. There is no possibility of a race condition here, because fil_flush() is protected by fil_system->mutex and the rest will be protected by log_sys->mutex. log_write_flush_to_disk_low(), log_write_and_flush(): Relax the assertions to test for a nonzero count.	2021-10-13 10:38:41 +03:00
Marko Mäkelä	f5fddae3cb	MDEV-26450: Corruption due to innodb_undo_log_truncate At least since commit `055a3334ad` (MDEV-13564) the undo log truncation in InnoDB did not work correctly. The main issue is that during the execution of trx_purge_truncate_history() some pages of the newly truncated undo tablespace could be discarded. This is improved from commit `1cb218c37c` which was applied to earlier-version branches. fsp_try_extend_data_file(): Apply the peculiar rounding of fil_space_t::size_in_header only to the system tablespace, whose size can be expressed in megabytes in a configuration parameter. Other files may freely grow by a number of pages. fseg_alloc_free_page_low(): Do allow the extension of undo tablespaces, and mention the file name in the error message. mtr_t::commit_shrink(): Implement crash-safe shrinking of a tablespace: (1) durably write the log (2) release the page latches of the rebuilt tablespace (3) release the mutexes (4) truncate the file (5) release the tablespace latch This is refactored from trx_purge_truncate_history(). log_write_and_flush_prepare(), log_write_and_flush(): New functions to durably write log during mtr_t::commit_shrink().	2021-09-24 08:22:19 +03:00
Marko Mäkelä	9024498e88	Merge 10.3 into 10.4	2021-09-22 18:26:54 +03:00
Marko Mäkelä	b46cf33ab8	Merge 10.2 into 10.3	2021-09-22 18:01:41 +03:00
Marko Mäkelä	1cb218c37c	MDEV-26450: Corruption due to innodb_undo_log_truncate At least since commit `055a3334ad` (MDEV-13564) the undo log truncation in InnoDB did not work correctly. The main issue is that during the execution of trx_purge_truncate_history() some pages of the newly truncated undo tablespace could be discarded. fsp_try_extend_data_file(): Apply the peculiar rounding of fil_space_t::size_in_header only to the system tablespace, whose size can be expressed in megabytes in a configuration parameter. Other files may freely grow by a number of pages. fseg_alloc_free_page_low(): Do allow the extension of undo tablespaces, and mention the file name in the error message. mtr_t::commit_shrink(): Implement crash-safe shrinking of a tablespace file. First, durably write the log, then shrink the file, and finally release the page latches of the rebuilt tablespace. Refactored from trx_purge_truncate_history(). log_write_and_flush_prepare(), log_write_and_flush(): New functions to durably write log during mtr_t::commit_shrink().	2021-09-22 14:15:00 +03:00

1 2 3 4 5 ...

817 commits