mariadb

mirror of https://github.com/MariaDB/server.git synced 2026-05-16 20:07:13 +02:00

Author	SHA1	Message	Date
Daniele Sciascia	648d2da8f2	MDEV-33540 Avoid writes to TRX_SYS page during mariabackup operations Fix a scenario where `mariabackup --prepare` fails with assertion `!m_modifications \|\| !recv_no_log_write' in `mtr_t::commit()`. This happens if the prepare step of the backup encounters a data directory which happens to store wsrep xid position in TRX SYS page (this is no longer the case since 10.3.5). And since MDEV-17458, `trx_rseg_array_init()` handles this case by copying the xid position to rollback segments, before clearing the xid from TRX SYS page. However, this step should be avoided when `trx_rseg_array_init()` is invoked from mariabackup. The relevant code was surrounded by the condition `srv_operation == SRV_OPERATION_NORMAL`. An additional check ensures that we are not trying to copy a xid position which has already zeroed.	2024-03-08 16:02:38 +01:00
mariadb-DebarunBanerjee	afe9632913	MDEV-33593 Auto increment deadlock error causes ASSERT in subsequent save point The issue here is ha_innobase::get_auto_increment() could cause a deadlock involving auto-increment lock and rollback the transaction implicitly. For such cases, storage engines usually call thd_mark_transaction_to_rollback() to inform SQL engine about it which in turn takes appropriate actions and close the transaction. In innodb, we call it while converting Innodb error code to MySQL. However, since ::innobase_get_autoinc() returns void, we skip the call for error code conversion and also miss marking the transaction for rollback for deadlock error. We assert eventually while releasing a savepoint as the transaction state is not active. Since convert_error_code_to_mysql() is handling some generic error handling part, like invoking the callback when needed, we should call that function in ha_innobase::get_auto_increment() even if we don't return the resulting mysql error code back.	2024-03-07 21:54:06 +05:30
Thirunarayanan Balathandayuthapani	6e5333fc8c	MDEV-32445 InnoDB may corrupt its log before upgrading it on startup Problem: ======== During upgrade, InnoDB does write the redo log for adjusting the tablespace size or tablespace flags even before the log has upgraded to configured format. This could lead to data inconsistent if any crash happened during upgrade process. Fix: === srv_start(): Write the tablespace flags adjustment, increased tablespace size redo log only after redo log upgradation. log_write_low(), log_reserve_and_write_fast(): Check whether the redo log is in physical format.	2024-03-06 15:01:26 +05:30
Thirunarayanan Balathandayuthapani	738da4918d	MDEV-32346 Assertion failure sym_node->table != NULL in pars_retrieve_table_def on UPDATE - During update operation, InnoDB should avoid the initializing the FTS_DOC_ID of foreign table if the foreign table is discarded	2024-03-06 14:04:49 +05:30
Marko Mäkelä	0772ac1f16	MDEV-33508 Performance regression due to frequent scan of full buf_pool.flush_list buf_flush_page_cleaner(): Remove a loop that had originally been added in commit `9d1466522e` (MDEV-32029) and made redundant by commit `5b53342a6a` (MDEV-32588). Starting with commit `d34479dc66` (MDEV-33053) this loop would cause a significant performance regression in workloads where buf_pool.need_LRU_eviction() constantly holds in buf_flush_page_cleaner(). Thanks to Steve Shaw of Intel for noticing this. Reviewed by: Debarun Banerjee Tested by: Matthias Leich	2024-02-28 12:48:38 +02:00
mariadb-DebarunBanerjee	969669767b	MDEV-33011 mariabackup --backup: FATAL ERROR: ... Can't open datafile cool_down/t3 The root cause is the WAL logging of file operation when the actual operation fails afterwards. It creates a situation with a log entry for a operation that would always fail. I could simulate both the backup scenario error and Innodb recovery failure exploiting the weakness. We are following WAL for file rename operation and once logged the operation must eventually complete successfully, or it is a major catastrophe. Right now, we fail for rename and handle it as normal error and it is the problem. I created a patch to address RENAME operation to a non existing schema where the destination schema directory is missing. The patch checks for the missing schema before logging in an attempt to avoid the failure after WAL log is written/flushed. I also checked that the schema cannot be dropped or there cannot be any race with other rename to the same file. This is protected by the MDL lock in SQL today. The patch should this be a good improvement over the current situation and solves the issue at hand.	2024-02-27 17:59:20 +05:30
Marko Mäkelä	71834ccb6c	MDEV-24671 fixup: Remove srv_max_n_threads The variable srv_max_n_threads lost its usefulness in commit `db006a9a43` (MDEV-21452) and commit `e71e613353` (MDEV-24671).	2024-02-27 11:14:28 +02:00
Thirunarayanan Balathandayuthapani	57cc8605eb	MDEV-19044 Alter table corrupts while applying the modification log Problem: ======== - InnoDB reads the length of the variable length field wrongly while applying the modification log of instant table. Solution: ======== rec_init_offsets_comp_ordinary(): For the temporary instant file record, InnoDB should read the length of the variable length field from the record itself.	2024-02-27 12:59:46 +05:30
Thirunarayanan Balathandayuthapani	e309e02447	MDEV-30655 IMPORT TABLESPACE fails with column count or index count mismatch update_vcol_pos(): pass table id as table_id_t instead of ulint.	2024-02-23 13:42:46 +05:30
Thirunarayanan Balathandayuthapani	e66928ab28	MDEV-33462 Server aborts while altering an InnoDB statistics table Problem: ======= - When online alter of InnoDB statistics table happens, any transaction which updates the statistics table has to read the undo log and log the DML changes during transaction commit. Applying undo log (UndorecApplier::apply_undo_rec) requires a shared lock on dictionary cache but dict_stats_save() already holds write lock on dictionary cache. This leads to abort of server during commit of statistics table changes. Solution: ======== - Disallow LOCK=NONE operation for the InnoDB statistics table. The reasoning is that statistics tables are typically rather small, so any blocking would be rather short. Writes to the statistics tables should be a rare operation.	2024-02-22 16:57:04 +05:30
Marko Mäkelä	042c3fc432	MDEV-24167 fixup: srw_lock_debug for SUX_LOCK_GENERIC srw_lock_debug::have_rd(), srw_lock_debug::have_wr(): For SUX_LOCK_GENERIC (no futex based synchronization primitives), we cannot check if the underlying srw_lock is held by us. Thanks to Dmitry Shulga for pointing out this build failure.	2024-02-21 13:03:25 +02:00
Thirunarayanan Balathandayuthapani	903ae30069	MDEV-30655 IMPORT TABLESPACE fails with column count or index count mismatch Problem: ======== Currently import operation fails with schema mismatch when cfg file has hidden fts document id and hidden fts document index. Fix: ==== To fix this issue, simply add the fts doc id column, indexes in table definition and try to import the table. In case of success: 1) update the fts document id in sys columns. 2) update the number of columns in sys tables. 3) insert the new fts index entry in sys indexes table and sys fields. 4) Reload the table with new table definition	2024-02-20 19:48:25 +05:30
Marko Mäkelä	53c6c823dc	MDEV-33464 Crash when innodb_max_undo_log_size is set to innodb_page_size*4294967296 purge_sys_t::truncating_tablespace(): Clamp the innodb_max_undo_log_size to the maximum number of pages before converting the result into a 32-bit unsigned integer. This fixes up commit `f8c88d905b` (MDEV-33213). In later major versions, we would use 32-bit unsigned integer here due to commit `ca501ffb04` and the code would crash also on 64-bit processors. Reviewed by: Debarun Banerjee	2024-02-15 12:34:04 +02:00
Marko Mäkelä	691f923906	Merge 10.5 into 10.6	2024-02-13 20:42:59 +02:00
Marko Mäkelä	b770633e07	Merge 10.4 into 10.5	2024-02-13 14:25:21 +02:00
Marko Mäkelä	68d9deb69a	MDEV-33332 SIGSEGV in buf_read_ahead_linear() when bpage is in buf_pool.watch buf_read_ahead_linear(): If buf_pool.watch_is_sentinel(*bpage), do not attempt to read the page frame because the pointer would be null for the elements of buf_pool.watch[]. Hitting this bug requires the use of a non-default value of innodb_change_buffering.	2024-02-13 14:10:44 +02:00
Marko Mäkelä	ca88eac835	MDEV-30528 CREATE FULLTEXT INDEX assertion failure WITH SYSTEM VERSIONING ha_innobase::check_if_supported_inplace_alter(): Require ALGORITHM=COPY when creating a FULLTEXT INDEX on a versioned table. row_merge_buf_add(), row_merge_read_clustered_index(): Remove the parameter or local variable history_fts that had been added in the attempt to fix MDEV-25004. Reviewed by: Thirunarayanan Balathandayuthapani Tested by: Matthias Leich	2024-02-12 16:52:55 +01:00
Marko Mäkelä	81f3e97bc8	MDEV-33383: Corrupted red-black tree due to incorrect comparison fts_doc_id_cmp(): Replaces several duplicated functions for comparing two doc_id_t. On IA-32, AMD64, ARMv7, ARMv8, RISC-V this should make use of some conditional ALU instructions. On POWER there will be conditional jumps. Unlike the original functions, these will return the correct result even if the difference of the two doc_id does not fit in the int data type. We use static_assert() and offsetof() to check at compilation time that this function is compatible with the rbt_create() calls. fts_query_compare_rank(): As documented, return -1 and not 1 when the rank are equal and r1->doc_id < r2->doc_id. This will affect the result of ha_innobase::ft_read(). fts_ptr2_cmp(), fts_ptr1_ptr2_cmp(): These replace fts_trx_table_cmp(), fts_trx_table_id_cmp(). The fts_savepoint_t::tables will be sorted by dict_table_t rather than dict_table_t::id. There was no correctness bug in the previous comparison predicates. We can avoid one level of unnecessary pointer dereferencing in this way. Actually, fts_savepoint_t is duplicating trx_t::mod_tables. MDEV-33401 was filed about removing it. The added unit test innodb_rbt-t covers both the previous buggy comparison predicate and the revised fts_doc_id_cmp(), using keys which led to finding the bug. Thanks to Shaohua Wang from Alibaba for providing the example and the revised comparison predicate. Reviewed by: Thirunarayanan Balathandayuthapani	2024-02-12 17:01:45 +02:00
Marko Mäkelä	47122a6112	MDEV-33383: Replace fts_doc_id_cmp, ib_vector_sort fts_doc_ids_sort(): Sort an array of doc_id_t by C++11 std::sort(). fts_doc_id_cmp(), ib_vector_sort(): Remove. The comparison was returning an incorrect result when the difference exceeded the int range. Reviewed by: Thirunarayanan Balathandayuthapani	2024-02-12 17:01:17 +02:00
Marko Mäkelä	8ec12e0d6d	Merge 10.4 into 10.5	2024-02-12 11:38:13 +02:00
Marko Mäkelä	466069b184	Merge 10.5 into 10.6	2024-02-08 10:38:53 +02:00
Marko Mäkelä	0381921e26	MDEV-33277 In-place upgrade causes invalid AUTO_INCREMENT values MDEV-33308 CHECK TABLE is modifying .frm file even if --read-only As noted in commit `d0ef1aaf61`, MySQL as well as older versions of MariaDB server would during ALTER TABLE ... IMPORT TABLESPACE write bogus values to the PAGE_MAX_TRX_ID field to pages of the clustered index, instead of letting that field remain 0. In commit `8777458a6e` this field was repurposed for PAGE_ROOT_AUTO_INC in the clustered index root page. To avoid trouble when upgrading from MySQL or older versions of MariaDB, we will try to detect and correct bogus values of PAGE_ROOT_AUTO_INC when opening a table for the first time from the SQL layer. btr_read_autoinc_with_fallback(): Add the parameters to mysql_version,max to indicate the TABLE_SHARE::mysql_version of the .frm file and the maximum value allowed for the type of the AUTO_INCREMENT column. In case the table was originally created in MySQL or an older version of MariaDB, read also the maximum value of the AUTO_INCREMENT column from the table and reset the PAGE_ROOT_AUTO_INC if it is above the limit. dict_table_t::get_index(const dict_col_t &) const: Find an index that starts with the specified column. ha_innobase::check_for_upgrade(): Return HA_ADMIN_FAILED if InnoDB needs upgrading but is in read-only mode. In this way, the call to update_frm_version() will be skipped. row_import_autoinc(): Adjust the AUTO_INCREMENT column at the end of ALTER TABLE...IMPORT TABLESPACE. This refinement was suggested by Debarun Banerjee. The changes outside InnoDB were developed by Michael 'Monty' Widenius: Added print_check_msg() service for easy reporting of check/repair messages in ENGINE=Aria and ENGINE=InnoDB. Fixed that CHECK TABLE do not update the .frm file under --read-only. Added 'handler_flags' to HA_CHECK_OPT as a way for storage engines to store state from handler::check_for_upgrade(). Reviewed by: Debarun Banerjee	2024-02-08 10:35:45 +02:00
Marko Mäkelä	85db534731	MDEV-33400 Adaptive hash index corruption after DISCARD TABLESPACE row_discard_tablespace(): Do not invoke dict_index_t::clear_instant_alter() because that would corrupt any adaptive hash index entries in the table. row_import_for_mysql(): Invoke dict_index_t::clear_instant_alter() after detaching any adaptive hash index entries.	2024-02-08 09:17:47 +01:00
Marko Mäkelä	b2654ba826	MDEV-32899 InnoDB is holding shared dict_sys.latch while waiting for FOREIGN KEY child table lock on DDL lock_table_children(): A new function to lock all child tables of a table. We will only hold dict_sys.latch while traversing dict_table_t::referenced_set. To prevent a race condition with std::set::erase() we will copy the pointers to the child tables to a local vector. Once we have acquired MDL and references to all child tables, we can safely release dict_sys.latch, wait for the locks, and finally release the references. dict_acquire_mdl_shared(): A new variant that takes mdl_context as a parameter. lock_table_for_trx(): Assert that we are not holding dict_sys.latch. ha_innobase::truncate(): When foreign_key_checks=ON, assert that no child tables exist (other than the current table). In any case, we will invoke lock_table_children() so that the child table metadata can be safely updated. (It is possible that a child table is being created concurrently with TRUNCATE TABLE.) ha_innobase::delete_table(): Before and after acquiring exclusive locks on the current table as well as all child tables, check that FOREIGN KEY constraints will not be violated. In this way, we can reject impossible DROP TABLE without having to wait for locks first. This fixes up commit `2ca1123464` (MDEV-26217) and commit `c3c53926c4` (MDEV-26554).	2024-02-08 14:22:35 +11:00
Marko Mäkelä	5f2dcd112b	MDEV-24167 fixup: srw_lock_debug instrumentation While the index_lock and block_lock include debug instrumentation to keep track of shared lock holders, such instrumentation was never part of the simpler srw_lock, and therefore some users of the class implemented a limited form of bookkeeping. srw_lock_debug encapsulates srw_lock and adds the data members writer, readers_lock, and readers to keep track of the threads that hold the exclusive latch or any shared latches. The debug checks are available also with SUX_LOCK_GENERIC (in environments that do not implement a futex-like system call). dict_sys_t::latch: Use srw_lock_debug in debug builds. This makes the debug fields latch_ex, latch_readers redundant. fil_space_t::latch: Use srw_lock_debug in debug builds. This makes the debug field latch_count redundant. The field latch_owner must be preserved, because fil_space_t::is_owner() is being used in all builds. lock_sys_t::latch: Use srw_lock_debug in debug builds. This makes the debug fields writer, readers redundant. lock_sys_t::is_holder(): A new debug predicate to check if the current thread is holding lock_sys.latch in any mode. trx_rseg_t::latch: Use srw_lock_debug in debug builds.	2024-02-08 14:22:35 +11:00
mariadb-DebarunBanerjee	5e7047067e	MDEV-33274 The test encryption.innodb-redo-nokeys often fails If we fail to open a tablespace while looking for FILE_CHECKPOINT, we set the corruption flag. Specifically, if encryption key is missing, we would not be able to open an encrypted tablespace and the flag could be set. We miss checking for this flag and report "Missing FILE_CHECKPOINT" Address review comment to improve the test. Flush pages before starting no-checkpoint block. It should improve the number of cases where the test is skipped because some intermediate checkpoint is triggered.	2024-02-08 08:13:16 +05:30
mariadb-DebarunBanerjee	fb9da7f751	MDEV-33023 Crash in mariadb-backup --prepare --export after --prepare mariadb-backup with --prepare option could result in empty redo log file. When --prepare is followed by --prepare --export, we exit early in srv_start function without opening the ibdata1 tablespace. Later while trying to read rollback segment header page, we hit the debug assert which claims that the system space should already have been opened. There are two assert cases here. Issue-1: System tablespace object is not there in fil space hash i.e. srv_sys_space.open_or_create() is not called. Issue-2: The system tablespace data file ibdata1 is not opened i.e. fil_system.sys_space->open() is not called. Fix: For empty redo log and restore operation, open system tablespace before returning.	2024-02-07 23:12:15 +05:30
Marko Mäkelä	8d54d173d7	Cleanup: Remove ut_format_name() This follows up commit `383f77cd84` which simplified dict_table_schema_check(). Note: We can display quoted names like this: my_snprintf(buf, sizeof buf, "%`.*s.%`s", int(t->name.dblen()), t->name.m_name, t->name.basename());	2024-02-07 13:56:31 +02:00
Marko Mäkelä	91a2192bf2	Merge 10.5 into 10.6	2024-02-07 13:51:03 +02:00
Daniel Black	e06b159f02	MDEV-33397: Innodb include OS error information when failing to write to iblogfileX Without an OS error it could one of the many errors from write.	2024-02-07 17:27:35 +11:00
Oleksandr Byelkin	8e7314992f	Merge branch '10.5' into mariadb-10.5.24	2024-02-06 18:29:14 +01:00
mariadb-DebarunBanerjee	66bb229e91	MDEV-18288 Transportable Tablespaces leave AUTO_INCREMENT in mismatched state, causing INSERT errors in newly imported tables when .cfg is not used. During import, if cfg file is not specified, we don't update the autoinc field in innodb dictionary object dict_table_t. The next insert tries to insert from the starting position of auto increment and fails. It can be observed that the issue is resolved once server is restarted as the persistent value is read correctly from PAGE_ROOT_AUTO_INC from index root page. The patch fixes the issue by reading the the auto increment value directly from PAGE_ROOT_AUTO_INC during import if cfg file is not specified. Test Fix: 1. import_bugs.test: Embedded mode warning has absolute path. Regular expression replacement in test. 2. full_crc32_import.test: Table level auto increment mismatch after import. It was using the auto increment data from the table prior to discard and import which is not right. This value has cached auto increment value higher than the actual inserted value and value stored in PAGE_ROOT_AUTO_INC. Updated the result file and added validation for checking the maximum value of auto increment column.	2024-02-06 13:45:30 +05:30
Sergei Golubchik	3f6038bc51	Merge branch '10.5' into 10.6	2024-01-31 18:04:03 +01:00
Sergei Golubchik	01f6abd1d4	Merge branch '10.4' into 10.5	2024-01-31 17:32:53 +01:00
Thirunarayanan Balathandayuthapani	21f18bd9d7	MDEV-33341 innodb.undo_space_dblwr test case fails with Unknown Storage Engine InnoDB Reason: ====== undo_space_dblwr test case fails if the first page of undo tablespace is not flushed before restart the server. While restarting the server, InnoDB fails to detect the first page of undo tablespace from doublewrite buffer. Fix: === Use "ib_log_checkpoint_avoid_hard" debug sync point to avoid checkpoint and make sure to flush the dirtied page before killing the server. innodb_make_page_dirty(): Fails to set srv_fil_make_page_dirty_debug variable.	2024-01-31 15:55:09 +05:30
Marko Mäkelä	b7d1f65b81	MDEV-12266 fixup: Remove dead code Ever since commit `5e84ea9634` this "else if" branch was unreachable because the preceding "if" condition covered it.	2024-01-30 13:10:53 +02:00
Marko Mäkelä	bc2849579b	MDEV-33251 Redundant check on prebuilt::n_rows_fetched overflow row_search_mvcc(): Revise an overflow check, disabling it on 64-bit systems. The maximum number of consecutive record reads in a key range scan should be limited by the maximum number of records per page (less than 2^13) and the maximum number of pages per tablespace (2^32) to less than 2^45. On 32-bit systems we can simplify the overflow check. Reviewed by: Vladislav Lesin	2024-01-30 13:10:46 +02:00
Marko Mäkelä	495e7f1b3d	MDEV-33053 fixup: Correct a condition before a message	2024-01-22 08:24:08 +02:00
Marko Mäkelä	5c243d4caf	Merge 10.5 into 10.6	2024-01-22 08:20:08 +02:00
Daniel Black	0c23f84d8d	MDEV-32983 cosmetic improvement on path separator near ib_buffer_pool A mix of path separators looks odd. InnoDB: Loading buffer pool(s) from C:\xampp\mysql\data/ib_buffer_pool This was changed in `cf552f5886` Both forward slashes and backward slashes work on Windows. We do not use \\?\ names. So we improve the consistent look of it so it doesn't look like a bug. Normalize, in this case, the path separator to \ for making the filename. Reported thanks to Github user @celestinoxp. Closes: https://github.com/ApacheFriends/xampp-build/issues/33 Reviewed by: Marko Mäkelä and Vladislav Vaintroub	2024-01-22 16:56:00 +11:00
Thirunarayanan Balathandayuthapani	7573fe8b07	MDEV-32968 InnoDB fails to restore tablespace first page from doublewrite buffer when page is empty recv_dblwr_t::find_first_page(): Free the allocated memory to read the first 3 pages from tablespace. innodb.doublewrite: Added sleep to ensure page cleaner thread wake up from my_cond_wait	2024-01-19 17:01:36 +05:30
Marko Mäkelä	21560bee9d	Revert "MDEV-32899 InnoDB is holding shared dict_sys.latch while waiting for FOREIGN KEY child table lock on DDL" This reverts commit `569da6a7ba`, commit `768a736174`, and commit `ba6bf7ad9e` because of a regression that was filed as MDEV-33104.	2024-01-19 12:46:11 +02:00
Marko Mäkelä	7e65e3027e	MDEV-33275 buf_flush_LRU(): mysql_mutex_assert_owner(&buf_pool.mutex) failed In commit `a55b951e60` (MDEV-26827) an error was introduced in a rarely executed code path of the buf_flush_page_cleaner() thread. As a result, the function buf_flush_LRU() could be invoked while not holding buf_pool.mutex. Reviewed by: Debarun Banerjee	2024-01-19 12:40:32 +02:00
Marko Mäkelä	d34479dc66	MDEV-33053 InnoDB LRU flushing does not run before running out of buffer pool buf_flush_LRU(): Display a warning if no pages could be evicted and no writes initiated. buf_pool_t::need_LRU_eviction(): Renamed from buf_pool_t::ran_out(). Check if the amount of free pages is smaller than innodb_lru_scan_depth instead of checking if it is 0. buf_flush_page_cleaner(): For the final LRU flush after a checkpoint flush, use a "budget" of innodb_io_capacity_max, like we do in the case when we are not in "furious" checkpoint flushing. Co-developed by: Debarun Banerjee Reviewed by: Debarun Banerjee Tested by: Matthias Leich	2024-01-19 12:40:16 +02:00
Daniel Black	82e8633420	innodb: IO Error message missing space Noted by Susmeet Khaire - thanks.	2024-01-19 17:33:44 +11:00
Marko Mäkelä	a6290a5bc5	MDEV-33095 innodb_flush_method=O_DIRECT creates excessive errors on Solaris The directio(3C) function on Solaris is supported on NFS and UFS while the majority of users should be on ZFS, which is a copy-on-write file system that implements transparent compression and therefore cannot support unbuffered I/O. Let us remove the call to directio() and simply treat innodb_flush_method=O_DIRECT in the same way as the previous default value innodb_flush_method=fsync on Solaris. Also, let us remove some dead code around calls to os_file_set_nocache() on platforms where fcntl(2) is not usable with O_DIRECT. On IBM AIX, O_DIRECT is not documented for fcntl(2), only for open(2).	2024-01-19 15:34:33 +11:00
Marko Mäkelä	ee1407f74d	MDEV-32268: GNU libc posix_fallocate() may be extremely slow os_file_set_size(): Let us invoke the Linux system call fallocate(2) directly, because the GNU libc posix_fallocate() implements a fallback that writes to the file 1 byte every 4096 or fewer bytes. In one environment, invoking fallocate() directly would lead to 4 times the file growth rate during ALTER TABLE. Presumably, what happened was that the NFS server used a smaller allocation block size than 4096 bytes and therefore created a heavily fragmented sparse file when posix_fallocate() was used. For example, extending a file by 4 MiB would create 1,024 file fragments. When the file is actually being written to with data, it would be "unsparsed". The built-in EOPNOTSUPP fallback in os_file_set_size() writes a buffer of 1 MiB of NUL bytes. This was always used on musl libc and other Linux implementations of posix_fallocate().	2024-01-18 11:00:27 +02:00
Marko Mäkelä	f63045b119	MDEV-33213 fixup: GCC 5 -Wconversion	2024-01-18 10:14:21 +02:00
Marko Mäkelä	3a96eba25f	Merge 10.5 into 10.6	2024-01-17 13:35:05 +02:00
Marko Mäkelä	f8c88d905b	MDEV-33213 History list is not shrunk unless there is a pause in the workload The parameter innodb_undo_log_truncate=ON enables a multi-phased logic: 1. Any "producers" (new starting transactions) are prohibited from using the rollback segments that reside in the undo tablespace. 2. Any transactions that use any of the rollback segments must be committed or aborted. 3. The purge of committed transaction history must process all the rollback segments. 4. The undo tablespace is truncated and rebuilt. 5. The rollback segments are re-enabled for new transactions. There was one flaw in this logic: The first step was not being invoked as often as it could be, and therefore innodb_undo_log_truncate=ON would have no chance to work during a heavy write workload. Independent of innodb_undo_log_truncate, even after commit `86767bcc0f` we are missing some chances to free processed undo log pages. If we prohibited the creation of new transactions in one busy rollback segment at a time, we would be eventually guaranteed to be able to free such pages. purge_sys_t::skipped_rseg: The current candidate rollback segment for shrinking the history independent of innodb_undo_log_truncate. purge_sys_t::iterator::free_history_rseg(): Renamed from trx_purge_truncate_rseg_history(). Implement the logic around purge_sys.m_skipped_rseg. purge_sys_t::truncate_undo_space: Renamed from truncate. purge_sys.truncate_undo_space.last: Changed the type to integer to get rid of some pointer dereferencing and conditional branches. purge_sys_t::truncating_tablespace(), purge_sys_t::undo_truncate_try(): Refactored from trx_purge_truncate_history(). Set purge_sys.truncate_undo_space.current if applicable, or return an already set purge_sys.truncate_undo_space.current. purge_coordinator_state::do_purge(): Invoke purge_sys_t::truncating_tablespace() as part of the normal work loop, to implement innodb_undo_log_truncate=ON as often as possible. trx_purge_truncate_rseg_history(): Remove a redundant parameter. trx_undo_truncate_start(): Replace dead code with a debug assertion. Correctness tested by: Matthias Leich Performance tested by: Axel Schwenke Reviewed by: Debarun Banerjee	2024-01-17 11:14:24 +02:00

... 3 4 5 6 7 ...

10,230 commits