mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 12:02:42 +01:00

Author	SHA1	Message	Date
Oleksandr Byelkin	9e1fb104a3	MariaDB 11.4.4 release -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmck77AACgkQ8WVvJMdM 0dgccQ/+Lls8fWt4D+gMPP7x+drJSO/IE/gZFt3ugbWF+/p3B2xXAs5AAE83wxEh QSbp4DCkb/9PnuakhLmzg0lFbxMUlh4rsJ1YyiuLB2J+YgKbAc36eQQf+rtYSipd DT5uRk36c9wOcOXo/mMv4APEvpPXBIBdIL4VvpKFbIOE7xT24Sp767zWXdXqrB1f JgOQdM2ct+bvSPC55oZ5p1kqyxwvd6K6+3RB3CIpwW9zrVSLg7enT3maLjj/761s jvlRae+Cv+r+Hit9XpmEH6n2FYVgIJ3o3WhdAHwN0kxKabXYTg7OCB7QxDZiUHI9 C/5goKmKaPB1PCQyuTQyLSyyK9a8nPfgn6tqw/p/ZKDQhKT9sWJv/5bSWecrVndx LLYifSTrFC/eXLzgPvCnNv/U8SjsZaAdMIKS681+qDJ0P5abghUIlGnMYTjYXuX1 1B6Vrr0bdrQ3V1CLB3tpkRjpUvicrsabtuAUAP65QnEG2G9UJXklOer+DE291Gsl f1I0o6C1zVGAOkUUD3QEYaHD8w7hlvyfKme5oXKUm3DOjaAar5UUKLdr6prxRZL4 ebhmGEy42Mf8fBYoeohIxmxgvv6h2Xd9xCukgPp8hFpqJGw8abg7JNZTTKH4h2IY J51RpD10h4eoi6WRn3opEcjexTGvZ+xNR7yYO5WxWw6VIre9IUA= =s+WW -----END PGP SIGNATURE----- Merge tag '11.4' into 11.6 MariaDB 11.4.4 release	2024-11-08 07:17:00 +01:00
Oleksandr Byelkin	69d033d165	Merge branch '10.11' into 11.2	2024-10-29 16:42:46 +01:00
Oleksandr Byelkin	3d0fb15028	Merge branch '10.6' into 10.11	2024-10-29 15:24:38 +01:00
Marko Mäkelä	7701ccb72d	MDEV-35149 Race condition around SET GLOBAL innodb_lru_scan_depth A debug assertion in buf_LRU_get_free_block() could fail if SET GLOBAL innodb_lru_scan_depth is being executed during a workload that involves allocating buffer pool pages. buf_pool_t::LRU_scan_depth: Replaces srv_LRU_scan_depth. buf_pool_t::flush_neighbors: Replaces srv_flush_neighbors. innodb_buf_pool_update<T>(): Update a parameter of buf_pool while holding buf_pool.mutex.	2024-10-21 10:08:58 +03:00
Marko Mäkelä	ebefef658e	Merge 10.11 into 11.2	2024-10-18 11:32:22 +03:00
Marko Mäkelä	8a6a4c947a	Cleanup: Replace some is_predefined_tablespace() In some places, there were redundant comparisons against TRX_SYS_SPACE or SRV_TMP_SPACE_ID. The temporary tablespace is never the subject of log-based recovery. Also, consistently check for SRV_SPACE_ID_UPPER_BOUND. Reviewed by: Debarun Barerjee	2024-10-04 13:41:12 +03:00
Marko Mäkelä	43465352b9	Merge 11.4 into 11.6	2024-10-03 16:09:56 +03:00
Marko Mäkelä	12a91b57e2	Merge 10.11 into 11.2	2024-10-03 13:24:43 +03:00
Yuchen Pei	ba7088d462	Merge '11.4' into 11.6	2024-10-03 15:59:20 +10:00
Marko Mäkelä	6acada713a	MDEV-34062: Implement innodb_log_file_mmap on 64-bit systems When using the default innodb_log_buffer_size=2m, mariadb-backup --backup would spend a lot of time re-reading and re-parsing the log. For reads, it would be beneficial to memory-map the entire ib_logfile0 to the address space (typically 48 bits or 256 TiB) and read it from there, both during --backup and --prepare. We will introduce the Boolean read-only parameter innodb_log_file_mmap that will be OFF by default on most platforms, to avoid aggressive read-ahead of the entire ib_logfile0 in when only a tiny portion would be accessed. On Linux and FreeBSD the default is innodb_log_file_mmap=ON, because those platforms define a specific mmap(2) option for enabling such read-ahead and therefore it can be assumed that the default would be on-demand paging. This parameter will only have impact on the initial InnoDB startup and recovery. Any writes to the log will use regular I/O, except when the ib_logfile0 is stored in a specially configured file system that is backed by persistent memory (Linux "mount -o dax"). We also experimented with allowing writes of the ib_logfile0 via a memory mapping and decided against it. A fundamental problem would be unnecessary read-before-write in case of a major page fault, that is, when a new, not yet cached, virtual memory page in the circular ib_logfile0 is being written to. There appears to be no way to tell the operating system that we do not care about the previous contents of the page, or that the page fault handler should just zero it out. Many references to HAVE_PMEM have been replaced with references to HAVE_INNODB_MMAP. The predicate log_sys.is_pmem() has been replaced with log_sys.is_mmap() && !log_sys.is_opened(). Memory-mapped regular files differ from MAP_SYNC (PMEM) mappings in the way that an open file handle to ib_logfile0 will be retained. In both code paths, log_sys.is_mmap() will hold. Holding a file handle open will allow log_t::clear_mmap() to disable the interface with fewer operations. It should be noted that ever since commit `685d958e38` (MDEV-14425) most 64-bit Linux platforms on our CI platforms (s390x a.k.a. IBM System Z being a notable exception) read and write /dev/shm/*/ib_logfile0 via a memory mapping, pretending that it is persistent memory (mount -o dax). So, the memory mapping based log parsing that this change is enabling by default on Linux and FreeBSD has already been extensively tested on Linux. ::log_mmap(): If a log cannot be opened as PMEM and the desired access is read-only, try to open a read-only memory mapping. xtrabackup_copy_mmap_snippet(), xtrabackup_copy_mmap_logfile(): Copy the InnoDB log in mariadb-backup --backup from a memory mapped file.	2024-09-26 18:47:12 +03:00
Marko Mäkelä	f0de610d0c	Merge 10.11 into 11.2	2024-09-10 18:35:16 +03:00
Marko Mäkelä	f9f92b480e	Merge 10.6 into 10.11	2024-09-06 16:17:42 +03:00
Marko Mäkelä	9f0b106631	MDEV-34845 buf_flush_buffer_pool(): Assertion `!os_aio_pending_reads()' failed buf_flush_buffer_pool(): Wait for any pending asynchronous reads to complete. This assertion failed in a run where buf_read_ahead_linear() had been triggered in an SQL statement that was executed right before shutdown. Reviewed by: Debarun Banerjee	2024-09-03 18:22:10 +03:00
Alexander Barkov	a2a5ba14a8	Merge remote-tracking branch 'origin/11.5' into 11.6	2024-07-10 19:18:52 +04:00
Vladislav Vaintroub	186a1afe63	MDEV-32537 due to Linux, restrict thread name to 15 characters, also in PS. Rename some threads to workaround this restrictions, e.g "rpl_parallel_thread"->"rpl_parallel", "slave_background" -> "slave_bg" etc.	2024-07-09 13:20:49 +02:00
Vladislav Vaintroub	584fc85e21	MDEV-32537 Name threads to improve debugging experience and diagnostics. Use SetThreadDescription/pthread_setname_np to give threads a name.	2024-07-09 13:17:20 +02:00
Oleksandr Byelkin	2447dda2c0	Merge branch '10.11' into 11.1	2024-07-08 22:40:16 +02:00
Marko Mäkelä	4ca355d863	MDEV-33894: Resurrect innodb_log_write_ahead_size As part of commit `685d958e38` (MDEV-14425) the parameter innodb_log_write_ahead_size was removed, because it was thought that determining the physical block size would be a sufficient replacement. However, we can only determine the physical block size on Linux or Microsoft Windows. On some file systems, the physical block size is not relevant. For example, XFS uses a block size of 4096 bytes even if the underlying block size may be smaller. On Linux, we failed to determine the physical block size if innodb_log_file_buffered=OFF was not requested or possible. This will be fixed. log_sys.write_size: The value of the reintroduced parameter innodb_log_write_ahead_size. To keep it simple, this is read-only and a power of two between 512 and 4096 bytes, so that the previous alignment guarantees are fulfilled. This will replace the previous log_sys.get_block_size(). log_sys.block_size, log_t::get_block_size(): Remove. log_t::set_block_size(): Ensure that write_size will not be less than the physical block size. There is no point to invoke this function with 512 or less, because that is the minimum value of write_size. innodb_params_adjust(): Add some disabled code for adjusting the minimum value and default value of innodb_log_write_ahead_size to reflect the log_sys.write_size. log_t::set_recovered(): Mark the recovery completed. This is the place to adjust some things if we want to allow write_size>4096. log_t::resize_write_buf(): Refer to write_size. log_t::resize_start(): Refer to write_size instead of get_block_size(). log_write_buf(): Simplify some arithmetics and remove a goto. log_t::write_buf(): Refer to write_size. If we are writing less than that, do not switch buffers, but keep writing to the same buffer. Move some code to improve the locality of reference. recv_scan_log(): Refer to write_size instead of get_block_size(). os_file_create_func(): For type==OS_LOG_FILE on Linux, always invoke os_file_log_maybe_unbuffered(), so that log_sys.set_block_size() will be invoked even if we are not attempting to use O_DIRECT. recv_sys_t::find_checkpoint(): Read the entire log header in a single 12 KiB request into log_sys.buf. Tested with: ./mtr --loose-innodb-log-write-ahead-size=4096 ./mtr --loose-innodb-log-write-ahead-size=2048	2024-06-27 16:38:08 +03:00
Marko Mäkelä	d34289a3e2	Merge 10.11 into 11.1	2024-06-17 09:21:50 +03:00
Marko Mäkelä	b81d717387	Merge 10.6 into 10.11	2024-06-11 12:50:10 +03:00
mariadb-DebarunBanerjee	b12c14e3b4	MDEV-34265 Possible hang during IO burst with innodb_flush_sync enabled When checkpoint age goes beyond the sync flush threshold and buf_flush_sync_lsn is set, page cleaner enters into "furious flush" stage to aggressively flush dirty pages from flush list and pull checkpoint LSN above safe margin. In this stage, page cleaner skips doing LRU flush and eviction. In 10.6, all other threads entirely rely on page cleaner to generate free pages. If free pages get over while page cleaner is busy in "furious flush" stage, a session thread could wait for free page in the middle of a min-transaction(mtr) while holding latches on other pages. It, in turn, can prevent page cleaner to flush such pages preventing checkpoint LSN to move forward creating a deadlock situation. Even otherwise, it could create a stall and hang like situation for large BP with plenty of dirty pages to flush before the stage could finish. Fix: During furious flush, check and evict LRU pages after each flush iteration.	2024-06-05 18:11:29 +05:30
Yuchen Pei	2d3e2c58b6	Merge branch '10.11' into 11.1	2024-05-31 10:54:31 +10:00
Marko Mäkelä	22ba7e4ff8	Merge 10.6 into 10.11	2024-05-30 16:04:00 +03:00
Marko Mäkelä	5ba542e9ee	Merge 10.5 into 10.6	2024-05-30 14:27:07 +03:00
mariadb-DebarunBanerjee	b2944adb76	MDEV-34166 Server could hang with BP < 80M under stress BUF_LRU_MIN_LEN (256) is too high value for low buffer pool(BP) size. For example, for BP size lower than 80M and 16 K page size, the limit is more than 5% of total BP and for lowest BP 5M, it is 80% of the BP. Non-data objects like explicit locks could occupy part of the BP pool reducing the pages available for LRU. If LRU reaches minimum limit and if no free pages are available, server would hang with page cleaner not able to free any more pages. Fix: To avoid such hang, we adjust the LRU limit lower than the limit for data objects as checked in buf_LRU_check_size_of_non_data_objects() i.e. one page less than 5% of BP.	2024-05-21 14:13:29 +05:30
Sergei Golubchik	f9807aadef	Merge branch '10.11' into 11.0	2024-05-12 12:18:28 +02:00
Sergei Golubchik	a6b2f820e0	Merge branch '10.6' into 10.11	2024-05-10 20:02:18 +02:00
Sergei Golubchik	7b53672c63	Merge branch '10.5' into 10.6	2024-05-08 20:06:00 +02:00
Vladislav Vaintroub	b18259ecf5	Compiling - Fix MSVC compile warnings, on x86	2024-05-03 22:00:56 +02:00
Marko Mäkelä	3f9f5ca48e	MDEV-33447: libpmem is not available in RHEL 8 Because the Red Hat Enterprise Linux 8 core repository does not include libpmem, let us implement the necessary subset ourselves. pmem_persist(): Implement for 64-bit x86, ARM, POWER, RISC-V, Loongarch in a way that should be compatible with the https://github.com/pmem/pmdk/ implementation of pmem_persist(). The CMake option WITH_INNODB_PMEM can be used for enabling or disabling this interface at compile time. By default, it is enabled on all applicable systems that are covered by our CI system. Note: libpmem had not been previously enabled for Loongarch in our Debian packaging. It was enabled for RISC-V, but we will not enable it by default on RISC-V or Loongarch because we lack CI coverage. The generated code for x86_64 was reviewed and tested on two Intel implementations: one that only supports clflush, and another that supports both clflushopt and clwb. The generated machine code was also reviewed on https://godbolt.org using various compiler versions. Godbolt helpfully includes an option to compile to binary code and display the encoding, which was useful on POWER. Reviewed by: Vladislav Vaintroub	2024-04-19 10:54:08 +03:00
Marko Mäkelä	1122ac978e	MDEV-33545: Improve innodb_doublewrite to cover NO_FSYNC In commit `24648768b4` (MDEV-30136) the parameter innodb_flush_method was deprecated, with no direct replacement for innodb_flush_method=O_DIRECT_NO_FSYNC. Let us change innodb_doublewrite from Boolean to ENUM that can be changed while the server is running: OFF: Assume that writes of innodb_page_size are atomic ON: Prevent torn writes (the default) fast: Like ON, but avoid synchronizing writes to data files The deprecated start-up parameter innodb_flush_method=NO_FSYNC will cause innodb_doublewrite=ON to be changed to innodb_doublewrite=fast, which will prevent InnoDB from making any durable writes to data files. This would normally be done right before the log checkpoint LSN is updated. Depending on the file systems being used and their configuration, this may or may not be safe. The value innodb_doublewrite=fast differs from the previous combination of innodb_doublewrite=ON and innodb_flush_method=O_DIRECT_NO_FSYNC by always invoking os_file_flush() on the doublewrite buffer itself in buf_dblwr_t::flush_buffered_writes_completed(). This should be safer when there are multiple doublewrite batches between checkpoints. Typically, once per second, buf_flush_page_cleaner() would write out up to innodb_io_capacity pages and advance the log checkpoint. Also typically, innodb_io_capacity>128, which is the size of the doublewrite buffer in pages. Should os_file_flush_func() not be invoked between doublewrite batches, writes could be reordered in an unsafe way. The setting innodb_doublewrite=fast could be safe when the doublewrite buffer (the first file of the system tablespace) and the data files reside in the same file system. This was tested by running "./mtr --rr innodb.alter_kill". On the first server startup, with innodb_doublewrite=fast, os_file_flush_func() would only be invoked on the ibdata1 file and possibly ib_logfile0. On subsequent startups with innodb_doublewrite=OFF, os_file_flush_func() will be invoked on the individual data files during log_checkpoint(). Note: The setting debug_no_sync (in the code, my_disable_sync) would disable all durable writes to InnoDB files, which would be much less safe. IORequest::Type: Introduce special values WRITE_DBL and PUNCH_DBL for asynchronous writes that are submitted via the doublewrite buffer. In this way, fil_space_t::use_doublewrite() or buf_dblwr.in_use() will only be consulted during buf_page_t::flush() and the doublewrite buffer can be enabled or disabled without any fear of inconsistency. buf_dblwr_t::block_size: Replaces block_size(). buf_dblwr_t::flush_buffered_writes(): If !in_use() and the doublewrite buffer is empty, just invoke fil_flush_file_spaces() and return. The doublewrite buffer could have been disabled while a batch was in progress. innodb_init_params(): If innodb_flush_method=O_DIRECT_NO_FSYNC, set innodb_doublewrite=fast or innodb_doublewrite=fearless. Thanks to Mark Callaghan for reporting this, and Vladislav Vaintroub for feedback.	2024-04-04 08:12:54 +03:00
Marko Mäkelä	fec2fd6add	Merge 10.11 into 11.0	2024-03-28 10:51:36 +02:00
Marko Mäkelä	788953463d	Merge 10.6 into 10.11 Some fixes related to commit `f838b2d799` and Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row() for system-versioned tables were provided by Nikita Malyavin. This was required by test versioning.rpl,trx_id,row.	2024-03-28 09:16:57 +02:00
Marko Mäkelä	ccb7a1e9a1	Merge 10.5 into 10.6	2024-03-27 15:00:56 +02:00
Marko Mäkelä	f0590db5c5	MDEV-33591 MONITOR_INC_VALUE_CUMULATIVE is executed regardless of "if" condition MONITOR_INC_VALUE_CUMULATIVE is a multiline macro, so the second statement will be executed always, regardless of "if" condition. These problems first started with commit `b1ab211dee` (MDEV-15053). Thanks to Yury Chaikou from ServiceNow for the report.	2024-03-22 14:57:00 +02:00
Marko Mäkelä	fa8a46eb68	MDEV-33613 InnoDB may still hang when temporarily running out of buffer pool By design, InnoDB has always hung when permanently running out of buffer pool, for example when several threads are waiting to allocate a block, and all of the buffer pool is buffer-fixed by the active threads. The hang that we are fixing here occurs when the buffer pool is only temporarily running out and the situation could be rescued by writing out some dirty pages or evicting some clean pages. buf_LRU_get_free_block(): Simplify the way how we wait for the buf_flush_page_cleaner thread. This fixes occasional hangs of the test encryption.innochecksum that were introduced by commit `a55b951e60` (MDEV-26827). To play it safe, we use a timed wait when waiting for the buf_flush_page_cleaner() thread to perform its job. Should that thread get stuck, we will invoke buf_pool.LRU_warn() in order to display a message that pages could not be freed, and keep trying to wake up the buf_flush_page_cleaner() thread. The INFORMATION_SCHEMA.INNODB_METRICS counters buffer_LRU_single_flush_failure_count and buffer_LRU_get_free_waits will be removed. The latter is represented by buffer_pool_wait_free. Also removed will be the message "InnoDB: Difficult to find free blocks in the buffer pool" because in `d34479dc66` we introduced a more precise message "InnoDB: Could not free any blocks in the buffer pool" in the buf_flush_page_cleaner thread. buf_pool_t::LRU_warn(): Issue the warning message that we could not free any blocks in the buffer pool. This may also be invoked by buf_LRU_get_free_block() if buf_flush_page_cleaner() appears to be stuck. buf_pool_t::n_flush_dec(): Remove. buf_pool_t::n_flush_dec_holding_mutex(): Rename to n_flush_dec(). buf_flush_LRU_list_batch(): Increment the eviction counter for blocks of temporary, discarded or dropped tablespaces. buf_flush_LRU(): Make static, and remove the constant parameter evict=false. The only caller will be the buf_flush_page_cleaner() thread. IORequest::is_LRU(): Remove. The only case of evicting pages on write completion will be when we are writing out pages of the temporary tablespace. Those pages are not in buf_pool.flush_list, only in buf_pool.LRU. buf_page_t::flush(): Remove the parameter evict. buf_page_t::write_complete(): Change the parameter "bool temporary" to "bool persistent" and add a parameter for an already read state(). Reviewed by: Debarun Banerjee	2024-03-22 14:17:39 +02:00
Marko Mäkelä	bf0b82d24b	MDEV-33515 log_sys.lsn_lock causes excessive context switching The log_sys.lsn_lock is a very contended resource with a small critical section in log_sys.append_prepare(). On many processor microarchitectures, replacing the system call based log_sys.lsn_lock with a pure spin lock would fare worse during high concurrency workloads, wasting a significant amount of CPU cycles in the spin loop. On other microarchitectures, we would see a significant amount of time being spent in native_queued_spin_lock_slowpath() in the Linux kernel, plus context switching between user and kernel address space. This was pointed out by Steve Shaw from Intel Corporation. Depending on the workload and the hardware implementation, it may be useful to use a pure spin lock in log_sys.append_prepare(). We will introduce a parameter. The statement SET GLOBAL INNODB_LOG_SPIN_WAIT_DELAY=50; would enable a spin lock that will execute that many MY_RELAX_CPU() operations (such as the x86 PAUSE instruction) between successive attempts of acquiring the spin lock. The use of a system call based log_sys.lsn_lock (which is the default setting) can be enabled by SET GLOBAL INNODB_LOG_SPIN_WAIT_DELAY=0; This patch will also introduce #ifdef LOG_LATCH_DEBUG (part of cmake -DWITH_INNODB_EXTRA_DEBUG=ON) for more accurate tracking of log_sys.latch ownership and reorganize the fields of log_sys to improve the locality of reference and to reduce the chances of false sharing. When a spin lock is being used, it will be maintained in the most significant bit of log_sys.buf_free. This is useful, because that is one of the fields that is covered by the lock. For IA-32 or AMD64, we implement the spin lock specially via log_t::lsn_lock_bts(), employing the i386 LOCK BTS instruction. A straightforward std::atomic::fetch_or() would translate into an inefficient loop around LOCK CMPXCHG. mtr_t::spin_wait_delay: The value of innodb_log_spin_wait_delay. mtr_t::finisher: Pointer to the currently used mtr_t::finish_write() implementation. This allows to avoid introducing conditional branches. We no longer invoke log_sys.is_pmem() at the mini-transaction level, but we would do that in log_write_up_to(). mtr_t::finisher_update(): Update finisher when spin_wait_delay is changed from or to 0 (the spin lock is changed to log_sys.lsn_lock or vice versa).	2024-03-22 12:29:01 +02:00
Marko Mäkelä	0772ac1f16	MDEV-33508 Performance regression due to frequent scan of full buf_pool.flush_list buf_flush_page_cleaner(): Remove a loop that had originally been added in commit `9d1466522e` (MDEV-32029) and made redundant by commit `5b53342a6a` (MDEV-32588). Starting with commit `d34479dc66` (MDEV-33053) this loop would cause a significant performance regression in workloads where buf_pool.need_LRU_eviction() constantly holds in buf_flush_page_cleaner(). Thanks to Steve Shaw of Intel for noticing this. Reviewed by: Debarun Banerjee Tested by: Matthias Leich	2024-02-28 12:48:38 +02:00
Marko Mäkelä	8155342a96	Merge 10.11 into 11.0	2024-02-20 15:31:18 +02:00
Marko Mäkelä	3dd7b0a80c	Cleanup: Remove OS_FILE_ON_ERROR_NO_EXIT Ever since commit `412ee0330c` or commit `a440d6ed3a` InnoDB should generally not abort when failing to open or create files. In Datafile::open_or_create() we had failed to set the flag to avoid abort() on failure, but everywhere else we were setting it. We may still call abort() via os_file_handle_error(). Reviewed by: Vladislav Vaintroub	2024-02-20 11:22:52 +02:00
Marko Mäkelä	35cc4b6c05	Merge 10.11 into 11.0	2024-01-22 10:10:50 +02:00
Marko Mäkelä	b3ca7fa089	Merge 10.6 into 10.11	2024-01-22 08:49:04 +02:00
Marko Mäkelä	495e7f1b3d	MDEV-33053 fixup: Correct a condition before a message	2024-01-22 08:24:08 +02:00
Marko Mäkelä	7e65e3027e	MDEV-33275 buf_flush_LRU(): mysql_mutex_assert_owner(&buf_pool.mutex) failed In commit `a55b951e60` (MDEV-26827) an error was introduced in a rarely executed code path of the buf_flush_page_cleaner() thread. As a result, the function buf_flush_LRU() could be invoked while not holding buf_pool.mutex. Reviewed by: Debarun Banerjee	2024-01-19 12:40:32 +02:00
Marko Mäkelä	d34479dc66	MDEV-33053 InnoDB LRU flushing does not run before running out of buffer pool buf_flush_LRU(): Display a warning if no pages could be evicted and no writes initiated. buf_pool_t::need_LRU_eviction(): Renamed from buf_pool_t::ran_out(). Check if the amount of free pages is smaller than innodb_lru_scan_depth instead of checking if it is 0. buf_flush_page_cleaner(): For the final LRU flush after a checkpoint flush, use a "budget" of innodb_io_capacity_max, like we do in the case when we are not in "furious" checkpoint flushing. Co-developed by: Debarun Banerjee Reviewed by: Debarun Banerjee Tested by: Matthias Leich	2024-01-19 12:40:16 +02:00
Marko Mäkelä	c2da55ac01	Merge 10.11 into 11.0	2024-01-10 12:42:56 +02:00
Marko Mäkelä	1eb11da3e5	Merge 10.6 into 10.11	2024-01-10 12:37:19 +02:00
Marko Mäkelä	3613fb2aa8	MDEV-33112 innodb_undo_log_truncate=ON is blocking page write When innodb_undo_log_truncate=ON causes an InnoDB undo tablespace to be truncated, we must guarantee that the undo tablespace will be rebuilt atomically: After mtr_t::commit_shrink() has durably written the mini-transaction that rebuilds the undo tablespace, we must not write any old pages to the tablespace. To guarantee this, in trx_purge_truncate_history() we used to traverse the entire buf_pool.flush_list in order to acquire exclusive latches on all pages for the undo tablespace that reside in the buffer pool, so that those pages cannot be written and will be evicted during mtr_t::commit_shrink(). But, this traversal may interfere with the page writing activity of buf_flush_page_cleaner(). It would be better to lazily discard the old pages of the truncated undo tablespace. fil_space_t::is_being_truncated, fil_space_t::clear_stopping(): Remove. fil_space_t::create_lsn: A new field, identifying the LSN of the latest rebuild of a tablespace. buf_page_t::flush(), buf_flush_try_neighbors(): Evict pages whose FIL_PAGE_LSN is below fil_space_t::create_lsn. mtr_t::commit_shrink(): Update fil_space_t::create_lsn and fil_space_t::size right before the log is durably written and the tablespace file is being truncated. fsp_page_create(), trx_purge_truncate_history(): Simplify the logic. Reviewed by: Thirunarayanan Balathandayuthapani, Vladislav Lesin Performance tested by: Axel Schwenke Correctness tested by: Matthias Leich	2024-01-10 11:53:00 +02:00
Marko Mäkelä	6538a91e94	Merge 10.5 into 10.6	2024-01-08 14:39:56 +02:00
Marko Mäkelä	0b612619ef	MDEV-33098: Fix some instrumentation for innodb.doublewrite_debug buf_flush_page_cleaner(): A continue or break inside DBUG_EXECUTE_IF actually is a no-op. Use an explicit call to _db_keyword_() to actually avoid advancing the checkpoint. buf_flush_list_now_set(): Invoke os_aio_wait_until_no_pending_writes() to ensure that the page write to the system tablespace is completed.	2024-01-08 14:36:54 +02:00

1 2 3 4 5 ...

487 commits