mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-31 02:51:44 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	4ae105a37d	Merge 10.4 into 10.5	2023-12-18 08:59:07 +02:00
Thirunarayanan Balathandayuthapani	59a984b4d8	MDEV-32725 innodb.import_update_stats accesses uninitialized ib_table->stat_n_rows - InnoDB should write all zeros into a table and its indexes statistics members when table is unreadable.	2023-12-15 15:43:19 +05:30
Sergei Golubchik	98a39b0c91	Merge branch '10.4' into 10.5	2023-12-02 01:02:50 +01:00
Marko Mäkelä	47fc64c19f	MDEV-32833 InnoDB wrong error message trx_t::commit_in_memory(): Empty the detailed_error string, so that FOREIGN KEY error messages from an earlier transaction will not be wrongly reused in ha_innobase::get_error_message(). Reviewed by: Thirunarayanan Balathandayuthapani	2023-11-29 10:52:25 +02:00
Marko Mäkelä	78c9a12c8f	MDEV-32861 InnoDB hangs when running out of I/O slots When the constant OS_AIO_N_PENDING_IOS_PER_THREAD is changed from 256 to 1 and the server is run with the minimum parameters innodb_read_io_threads=1 and innodb_write_io_threads=2, two hangs were observed. tpool::cache<T>::put(T*): Ensure that get() in io_slots::acquire() will be woken up when the cache previously was empty. buf_pool_t::io_buf_t::reserve(): Schedule a possibly partial doublewrite batch so that os_aio_wait_until_no_pending_writes() has a chance of returning. Add a Boolean parameter and pass wait_for_reads=false inside buf_page_decrypt_after_read(), because those calls will be executed inside a read completion callback, and therefore os_aio_wait_until_no_pending_reads() would block indefinitely.	2023-11-22 16:54:41 +02:00
Marko Mäkelä	a3d0d5fc33	MDEV-26055: Improve adaptive flushing This is a 10.5 backport from 10.6 commit `9593cccf28`. Adaptive flushing is enabled by setting innodb_max_dirty_pages_pct_lwm>0 (not default) and innodb_adaptive_flushing=ON (default). There is also the parameter innodb_adaptive_flushing_lwm (default: 10 per cent of the log capacity). It should enable some adaptive flushing even when innodb_max_dirty_pages_pct_lwm=0. That is not being changed here. This idea was first presented by Inaam Rana several years ago, and I discussed it with Jean-François Gagné at FOSDEM 2023. buf_flush_page_cleaner(): When we are not near the log capacity limit (neither buf_flush_async_lsn nor buf_flush_sync_lsn are set), also try to move clean blocks from the buf_pool.LRU list to buf_pool.free or initiate writes (but not the eviction) of dirty blocks, until the remaining I/O capacity has been consumed. buf_flush_LRU_list_batch(): Add the parameter bool evict, to specify whether dirty least recently used pages (from buf_pool.LRU) should be evicted immediately after they have been written out. Callers outside buf_flush_page_cleaner() will pass evict=true, to retain the existing behaviour. buf_do_LRU_batch(): Add the parameter bool evict. Return counts of evicted and flushed pages. buf_flush_LRU(): Add the parameter bool evict. Assume that the caller holds buf_pool.mutex and will invoke buf_dblwr.flush_buffered_writes() afterwards. buf_flush_list_holding_mutex(): A low-level variant of buf_flush_list() whose caller must hold buf_pool.mutex and invoke buf_dblwr.flush_buffered_writes() afterwards. buf_flush_wait_batch_end_acquiring_mutex(): Remove. It is enough to have buf_flush_wait_batch_end(). page_cleaner_flush_pages_recommendation(): Avoid some floating-point arithmetics. buf_flush_page(), buf_flush_check_neighbor(), buf_flush_check_neighbors(), buf_flush_try_neighbors(): Rename the parameter "bool lru" to "bool evict". buf_free_from_unzip_LRU_list_batch(): Remove the parameter. Only actual page writes will contribute towards the limit. buf_LRU_free_page(): Evict freed pages of temporary tables. buf_pool.done_free: Broadcast whenever a block is freed (and buf_pool.try_LRU_scan is set). buf_pool_t::io_buf_t::reserve(): Retry indefinitely. During the test encryption.innochecksum we easily run out of these buffers for PAGE_COMPRESSED or ENCRYPTED pages. Tested by Matthias Leich and Axel Schwenke	2023-11-16 17:45:18 +02:00
Oleksandr Byelkin	6cfd2ba397	Merge branch '10.4' into 10.5	2023-11-08 12:59:00 +01:00
Kristian Nielsen	9fa718b1a1	Fix mariabackup InnoDB recovered binlog position on server upgrade Before MariaDB 10.3.5, the binlog position was stored in the TRX_SYS page, while after it is stored in rollback segments. There is code to read the legacy position from TRX_SYS to handle upgrades. The problem was if the legacy position happens to compare larger than the position found in rollback segments; in this case, the old TRX_SYS position would incorrectly be preferred over the newer position from rollback segments. Fixed by always preferring a position from rollback segments over a legacy position. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-11-03 09:13:51 +01:00
Kristian Nielsen	f8f5ed2280	Revert: MDEV-22351 InnoDB may recover wrong information after RESET MASTER This commit can cause the wrong (old) binlog position to be recovered by mariabackup --prepare. It implements that the value of the FIL_PAGE_LSN is compared to determine which binlog position is the last one and should be recoved. However, it is not guaranteed that the FIL_PAGE_LSN order matches the commit order, as is assumed by the code. This is because the page LSN could be modified by an unrelated update of the page after the commit. In one example, the recovery first encountered this in trx_rseg_mem_restore(): lsn=27282754 binlog position (./master-bin.000001, 472908) and then later: lsn=27282699 binlog position (./master-bin.000001, 477164) The last one 477164 is the correct position. However, because the LSN encountered for the first one is higher, that position is recovered instead. This results in too old binlog position, and a newly provisioned slave will start replicating too early and get duplicate key error or similar. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-11-03 09:13:51 +01:00
Marko Mäkelä	bfab4ab000	MDEV-18867 fixup: Remove DBUG injection In commit `75e82f71f1` the code to rename internal tables for FULLTEXT INDEX that had been created on Microsoft Windows using incompatible names was removed. Let us also remove the related fault injection.	2023-11-02 15:27:52 +02:00
Marko Mäkelä	15ae97b1c2	MDEV-32578 row_merge_fts_doc_tokenize() handles parser plugin inconsistently When mysql/mysql-server@0c954c2289 added a plugin interface for FULLTEXT INDEX tokenization to MySQL 5.7, fts_tokenize_ctx::processed_len got a second meaning, which is only partly implemented in row_merge_fts_doc_tokenize(). This inconsistency could cause a crash when using FULLTEXT...WITH PARSER. A test case that would crash MySQL 8.0 when using an n-gram parser and single-character words would fail to crash in MySQL 5.7, because the buf_full condition in row_merge_fts_doc_tokenize() was not met. This change is inspired by mysql/mysql-server@38e9a0779a that appeared in MySQL 5.7.44.	2023-10-27 13:13:49 +03:00
Thirunarayanan Balathandayuthapani	3da5d047b8	MDEV-31851 After crash recovery, undo tablespace fails to open Problem: ======== - InnoDB fails to open undo tablespace when page0 is corrupted and fails to throw error. Solution: ========= - InnoDB throws DB_CORRUPTION error when InnoDB encounters page0 corruption of undo tablespace. - InnoDB restores the page0 of undo tablespace from doublewrite buffer if it encounters page corruption - Moved Datafile::restore_from_doublewrite() to recv_dblwr_t::restore_first_page(). So that undo tablespace and system tablespace can use this function instead of duplicating the code srv_undo_tablespace_open(): Returns 0 if file doesn't exist or ULINT_UNDEFINED if page0 is corrupted.	2023-10-17 18:41:21 +05:30
Daniel Black	fbd11d5f29	MDEV-18200 MariaBackup full backup failed with InnoDB: Failing assertion: success Review cleanups.	2023-10-13 09:48:57 +11:00
Daniel Black	c79ca7c7ad	MDEV-18200 MariaBackup full backup failed with InnoDB: Failing assertion: success There are many filesystem related errors that can occur with MariaBackup. These already outputed to stderr with a good description of the error. Many of these are permission or resource (file descriptor) limits where the assertion and resulting core crash doesn't offer developers anything more than the log message. To the user, assertions and core crashes come across as poor error handling. As such we return an error and handle this all the way up the stack.	2023-10-12 21:37:27 +11:00
Marko Mäkelä	f9d471e2d5	Cleanup: Remove innobase_init_vc_templ() This fixes up a merge of commit `4fb8f7d07a` with respect to commit `ea37b14409`.	2023-10-12 09:48:54 +03:00
Marko Mäkelä	6e9b421f77	MDEV-32364 Server crashes when starting server with high innodb_log_buffer_size log_t::create(): Return whether the initialisation succeeded. It may fail if too large an innodb_log_buffer_size is specified.	2023-10-06 14:16:01 +03:00
Daniel Black	ca66a2cbfa	MDEV-18200 MariaBackup full backup failed with InnoDB: Failing assertion: success There are many filesystem related errors that can occur with MariaBackup. These already outputed to stderr with a good description of the error. Many of these are permission or resource (file descriptor) limits where the assertion and resulting core crash doesn't offer developers anything more than the log message. To the user, assertions and core crashes come across as poor error handling. As such we return an error and handle this all the way up the stack.	2023-09-26 08:55:52 +10:00
Vlad Lesin	95730372bd	MDEV-30165 X-lock on supremum for prepared transaction for RR trx_t::set_skip_lock_inheritance() must be invoked at the very beginning of lock_release_on_prepare(). Currently trx_t::set_skip_lock_inheritance() is invoked at the end of lock_release_on_prepare() when lock_sys and trx are released, and there can be a case when locks on prepare are released, but "not inherit gap locks" bit has not yet been set, and page split inherits lock to supremum. Also reset supremum bit and rebuild waiting queue when XA is prepared. Reviewed by: Marko Mäkelä	2023-09-21 20:07:53 +03:00
Marko Mäkelä	d58f43f8b4	MDEV-21174 fixup: Remove unused ut_bit_set_nth() This fixes up commit `56f6dab1d0`	2023-09-19 18:02:56 +03:00
Oleksandr Byelkin	7564be1352	Merge branch '10.4' into 10.5	2023-07-26 16:02:57 +02:00
Oleksandr Byelkin	f52954ef42	Merge commit '10.4' into 10.5	2023-07-20 11:54:52 +02:00
Vlad Lesin	1bfd3cc457	MDEV-10962 Deadlock with 3 concurrent DELETEs by unique key PROBLEM: A deadlock was possible when a transaction tried to "upgrade" an already held Record Lock to Next Key Lock. SOLUTION: This patch is based on observations that: (1) a Next Key Lock is equivalent to Record Lock combined with Gap Lock (2) a GAP Lock never has to wait for any other lock In case we request a Next Key Lock, we check if we already own a Record Lock of equal or stronger mode, and if so, then we change the requested lock type to GAP Lock, which we either already have, or can be granted immediately, as GAP locks don't conflict with any other lock types. (We don't consider Insert Intention Locks a Gap Lock in above statements). The reason of why we don't upgrage Record Lock to Next Key Lock is the following. Imagine a transaction which does something like this: for each row { request lock in LOCK_X\|LOCK_REC_NOT_GAP mode request lock in LOCK_S mode } If we upgraded lock from Record Lock to Next Key lock, there would be created only two lock_t structs for each page, one for LOCK_X\|LOCK_REC_NOT_GAP mode and one for LOCK_S mode, and then used their bitmaps to mark all records from the same page. The situation would look like this: request lock in LOCK_X\|LOCK_REC_NOT_GAP mode on row 1: // -> creates new lock_t for LOCK_X\|LOCK_REC_NOT_GAP mode and sets bit for // 1 request lock in LOCK_S mode on row 1: // -> notices that we already have LOCK_X\|LOCK_REC_NOT_GAP on the row 1, // so it upgrades it to X request lock in LOCK_X\|LOCK_REC_NOT_GAP mode on row 2: // -> creates a new lock_t for LOCK_X\|LOCK_REC_NOT_GAP mode (because we // don't have any after we've upgraded!) and sets bit for 2 request lock in LOCK_S mode on row 2: // -> notices that we already have LOCK_X\|LOCK_REC_NOT_GAP on the row 2, // so it upgrades it to X ...etc...etc.. Each iteration of the loop creates a new lock_t struct, and in the end we have a lot (one for each record!) of LOCK_X locks, each with single bit set in the bitmap. Soon we run out of space for lock_t structs. If we create LOCK_GAP instead of lock upgrading, the above scenario works like the following: // -> creates new lock_t for LOCK_X\|LOCK_REC_NOT_GAP mode and sets bit for // 1 request lock in LOCK_S mode on row 1: // -> notices that we already have LOCK_X\|LOCK_REC_NOT_GAP on the row 1, // so it creates LOCK_S\|LOCK_GAP only and sets bit for 1 request lock in LOCK_X\|LOCK_REC_NOT_GAP mode on row 2: // -> reuses the lock_t for LOCK_X\|LOCK_REC_NOT_GAP by setting bit for 2 request lock in LOCK_S mode on row 2: // -> notices that we already have LOCK_X\|LOCK_REC_NOT_GAP on the row 2, // so it reuses LOCK_S\|LOCK_GAP setting bit for 2 In the end we have just two locks per page, one for each mode: LOCK_X\|LOCK_REC_NOT_GAP and LOCK_S\|LOCK_GAP. Another benefit of this solution is that it avoids not-entirely const-correct, (and otherwise looking risky) "upgrading". The fix was ported from mysql/mysql-server@bfba840dfa mysql/mysql-server@75cefdb1f7 Reviewed by: Marko Mäkelä	2023-07-06 15:06:10 +03:00
Thirunarayanan Balathandayuthapani	bd076d4dff	MDEV-31442 page_cleaner thread aborts while releasing the tablespace - InnoDB shouldn't acquire the tablespace when it is being stopped or closed	2023-06-16 14:58:48 +05:30
Thirunarayanan Balathandayuthapani	841e905f20	MDEV-31442 page_cleaner thread aborts while releasing the tablespace After further I/O on a tablespace has been stopped (for example due to DROP TABLE or an operation that rebuilds a table), page cleaner thread tries to flush the pending writes for the tablespace and releases the tablespace reference even though it was not acquired. fil_space_t::flush(): Don't release the tablespace when it is being stopped and closed Thanks to Marko Mäkelä for suggesting this patch.	2023-06-09 18:15:33 +05:30
Marko Mäkelä	c25b496724	MDEV-31382 SET GLOBAL innodb_undo_log_truncate=ON has no effect on logically empty undo logs innodb_undo_log_truncate_update(): A callback function. If SET GLOBAL innodb_undo_log_truncate=ON, invoke srv_wake_purge_thread_if_not_active(). srv_wake_purge_thread_if_not_active(): If innodb_undo_log_truncate=ON, always wake up the purge subsystem. srv_do_purge(): If the history is empty, invoke trx_purge_truncate_history() in order to free undo log pages. trx_purge_truncate_history(): If head.trx_no==0, consider the cached undo logs to be free. trx_purge(): Remove the parameter "bool truncate" and let the caller invoke trx_purge_truncate_history() directly. Reviewed by: Vladislav Lesin	2023-06-08 09:18:21 +03:00
Marko Mäkelä	3e40f9a7f3	MDEV-31355 innodb_undo_log_truncate=ON fails to wait for purge of enough transaction history purge_sys_t::sees(): Wrapper for view.sees(). trx_purge_truncate_history(): Invoke purge_sys.sees() instead of comparing to head.trx_no, to determine if undo pages can be safely freed. The test innodb.cursor-restore-locking was adjusted by Vladislav Lesin, as was the the debug instrumentation in row_purge_del_mark(). Reviewed by: Vladislav Lesin	2023-06-08 09:17:52 +03:00
Vlad Lesin	b54e7b0cea	MDEV-31185 rw_trx_hash_t::find() unpins pins too early rw_trx_hash_t::find() acquires element->mutex, then unpins pins, used for lf_hash element search. After that the "element" can be deallocated and reused by some other thread. If we take a look rw_trx_hash_t::insert()->lf_hash_insert()->lf_alloc_new() calls, we will not find any element->mutex acquisition, as it was not initialized yet before it's allocation. rw_trx_hash_t::insert() can reuse the chunk, unpinned in rw_trx_hash_t::find(). The scenario is the following: 1. Thread 1 have just executed lf_hash_search() in rw_trx_hash_t::find(), but have not acquired element->mutex yet. 2. Thread 2 have removed the element from hash table with rw_trx_hash_t::erase() call. 3. Thread 1 acquired element->mutex and unpinned pin 2 pin with lf_hash_search_unpin(pins) call. 4. Some thread purged memory of the element. 5. Thread 3 reused the memory for the element, filled element->id, element->trx. 6. Thread 1 crashes with failed "DBUG_ASSERT(trx_id == trx->id)" assertion. Note that trx_t objects are also reused, see the code around trx_pools for details. The fix is to invoke "lf_hash_search_unpin(pins);" after element->trx is stored in local variable in rw_trx_hash_t::find(). Reviewed by: Nikita Malyavin, Marko Mäkelä.	2023-05-19 15:50:20 +03:00
Marko Mäkelä	06d555a41a	Merge bb-10.5-release into 10.5	2023-05-19 14:23:04 +03:00
Marko Mäkelä	e0084b9d31	MDEV-31234 InnoDB does not free UNDO after the fix of MDEV-30671 trx_purge_truncate_history(): Only call trx_purge_truncate_rseg_history() if the rollback segment is safe to process. This will avoid leaking undo log pages that are not yet ready to be processed. This fixes a regression that was introduced in commit `0de3be8cfd` (MDEV-30671). trx_sys_t::any_active_transactions(): Separately count XA PREPARE transactions. srv_purge_should_exit(): Terminate slow shutdown if the history size does not change and XA PREPARE transactions exist in the system. This will avoid a hang of the test innodb.recovery_shutdown. Tested by: Matthias Leich	2023-05-19 12:19:26 +03:00
Marko Mäkelä	477285c8ea	MDEV-31253 Freed data pages are not always being scrubbed fil_space_t::flush_freed(): Renamed from buf_flush_freed_pages(); this is a backport of `aa45850687` from 10.6. Invoke log_write_up_to() on last_freed_lsn, instead of avoiding the operation when the log has not yet been written. A more costly alternative would be that log_checkpoint() would invoke this function on every affected tablespace.	2023-05-12 14:57:14 +03:00
Marko Mäkelä	50f3b7d164	MDEV-31124 Innodb_data_written miscounts doublewrites When commit `a5a2ef079c` implemented asynchronous doublewrite, the writes via the doublewrite buffer started to be counted incorrectly, without multiplying them by innodb_page_size. srv_export_innodb_status(): Correctly count the Innodb_data_written. buf_dblwr_t: Remove submitted(), because it is close to written() and only Innodb_data_written was interested in it. According to its name, it should count completed and not submitted writes. Tested by: Axel Schwenke	2023-04-25 12:17:06 +03:00
Oleksandr Byelkin	1d74927c58	Merge branch '10.4' into 10.5	2023-04-24 12:43:47 +02:00
Alexander Barkov	9f98a2acd7	MDEV-30968 mariadb-backup does not copy Aria logs if aria_log_dir_path is used - `mariadb-backup --backup` was fixed to fetch the value of the @@aria_log_dir_path server variable and copy aria_log* files from @@aria_log_dir_path directory to the backup directory. Absolute and relative (to --datadir) paths are supported. Before this change aria_log* files were copied to the backup only if they were in the default location in @@datadir. - `mariadb-backup --copy-back` now understands a new my.cnf and command line parameter --aria-log-dir-path. `mariadb-backup --copy-back` in the main loop in copy_back() (when copying back from the backup directory to --datadir) was fixed to ignore all aria_log* files. A new function copy_back_aria_logs() was added. It consists of a separate loop copying back aria_log* files from the backup directory to the directory specified in --aria-log-dir-path. Absolute and relative (to --datadir) paths are supported. If --aria-log-dir-path is not specified, aria_log* files are copied to --datadir by default. - The function is_absolute_path() was fixed to understand MTR style paths on Windows with forward slashes, e.g. --aria-log-dir-path=D:/Buildbot/amd64-windows/build/mysql-test/var/...	2023-04-21 19:08:35 +04:00
Oleksandr Byelkin	ac5a534a4c	Merge remote-tracking branch '10.4' into 10.5	2023-03-31 21:32:41 +02:00
Marko Mäkelä	402f36dd65	MDEV-30936 fixup fil_space_t::~fil_space_t(): Invoke ut_free(name) because doing so in the callers would trip MSAN_OPTIONS=poison_in_dtor=1	2023-03-28 15:10:32 +03:00
Marko Mäkelä	dfa90257f6	MDEV-30936 clang 15.0.7 -fsanitize=memory fails massively handle_slave_io(), handle_slave_sql(), os_thread_exit(): Remove a redundant pthread_exit(nullptr) call, because it would cause SIGSEGV. mysql_print_status(): Add MEM_MAKE_DEFINED() to work around some missing instrumentation around mallinfo2(). que_graph_free_stat_list(): Invoke que_node_get_next(node) before que_graph_free_recursive(node). That is the logical and MSAN_OPTIONS=poison_in_dtor=1 compatible way of freeing memory. ins_node_t::~ins_node_t(): Invoke mem_heap_free(entry_sys_heap). que_graph_free_recursive(): Rely on ins_node_t::~ins_node_t(). fts_t::~fts_t(): Invoke mem_heap_free(fts_heap). fts_free(): Replace with direct calls to fts_t::~fts_t(). The failures in free_root() due to MSAN_OPTIONS=poison_in_dtor=1 will be covered in MDEV-30942.	2023-03-28 11:44:24 +03:00
Vlad Lesin	4c226c1850	MDEV-29050 mariabackup issues error messages during InnoDB tablespaces export on partial backup preparing The solution is to suppress error messages for missing tablespaces if mariabackup is launched with "--prepare --export" options. "mariabackup --prepare --export" invokes itself with --mysqld parameter. If the parameter is set, then it starts server to feed "FLUSH TABLES ... FOR EXPORT;" queries for exported tablespaces. This is "normal" server start, that's why new srv_operation value is introduced. Reviewed by Marko Makela.	2023-03-27 20:15:10 +03:00
Vlad Lesin	7d6b3d4008	MDEV-30775 Performance regression in fil_space_t::try_to_close() introduced in MDEV-23855 fil_node_open_file_low() tries to close files from the top of fil_system.space_list if the number of opened files is exceeded. It invokes fil_space_t::try_to_close(), which iterates the list searching for the first opened space. Then it just closes the space, leaving it in the same position in fil_system.space_list. On heavy files opening, like during 'SHOW TABLE STATUS ...' execution, if the number of opened files limit is reached, fil_space_t::try_to_close() iterates more and more closed spaces before reaching any opened space for each fil_node_open_file_low() call. What causes performance regression if the number of spaces is big enough. The fix is to keep opened spaces at the top of fil_system.space_list, and move closed files at the end of the list. For this purpose fil_space_t::space_list_last_opened pointer is introduced. It points to the last inserted opened space in fil_space_t::space_list. When space is opened, it's inserted to the position just after the pointer points to in fil_space_t::space_list to preserve the logic, inroduced in MDEV-23855. Any closed space is added to the end of fil_space_t::space_list. As opened spaces are located at the top of fil_space_t::space_list, fil_space_t::try_to_close() finds opened space faster. There can be the case when opened and closed spaces are mixed in fil_space_t::space_list if fil_system.freeze_space_list was set during fil_node_open_file_low() execution. But this should not cause any error, as fil_space_t::try_to_close() still iterates spaces in the list. There is no need in any test case for the fix, as it does not change any functionality, but just fixes performance regression.	2023-03-10 18:31:10 +03:00
Marko Mäkelä	c14a39431b	MDEV-30753 Possible corruption due to trx_purge_free_segment() Starting with commit `0de3be8cfd` (MDEV-30671), the field TRX_UNDO_NEEDS_PURGE lost its previous meaning. The following scenario is possible: (1) InnoDB is killed at a point of time corresponding to the durable execution of some fseg_free_step_not_header() but not trx_purge_remove_log_hdr(). (2) After restart, the affected pages are allocated for something else. (3) Purge will attempt to access the newly reallocated pages when looking for some old undo log records. trx_purge_free_segment(): Invoke trx_purge_remove_log_hdr() as the first thing, to be safe. If the server is killed, some pages will never be freed. That is the lesser evil. Also, before each mtr.start(), invoke log_free_check() to prevent ib_logfile0 overrun.	2023-02-28 15:39:23 +02:00
Marko Mäkelä	0de3be8cfd	MDEV-30671 InnoDB undo log truncation fails to wait for purge of history It is not safe to invoke trx_purge_free_segment() or execute innodb_undo_log_truncate=ON before all undo log records in the rollback segment has been processed. A prominent failure that would occur due to premature freeing of undo log pages is that trx_undo_get_undo_rec() would crash when trying to copy an undo log record to fetch the previous version of a record. If trx_undo_get_undo_rec() was not invoked in the unlucky time frame, then the symptom would be that some committed transaction history is never removed. This would be detected by CHECK TABLE...EXTENDED that was impleented in commit `ab0190101b`. Such a garbage collection leak should be possible even when using innodb_undo_log_truncate=OFF, just involving trx_purge_free_segment(). trx_rseg_t::needs_purge: Change the type from Boolean to a transaction identifier, noting the most recent non-purged transaction, or 0 if everything has been purged. On transaction start, we initialize this to 1 more than the transaction start ID. On recovery, the field may be adjusted to the transaction end ID (TRX_UNDO_TRX_NO) if it is larger. The field TRX_UNDO_NEEDS_PURGE becomes write-only; only some debug assertions that would validate the value. The field reflects the old inaccurate Boolean field trx_rseg_t::needs_purge. trx_undo_mem_create_at_db_start(), trx_undo_lists_init(), trx_rseg_mem_restore(): Remove the parameter max_trx_id. Instead, store the maximum in trx_rseg_t::needs_purge, where trx_rseg_array_init() will find it. trx_purge_free_segment(): Contiguously hold a lock on trx_rseg_t to prevent any concurrent allocation of undo log. trx_purge_truncate_rseg_history(): Only invoke trx_purge_free_segment() if the rollback segment is empty and there are no pending transactions associated with it. trx_purge_truncate_history(): Only proceed with innodb_undo_log_truncate=ON if trx_rseg_t::needs_purge indicates that all history has been purged. Tested by: Matthias Leich	2023-02-24 14:24:44 +02:00
Marko Mäkelä	5300c0fb76	MDEV-30657 InnoDB: Not applying UNDO_APPEND due to corruption This almost completely reverts commit `acd23da4c2` and retains a safe optimization: recv_sys_t::parse(): Remove any old redo log records for the truncated tablespace, to free up memory earlier. If recovery consists of multiple batches, then recv_sys_t::apply() will must invoke recv_sys_t::trim() again to avoid wrongly applying old log records to an already truncated undo tablespace.	2023-02-15 18:16:41 +02:00
Marko Mäkelä	c41c79650a	Merge 10.4 into 10.5	2023-02-10 12:02:11 +02:00
Vicențiu Ciorbaru	08c852026d	Apply clang-tidy to remove empty constructors / destructors This patch is the result of running run-clang-tidy -fix -header-filter=.* -checks='-,modernize-use-equals-default' . Code style changes have been done on top. The result of this change leads to the following improvements: 1. Binary size reduction. For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by ~400kb. * A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb. 2. Compiler can better understand the intent of the code, thus it leads to more optimization possibilities. Additionally it enabled detecting unused variables that had an empty default constructor but not marked so explicitly. Particular change required following this patch in sql/opt_range.cc result_keys, an unused template class Bitmap now correctly issues unused variable warnings. Setting Bitmap template class constructor to default allows the compiler to identify that there are no side-effects when instantiating the class. Previously the compiler could not issue the warning as it assumed Bitmap class (being a template) would not be performing a NO-OP for its default constructor. This prevented the "unused variable warning".	2023-02-09 16:09:08 +02:00
Marko Mäkelä	acd23da4c2	MDEV-30479 optimization: Invoke recv_sys_t::trim() earlier recv_sys_t::parse(): Discard old page-level redo log when parsing a TRIM_PAGES record. recv_sys_t::apply(): trim() was invoked in parse() already. recv_sys_t::truncated_undo_spaces[]: Only store the size, no LSN.	2023-02-06 20:29:42 +02:00
Thirunarayanan Balathandayuthapani	17858e03a7	MDEV-30179 mariabackup --backup fails with FATAL ERROR: ... failed to copy datafile - Mariabackup fails to copy the undo log tablespace when it undergoes truncation. So Mariabackup should detect the redo log which does undo tablespace truncation and also backup should read the minimum file size of the tablespace and ignore the error while reading. - Throw error when innodb undo tablespace read failed, but backup doesn't find the redo log for undo tablespace truncation	2023-01-10 15:47:13 +05:30
Marko Mäkelä	8b9b4ab3f5	Merge 10.4 into 10.5	2023-01-03 17:08:42 +02:00
Marko Mäkelä	fb0808c450	Merge 10.3 into 10.4	2023-01-03 16:10:02 +02:00
Aleksey Midenkov	e056efdd6c	MDEV-25004 Missing row in FTS_DOC_ID_INDEX during DELETE HISTORY 1. In case of system-versioned table add row_end into FTS_DOC_ID index in fts_create_common_tables() and innobase_create_key_defs(). fts_n_uniq() returns 1 or 2 depending on whether the table is system-versioned. After this patch recreate of FTS_DOC_ID index is required for existing system-versioned tables. If you see this message in error log or server warnings: "InnoDB: Table db/t1 contains 2 indexes inside InnoDB, which is different from the number of indexes 1 defined in the MariaDB" use this command to fix the table: ALTER TABLE db.t1 FORCE; 2. Fix duplicate history for secondary unique index like it was done in MDEV-23644 for clustered index (`932ec586aa`). In case of existing history row which conflicts with currently inseted row we check in row_ins_scan_sec_index_for_duplicate() whether that row was inserted as part of current transaction. In that case we indicate with DB_FOREIGN_DUPLICATE_KEY that new history row is not needed and should be silently skipped. 3. Some parts of MDEV-21138 (`7410ff436e`) reverted. Skipping of FTS_DOC_ID index for history rows made problems with purge system. Now this is fixed differently by p.2. 4. wait_all_purged.inc checks that we didn't affect non-history rows so they are deleted and purged correctly. Additional FTS fixes fts_init_get_doc_id(): exclude history rows from max_doc_id calculation. fts_init_get_doc_id() callback is used only for crash recovery. fts_add_doc_by_id(): set max value for row_end field. fts_read_stopword(): stopwords table can be system-versioned too. We now read stopwords only for current data. row_insert_for_mysql(): exclude history rows from doc_id validation. row_merge_read_clustered_index(): exclude history_rows from doc_id processing. fts_load_user_stopword(): for versioned table retrieve row_end field and skip history rows. For non-versioned table we retrieve 'value' field twice (just for uniformity). FTS tests for System Versioning now include maybe_versioning.inc which adds 3 combinations: 'vers' for debug build sets sysvers_force and sysvers_hide. sysvers_force makes every created table system-versioned, sysvers_hide hides WITH SYSTEM VERSIONING for SHOW CREATE. Note: basic.test, stopword.test and versioning.test do not require debug for 'vers' combination. This is controlled by $modify_create_table in maybe_versioning.inc and these tests run WITH SYSTEM VERSIONING explicitly which allows to test 'vers' combination on non-debug builds. 'vers_trx' like 'vers' sets sysvers_force_trx and sysvers_hide. That tests FTS with trx_id-based System Versioning. 'orig' works like before: no System Versioning is added, no debug is required. Upgrade/downgrade test for System Versioning is done by innodb_fts.versioning. It has 2 combinations: 'prepare' makes binaries in std_data (requires old server and OLD_BINDIR). It tests upgrade/downgrade against old server as well. 'upgrade' tests upgrade against binaries in std_data. Cleanups: Removed innodb-fts-stopword.test as it duplicates stopword.test	2022-12-27 00:02:02 +03:00
Marko Mäkelä	b8f4b984f9	MDEV-24685 fixup: Remove srv_n_file_io_threads The variable was not really being used for anything. The parameters innodb_read_io_threads, innodb_write_io_threads have replaced innodb_file_io_threads.	2022-12-16 17:08:56 +02:00
Marko Mäkelä	92ff7bb63f	MDEV-30227 [ERROR] [FATAL] InnoDB: fdatasync() returned 9 fil_space_t::flush<false>(): If the CLOSING flag is set, the file may already have been closed, resulting in EBADF being returned by fdatasync(). In any case, the thread that had set the flag should take care of invoking os_file_flush_func(). The crash occurred during the execution of FLUSH TABLES...FOR EXPORT. Tested by: Matthias Leich	2022-12-15 12:45:26 +02:00

1 2 3 4 5 ...

3664 commits