mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-31 02:51:44 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	acd265b69b	MDEV-12353: Exclusively use page_zip_reorganize() for ROW_FORMAT=COMPRESSED page_zip_reorganize(): Restore the page on failure. In callers, omit now-redundant calls to page_zip_decompress(). btr_page_reorganize_low(): Define in static scope only, and remove the z_level parameter. Assert that ROW_FORMAT is not COMPRESSED. btr_page_reorganize_block(), btr_page_reorganize(): Invoke page_zip_reorganize() for ROW_FORMAT=COMPRESSED.	2020-02-13 18:19:14 +02:00
Marko Mäkelä	5bea43f5e0	MDEV-12353: Deprecate and ignore innodb_log_compressed_pages page_zip_compress_write_log_no_data(): Remove. We no longer write the MLOG_ZIP_PAGE_COMPRESS_NO_DATA record. Instead, we will write MLOG_ZIP_PAGE_COMPRESS records.	2020-02-13 18:19:13 +02:00
Marko Mäkelä	1a6f708ec5	MDEV-15058: Deprecate and ignore innodb_buffer_pool_instances Our benchmarking efforts indicate that the reasons for splitting the buf_pool in commit `c18084f71b` have mostly gone away, possibly as a result of mysql/mysql-server@ce6109ebfd or similar work. Only in one write-heavy benchmark where the working set size is ten times the buffer pool size, the buf_pool->mutex would be less contended with 4 buffer pool instances than with 1 instance, in buf_page_io_complete(). That contention could be alleviated further by making more use of std::atomic and by splitting buf_pool_t::mutex further (MDEV-15053). We will deprecate and ignore the following parameters: innodb_buffer_pool_instances innodb_page_cleaners There will be only one buffer pool and one page cleaner task. In a number of INFORMATION_SCHEMA views, columns that indicated the buffer pool instance will be removed: information_schema.innodb_buffer_page.pool_id information_schema.innodb_buffer_page_lru.pool_id information_schema.innodb_buffer_pool_stats.pool_id information_schema.innodb_cmpmem.buffer_pool_instance information_schema.innodb_cmpmem_reset.buffer_pool_instance	2020-02-12 14:45:21 +02:00
Marko Mäkelä	fc2f2fa853	MDEV-19747: Deprecate and ignore innodb_log_optimize_ddl During native table rebuild or index creation, InnoDB used to skip redo logging and write MLOG_INDEX_LOAD records to inform crash recovery and Mariabackup of the gaps in redo log. This is fragile and prohibits some optimizations, such as skipping the doublewrite buffer for newly (re)initialized pages (MDEV-19738). row_merge_write_redo(): Remove. We do not write MLOG_INDEX_LOAD records any more. Instead, we write full redo log. FlushObserver: Remove. fseg_free_page_func(): Remove the parameter log. Redo logging cannot be disabled. fil_space_t::redo_skipped_count: Remove. We cannot remove buf_block_t::skip_flush_check, because PageBulk will temporarily generate invalid B-tree pages in the buffer pool.	2020-02-11 18:44:26 +02:00
Eugene Kosov	700e010309	fix aligned memcpy()-like functions usage I found that memcpy_aligned was used incorrectly at redo log and decided to put assertions in aligned functions. And found even more incorrect cases. Given the amount discovered of bugs, I left assertions to prevent future bugs. my_assume_aligned(): instead of MY_ASSUME_ALIGNED macro	2020-01-23 00:12:43 +08:00
Marko Mäkelä	28c89b7151	Merge 10.4 into 10.5	2019-12-16 07:47:17 +02:00
Marko Mäkelä	745fd4b39f	MDEV-21174: Remove some mlog_write_initial_log_record_fast() Pass buf_block_t* to more functions that write redo log. page_zip_write_node_ptr(), page_zip_write_blob_ptr(), page_zip_compress_write_log_no_data(): Take buf_block_t* as parameter, and do not tolerate mtr=NULL. page_zip_compress(): Do not tolerate mtr=NULL. page_zip_dir_insert(): Take page_cur_t* as parameter. mlog_write_initial_log_record(): Remove. This function was unused. RecIterator::remove(): Remove the redundant page_zip parameter. PageConverter::m_page_zip_ptr: Remove.	2019-12-13 18:15:51 +02:00
Marko Mäkelä	2b5a269cb4	MDEV-21174: Clean up record insertion page_cur_insert_rec_low(): Take page_cur_t* as a parameter, and do not tolerate mtr=NULL. page_cur_insert_rec_zip(): Do not tolerate mtr=NULL.	2019-12-13 18:15:51 +02:00
Marko Mäkelä	8fa759a576	Merge 10.3 into 10.4 We disable the MDEV-21189 test galera.galera_partition because it times out.	2019-12-13 17:30:37 +02:00
Marko Mäkelä	3466b47b0d	Merge 10.2 into 10.3	2019-12-13 10:08:57 +02:00
Eugene Kosov	f0aa073f2b	MDEV-20950 Reduce size of record offsets offset_t: this is a type which represents one record offset. It's unsigned short int. a lot of functions: replace ulint with offset_t btr_pcur_restore_position_func(), page_validate(), row_ins_scan_sec_index_for_duplicate(), row_upd_clust_rec_by_insert_inherit_func(), row_vers_impl_x_locked_low(), trx_undo_prev_version_build(): allocate record offsets on the stack instead of waiting for rec_get_offsets() to allocate it from mem_heap_t. So, reducing memory allocations. RECORD_OFFSET, INDEX_OFFSET: now it's less convenient to store pointers in offset_t* array. One pointer occupies now several offset_t. And those constant are start indexes into array to places where to store pointer values REC_OFFS_HEADER_SIZE: adjusted for the new reality REC_OFFS_NORMAL_SIZE: increase size from 100 to 300 which means less heap allocations. And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which is smaller than previous 800 bytes. REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality rem0rec.h, rem0rec.ic, rem0rec.cc: various arguments, return values and local variables types were changed to fix numerous integer conversions issues. enum field_type_t: offset types concept was introduces which replaces old offset flags stuff. Like in earlier version, 2 upper bits are used to store offset type. And this enum represents those types. REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed get_type(), set_type(), get_value(), combine(): these are convenience functions to work with offsets and it's types rec_offs_base()[0]: still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL rec_offs_base()[i]: these have type offset_t now. Two upper bits contains type.	2019-12-13 00:26:50 +07:00
Marko Mäkelä	bb45941685	MDEV-21205 Assertion failure in btr_sec_min_rec_mark In commit `af5947f433` the function btr_discard_page() is invoking btr_set_min_rec_mark() with the wrong buf_block_t* object. node_ptr is on merge_block, not block. btr_discard_page(): Remove the variables merge_page, page, and always refer to block->frame or merge_block->frame instead. Also, limit the scope of node_ptr and avoid duplicated conditions. btr_set_min_rec_mark(): Add a template parameter, so that the caller can specify whether the page is supposed to have a left sibling. Otherwise, the assertion (which was introduced in the same commit) would fail in btr_discard_page().	2019-12-04 10:51:38 +02:00
Marko Mäkelä	af5947f433	MDEV-21174: Replace mlog_write_string() with mtr_t::memcpy() mtr_t::memcpy(): Replaces mlog_write_string(), mlog_log_string(). The buf_block_t is passed a parameter, so that mlog_write_initial_log_record_low() can be used instead of mlog_write_initial_log_record_fast(). fil_space_crypt_t::write_page0(): Remove the fil_space_t* parameter.	2019-12-03 11:05:19 +02:00
Marko Mäkelä	87839258f8	MDEV-21174: Replace mlog_memset() with mtr_t::memset() Passing buf_block_t helps us avoid calling mlog_write_initial_log_record_fast() and page_get_page_no(), and allows us to implement more debug checks, such as that on ROW_FORMAT=COMPRESSED index pages, only the page header may be modified by MLOG_MEMSET records. fseg_n_reserved_pages(): Add a buf_block_t parameter.	2019-12-03 11:05:19 +02:00
Marko Mäkelä	caea64df18	Cleanup: Remove some page_get_page_no() calls Refer to buf_page_t::id instead of parsing the tablespace identifier or page number from the buffer pool page.	2019-12-03 11:05:19 +02:00
Marko Mäkelä	56f6dab1d0	MDEV-21174: Replace mlog_write_ulint() with mtr_t::write() mtr_t::write(): Replaces mlog_write_ulint(), mlog_write_ull(). Optimize away writes if the page contents does not change, except when a dummy write has been explicitly requested. Because the member function template takes a block descriptor as a parameter, it is possible to introduce better consistency checks. Due to this, the code for handling file-based lists, undo logs and user transactions was refactored to pass around buf_block_t.	2019-12-03 11:05:18 +02:00
Marko Mäkelä	cd92c6c83d	MDEV-12353 preparation: Do not write MLOG_REC_MIN_MARK btr_set_min_rec_mark(): Write MLOG_1BYTE instead of MLOG_REC_MIN_MARK or MLOG_COMP_REC_MIN_MARK. On ROW_FORMAT=COMPRESSED pages, the minimum record flag is not stored at all. The flag is computed for the uncompressed page by page_zip_decompress(). Hence, nothing needs to be logged for ROW_FORMAT=COMPRESSED tables for this operation. To facilitate crash-upgrade and hot backup from older versions, we will retain the code to parse and apply the old log record types MLOG_REC_MIN_MARK and MLOG_COMP_REC_MIN_MARK.	2019-12-03 11:05:18 +02:00
Marko Mäkelä	bf2cc46798	MDEV-21133: Remove buf_frame_copy()	2019-12-03 11:05:18 +02:00
Marko Mäkelä	a6e8a7df82	Cleanup: flst_read_addr(), fil_addr_t fil_addr_t: Use exactly sized data types. flst_read_addr(): Remove the unused parameter mtr. page_offset(): Return uint16_t.	2019-11-28 11:44:40 +02:00
Marko Mäkelä	25e2a556de	MDEV-21133 Optimize access to InnoDB page header fields Introduce memcpy_aligned<N>(), memcmp_aligned<N>(), memset_aligned<N>() and use them for accessing InnoDB page header fields that are known to be aligned. MY_ASSUME_ALIGNED(): Wrapper for the GCC/clang __builtin_assume_aligned(). Nothing similar seems to exist in Microsoft Visual Studio, and the C++20 std::assume_aligned is not available to us yet. Explicitly specified alignment guarantees allow compilers to generate faster code on platforms with strict alignment rules, instead of emitting calls to potentially unaligned memcpy(), memcmp(), or memset().	2019-11-26 10:15:03 +02:00
Marko Mäkelä	ae90f8431b	Merge 10.4 into 10.5	2019-11-14 14:49:20 +02:00
Marko Mäkelä	89ae01fd00	Merge 10.3 into 10.4	2019-11-14 13:23:36 +02:00
Marko Mäkelä	3d4a801533	MDEV-12353 preparation: Replace mtr_x_lock() and friends Apart from page latches (buf_block_t::lock), mini-transactions are keeping track of at most one dict_index_t::lock and fil_space_t::latch at a time, and in a rare case, purge_sys.latch. Let us introduce interfaces for acquiring an index latch or a tablespace latch. In a later version, we may want to introduce mtr_t members for holding a latched dict_index_t* and fil_space_t, and replace the remaining use of mtr_t::m_memo with std::set<buf_block_t> or with a map<buf_block_t,byte> pointing to log records.	2019-11-14 11:40:33 +02:00
Marko Mäkelä	c99470b366	Merge 10.4 into 10.5	2019-11-13 20:38:14 +02:00
Marko Mäkelä	49019dde65	MDEV-17138 follow-up: Optimize index page creation btr_create(), btr_root_raise_and_insert(): Write a MLOG_MEMSET record to set FIL_PAGE_PREV,FIL_PAGE_NEXT to FIL_NULL, instead of writing two MLOG_4BYTES records. For ROW_FORMAT=COMPRESSED pages, we will not use MLOG_MEMSET because we want the crash-downgrade to earlier 10.4 releases to succeed. mlog_parse_nbytes(): Relax the too strict assertion. There is no problem with MLOG_MEMSET records that affect the uncompressed header of ROW_FORMAT=COMPRESSED index pages.	2019-11-13 18:35:04 +02:00
Marko Mäkelä	0117d0e65a	Merge 10.4 into 10.5	2019-11-11 15:21:58 +02:00
Marko Mäkelä	3da895a736	Merge 10.3 into 10.4	2019-11-11 15:03:46 +02:00
Marko Mäkelä	4fcfdb60e7	Merge 10.2 into 10.3	2019-11-11 14:56:51 +02:00
Marko Mäkelä	33f74e8fcf	MDEV-21024: Clean up IMPORT TABLESPACE page_rec_write_field(): Remove. dict_create_index_tree_step(): If the SYS_INDEXES.PAGE does not change, do not update it in the data dictionary. Typically, all index page numbers would be unchanged before and after IMPORT TABLESPACE, except if some secondary indexes were created after loading some data. btr_root_fseg_adjust_on_import(): Remove the redundant mtr_t* parameter. Redo logging is disabled during the page adjustments that IMPORT TABLESPACE is performing.	2019-11-11 14:14:26 +02:00
Marko Mäkelä	dfdd96214b	MDEV-21024: Clean up btr_root_raise_and_insert() The root page must never have any siblings, so it is unnecessary to clear those fields.	2019-11-11 14:14:26 +02:00
Marko Mäkelä	29d67d051a	Cleanup btr_page_get_prev(), btr_page_get_next() Remove the redundant parameter mtr_t*. Make use of page_has_prev(), page_has_next() whenever possible.	2019-11-11 13:36:21 +02:00
Marko Mäkelä	64a02e4fa2	MDEV-19586: Add const qualifiers Except for fil_name_process(), which invokes os_normalize_path(), the redo log record parser will not modify the redo log records. Add const qualifiers accordingly.	2019-11-04 09:25:26 +02:00
Marko Mäkelä	bb450b1fed	Merge 10.2 into 10.3	2019-10-12 15:38:58 +03:00
Marko Mäkelä	361e8284f3	MDEV-20813 Assertion failure in buf_flush_init_for_writing() for innodb_immediate_scrub_data_uncompressed=ON The assertion that was added in commit `c0c003beb4` to augment the fix of MDEV-20805 turns out to be invalid when innodb_immediate_scrub_data_uncompressed is enabled. In this mode, fsp_init_file_page() will be invoked on data pages that have been freed, causing writes of almost-all-zero pages. btr_page_free(): Adjust the comment. buf_flush_init_for_writing(): Disable the assertion with a note that it should be re-enabled in MDEV-15528.	2019-10-12 15:28:55 +03:00
Marko Mäkelä	b42294bc64	MDEV-19514 Defer change buffer merge until pages are requested We will remove the InnoDB background operation of merging buffered changes to secondary index leaf pages. Changes will only be merged as a result of an operation that accesses a secondary index leaf page, such as a SQL statement that performs a lookup via that index, or is modifying the index. Also ROLLBACK and some background operations, such as purging the history of committed transactions, or computing index cardinality statistics, can cause change buffer merge. Encryption key rotation will not perform change buffer merge. The motivation of this change is to simplify the I/O logic and to allow crash recovery to happen in the background (MDEV-14481). We also hope that this will reduce the number of "mystery" crashes due to corrupted data. Because change buffer merge will typically take place as a result of executing SQL statements, there should be a clearer connection between the crash and the SQL statements that were executed when the server crashed. In many cases, a slight performance improvement was observed. This is joint work with Thirunarayanan Balathandayuthapani and was tested by Axel Schwenke and Matthias Leich. The InnoDB monitor counter innodb_ibuf_merge_usec will be removed. On slow shutdown (innodb_fast_shutdown=0), we will continue to merge all buffered changes (and purge all undo log history). Two InnoDB configuration parameters will be changed as follows: innodb_disable_background_merge: Removed. This parameter existed only in debug builds. All change buffer merges will use synchronous reads. innodb_force_recovery will be changed as follows: * innodb_force_recovery=4 will be the same as innodb_force_recovery=3 (the change buffer merge cannot be disabled; it can only happen as a result of an operation that accesses a secondary index leaf page). The option used to be capable of corrupting secondary index leaf pages. Now that capability is removed, and innodb_force_recovery=4 becomes 'safe'. * innodb_force_recovery=5 (which essentially hard-wires SET GLOBAL TRANSACTION ISOLATION LEVEL READ UNCOMMITTED) becomes safe to use. Bogus data can be returned to SQL, but persistent InnoDB data files will not be corrupted further. * innodb_force_recovery=6 (ignore the redo log files) will be the only option that can potentially cause persistent corruption of InnoDB data files. Code changes: buf_page_t::ibuf_exist: New flag, to indicate whether buffered changes exist for a buffer pool page. Pages with pending changes can be returned by buf_page_get_gen(). Previously, the changes were always merged inside buf_page_get_gen() if needed. ibuf_page_exists(const buf_page_t&): Check if a buffered changes exist for an X-latched or read-fixed page. buf_page_get_gen(): Add the parameter allow_ibuf_merge=false. All callers that know that they may be accessing a secondary index leaf page must pass this parameter as allow_ibuf_merge=true, unless it does not matter for that caller whether all buffered changes have been applied. Assert that whenever allow_ibuf_merge holds, the page actually is a leaf page. Attempt change buffer merge only to secondary B-tree index leaf pages. btr_block_get(): Add parameter 'bool merge'. All callers of btr_block_get() should know whether the page could be a secondary index leaf page. If it is not, we should avoid consulting the change buffer bitmap to even consider a merge. This is the main interface to requesting index pages from the buffer pool. ibuf_merge_or_delete_for_page(), recv_recover_page(): Replace buf_page_get_known_nowait() with much simpler logic, because it is now guaranteed that that the block is x-latched or read-fixed. mlog_init_t::mark_ibuf_exist(): Renamed from mlog_init_t::ibuf_merge(). On crash recovery, we will no longer merge any buffered changes for the pages that we read into the buffer pool during the last batch of applying log records. buf_page_get_gen_known_nowait(), BUF_MAKE_YOUNG, BUF_KEEP_OLD: Remove. btr_search_guess_on_hash(): Merge buf_page_get_gen_known_nowait() to its only remaining caller. buf_page_make_young_if_needed(): Define as an inline function. Add the parameter buf_pool. buf_page_peek_if_young(), buf_page_peek_if_too_old(): Add the parameter buf_pool. fil_space_validate_for_mtr_commit(): Remove a bogus comment about background merge of the change buffer. btr_cur_open_at_rnd_pos_func(), btr_cur_search_to_nth_level_func(), btr_cur_open_at_index_side_func(): Use narrower data types and scopes. ibuf_read_merge_pages(): Replaces buf_read_ibuf_merge_pages(). Merge the change buffer by invoking buf_page_get_gen().	2019-10-11 17:28:15 +03:00
Marko Mäkelä	d04f2de80a	Merge 10.4 into 10.5	2019-10-11 08:41:36 +03:00
Marko Mäkelä	09afd3da1a	Merge 10.3 into 10.4	2019-10-10 21:30:40 +03:00
Marko Mäkelä	4cdb72f237	MDEV-19783: Relax an assertion btr_page_get_split_rec_to_left(): Assert that in the leftmost leaf page, if the metadata record exists, index->is_instant() must hold. The assertion of commit `01f45becd1` could fail during innobase_instant_try().	2019-10-10 21:22:38 +03:00
Marko Mäkelä	01f45becd1	MDEV-19783: Add more assertions btr_page_get_split_rec_to_left(): Assert that in the leftmost leaf page, the metadata record exists if and only if index->is_instant(). page_validate(): Correct the wording of a message. rec_init_offsets(): Assert that whenever a record is in "instant ALTER" format, index->is_instant() must hold.	2019-10-10 20:40:26 +03:00
Marko Mäkelä	7f84e3ad75	Merge 10.2 into 10.3	2019-10-10 20:38:44 +03:00
Marko Mäkelä	6d7a826953	MDEV-20788: Bogus assertion failure for PAGE_FREE list In MDEV-11369 (instant ADD COLUMN) in MariaDB Server 10.3, we introduced the hidden metadata record that must be the first record in the clustered index if and only if index->is_instant() holds. To catch MDEV-19783, in commit `ed0793e096` and commit `99dc40d6ac` we added some assertions to find cases where the metadata record is missing while it should not be, or a record exists when it should not. Those assertions were invalid when traversing the PAGE_FREE list. That list can contain anything; we must only be able to determine the successor and the size of each garbage record in it. page_validate(), page_simple_validate_old(), page_simple_validate_new(): Do not invoke page_rec_get_next_const() for traversing the PAGE_FREE list, but instead use a lower-level accessor that does not attempt to validate the REC_INFO_MIN_REC_FLAG. page_copy_rec_list_end_no_locks(), page_copy_rec_list_start(), page_delete_rec_list_start(): Add assertions. btr_page_get_split_rec_to_left(): Remove a redundant return value, and make the output parameter the return value. btr_page_get_split_rec_to_right(), btr_page_split_and_insert(): Clean up.	2019-10-10 20:29:30 +03:00
Marko Mäkelä	c11e5cdd12	Merge 10.3 into 10.4	2019-10-10 11:19:25 +03:00
Marko Mäkelä	892378fb9d	Merge 10.2 into 10.3	2019-10-09 13:25:11 +03:00
Eugene Kosov	ed0793e096	MDEV-19783: Add more REC_INFO_MIN_REC_FLAG checks btr_cur_pessimistic_delete(): code changed in a way that allows to put more REC_INFO_MIN_REC_FLAG assertions inside btr_set_min_rec_mark(). Without that change tests innodb.innodb-table-online, innodb.temp_table_savepoint and innodb_zip.prefix_index_liftedlimit fail. Removed basically duplicated page_zip_validate() calls which fails because of temporary(!) invariant violation. That fixed innodb_zip.wl5522_debug_zip and innodb_zip.prefix_index_liftedlimit	2019-10-09 08:29:26 +03:00
Marko Mäkelä	d480d28f4f	Add page_has_prev(), page_has_next(), page_has_siblings() Until now, InnoDB inefficiently compared the aligned fields FIL_PAGE_PREV, FIL_PAGE_NEXT to the byte-order-agnostic value FIL_NULL. This is a backport of `32170f8c6d` from MariaDB Server 10.3.	2019-10-09 08:29:26 +03:00
Marko Mäkelä	a340af9223	btr_block_get(): Remove redundant parameters	2019-09-25 16:08:48 +03:00
Marko Mäkelä	5d0bab47fc	btr_block_get(), btr_block_get_func(): Change the parameter to const dict_index_t& btr_level_list_remove(): Clean up the parameters. Renamed from btr_level_list_remove_func().	2019-09-25 13:34:49 +03:00
Marko Mäkelä	624dd71b94	Merge 10.4 into 10.5	2019-08-13 18:57:00 +03:00
Marko Mäkelä	9c16460e63	Merge 10.3 into 10.4	2019-07-01 18:37:15 +03:00
Marko Mäkelä	0e1ba364a1	MDEV-19916 Corruption after instant ADD/DROP and shrinking the tree btr_lift_page_up(): Correct the incorrect condition. page_validate(): Validate the page type.	2019-07-01 18:24:54 +03:00

1 2 3 4 5 ...

277 commits