mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-30 18:41:56 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	a8de8f261d	Merge 10.2 into 10.3	2020-10-28 10:01:50 +02:00
Thirunarayanan Balathandayuthapani	bc540b8706	MDEV-23693 Failing assertion: my_atomic_load32_explicit(&lock->lock_word, MY_MEMORY_ORDER_RELAXED) == X_LOCK_DECR InnoDB frees the block lock during buffer pool shrinking when other thread is yet to release the block lock. While shrinking the buffer pool, InnoDB allows the page to be freed unless it is buffer fixed. In some cases, InnoDB releases the latch after unfixing the block. Fix: ==== - InnoDB should unfix the block after releases the latch. - Add more assertion to check buffer fix while accessing the page. - Introduced block_hint structure to store buf_block_t pointer and allow accessing the buf_block_t pointer only by passing a functor. It returns original buf_block_t* pointer if it is valid or nullptr if the pointer become stale. - Replace buf_block_is_uncompressed() with buf_pool_t::is_block_pointer() This change is motivated by a change in mysql-5.7.32: mysql/mysql-server@46e60de444 Bug #31036301 ASSERTION FAILURE: SYNC0RW.IC:429:LOCK->LOCK_WORD	2020-10-27 18:30:00 +05:30
Marko Mäkelä	7e07e38cf6	Merge 10.2 into 10.3	2020-09-09 13:06:46 +03:00
Marko Mäkelä	c26eae0cc0	MDEV-23456 fixup: Fix mtr_t::get_fix_count() Before commit `05fa4558e0` (MDEV-22110) we have slot->type == MTR_MEMO_MODIFY that are unrelated to incrementing the buffer-fix count. FindBlock::operator(): In debug builds, skip MTR_MEMO_MODIFY entries. Also, simplify the code a little. This fixes an infinite loop in the tests innodb.innodb_defragment and innodb.innodb_wl6326_big.	2020-09-09 12:01:03 +03:00
Thirunarayanan Balathandayuthapani	b1009ae5c1	MDEV-23456 fil_space_crypt_t::write_page0() is accessing an uninitialized page buf_page_create() is invoked when page is initialized. So that previous contents of the page ignored. In few cases, it calls buf_page_get_gen() is called to fetch the page from buffer pool. It should take x-latch on the page. If other thread uses the block or block io state is different from BUF_IO_NONE then release the mutex and check the state and buffer fix count again. For compressed page, use the existing free block from LRU list to create new page. Retry to fetch the compressed page if it is in flush list fseg_create(), fseg_create_general(): Introduce block as a parameter where segment header is placed. It is used to avoid repetitive x-latch on the same page Change the assert to check whether the page has SX latch and X latch in all callee function of buf_page_create() mtr_t::get_fix_count(): Get the buffer fix count of the given block added by the mtr FindBlock is added to find the buffer fix count of the given block acquired by the mini-transaction	2020-09-09 11:58:15 +05:30
Marko Mäkelä	acc58fd835	Merge 10.2 into 10.3	2020-07-20 15:11:59 +03:00
Marko Mäkelä	ca9276e37e	Merge 10.1 into 10.2	2020-07-20 14:53:24 +03:00
Marko Mäkelä	57ec42bc32	MDEV-23190 InnoDB data file extension is not crash-safe When InnoDB is extending a data file, it is updating the FSP_SIZE field in the first page of the data file. In commit `8451e09073` (MDEV-11556) we removed a work-around for this bug and made recovery stricter, by making it track changes to FSP_SIZE via redo log records, and extend the data files before any changes are being applied to them. It turns out that the function fsp_fill_free_list() is not crash-safe with respect to this when it is initializing the change buffer bitmap page (page 1, or generally, N*innodb_page_size+1). It uses a separate mini-transaction that is committed (and will be written to the redo log file) before the mini-transaction that actually extended the data file. Hence, recovery can observe a reference to a page that is beyond the current end of the data file. fsp_fill_free_list(): Initialize the change buffer bitmap page in the same mini-transaction. The rest of the changes are fixing a bug that the use of the separate mini-transaction was attempting to work around. Namely, we must ensure that no other thread will access the change buffer bitmap page before our mini-transaction has been committed and all page latches have been released. That is, for read-ahead as well as neighbour flushing, we must avoid accessing pages that might not yet be durably part of the tablespace. fil_space_t::committed_size: The size of the tablespace as persisted by mtr_commit(). fil_space_t::max_page_number_for_io(): Limit the highest page number for I/O batches to committed_size. MTR_MEMO_SPACE_X_LOCK: Replaces MTR_MEMO_X_LOCK for fil_space_t::latch. mtr_x_space_lock(): Replaces mtr_x_lock() for fil_space_t::latch. mtr_memo_slot_release_func(): When releasing MTR_MEMO_SPACE_X_LOCK, copy space->size to space->committed_size. In this way, read-ahead or flushing will never be invoked on pages that do not yet exist according to FSP_SIZE.	2020-07-20 14:48:56 +03:00
Marko Mäkelä	1df1a63924	Merge 10.2 into 10.3	2020-07-02 06:17:51 +03:00
Marko Mäkelä	c36834c832	MDEV-20377: Make WITH_MSAN more usable MemorySanitizer (clang -fsanitize=memory) requires that all code be compiled with instrumentation enabled. The only exception is the C runtime library. Failure to use instrumented libraries will cause bogus messages about memory being uninitialized. In WITH_MSAN builds, we must avoid calling getservbyname(), because even though it is a standard library function, it is not instrumented, not even in clang 10. Note: Before MariaDB Server 10.5, ./mtr will typically fail due to the old PCRE library, which was updated in MDEV-14024. The following cmake options were tested on 10.5 in commit `94d0bb4dbe`: cmake \ -DCMAKE_C_FLAGS='-march=native -O2' \ -DCMAKE_CXX_FLAGS='-stdlib=libc++ -march=native -O2' \ -DWITH_EMBEDDED_SERVER=OFF -DWITH_UNIT_TESTS=OFF -DCMAKE_BUILD_TYPE=Debug \ -DWITH_INNODB_{BZIP2,LZ4,LZMA,LZO,SNAPPY}=OFF \ -DPLUGIN_{ARCHIVE,TOKUDB,MROONGA,OQGRAPH,ROCKSDB,CONNECT,SPIDER}=NO \ -DWITH_SAFEMALLOC=OFF \ -DWITH_{ZLIB,SSL,PCRE}=bundled \ -DHAVE_LIBAIO_H=0 \ -DWITH_MSAN=ON MEM_MAKE_DEFINED(): An alias for VALGRIND_MAKE_MEM_DEFINED() and __msan_unpoison(). MEM_GET_VBITS(), MEM_SET_VBITS(): Aliases for VALGRIND_GET_VBITS(), VALGRIND_SET_VBITS(), __msan_copy_shadow(). InnoDB: Replace the UNIV_MEM_ macros with corresponding MEM_ macros. ut_crc32_8_hw(), ut_crc32_64_low_hw(): Use the compiler built-in functions instead of inline assembler when building WITH_MSAN. This will require at least -msse4.2 when building for IA-32 or AMD64. The inline assembler would not be instrumented, and would thus cause bogus failures.	2020-07-01 17:23:00 +03:00
Marko Mäkelä	3d4a801533	MDEV-12353 preparation: Replace mtr_x_lock() and friends Apart from page latches (buf_block_t::lock), mini-transactions are keeping track of at most one dict_index_t::lock and fil_space_t::latch at a time, and in a rare case, purge_sys.latch. Let us introduce interfaces for acquiring an index latch or a tablespace latch. In a later version, we may want to introduce mtr_t members for holding a latched dict_index_t* and fil_space_t, and replace the remaining use of mtr_t::m_memo with std::set<buf_block_t> or with a map<buf_block_t,byte> pointing to log records.	2019-11-14 11:40:33 +02:00
Marko Mäkelä	bc5cfe7769	Merge 10.2 into 10.3	2019-11-14 10:51:06 +02:00
Marko Mäkelä	3b573c0783	Clean up mtr_t::commit() further memo_block_unfix(), memo_latch_release(): Merge to ReleaseLatches. memo_slot_release(), ReleaseAll: Clean up the formatting.	2019-11-13 09:51:28 +02:00
Marko Mäkelä	5098d708a0	Merge 10.2 into 10.3	2019-11-12 16:42:58 +02:00
Marko Mäkelä	2570cb8b91	MDEV-12353 preparation: Clean up mtr_t mtr_t::Impl, mtr_t::Command: Merge to mtr_t. MTR_MAGIC_N: Remove. MTR_STATE_COMMITTING: Remove. This state was only being set internally during mtr_t::commit(). mtr_t::Command::m_locks_released: Remove (set-and-never-read member). mtr_t::Command::m_start_lsn: Replaced with the return value of finish_write() and a parameter to release_blocks(). mtr_t::Command::m_end_lsn: Removed as a duplicate of mtr_t::m_commit_lsn. mtr_t::Command::prepare_write(): Replace a switch () with a comparison against 0. Only 2 m_log_mode are allowed.	2019-11-12 15:46:57 +02:00
Marko Mäkelä	f42a23178e	MDEV-20425: Fix -Wimplicit-fallthrough With --skip-debug-assert, DBUG_ASSERT(false) will allow execution to continue. Hence, we will need /* fall through */ after them. Some DBUG_ASSERT(0) were replaced by break; when the switch () statement was followed by DBUG_ASSERT(0).	2019-08-30 14:11:59 +03:00
Marko Mäkelä	192aa295b4	Merge 10.2 into 10.3	2019-06-19 08:56:10 +03:00
Marko Mäkelä	2b660fb4c2	mtr_t::is_block_dirtied(): Define inline	2019-06-16 15:55:09 +03:00
Marko Mäkelä	e9795d0208	Add mtr_buf_t::for_each_block_in_reverse() const Avoid unnecessary creation of named objects for invoking mtr_buf_t::for_each_block_in_reverse().	2019-06-16 15:53:06 +03:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	c0ac0b8860	Update FSF address	2019-05-11 19:25:02 +03:00
Vicențiu Ciorbaru	f177f125d4	Merge branch '5.5' into 10.1	2019-05-11 19:15:57 +03:00
Vicențiu Ciorbaru	15f1e03d46	Follow-up to changing FSF address Some places didn't match the previous rules, making the Floor address wrong. Additional sed rules: sed -i -e 's/Place.Suite ., Boston/Street, Fifth Floor, Boston/g' sed -i -e 's/Suite .*, Boston/Fifth Floor, Boston/g'	2019-05-11 18:30:45 +03:00
Marko Mäkelä	fd58bb71e2	Merge 10.2 into 10.3	2018-11-19 18:45:53 +02:00
Marko Mäkelä	ff88e4bb8a	Remove many redundant #include from InnoDB	2018-11-19 11:42:14 +02:00
Marko Mäkelä	df563e0c03	Merge 10.2 into 10.3 main.derived_cond_pushdown: Move all 10.3 tests to the end, trim trailing white space, and add an "End of 10.3 tests" marker. Add --sorted_result to tests where the ordering is not deterministic. main.win_percentile: Add --sorted_result to tests where the ordering is no longer deterministic.	2018-11-06 09:40:39 +02:00
Marko Mäkelä	abcd09c95a	mtr_t::start(): Remove unused parameters The parameters bool sync=true, bool read_only=false of mtr_t::start() were added in `eca5b0fc17` (MySQL 5.7.3). The parameter read_only was never used anywhere. The parameter sync was only copied around, and would be returned by the unused function mtr_t::is_async(). We do not need this dead code in MariaDB.	2018-11-01 10:48:56 +02:00
Marko Mäkelä	1eb2d8f6e8	Merge 10.2 into 10.3	2018-08-16 08:54:58 +03:00
Marko Mäkelä	b853b4fd88	Report InnoDB redo log corruption better recv_parse_log_recs(): Check for corruption before checking for end-of-log-buffer. mlog_parse_initial_log_record(), page_cur_parse_delete_rec(): Flag corruption for out-of-bounds values, and let the caller dump the corrupted redo log extract.	2018-08-10 13:02:01 +03:00
Marko Mäkelä	2b27ac8282	Fix many -Wunused-parameter Remove unused InnoDB function parameters and functions. i_s_sys_virtual_fill_table(): Do not allocate heap memory. mtr_is_block_fix(): Replace with mtr_memo_contains(). mtr_is_page_fix(): Replace with mtr_memo_contains_page().	2018-05-01 16:52:19 +03:00
Marko Mäkelä	9801715cb0	Use compile_time_assert() in InnoDB Replace most use of #error. Some checks were impossible to evaluate in the preprocessor due to the use of named integer constants or enumerations.	2018-04-30 18:22:52 +03:00
Marko Mäkelä	d73a898d64	MDEV-16045: Allocate log_sys statically There is only one redo log subsystem in InnoDB. Allocate the object statically, to avoid unnecessary dereferencing of the pointer. log_t::create(): Renamed from log_sys_init(). log_t::close(): Renamed from log_shutdown(). log_t::checkpoint_buf_ptr: Remove. Allocate log_t::checkpoint_buf statically.	2018-04-29 09:46:24 +03:00
Marko Mäkelä	715e4f4320	MDEV-12218 Clean up InnoDB parameter validation Bind more InnoDB parameters directly to MYSQL_SYSVAR and remove "shadow variables". innodb_change_buffering: Declare as ENUM, not STRING. innodb_flush_method: Declare as ENUM, not STRING. innodb_log_buffer_size: Bind directly to srv_log_buffer_size, without rounding it to a multiple of innodb_page_size. LOG_BUFFER_SIZE: Remove. SysTablespace::normalize_size(): Renamed from normalize(). innodb_init_params(): A new function to initialize and validate InnoDB startup parameters. innodb_init(): Renamed from innobase_init(). Invoke innodb_init_params() before actually trying to start up InnoDB. srv_start(bool): Renamed from innobase_start_or_create_for_mysql(). Added the input parameter create_new_db. SRV_ALL_O_DIRECT_FSYNC: Define only for _WIN32. xb_normalize_init_values(): Merge to innodb_init_param().	2018-04-29 09:41:42 +03:00
Marko Mäkelä	a90100d756	Replace univ_page_size and UNIV_PAGE_SIZE Try to use one variable (srv_page_size) for innodb_page_size. Also, replace UNIV_PAGE_SIZE_SHIFT with srv_page_size_shift.	2018-04-28 20:45:45 +03:00
Marko Mäkelä	ba19764209	Fix most -Wsign-conversion in InnoDB Change innodb_buffer_pool_size, innodb_fill_factor to unsigned.	2018-04-28 20:45:45 +03:00
Marko Mäkelä	362151e8c8	MDEV-15914: Simplify mlog_open_and_write_index()	2018-04-26 22:53:33 +03:00
Marko Mäkelä	4cad42392a	MDEV-12266: Change dict_table_t::space to fil_space_t* InnoDB always keeps all tablespaces in the fil_system cache. The fil_system.LRU is only for closing file handles; the fil_space_t and fil_node_t for all data files will remain in main memory. Between startup to shutdown, they can only be created and removed by DDL statements. Therefore, we can let dict_table_t::space point directly to the fil_space_t. dict_table_t::space_id: A numeric tablespace ID for the corner cases where we do not have a tablespace. The most prominent examples are ALTER TABLE...DISCARD TABLESPACE or a missing or corrupted file. There are a few functional differences; most notably: (1) DROP TABLE will delete matching .ibd and .cfg files, even if they were not attached to the data dictionary. (2) Some error messages will report file names instead of numeric IDs. There still are many functions that use numeric tablespace IDs instead of fil_space_t, and many functions could be converted to fil_space_t member functions. Also, Tablespace and Datafile should be merged with fil_space_t and fil_node_t. page_id_t and buf_page_get_gen() could use fil_space_t& instead of a numeric ID, and after moving to a single buffer pool (MDEV-15058), buf_pool_t::page_hash could be moved to fil_space_t::page_hash. FilSpace: Remove. Only few calls to fil_space_acquire() will remain, and gradually they should be removed. mtr_t::set_named_space_id(ulint): Renamed from set_named_space(), to prevent accidental calls to this slower function. Very few callers remain. fseg_create(), fsp_reserve_free_extents(): Take fil_space_t as a parameter instead of a space_id. fil_space_t::rename(): Wrapper for fil_rename_tablespace_check(), fil_name_write_rename(), fil_rename_tablespace(). Mariabackup passes the parameter log=false; InnoDB passes log=true. dict_mem_table_create(): Take fil_space_t* instead of space_id as parameter. dict_process_sys_tables_rec_and_mtr_commit(): Replace the parameter 'status' with 'bool cached'. dict_get_and_save_data_dir_path(): Avoid copying the fil_node_t::name. fil_ibd_open(): Return the tablespace. fil_space_t::set_imported(): Replaces fil_space_set_imported(). truncate_t: Change many member function parameters to fil_space_t*, and remove page_size parameters. row_truncate_prepare(): Merge to its only caller. row_drop_table_from_cache(): Assert that the table is persistent. dict_create_sys_indexes_tuple(): Write SYS_INDEXES.SPACE=FIL_NULL if the tablespace has been discarded. row_import_update_discarded_flag(): Remove a constant parameter.	2018-03-29 22:02:05 +03:00
Marko Mäkelä	428e02895b	MDEV-12266: Remove dict_index_t::table_name Replace index->table_name with index->table->name.	2018-03-29 20:47:37 +03:00
Marko Mäkelä	604fea1ad6	MDEV-12266: Remove dict_index_t::space We can rely on the dict_table_t::space. All indexes of a table object are always in the same tablespace. (For fulltext indexes, the data is located in auxiliary tables, and these will continue to have their own table objects, separate from the main table.)	2018-03-29 20:47:37 +03:00
Marko Mäkelä	2ac8b1a907	MDEV-12266: Add fil_system.sys_space, temp_space Add fil_system_t::sys_space, fil_system_t::temp_space. These will replace lookups for TRX_SYS_SPACE or SRV_TMP_SPACE_ID. mtr_t::m_undo_space, mtr_t::m_sys_space: Remove. mtr_t::set_sys_modified(): Remove. fil_space_get_type(), fil_space_get_n_reserved_extents(): Remove. fsp_header_get_tablespace_size(), fsp_header_inc_size(): Merge to the only caller, innobase_start_or_create_for_mysql().	2018-03-29 19:18:11 +03:00
Marko Mäkelä	33714d2065	Merge bb-10.2-ext into 10.3	2018-01-30 21:04:48 +02:00
Marko Mäkelä	d9c77f0341	Revert "MDEV-6928: Add trx pointer to struct mtr_t" This reverts commit `3486135bb5`. The commit comment ended in the words: "This is needed later." Apparently the "later" never arrived.	2018-01-29 15:45:16 +02:00
Marko Mäkelä	a4948dafcd	MDEV-11369 Instant ADD COLUMN for InnoDB For InnoDB tables, adding, dropping and reordering columns has required a rebuild of the table and all its indexes. Since MySQL 5.6 (and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing concurrent modification of the tables. This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously, with only minor changes performed to the table structure. The counter innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS is incremented whenever a table rebuild operation is converted into an instant ADD COLUMN operation. ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN. Some usability limitations will be addressed in subsequent work: MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY and ALGORITHM=INSTANT MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE The format of the clustered index (PRIMARY KEY) is changed as follows: (1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT, and a new field PAGE_INSTANT will contain the original number of fields in the clustered index ('core' fields). If instant ADD COLUMN has not been used or the table becomes empty, or the very first instant ADD COLUMN operation is rolled back, the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset to 0 and FIL_PAGE_INDEX. (2) A special 'default row' record is inserted into the leftmost leaf, between the page infimum and the first user record. This record is distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the same format as records that contain values for the instantly added columns. This 'default row' always has the same number of fields as the clustered index according to the table definition. The values of 'core' fields are to be ignored. For other fields, the 'default row' will contain the default values as they were during the ALTER TABLE statement. (If the column default values are changed later, those values will only be stored in the .frm file. The 'default row' will contain the original evaluated values, which must be the same for every row.) The 'default row' must be completely hidden from higher-level access routines. Assertions have been added to ensure that no 'default row' is ever present in the adaptive hash index or in locked records. The 'default row' is never delete-marked. (3) In clustered index leaf page records, the number of fields must reside between the number of 'core' fields (dict_index_t::n_core_fields introduced in this work) and dict_index_t::n_fields. If the number of fields is less than dict_index_t::n_fields, the missing fields are replaced with the column value of the 'default row'. Note: The number of fields in the record may shrink if some of the last instantly added columns are updated to the value that is in the 'default row'. The function btr_cur_trim() implements this 'compression' on update and rollback; dtuple::trim() implements it on insert. (4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new status value REC_STATUS_COLUMNS_ADDED will indicate the presence of a new record header that will encode n_fields-n_core_fields-1 in 1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header always explicitly encodes the number of fields.) We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for covering the insert of the 'default row' record when instant ADD COLUMN is used for the first time. Subsequent instant ADD COLUMN can use TRX_UNDO_UPD_EXIST_REC. This is joint work with Vin Chen (陈福荣) from Tencent. The design that was discussed in April 2017 would not have allowed import or export of data files, because instead of the 'default row' it would have introduced a data dictionary table. The test rpl.rpl_alter_instant is exactly as contributed in pull request #408. The test innodb.instant_alter is based on a contributed test. The redo log record format changes for ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPACT are as contributed. (With this change present, crash recovery from MariaDB 10.3.1 will fail in spectacular ways!) Also the semantics of higher-level redo log records that modify the PAGE_INSTANT field is changed. The redo log format version identifier was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1. Everything else has been rewritten by me. Thanks to Elena Stepanova, the code has been tested extensively. When rolling back an instant ADD COLUMN operation, we must empty the PAGE_FREE list after deleting or shortening the 'default row' record, by calling either btr_page_empty() or btr_page_reorganize(). We must know the size of each entry in the PAGE_FREE list. If rollback left a freed copy of the 'default row' in the PAGE_FREE list, we would be unable to determine its size (if it is in ROW_FORMAT=COMPACT or ROW_FORMAT=DYNAMIC) because it would contain more fields than the rolled-back definition of the clustered index. UNIV_SQL_DEFAULT: A new special constant that designates an instantly added column that is not present in the clustered index record. len_is_stored(): Check if a length is an actual length. There are two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL. dict_col_t::def_val: The 'default row' value of the column. If the column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT. dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(), instant_value(). dict_col_t::remove_instant(): Remove the 'instant ADD' status of a column. dict_col_t::name(const dict_table_t& table): Replaces dict_table_get_col_name(). dict_index_t::n_core_fields: The original number of fields. For secondary indexes and if instant ADD COLUMN has not been used, this will be equal to dict_index_t::n_fields. dict_index_t::n_core_null_bytes: Number of bytes needed to represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable). dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that n_core_null_bytes was not initialized yet from the clustered index root page. dict_index_t: Add the accessors is_instant(), is_clust(), get_n_nullable(), instant_field_value(). dict_index_t::instant_add_field(): Adjust clustered index metadata for instant ADD COLUMN. dict_index_t::remove_instant(): Remove the 'instant ADD' status of a clustered index when the table becomes empty, or the very first instant ADD COLUMN operation is rolled back. dict_table_t: Add the accessors is_instant(), is_temporary(), supports_instant(). dict_table_t::instant_add_column(): Adjust metadata for instant ADD COLUMN. dict_table_t::rollback_instant(): Adjust metadata on the rollback of instant ADD COLUMN. prepare_inplace_alter_table_dict(): First create the ctx->new_table, and only then decide if the table really needs to be rebuilt. We must split the creation of table or index metadata from the creation of the dictionary table records and the creation of the data. In this way, we can transform a table-rebuilding operation into an instant ADD COLUMN operation. Dictionary objects will only be added to cache when table rebuilding or index creation is needed. The ctx->instant_table will never be added to cache. dict_table_t::add_to_cache(): Modified and renamed from dict_table_add_to_cache(). Do not modify the table metadata. Let the callers invoke dict_table_add_system_columns() and if needed, set can_be_evicted. dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the system columns (which will now exist in the dict_table_t object already at this point). dict_create_table_step(): Expect the callers to invoke dict_table_add_system_columns(). pars_create_table(): Before creating the table creation execution graph, invoke dict_table_add_system_columns(). row_create_table_for_mysql(): Expect all callers to invoke dict_table_add_system_columns(). create_index_dict(): Replaces row_merge_create_index_graph(). innodb_update_n_cols(): Renamed from innobase_update_n_virtual(). Call my_error() if an error occurs. btr_cur_instant_init(), btr_cur_instant_init_low(), btr_cur_instant_root_init(): Load additional metadata from the clustered index and set dict_index_t::n_core_null_bytes. This is invoked when table metadata is first loaded into the data dictionary. dict_boot(): Initialize n_core_null_bytes for the four hard-coded dictionary tables. dict_create_index_step(): Initialize n_core_null_bytes. This is executed as part of CREATE TABLE. dict_index_build_internal_clust(): Initialize n_core_null_bytes to NO_CORE_NULL_BYTES if table->supports_instant(). row_create_index_for_mysql(): Initialize n_core_null_bytes for CREATE TEMPORARY TABLE. commit_cache_norebuild(): Call the code to rename or enlarge columns in the cache only if instant ADD COLUMN is not being used. (Instant ADD COLUMN would copy all column metadata from instant_table to old_table, including the names and lengths.) PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields. This is repurposing the 16-bit field PAGE_DIRECTION, of which only the least significant 3 bits were used. The original byte containing PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B. page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT. page_ptr_get_direction(), page_get_direction(), page_ptr_set_direction(): Accessors for PAGE_DIRECTION. page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION. page_direction_increment(): Increment PAGE_N_DIRECTION and set PAGE_DIRECTION. rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes, and assume that heap_no is always set. Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records, even if the record contains fewer fields. rec_offs_make_valid(): Add the parameter 'leaf'. rec_copy_prefix_to_dtuple(): Assert that the tuple is only built on the core fields. Instant ADD COLUMN only applies to the clustered index, and we should never build a search key that has more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR. All these columns are always present. dict_index_build_data_tuple(): Remove assertions that would be duplicated in rec_copy_prefix_to_dtuple(). rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose number of fields is between n_core_fields and n_fields. cmp_rec_rec_with_match(): Implement the comparison between two MIN_REC_FLAG records. trx_t::in_rollback: Make the field available in non-debug builds. trx_start_for_ddl_low(): Remove dangerous error-tolerance. A dictionary transaction must be flagged as such before it has generated any undo log records. This is because trx_undo_assign_undo() will mark the transaction as a dictionary transaction in the undo log header right before the very first undo log record is being written. btr_index_rec_validate(): Account for instant ADD COLUMN row_undo_ins_remove_clust_rec(): On the rollback of an insert into SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the last column from the table and the clustered index. row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(), trx_undo_update_rec_get_update(): Handle the 'default row' as a special case. dtuple_t::trim(index): Omit a redundant suffix of an index tuple right before insert or update. After instant ADD COLUMN, if the last fields of a clustered index tuple match the 'default row', there is no need to store them. While trimming the entry, we must hold a page latch, so that the table cannot be emptied and the 'default row' be deleted. btr_cur_optimistic_update(), btr_cur_pessimistic_update(), row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low(): Invoke dtuple_t::trim() if needed. row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling row_ins_clust_index_entry_low(). rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number of fields to be between n_core_fields and n_fields. Do not support infimum,supremum. They are never supposed to be stored in dtuple_t, because page creation nowadays uses a lower-level method for initializing them. rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the number of fields. btr_cur_trim(): In an update, trim the index entry as needed. For the 'default row', handle rollback specially. For user records, omit fields that match the 'default row'. btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete(): Skip locking and adaptive hash index for the 'default row'. row_log_table_apply_convert_mrec(): Replace 'default row' values if needed. In the temporary file that is applied by row_log_table_apply(), we must identify whether the records contain the extra header for instantly added columns. For now, we will allocate an additional byte for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table has been subject to instant ADD COLUMN. The ROW_T_DELETE records are fine, as they will be converted and will only contain 'core' columns (PRIMARY KEY and some system columns) that are converted from dtuple_t. rec_get_converted_size_temp(), rec_init_offsets_temp(), rec_convert_dtuple_to_temp(): Add the parameter 'status'. REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG \| REC_STATUS_COLUMNS_ADDED: An info_bits constant for distinguishing the 'default row' record. rec_comp_status_t: An enum of the status bit values. rec_leaf_format: An enum that replaces the bool parameter of rec_init_offsets_comp_ordinary().	2017-10-06 09:50:10 +03:00
Marko Mäkelä	48192f963a	Add the parameter bool leaf to rec_get_offsets() This should affect debug builds only. Debug builds will check that the status bits of ROW_FORMAT!=REDUNDANT records match the is_leaf parameter. The only observable change to non-debug should be the addition of the is_leaf parameter to the function rec_copy_prefix_to_dtuple(), and the removal of some calls to update the adaptive hash index (it is only built for the leaf pages). This change should have been made in MySQL 5.0.3, instead of introducing the status flags in the ROW_FORMAT=COMPACT record header.	2017-09-20 16:53:34 +03:00
Marko Mäkelä	6b687a0fde	Introduce page_rec_is_leaf() and clean up page0page.h Define some page accessor functions inline in page0page.h, reducing code duplication in page0page.ic. Use page_rec_is_leaf() instead of page_is_leaf() where possible.	2017-09-20 08:42:44 +03:00
Marko Mäkelä	dcdc1c6d09	MDEV-13452 Assertion `!recv_no_log_write' failed in log_reserve_and_open() The debug flag recv_no_log_write prohibits writes of redo log records for modifying page data. The debug assertion was failing when fil_names_clear() was writing the informative MLOG_FILE_NAME and MLOG_CHECKPOINT records which do not modify any data. log_reserve_and_open(), log_write_low(): Remove the debug assertion. log_pad_current_log_block(), mtr_write_log(), mtr_t::Command::prepare_write(): Add the debug assertion.	2017-08-07 13:54:37 +03:00
Marko Mäkelä	42f657cd2f	MDEV-13267 At startup with crash recovery: mtr_t::commit_checkpoint(lsn_t, bool): Assertion `!recv_no_log_write' failed This is a bogus debug assertion failure that should be possible starting with MariaDB 10.2.2 (which merged WL#7142 via MySQL 5.7.9). While generating page-change redo log records is strictly out of the question during tat certain parts of crash recovery, the fil_names_clear() is only emitting informational MLOG_FILE_NAME and MLOG_CHECKPOINT records to guarantee that if the server is killed during or soon after the crash recovery, subsequent crash recovery will be possible. The metadata buffer that fil_names_clear() is flushing to the redo log is being filled by recv_init_crash_recovery_spaces(), right before starting to apply redo log, by invoking fil_names_dirty() on every discovered tablespace for which there are changes to apply. When it comes to Mariabackup (xtrabackup --prepare), it is strictly out of the question to generate any redo log whatsoever, because that could break the restore of incremental backups by causing LSN deviation. So, the fil_names_dirty() call must be skipped when restoring backups. recv_recovery_from_checkpoint_start(): Do not invoke fil_names_clear() when restoring a backup. mtr_t::commit_checkpoint(): Remove the failing assertion. The only caller is fil_names_clear(), and it must be called by recv_recovery_from_checkpoint_start() for normal server startup to be crash-safe. The debug assertion in mtr_t::commit() will still catch rogue redo log writes.	2017-07-07 18:40:57 +03:00
Marko Mäkelä	8c71c6aa8b	MDEV-12548 Initial implementation of Mariabackup for MariaDB 10.2 InnoDB I/O and buffer pool interfaces and the redo log format have been changed between MariaDB 10.1 and 10.2, and the backup code has to be adjusted accordingly. The code has been simplified, and many memory leaks have been fixed. Instead of the file name xtrabackup_logfile, the file name ib_logfile0 is being used for the copy of the redo log. Unnecessary InnoDB startup and shutdown and some unnecessary threads have been removed. Some help was provided by Vladislav Vaintroub. Parameters have been cleaned up and aligned with those of MariaDB 10.2. The --dbug option has been added, so that in debug builds, --dbug=d,ib_log can be specified to enable diagnostic messages for processing redo log entries. By default, innodb_doublewrite=OFF, so that --prepare works faster. If more crash-safety for --prepare is needed, double buffering can be enabled. The parameter innodb_log_checksums=OFF can be used to ignore redo log checksums in --backup. Some messages have been cleaned up. Unless --export is specified, Mariabackup will not deal with undo log. The InnoDB mini-transaction redo log is not only about user-level transactions; it is actually about mini-transactions. To avoid confusion, call it the redo log, not transaction log. We disable any undo log processing in --prepare. Because MariaDB 10.2 supports indexed virtual columns, the undo log processing would need to be able to evaluate virtual column expressions. To reduce the amount of code dependencies, we will not process any undo log in prepare. This means that the --export option must be disabled for now. This also means that the following options are redundant and have been removed: xtrabackup --apply-log-only innobackupex --redo-only In addition to disabling any undo log processing, we will disable any further changes to data pages during --prepare, including the change buffer merge. This means that restoring incremental backups should reliably work even when change buffering is being used on the server. Because of this, preparing a backup will not generate any further redo log, and the redo log file can be safely deleted. (If the --export option is enabled in the future, it must generate redo log when processing undo logs and buffered changes.) In --prepare, we cannot easily know if a partial backup was used, especially when restoring a series of incremental backups. So, we simply warn about any missing files, and ignore the redo log for them. FIXME: Enable the --export option. FIXME: Improve the handling of the MLOG_INDEX_LOAD record, and write a test that initiates a backup while an ALGORITHM=INPLACE operation is creating indexes or rebuilding a table. An error should be detected when preparing the backup. FIXME: In --incremental --prepare, xtrabackup_apply_delta() should ensure that if FSP_SIZE is modified, the file size will be adjusted accordingly.	2017-07-05 11:43:28 +03:00
Marko Mäkelä	3e1d0ff574	Fix a merge error in commit `8f643e2063` A merge error caused InnoDB bootstrap to fail when innodb_undo_tablespaces was set to more than 2. This was because of a bug that was introduced to srv_undo_tablespaces_init() by the merge. Furthermore, some adjustments for Oracle Bug#25551311 aka Bug#23517560 changes were forgotten. We must minimize direct references to srv_undo_tablespaces_open and use predicates instead. srv_undo_tablespaces_init(): Increment srv_undo_tablespaces_open once, not twice, per loop iteration. is_system_or_undo_tablespace(): Remove (unused function). is_predefined_tablespace(): Invoke srv_is_undo_tablespace().	2017-06-27 21:23:12 +03:00

1 2 3

104 commits