mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-31 11:01:52 +01:00

Author	SHA1	Message	Date
Monty	65f831d17c	Fixed bugs found by valgrind - Some of the bug fixes are backports from 10.5! - The fix in innobase/fil/fil0fil.cc is just a backport to get less error messages in mysqld.1.err when running with valgrind. - Renamed HAVE_valgrind_or_MSAN to HAVE_valgrind	2020-07-02 17:57:34 +03:00
Marko Mäkelä	1df1a63924	Merge 10.2 into 10.3	2020-07-02 06:17:51 +03:00
Marko Mäkelä	c36834c832	MDEV-20377: Make WITH_MSAN more usable MemorySanitizer (clang -fsanitize=memory) requires that all code be compiled with instrumentation enabled. The only exception is the C runtime library. Failure to use instrumented libraries will cause bogus messages about memory being uninitialized. In WITH_MSAN builds, we must avoid calling getservbyname(), because even though it is a standard library function, it is not instrumented, not even in clang 10. Note: Before MariaDB Server 10.5, ./mtr will typically fail due to the old PCRE library, which was updated in MDEV-14024. The following cmake options were tested on 10.5 in commit `94d0bb4dbe`: cmake \ -DCMAKE_C_FLAGS='-march=native -O2' \ -DCMAKE_CXX_FLAGS='-stdlib=libc++ -march=native -O2' \ -DWITH_EMBEDDED_SERVER=OFF -DWITH_UNIT_TESTS=OFF -DCMAKE_BUILD_TYPE=Debug \ -DWITH_INNODB_{BZIP2,LZ4,LZMA,LZO,SNAPPY}=OFF \ -DPLUGIN_{ARCHIVE,TOKUDB,MROONGA,OQGRAPH,ROCKSDB,CONNECT,SPIDER}=NO \ -DWITH_SAFEMALLOC=OFF \ -DWITH_{ZLIB,SSL,PCRE}=bundled \ -DHAVE_LIBAIO_H=0 \ -DWITH_MSAN=ON MEM_MAKE_DEFINED(): An alias for VALGRIND_MAKE_MEM_DEFINED() and __msan_unpoison(). MEM_GET_VBITS(), MEM_SET_VBITS(): Aliases for VALGRIND_GET_VBITS(), VALGRIND_SET_VBITS(), __msan_copy_shadow(). InnoDB: Replace the UNIV_MEM_ macros with corresponding MEM_ macros. ut_crc32_8_hw(), ut_crc32_64_low_hw(): Use the compiler built-in functions instead of inline assembler when building WITH_MSAN. This will require at least -msse4.2 when building for IA-32 or AMD64. The inline assembler would not be instrumented, and would thus cause bogus failures.	2020-07-01 17:23:00 +03:00
Oleksandr Byelkin	7fb73ed143	Merge branch '10.2' into 10.3	2020-05-04 16:47:11 +02:00
Daniel Black	ba2061da52	MDEV-21595: innodb offset_t rename to rec_offs thanks to: perl -i -pe 's/\boffset_t\b/rec_offs/g' $(git grep -lw offset_t storage/innobase)	2020-04-29 12:02:47 +03:00
Marko Mäkelä	3466b47b0d	Merge 10.2 into 10.3	2019-12-13 10:08:57 +02:00
Eugene Kosov	f0aa073f2b	MDEV-20950 Reduce size of record offsets offset_t: this is a type which represents one record offset. It's unsigned short int. a lot of functions: replace ulint with offset_t btr_pcur_restore_position_func(), page_validate(), row_ins_scan_sec_index_for_duplicate(), row_upd_clust_rec_by_insert_inherit_func(), row_vers_impl_x_locked_low(), trx_undo_prev_version_build(): allocate record offsets on the stack instead of waiting for rec_get_offsets() to allocate it from mem_heap_t. So, reducing memory allocations. RECORD_OFFSET, INDEX_OFFSET: now it's less convenient to store pointers in offset_t* array. One pointer occupies now several offset_t. And those constant are start indexes into array to places where to store pointer values REC_OFFS_HEADER_SIZE: adjusted for the new reality REC_OFFS_NORMAL_SIZE: increase size from 100 to 300 which means less heap allocations. And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which is smaller than previous 800 bytes. REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality rem0rec.h, rem0rec.ic, rem0rec.cc: various arguments, return values and local variables types were changed to fix numerous integer conversions issues. enum field_type_t: offset types concept was introduces which replaces old offset flags stuff. Like in earlier version, 2 upper bits are used to store offset type. And this enum represents those types. REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed get_type(), set_type(), get_value(), combine(): these are convenience functions to work with offsets and it's types rec_offs_base()[0]: still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL rec_offs_base()[i]: these have type offset_t now. Two upper bits contains type.	2019-12-13 00:26:50 +07:00
Marko Mäkelä	0a20e5ab77	Merge 10.2 into 10.3	2019-12-12 14:41:51 +02:00
Marko Mäkelä	d146e3dcfe	MDEV-21256: Simplify ut_rnd_interval() ut_rnd_interval(): Remove the first parameter, which was mostly passed as 0. Implement as a simple wrapper around ut_rnd_gen(). Trivially return 0 if the size of the interval is smaller than 2. ut_rnd_ulint_counter, ut_rnd_gen_next_ulint(), ut_rnd_gen_ulint(): Remove.	2019-12-10 16:58:28 +02:00
Marko Mäkelä	51fc8ab73e	MDEV-21256: Reduce the use of ut_rnd_gen_next_ulint() ut_rnd_set_seed(): Unused function; remove. ut_rnd_gen(): Renamed from page_cur_lcg_prng(). ut_rnd_current: The internal state of ut_rnd_gen(). page_cur_open_on_rnd_user_rec(): Replace linear search with page_rec_get_nth().	2019-12-10 16:58:28 +02:00
Marko Mäkelä	29d67d051a	Cleanup btr_page_get_prev(), btr_page_get_next() Remove the redundant parameter mtr_t*. Make use of page_has_prev(), page_has_next() whenever possible.	2019-11-11 13:36:21 +02:00
Marko Mäkelä	892378fb9d	Merge 10.2 into 10.3	2019-10-09 13:25:11 +03:00
Eugene Kosov	ed0793e096	MDEV-19783: Add more REC_INFO_MIN_REC_FLAG checks btr_cur_pessimistic_delete(): code changed in a way that allows to put more REC_INFO_MIN_REC_FLAG assertions inside btr_set_min_rec_mark(). Without that change tests innodb.innodb-table-online, innodb.temp_table_savepoint and innodb_zip.prefix_index_liftedlimit fail. Removed basically duplicated page_zip_validate() calls which fails because of temporary(!) invariant violation. That fixed innodb_zip.wl5522_debug_zip and innodb_zip.prefix_index_liftedlimit	2019-10-09 08:29:26 +03:00
Marko Mäkelä	d480d28f4f	Add page_has_prev(), page_has_next(), page_has_siblings() Until now, InnoDB inefficiently compared the aligned fields FIL_PAGE_PREV, FIL_PAGE_NEXT to the byte-order-agnostic value FIL_NULL. This is a backport of `32170f8c6d` from MariaDB Server 10.3.	2019-10-09 08:29:26 +03:00
Marko Mäkelä	b951fc4e7f	Merge 10.2 into 10.3	2019-07-24 15:34:24 +03:00
Marko Mäkelä	97055e6b11	MDEV-14154: Remove ut_time_us() Use microsecond_interval_timer() or my_interval_timer() [in nanoseconds] instead.	2019-07-23 17:25:02 +03:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	f177f125d4	Merge branch '5.5' into 10.1	2019-05-11 19:15:57 +03:00
Marko Mäkelä	c3a6c683e2	Merge 10.2 into 10.3	2019-03-25 11:03:19 +02:00
Marko Mäkelä	72b934e3f7	MDEV-14126: Detect unexpected emptying of B-tree pages If an index page becomes empty, btr_page_empty() should be called.	2019-03-25 10:53:01 +02:00
Marko Mäkelä	b59d484696	MDEV-14126: Remove page_is_root() The predicate page_is_root(), which was added in MariaDB Server 10.2.2, is based on a wrong assumption. Under some circumstances, InnoDB can transform B-trees into a degenerate state where a non-leaf page has no sibling pages. Because of this, we cannot assume that a page that has no siblings is the root page. This bug will be tracked as MDEV-19022. Because of the bug that may affect many InnoDB data files, we must remove and replace the wrong predicate. Using the wrong predicate can cause corruption. A leaf page is not allowed to be empty except if it is the root page, and the entire table is empty.	2019-03-25 10:53:00 +02:00
Marko Mäkelä	fd58bb71e2	Merge 10.2 into 10.3	2018-11-19 18:45:53 +02:00
Marko Mäkelä	ff88e4bb8a	Remove many redundant #include from InnoDB	2018-11-19 11:42:14 +02:00
Marko Mäkelä	755187c853	Terminology: 'metadata record' instead of 'default row' For instant ALTER TABLE, we store a hidden metadata record at the start of the clustered index, to indicate how the format of the records differs from the latest table definition. The term 'default row' is too specific, because it applies to instant ADD COLUMN only, and we will be supporting more classes of instant ALTER TABLE later on. For instant ADD COLUMN, we store the initial default values in the metadata record.	2018-09-19 07:21:24 +03:00
Marko Mäkelä	7830fb7f45	Merge 10.2 into 10.3	2018-08-28 12:22:56 +03:00
Marko Mäkelä	1b4c5b7327	MDEV-16868 Same query gives different results An INSERT into a temporary table would fail to set the index page as modified. If there were no other write operations (such as UPDATE or DELETE) to the page, and the page was evicted, we would read back the old contents of the page, causing corruption or loss of data. page_cur_insert_rec_write_log(): Call mtr_t::set_modified() for temporary tables. Normally this is part of the mlog_open() call, but the mlog_open() call was only present in debug builds. This regression was caused by commit `48192f963a` which was preparation for MDEV-11369 and supposed to affect debug builds only. Thanks to Thirunarayanan Balathandayuthapani for debugging.	2018-08-24 09:38:52 +03:00
Marko Mäkelä	1eb2d8f6e8	Merge 10.2 into 10.3	2018-08-16 08:54:58 +03:00
Marko Mäkelä	b853b4fd88	Report InnoDB redo log corruption better recv_parse_log_recs(): Check for corruption before checking for end-of-log-buffer. mlog_parse_initial_log_record(), page_cur_parse_delete_rec(): Flag corruption for out-of-bounds values, and let the caller dump the corrupted redo log extract.	2018-08-10 13:02:01 +03:00
Marko Mäkelä	93b6552182	Merge 10.2 into 10.3	2018-07-26 09:19:52 +03:00
Marko Mäkelä	0f90728bc0	MDEV-16809 Allow full redo logging for ALTER TABLE Introduce the configuration option innodb_log_optimize_ddl for controlling whether native index creation or table-rebuild in InnoDB should keep optimizing the redo log (and writing MLOG_INDEX_LOAD records to ensure that concurrent backup would fail). By default, we have innodb_log_optimize_ddl=ON, that is, the default behaviour that was introduced in MariaDB 10.2.2 (with the merge of InnoDB from MySQL 5.7) will be unchanged. BtrBulk::m_trx: Replaces m_trx_id. We must be able to check for KILL QUERY even if !m_flush_observer (innodb_log_optimize_ddl=OFF). page_cur_insert_rec_write_log(): Declare globally, so that this can be called from PageBulk::insert(). row_merge_insert_index_tuples(): Remove the unused parameter trx_id. row_merge_build_indexes(): Enable or disable redo logging based on the innodb_log_optimize_ddl parameter. PageBulk::init(), PageBulk::insert(), PageBulk::finish(): Write redo log records if needed. For ROW_FORMAT=COMPRESSED, redo log will be written in PageBulk::compress() unless we called m_mtr.set_log_mode(MTR_LOG_NO_REDO).	2018-07-26 08:44:42 +03:00
Marko Mäkelä	ba43914ec4	Replace dict_table_is_temporary(table) with table->is_temporary()	2018-05-12 22:12:12 +03:00
Marko Mäkelä	9801715cb0	Use compile_time_assert() in InnoDB Replace most use of #error. Some checks were impossible to evaluate in the preprocessor due to the use of named integer constants or enumerations.	2018-04-30 18:22:52 +03:00
Marko Mäkelä	9ed2b2b2b8	Do not divide or multiply by srv_page_size Instead, shift by srv_page_size_shift.	2018-04-28 20:52:22 +03:00
Marko Mäkelä	a90100d756	Replace univ_page_size and UNIV_PAGE_SIZE Try to use one variable (srv_page_size) for innodb_page_size. Also, replace UNIV_PAGE_SIZE_SHIFT with srv_page_size_shift.	2018-04-28 20:45:45 +03:00
Marko Mäkelä	ba19764209	Fix most -Wsign-conversion in InnoDB Change innodb_buffer_pool_size, innodb_fill_factor to unsigned.	2018-04-28 20:45:45 +03:00
Marko Mäkelä	604fea1ad6	MDEV-12266: Remove dict_index_t::space We can rely on the dict_table_t::space. All indexes of a table object are always in the same tablespace. (For fulltext indexes, the data is located in auxiliary tables, and these will continue to have their own table objects, separate from the main table.)	2018-03-29 20:47:37 +03:00
Marko Mäkelä	32170f8c6d	Add page_has_prev(), page_has_next(), page_has_siblings() Until now, InnoDB inefficiently compared the aligned fields FIL_PAGE_PREV, FIL_PAGE_NEXT to the byte-order-agnostic value FIL_NULL.	2018-02-08 22:34:21 +02:00
Marko Mäkelä	3e6fcb6ac8	MDEV-14935 Remove bogus conditions related to not redo-logging PAGE_MAX_TRX_ID changes InnoDB originally skipped the redo logging of PAGE_MAX_TRX_ID changes until I enabled it in commit `e76b873f24` that was part of MySQL 5.5.5 already. Later, when a more complete history of the InnoDB Plugin for MySQL 5.1 (aka branches/zip in the InnoDB subversion repository) and of the planned-to-be closed-source branches/innodb+ that became the basis of InnoDB in MySQL 5.5 was pushed to the MySQL source repository, the change was part of commit `509e761f06`: ------------------------------------------------------------------------ r5038 \| marko \| 2009-05-19 22:59:07 +0300 (Tue, 19 May 2009) \| 30 lines branches/zip: Write PAGE_MAX_TRX_ID to the redo log. Otherwise, transactions that are started before the rollback of incomplete transactions has finished may have an inconsistent view of the secondary indexes. dict_index_is_sec_or_ibuf(): Auxiliary function for controlling updates and checks of PAGE_MAX_TRX_ID: check whether an index is a secondary index or the insert buffer tree. page_set_max_trx_id(), page_update_max_trx_id(), lock_rec_insert_check_and_lock(), lock_sec_rec_modify_check_and_lock(), btr_cur_ins_lock_and_undo(), btr_cur_upd_lock_and_undo(): Add the parameter mtr. page_set_max_trx_id(): Allow mtr to be NULL. When mtr==NULL, do not attempt to write to the redo log. This only occurs when creating a page or reorganizing a compressed page. In these cases, the PAGE_MAX_TRX_ID will be set correctly during the application of redo log records, even though there is no explicit log record about it. btr_discard_only_page_on_level(): Preserve PAGE_MAX_TRX_ID. This function should be unreachable, though. btr_cur_pessimistic_update(): Update PAGE_MAX_TRX_ID. Add some assertions for checking that PAGE_MAX_TRX_ID is set on all secondary index leaf pages. rb://115 tested by Michael, fixes Issue #211 ------------------------------------------------------------------------ After this fix, some bogus references to recv_recovery_is_on() remained. Also, some references could be replaced with references to index->is_dummy to prepare us for MDEV-14481 (background redo log apply).	2018-01-12 18:31:03 +02:00
Marko Mäkelä	a4948dafcd	MDEV-11369 Instant ADD COLUMN for InnoDB For InnoDB tables, adding, dropping and reordering columns has required a rebuild of the table and all its indexes. Since MySQL 5.6 (and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing concurrent modification of the tables. This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously, with only minor changes performed to the table structure. The counter innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS is incremented whenever a table rebuild operation is converted into an instant ADD COLUMN operation. ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN. Some usability limitations will be addressed in subsequent work: MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY and ALGORITHM=INSTANT MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE The format of the clustered index (PRIMARY KEY) is changed as follows: (1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT, and a new field PAGE_INSTANT will contain the original number of fields in the clustered index ('core' fields). If instant ADD COLUMN has not been used or the table becomes empty, or the very first instant ADD COLUMN operation is rolled back, the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset to 0 and FIL_PAGE_INDEX. (2) A special 'default row' record is inserted into the leftmost leaf, between the page infimum and the first user record. This record is distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the same format as records that contain values for the instantly added columns. This 'default row' always has the same number of fields as the clustered index according to the table definition. The values of 'core' fields are to be ignored. For other fields, the 'default row' will contain the default values as they were during the ALTER TABLE statement. (If the column default values are changed later, those values will only be stored in the .frm file. The 'default row' will contain the original evaluated values, which must be the same for every row.) The 'default row' must be completely hidden from higher-level access routines. Assertions have been added to ensure that no 'default row' is ever present in the adaptive hash index or in locked records. The 'default row' is never delete-marked. (3) In clustered index leaf page records, the number of fields must reside between the number of 'core' fields (dict_index_t::n_core_fields introduced in this work) and dict_index_t::n_fields. If the number of fields is less than dict_index_t::n_fields, the missing fields are replaced with the column value of the 'default row'. Note: The number of fields in the record may shrink if some of the last instantly added columns are updated to the value that is in the 'default row'. The function btr_cur_trim() implements this 'compression' on update and rollback; dtuple::trim() implements it on insert. (4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new status value REC_STATUS_COLUMNS_ADDED will indicate the presence of a new record header that will encode n_fields-n_core_fields-1 in 1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header always explicitly encodes the number of fields.) We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for covering the insert of the 'default row' record when instant ADD COLUMN is used for the first time. Subsequent instant ADD COLUMN can use TRX_UNDO_UPD_EXIST_REC. This is joint work with Vin Chen (陈福荣) from Tencent. The design that was discussed in April 2017 would not have allowed import or export of data files, because instead of the 'default row' it would have introduced a data dictionary table. The test rpl.rpl_alter_instant is exactly as contributed in pull request #408. The test innodb.instant_alter is based on a contributed test. The redo log record format changes for ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPACT are as contributed. (With this change present, crash recovery from MariaDB 10.3.1 will fail in spectacular ways!) Also the semantics of higher-level redo log records that modify the PAGE_INSTANT field is changed. The redo log format version identifier was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1. Everything else has been rewritten by me. Thanks to Elena Stepanova, the code has been tested extensively. When rolling back an instant ADD COLUMN operation, we must empty the PAGE_FREE list after deleting or shortening the 'default row' record, by calling either btr_page_empty() or btr_page_reorganize(). We must know the size of each entry in the PAGE_FREE list. If rollback left a freed copy of the 'default row' in the PAGE_FREE list, we would be unable to determine its size (if it is in ROW_FORMAT=COMPACT or ROW_FORMAT=DYNAMIC) because it would contain more fields than the rolled-back definition of the clustered index. UNIV_SQL_DEFAULT: A new special constant that designates an instantly added column that is not present in the clustered index record. len_is_stored(): Check if a length is an actual length. There are two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL. dict_col_t::def_val: The 'default row' value of the column. If the column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT. dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(), instant_value(). dict_col_t::remove_instant(): Remove the 'instant ADD' status of a column. dict_col_t::name(const dict_table_t& table): Replaces dict_table_get_col_name(). dict_index_t::n_core_fields: The original number of fields. For secondary indexes and if instant ADD COLUMN has not been used, this will be equal to dict_index_t::n_fields. dict_index_t::n_core_null_bytes: Number of bytes needed to represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable). dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that n_core_null_bytes was not initialized yet from the clustered index root page. dict_index_t: Add the accessors is_instant(), is_clust(), get_n_nullable(), instant_field_value(). dict_index_t::instant_add_field(): Adjust clustered index metadata for instant ADD COLUMN. dict_index_t::remove_instant(): Remove the 'instant ADD' status of a clustered index when the table becomes empty, or the very first instant ADD COLUMN operation is rolled back. dict_table_t: Add the accessors is_instant(), is_temporary(), supports_instant(). dict_table_t::instant_add_column(): Adjust metadata for instant ADD COLUMN. dict_table_t::rollback_instant(): Adjust metadata on the rollback of instant ADD COLUMN. prepare_inplace_alter_table_dict(): First create the ctx->new_table, and only then decide if the table really needs to be rebuilt. We must split the creation of table or index metadata from the creation of the dictionary table records and the creation of the data. In this way, we can transform a table-rebuilding operation into an instant ADD COLUMN operation. Dictionary objects will only be added to cache when table rebuilding or index creation is needed. The ctx->instant_table will never be added to cache. dict_table_t::add_to_cache(): Modified and renamed from dict_table_add_to_cache(). Do not modify the table metadata. Let the callers invoke dict_table_add_system_columns() and if needed, set can_be_evicted. dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the system columns (which will now exist in the dict_table_t object already at this point). dict_create_table_step(): Expect the callers to invoke dict_table_add_system_columns(). pars_create_table(): Before creating the table creation execution graph, invoke dict_table_add_system_columns(). row_create_table_for_mysql(): Expect all callers to invoke dict_table_add_system_columns(). create_index_dict(): Replaces row_merge_create_index_graph(). innodb_update_n_cols(): Renamed from innobase_update_n_virtual(). Call my_error() if an error occurs. btr_cur_instant_init(), btr_cur_instant_init_low(), btr_cur_instant_root_init(): Load additional metadata from the clustered index and set dict_index_t::n_core_null_bytes. This is invoked when table metadata is first loaded into the data dictionary. dict_boot(): Initialize n_core_null_bytes for the four hard-coded dictionary tables. dict_create_index_step(): Initialize n_core_null_bytes. This is executed as part of CREATE TABLE. dict_index_build_internal_clust(): Initialize n_core_null_bytes to NO_CORE_NULL_BYTES if table->supports_instant(). row_create_index_for_mysql(): Initialize n_core_null_bytes for CREATE TEMPORARY TABLE. commit_cache_norebuild(): Call the code to rename or enlarge columns in the cache only if instant ADD COLUMN is not being used. (Instant ADD COLUMN would copy all column metadata from instant_table to old_table, including the names and lengths.) PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields. This is repurposing the 16-bit field PAGE_DIRECTION, of which only the least significant 3 bits were used. The original byte containing PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B. page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT. page_ptr_get_direction(), page_get_direction(), page_ptr_set_direction(): Accessors for PAGE_DIRECTION. page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION. page_direction_increment(): Increment PAGE_N_DIRECTION and set PAGE_DIRECTION. rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes, and assume that heap_no is always set. Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records, even if the record contains fewer fields. rec_offs_make_valid(): Add the parameter 'leaf'. rec_copy_prefix_to_dtuple(): Assert that the tuple is only built on the core fields. Instant ADD COLUMN only applies to the clustered index, and we should never build a search key that has more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR. All these columns are always present. dict_index_build_data_tuple(): Remove assertions that would be duplicated in rec_copy_prefix_to_dtuple(). rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose number of fields is between n_core_fields and n_fields. cmp_rec_rec_with_match(): Implement the comparison between two MIN_REC_FLAG records. trx_t::in_rollback: Make the field available in non-debug builds. trx_start_for_ddl_low(): Remove dangerous error-tolerance. A dictionary transaction must be flagged as such before it has generated any undo log records. This is because trx_undo_assign_undo() will mark the transaction as a dictionary transaction in the undo log header right before the very first undo log record is being written. btr_index_rec_validate(): Account for instant ADD COLUMN row_undo_ins_remove_clust_rec(): On the rollback of an insert into SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the last column from the table and the clustered index. row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(), trx_undo_update_rec_get_update(): Handle the 'default row' as a special case. dtuple_t::trim(index): Omit a redundant suffix of an index tuple right before insert or update. After instant ADD COLUMN, if the last fields of a clustered index tuple match the 'default row', there is no need to store them. While trimming the entry, we must hold a page latch, so that the table cannot be emptied and the 'default row' be deleted. btr_cur_optimistic_update(), btr_cur_pessimistic_update(), row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low(): Invoke dtuple_t::trim() if needed. row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling row_ins_clust_index_entry_low(). rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number of fields to be between n_core_fields and n_fields. Do not support infimum,supremum. They are never supposed to be stored in dtuple_t, because page creation nowadays uses a lower-level method for initializing them. rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the number of fields. btr_cur_trim(): In an update, trim the index entry as needed. For the 'default row', handle rollback specially. For user records, omit fields that match the 'default row'. btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete(): Skip locking and adaptive hash index for the 'default row'. row_log_table_apply_convert_mrec(): Replace 'default row' values if needed. In the temporary file that is applied by row_log_table_apply(), we must identify whether the records contain the extra header for instantly added columns. For now, we will allocate an additional byte for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table has been subject to instant ADD COLUMN. The ROW_T_DELETE records are fine, as they will be converted and will only contain 'core' columns (PRIMARY KEY and some system columns) that are converted from dtuple_t. rec_get_converted_size_temp(), rec_init_offsets_temp(), rec_convert_dtuple_to_temp(): Add the parameter 'status'. REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG \| REC_STATUS_COLUMNS_ADDED: An info_bits constant for distinguishing the 'default row' record. rec_comp_status_t: An enum of the status bit values. rec_leaf_format: An enum that replaces the bool parameter of rec_init_offsets_comp_ordinary().	2017-10-06 09:50:10 +03:00
Marko Mäkelä	9c373d4d1d	Fix bogus rec_get_offsets() debug assertion failures for ROW_FORMAT=REDUNDANT When the debug parameter 'bool leaf' was added to rec_get_offsets(), also some debug assertions for reading the heap_no of ROW_FORMAT=REDUNDANT records were added. However, the heap number is uninitialized when offsets are being computed for to-be-inserted records. For debug builds, initialize the heap number to a dummy value, so that the record will be interpreted as 'user record'. The infimum and supremum pseudo-records are never copied from the page frame and never inserted; they are part of the page creation. rec_convert_dtuple_to_rec_old(): Remove a bogus memset() in debug builds.	2017-09-21 10:14:30 +03:00
Marko Mäkelä	48192f963a	Add the parameter bool leaf to rec_get_offsets() This should affect debug builds only. Debug builds will check that the status bits of ROW_FORMAT!=REDUNDANT records match the is_leaf parameter. The only observable change to non-debug should be the addition of the is_leaf parameter to the function rec_copy_prefix_to_dtuple(), and the removal of some calls to update the adaptive hash index (it is only built for the leaf pages). This change should have been made in MySQL 5.0.3, instead of introducing the status flags in the ROW_FORMAT=COMPACT record header.	2017-09-20 16:53:34 +03:00
Marko Mäkelä	cd694d76ce	Merge 10.0 into 10.1	2017-09-06 15:32:56 +03:00
Marko Mäkelä	6b45355e6b	MDEV-13103 Assertion `flags & BUF_PAGE_PRINT_NO_CRASH' failed in buf_page_print buf_page_print(): Remove the parameter 'flags', and when a server abort is intended, perform that in the caller. In this way, page corruption reports due to different reasons can be distinguished better. This is non-functional code refactoring that does not fix any page corruption issues. The change is only made to avoid falsely grouping together unrelated causes of page corruption.	2017-09-06 14:01:15 +03:00
Marko Mäkelä	f9cc391863	Merge 10.1 into 10.2 This only merges MDEV-12253, adapting it to MDEV-12602 which is already present in 10.2 but not yet in the 10.1 revision that is being merged. TODO: Error handling in crash recovery needs to be improved. If a page cannot be decrypted (or read), we should cleanly abort the startup. If innodb_force_recovery is specified, we should ignore the problematic page and apply redo log to other pages. Currently, the test encryption.innodb-redo-badkey randomly fails like this (the last messages are from cmake -DWITH_ASAN): 2017-05-05 10:19:40 140037071685504 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1635994 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Missing MLOG_FILE_NAME or MLOG_FILE_DELETE before MLOG_CHECKPOINT for tablespace 1 2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[2201] with error Data structure corruption 2017-05-05 10:19:41 140037071685504 [Note] InnoDB: Starting shutdown... i================================================================= ==5226==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x612000018588 in thread T0 #0 0x736750 in operator delete(void) (/mariadb/server/build/sql/mysqld+0x736750) #1 0x1e4833f in LatchCounter::~LatchCounter() /mariadb/server/storage/innobase/include/sync0types.h:599:4 #2 0x1e480b8 in LatchMeta<LatchCounter>::~LatchMeta() /mariadb/server/storage/innobase/include/sync0types.h:786:17 #3 0x1e35509 in sync_latch_meta_destroy() /mariadb/server/storage/innobase/sync/sync0debug.cc:1622:3 #4 0x1e35314 in sync_check_close() /mariadb/server/storage/innobase/sync/sync0debug.cc:1839:2 #5 0x1dfdc18 in innodb_shutdown() /mariadb/server/storage/innobase/srv/srv0start.cc:2888:2 #6 0x197e5e6 in innobase_init(void) /mariadb/server/storage/innobase/handler/ha_innodb.cc:4475:3	2017-05-05 10:38:53 +03:00
Marko Mäkelä	5684aa220c	MDEV-12488 Remove type mismatch in InnoDB printf-like calls Alias the InnoDB ulint and lint data types to size_t and ssize_t, which are the standard names for the machine-word-width data types. Correspondingly, define ULINTPF as "%zu" and introduce ULINTPFx as "%zx". In this way, better compiler warnings for type mismatch are possible. Furthermore, use PRIu64 for that 64-bit format, and define the feature macro __STDC_FORMAT_MACROS to enable it on Red Hat systems. Fix some errors in error messages, and replace some error messages with assertions. Most notably, an IMPORT TABLESPACE error message in InnoDB was displaying the number of columns instead of the mismatching flags.	2017-04-21 18:03:15 +03:00
Marko Mäkelä	4e1116b2c6	MDEV-12271 Port MySQL 8.0 Bug#23150562 REMOVE UNIV_MUST_NOT_INLINE AND UNIV_NONINL Also, remove empty .ic files that were not removed by my MySQL commit. Problem: InnoDB used to support a compilation mode that allowed to choose whether the function definitions in .ic files are to be inlined or not. This stopped making sense when InnoDB moved to C++ in MySQL 5.6 (and ha_innodb.cc started to #include .ic files), and more so in MySQL 5.7 when inline methods and functions were introduced in .h files. Solution: Remove all references to UNIV_NONINL and UNIV_MUST_NOT_INLINE from all files, assuming that the symbols are never defined. Remove the files fut0fut.cc and ut0byte.cc which only mattered when UNIV_NONINL was defined.	2017-03-17 12:42:07 +02:00
Marko Mäkelä	27b9989d31	MDEV-12121 Introduce build option WITH_INNODB_AHI to disable innodb_adaptive_hash_index The InnoDB adaptive hash index is sometimes degrading the performance of InnoDB, and it is sometimes disabled to get more consistent performance. We should have a compile-time option to disable the adaptive hash index. Let us introduce two options: OPTION(WITH_INNODB_AHI "Include innodb_adaptive_hash_index" ON) OPTION(WITH_INNODB_ROOT_GUESS "Cache index root block descriptors" ON) where WITH_INNODB_AHI always implies WITH_INNODB_ROOT_GUESS. As part of this change, the misleadingly named function trx_search_latch_release_if_reserved(trx) will be replaced with the macro trx_assert_no_search_latch(trx) that will be empty unless BTR_CUR_HASH_ADAPT is defined (cmake -DWITH_INNODB_AHI=ON). We will also remove the unused column INFORMATION_SCHEMA.INNODB_TRX.TRX_ADAPTIVE_HASH_TIMEOUT. In MariaDB Server 10.1, it used to reflect the value of trx_t::search_latch_timeout which could be adjusted during row_search_for_mysql(). In 10.2, there is no such field. Other than the removal of the unused column TRX_ADAPTIVE_HASH_TIMEOUT, this is an almost non-functional change to the server when using the default build options. Some tests are adjusted so that they will work with both -DWITH_INNODB_AHI=ON and -DWITH_INNODB_AHI=OFF. The test innodb.innodb_monitor has been renamed to innodb.monitor in order to track MySQL 5.7, and the duplicate tests sys_vars.innodb_monitor_* are removed.	2017-03-03 16:55:50 +02:00
Marko Mäkelä	63574f1275	MDEV-11690 Remove UNIV_HOTBACKUP The InnoDB source code contains quite a few references to a closed-source hot backup tool which was originally called InnoDB Hot Backup (ibbackup) and later incorporated in MySQL Enterprise Backup. The open source backup tool XtraBackup uses the full database for recovery. So, the references to UNIV_HOTBACKUP are only cluttering the source code.	2016-12-30 16:05:42 +02:00
Marko Mäkelä	c868acdf65	MDEV-11487 Revert InnoDB internal temporary tables from WL#7682 WL#7682 in MySQL 5.7 introduced the possibility to create light-weight temporary tables in InnoDB. These are called 'intrinsic temporary tables' in InnoDB, and in MySQL 5.7, they can be created by the optimizer for sorting or buffering data in query processing. In MariaDB 10.2, the optimizer temporary tables cannot be created in InnoDB, so we should remove the dead code and related data structures.	2016-12-09 12:05:07 +02:00

1 2

60 commits