mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-31 02:51:44 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	f79cebb4d0	Merge 10.7 into 10.8	2022-07-28 10:33:26 +03:00
Marko Mäkelä	742e1c727f	Merge 10.6 into 10.7	2022-07-27 18:26:21 +03:00
Thirunarayanan Balathandayuthapani	19283c67c6	MDEV-28679 After upgrade to 10.7.3-1 with enabled data-at-rest encryption unable to restore dump file. - InnoDB bulk insert fails to use encryption buffer for encrypting the temporary log file. Declare the m_crypt_block, m_crypt_pfx in row_merge_bulk_t to be used for encrypting the temporary file.	2022-07-26 11:25:56 +05:30
Marko Mäkelä	f8240a2723	MDEV-26294 Duplicate entries in unique index not detected when changing collation Problem: ======= ALTER TABLE in InnoDB fails to detect duplicate entries for the unique index when the character set or collation of an indexed column is changed in such a way that the character encoding is compatible with the old table definition. In this case, any secondary indexes on the changed columns would be rebuilt (DROP INDEX, ADD INDEX). Solution: ======== During ALTER TABLE, InnoDB keeps track of columns whose collation changed, and will fill in the correct metadata when sorting the index records, or applying changes from concurrent DML. This metadata will be allocated in the dict_index_t::heap of the being-created secondary indexes. The fix was developed by Thirunarayanan Balathandayuthapani and simplified by me.	2022-07-04 16:13:04 +03:00
Marko Mäkelä	133c2129cd	Merge 10.7 into 10.8	2022-04-27 10:43:00 +03:00
Marko Mäkelä	638afc4acf	Merge 10.6 into 10.7	2022-04-26 18:59:40 +03:00
Marko Mäkelä	358921ce32	MDEV-26938 Support descending indexes internally in InnoDB This is loosely based on the InnoDB changes in mysql/mysql-server@97fd8b1b69 that I had developed in 2015 or 2016. For each B-tree key field, we will allow a flag ASC/DESC to be associated. When PRIMARY KEY fields are internally appended to secondary indexes, the ASC/DESC attribute will be inherited, so that covering index scans will work as expected. Note: Until the subsequent commit, the DESC attribute will be ignored (no HA_REVERSE_SORT flag will be written to .frm files). dict_field_t::descending: A new flag to denote descending order. cmp_data(), cmp_dfield_dfield(): Add a new parameter descending. cmp_dtuple_rec(), cmp_dtuple_rec_with_match(): Add a parameter "index". dtuple_coll_eq(): Replaces dtuple_coll_cmp(). cmp_dfield_dfield_eq_prefix(): Replaces cmp_dfield_dfield_like_prefix(). dict_index_t::is_btree(): Check whether the index is a regular B-tree index (not SPATIAL, FULLTEXT, or the ibuf.index, or a corrupted index. btr_cur_search_to_nth_level_func(): Only attempt to use the adaptive hash index if index->is_btree(). This function may also be invoked on ibuf.index, and cmp_dtuple_rec_with_match_bytes() will no longer work on ibuf.index because it assumes that the index and record fields exactly match. The ibuf.index is a special variadic index tree. Thanks to Thirunarayanan Balathandayuthapani for fixing some bugs: MDEV-27439, MDEV-27374/MDEV-27445.	2022-01-26 18:43:05 +01:00
Thirunarayanan Balathandayuthapani	045757af4c	MDEV-24621 In bulk insert, pre-sort and build indexes one page at a time When inserting a number of rows into an empty table, InnoDB will buffer and pre-sort the records for each index, and build the indexes one page at a time. For each index, a buffer of innodb_sort_buffer_size will be created. If the buffer ran out of memory then we will create temporary files for storing the data. At the end of the statement, we will sort and apply the buffered records. Ideally, we would do this at the end of the transaction or only when starting to execute a non-INSERT statement on the table. However, it could be awkward if duplicate keys or similar errors would be reported during the execution of a later statement. This will be addressed in MDEV-25036. Any columns longer than 2000 bytes will buffered in temporary files. innodb_prepare_commit_versioned(): Apply all bulk buffered insert operation, at the end of each statement. ha_commit_trans(): Handle errors from innodb_prepare_commit_versioned(). row_merge_buf_write(): This function should accept blob file handle too and it should write the field data which are greater than 2000 bytes row_merge_bulk_t: Data structure to maintain the data during bulk insert operation. trx_mod_table_time_t::start_bulk_insert(): Notify the start of bulk insert operation and create new buffer for the given table trx_mod_table_time_t::add_tuple(): Buffer a record. trx_mod_table_time_t::write_bulk(): Do bulk insert operation present in the transaction trx_mod_table_time_t::bulk_buffer_exist(): Whether the buffer storage exist for the bulk transaction trx_mod_table_time_t::write_bulk(): Write all buffered insert operation for the transaction and the table. row_ins_clust_index_entry_low(): Insert the data into the bulk buffer if it is already exist. row_ins_sec_index_entry(): Insert the secondary tuple if the bulk buffer already exist. row_merge_bulk_buf_add(): Insert the tuple into bulk buffer insert operation. row_merge_buf_blob(): Write the field data whose length is more than 2000 bytes into blob temporary file. Write the file offset and length into the tuple field. row_merge_copy_blob_from_file(): Copy the blob from blob file handler based on reference of the given tuple. row_merge_insert_index_tuples(): Handle blob for bulk insert operation. row_merge_bulk_t::row_merge_bulk_t(): Constructor. Initialize the buffer and file for all the indexes expect fts index. row_merge_bulk_t::create_tmp_file(): Create new temporary file for the given index. row_merge_bulk_t::write_to_tmp_file(): Write the content from buffer to disk file for the given index. row_merge_bulk_t::add_tuple(): Insert the tuple into the merge buffer for the given index. If the memory ran out then InnoDB should sort the buffer and write into file. row_merge_bulk_t::write_to_index(): Do bulk insert operation from merge file/merge buffer for the given index row_merge_bulk_t::write_to_table(): Do bulk insert operation for all the indexes. dict_stats_update(): If a bulk insert transaction is in progress, treat the table as empty. The index creation could hold latches for extended amounts of time.	2021-10-26 15:01:44 +03:00
Marko Mäkelä	c5fd9aa562	MDEV-25919: Lock tables before acquiring dict_sys.latch In commit `1bd681c8b3` (MDEV-25506 part 3) we introduced a "fake instant timeout" when a transaction would wait for a table or record lock while holding dict_sys.latch. This prevented a deadlock of the server but could cause bogus errors for operations on the InnoDB persistent statistics tables. A better fix is to ensure that whenever a transaction is being executed in the InnoDB internal SQL parser (which will for now require dict_sys.latch to be held), it will already have acquired all locks that could be required for the execution. So, we will acquire the following locks upfront, before acquiring dict_sys.latch: (1) MDL on the affected user table (acquired by the SQL layer) (2) If applicable (not for RENAME TABLE): InnoDB table lock (3) If persistent statistics are going to be modified: (3.a) MDL_SHARED on mysql.innodb_table_stats, mysql.innodb_index_stats (3.b) exclusive table locks on the statistics tables (4) Exclusive table locks on the InnoDB data dictionary tables (not needed in ANALYZE TABLE and the like) Note: Acquiring exclusive locks on the statistics tables may cause more locking conflicts between concurrent DDL operations. Notably, RENAME TABLE will lock the statistics tables even if no persistent statistics are enabled for the table. DROP DATABASE will only acquire locks on statistics tables if persistent statistics are enabled for the tables on which the SQL layer is invoking ha_innobase::delete_table(). For any "garbage collection" in innodb_drop_database(), a timeout while acquiring locks on the statistics tables will result in any statistics not being deleted for any tables that the SQL layer did not know about. If innodb_defragment=ON, information may be written to the statistics tables even for tables for which InnoDB persistent statistics are disabled. But, DROP TABLE will no longer attempt to delete that information if persistent statistics are not enabled for the table. This change should also fix the hangs related to InnoDB persistent statistics and STATS_AUTO_RECALC (MDEV-15020) as well as a bug that running ALTER TABLE on the statistics tables concurrently with running ALTER TABLE on InnoDB tables could cause trouble. lock_rec_enqueue_waiting(), lock_table_enqueue_waiting(): Do not issue a fake instant timeout error when the transaction is holding dict_sys.latch. Instead, assert that the dict_sys.latch is never being held here. lock_sys_tables(): A new function to acquire exclusive locks on all dictionary tables, in case DROP TABLE or similar operation is being executed. Locking non-hard-coded tables is optional to avoid a crash in row_merge_drop_temp_indexes(). The SYS_VIRTUAL table was introduced in MySQL 5.7 and MariaDB Server 10.2. Normally, we require all these dictionary tables to exist before executing any DDL, but the function row_merge_drop_temp_indexes() is an exception. When upgrading from MariaDB Server 10.1 or MySQL 5.6 or earlier, the table SYS_VIRTUAL would not exist at this point. ha_innobase::commit_inplace_alter_table(): Invoke log_write_up_to() while not holding dict_sys.latch. dict_sys_t::remove(), dict_table_close(): No longer try to drop index stubs that were left behind by aborted online ADD INDEX. Such indexes should be dropped from the InnoDB data dictionary by row_merge_drop_indexes() as part of the failed DDL operation. Stubs for aborted indexes may only be left behind in the data dictionary cache. dict_stats_fetch_from_ps(): Use a normal read-only transaction. ha_innobase::delete_table(), ha_innobase::truncate(), fts_lock_table(): While waiting for purge to stop using the table, do not hold dict_sys.latch. ha_innobase::delete_table(): Implement a work-around for the rollback of ALTER TABLE...ADD PARTITION. MDL_EXCLUSIVE would not be held if ALTER TABLE hits lock_wait_timeout while trying to upgrade the MDL due to a conflicting LOCK TABLES, such as in the first ALTER TABLE in the test case of Bug#53676 in parts.partition_special_innodb. Therefore, we must explicitly stop purge, because it would not be stopped by MDL. dict_stats_func(), btr_defragment_chunk(): Allocate a THD so that we can acquire MDL on the InnoDB persistent statistics tables. mysqltest_embedded: Invoke ha_pre_shutdown() before free_used_memory() in order to avoid ASAN heap-use-after-free related to acquire_thd(). trx_t::dict_operation_lock_mode: Changed the type to bool. row_mysql_lock_data_dictionary(), row_mysql_unlock_data_dictionary(): Implemented as macros. rollback_inplace_alter_table(): Apply an infinite timeout to lock waits. innodb_thd_increment_pending_ops(): Wrapper for thd_increment_pending_ops(). Never attempt async operation for InnoDB background threads, such as the trx_t::commit() in dict_stats_process_entry_from_recalc_pool(). lock_sys_t::cancel(trx_t*): Make dictionary transactions immune to KILL. lock_wait(): Make dictionary transactions immune to KILL, and to lock wait timeout when waiting for locks on dictionary tables. parts.partition_special_innodb: Use lock_wait_timeout=0 to instantly get ER_LOCK_WAIT_TIMEOUT. main.mdl: Filter out MDL on InnoDB persistent statistics tables Reviewed by: Thirunarayanan Balathandayuthapani	2021-08-31 13:54:44 +03:00
Marko Mäkelä	a7d68e7a0f	MDEV-25791: Remove UNIV_INTERN Back in 2006 or 2007, when MySQL AB and Innobase Oy existed as separately controlled entities (Innobase had been acquired by Oracle Corporation), MySQL 5.1 introduced a storage engine plugin interface and Oracle made use of it by distributing a separate InnoDB Plugin, which would contain some more bug fixes and improvements, compared to the version of InnoDB that was statically linked with the mysqld server that was distributed by MySQL AB. The built-in InnoDB would export global symbols, which would clash with the symbols of the dynamic InnoDB Plugin (which was supposed to override the built-in one when present). The solution to this problem was to declare all global symbols with UNIV_INTERN, so that they would get the GCC function attribute that specifies hidden visibility. Later, in MariaDB Server, something based on Percona XtraDB (a fork of MySQL InnoDB) became the statically linked implementation, and something closer to MySQL InnoDB was available as a dynamic plugin. Starting with version 10.2, MariaDB Server includes only one InnoDB implementation, and hence any reason to have the UNIV_INTERN definition was lost. btr_get_size_and_reserved(): Move to the same compilation unit with the only caller. innodb_set_buf_pool_size(): Remove. Modify innobase_buffer_pool_size directly. fil_crypt_calculate_checksum(): Merge to the only caller. ha_innobase::innobase_reset_autoinc(): Merge to the only caller. thd_query_start_micro(): Remove. Call thd_start_utime() directly.	2021-05-27 13:28:08 +03:00
Marko Mäkelä	cc2ddde4d8	MDEV-18518 follow-up fixes Make DDL operations that involve FULLTEXT INDEX atomic. In particular, we must drop the internal FTS_ tables in the same DDL transaction with ALTER TABLE. Remove all references to fts_drop_orphaned_tables(). row_merge_drop_temp_indexes(): Drop also the internal FTS_ tables that are associated with index stubs that were created in prepare_inplace_alter_table_dict() for CREATE FULLTEXT INDEX before the server was killed. fts_clear_all(): Remove the fts_drop_tables() call. It has to be executed before the transaction is committed! dict_load_indexes(): Do not load any metadata for index stubs that had been created by prepare_inplace_alter_table_dict() fts_create_one_common_table(), fts_create_common_tables(), fts_create_one_index_table(), fts_create_index_tables(): Remove redundant error handling. The tables will be dropped just fine by dict_drop_index_tree(). commit_try_norebuild(): Also drop the FTS_ tables when dropping FULLTEXT INDEX. The changes to the test case innodb_fts.crash_recovery has been extensively tested. The non-debug server will be killed while the 3 ALTER TABLE are in any phase of execution. With the debug server, DEBUG_SYNC should make the test deterministic.	2021-05-06 16:04:29 +03:00
Marko Mäkelä	52aac131e3	MDEV-18518 Multi-table CREATE and DROP transactions for InnoDB InnoDB used to support at most one CREATE TABLE or DROP TABLE per transaction. This caused complications for DDL operations on partitioned tables (where each partition is treated as a separate table by InnoDB) and FULLTEXT INDEX (where each index is maintained in a number of internal InnoDB tables). dict_drop_index_tree(): Extend the MDEV-24589 logic and treat the purge or rollback of SYS_INDEXES records of clustered indexes specially: by dropping the tablespace if it exists. This is the only form of recovery that we will need. trx_undo_ddl_type: Document the DDL undo log record types better. trx_t::dict_operation: Change the type to bool. trx_t::ddl: Remove. trx_t::table_id, trx_undo_t::table_id: Remove. dict_build_table_def_step(): Remove trx_t::table_id logging. dict_table_close_and_drop(), row_merge_drop_table(): Remove. row_merge_lock_table(): Merged to the only callers, which can call lock_table_for_trx() directly. fts_aux_table_t, fts_aux_id, fts_space_set_t: Remove. fts_drop_orphaned_tables(): Remove. row_merge_rename_index_to_drop(): Remove. Thanks to MDEV-24589, we can simply delete the to-be-dropped indexes from SYS_INDEXES, while still being able to roll back the operation. ha_innobase_inplace_ctx: Make a few data members const. Preallocate trx. prepare_inplace_alter_table_dict(): Simplify the logic. Let the normal rollback take care of some cleanup. row_undo_ins_remove_clust_rec(): Simplify the parsing of SYS_COLUMNS. trx_rollback_active(): Remove the special DROP TABLE logic. trx_undo_mem_create_at_db_start(), trx_undo_reuse_cached(): Always write TRX_UNDO_TABLE_ID as 0.	2021-05-04 13:48:55 +03:00
Marko Mäkelä	6c3e860cbf	Merge 10.4 into 10.5	2021-04-14 11:35:39 +03:00
Marko Mäkelä	6e6318b29b	Merge 10.2 into 10.3	2021-04-13 10:26:01 +03:00
Thirunarayanan Balathandayuthapani	cf2c6b7f8d	MDEV-24971 InnoDB access freed virtual column after rollback of secondary index Problem: ======== InnoDB fails to clean the index stub if it fails to add the virtual index which contains new virtual column. But it clears the newly virtual column from index in clear_added_indexes() during inplace_alter_table. On commit, InnoDB evicts and reload the table. In case of rollback, it doesn't happen. InnoDB clears the ABORTED index while opening the table or doing the DDL. In the mean time, InnoDB can access the dropped virtual index columns while creating prebuilt or rollback of concurrent DML. Solution: ========== (1) InnoDB should maintain newly added virtual column while rollbacking the newly added virtual index. (2) InnoDB must not defer the index removal if the alter table is executed with LOCK=EXCLUSIVE. (3) For LOCK=SHARED, InnoDB should check whether the table has any other transaction lock other than alter transaction before deferring the index stub. Replaced has_new_v_col with dict_add_vcol_info in dict_index_t to indicate whether the index has any new virtual column. dict_index_t::has_new_v_col(): Returns whether the index has newly added virtual column, it doesn't say which columns are newly added virtual column ha_innobase_inplace_ctx::is_new_vcol(): Return whether the given column is added as a part of the current alter. ha_innobase_inplace_ctx::clean_new_vcol_index(): Copy the newly added virtual column to new_vcol_info in dict_index_t. Replace the column in the index fields with virtual column stored in new_vcol_info. dict_index_t::assign_new_v_col(): Store the number of virtual column added in index as a part of alter table. dict_index_t::get_n_new_vcol(): Get the number of newly added virtual column dict_index_t::assign_drop_v_col(): Allocate the memory for adding new virtual column in new_vcol_info. dict_index_t::add_drop_v_col(): Add the newly added virtual column in new_vcol_info. dict_table_t::has_lock_for_other_trx(): Whether the table has any other transaction lock than given transaction. row_merge_drop_indexes(): Add parameter alter_trx and check whether the table has any other lock than alter transaction.	2021-04-12 16:06:06 +05:30
Marko Mäkelä	7bcaa541aa	Merge 10.4 into 10.5	2020-05-05 21:16:22 +03:00
Oleksandr Byelkin	7fb73ed143	Merge branch '10.2' into 10.3	2020-05-04 16:47:11 +02:00
Daniel Black	ba2061da52	MDEV-21595: innodb offset_t rename to rec_offs thanks to: perl -i -pe 's/\boffset_t\b/rec_offs/g' $(git grep -lw offset_t storage/innobase)	2020-04-29 12:02:47 +03:00
Marko Mäkelä	5203bc10f1	Merge 10.4 into 10.5	2020-03-21 11:37:10 +02:00
Marko Mäkelä	44298e4dea	Merge 10.2 into 10.3 Also, clean up the test innodb_gis.geometry a little further.	2020-03-20 18:12:17 +02:00
Marko Mäkelä	6960e9ed24	MDEV-21983: Crash on DROP/RENAME TABLE after DISCARD TABLESPACE fil_delete_tablespace(): Remove the unused parameter drop_ahi, and add the parameter if_exists=false. We want to suppress error messages if we know that the tablespace has been discarded. dict_table_rename_in_cache(): Pass the new parameter to fil_delete_tablespace(), that is, do not complain about missing tablespace if the tablespace has been discarded. row_make_new_pathname(): Declare as static. row_drop_table_for_mysql(): Tolerate !table->data_dir_path when the tablespace has been discarded. row_rename_table_for_mysql(): Skip part of the RENAME TABLE when fil_space_get_first_path() returns NULL.	2020-03-19 14:23:47 +02:00
Marko Mäkelä	37e7bde12a	MDEV-14425 preparation: Remove log_t::append_on_checkpoint Simplify the logging of ALTER TABLE operations, by making use of the TRX_UNDO_RENAME_TABLE undo log record that was introduced in commit `0bc36758ba`. commit_try_rebuild(): Invoke row_rename_table_for_mysql() and actually rename the files before committing the transaction. fil_mtr_rename_log(), commit_cache_rebuild(), log_append_on_checkpoint(), row_merge_rename_tables_dict(): Remove. mtr_buf_copy_t, log_t::append_on_checkpoint: Remove. row_rename_table_for_mysql(): If !use_fk, ignore missing foreign keys. Remove a call to dict_table_rename_in_cache(), because trx_rollback_to_savepoint() should invoke the function if needed.	2020-03-03 22:25:20 +02:00
Marko Mäkelä	fc2f2fa853	MDEV-19747: Deprecate and ignore innodb_log_optimize_ddl During native table rebuild or index creation, InnoDB used to skip redo logging and write MLOG_INDEX_LOAD records to inform crash recovery and Mariabackup of the gaps in redo log. This is fragile and prohibits some optimizations, such as skipping the doublewrite buffer for newly (re)initialized pages (MDEV-19738). row_merge_write_redo(): Remove. We do not write MLOG_INDEX_LOAD records any more. Instead, we write full redo log. FlushObserver: Remove. fseg_free_page_func(): Remove the parameter log. Redo logging cannot be disabled. fil_space_t::redo_skipped_count: Remove. We cannot remove buf_block_t::skip_flush_check, because PageBulk will temporarily generate invalid B-tree pages in the buffer pool.	2020-02-11 18:44:26 +02:00
Marko Mäkelä	3466b47b0d	Merge 10.2 into 10.3	2019-12-13 10:08:57 +02:00
Eugene Kosov	f0aa073f2b	MDEV-20950 Reduce size of record offsets offset_t: this is a type which represents one record offset. It's unsigned short int. a lot of functions: replace ulint with offset_t btr_pcur_restore_position_func(), page_validate(), row_ins_scan_sec_index_for_duplicate(), row_upd_clust_rec_by_insert_inherit_func(), row_vers_impl_x_locked_low(), trx_undo_prev_version_build(): allocate record offsets on the stack instead of waiting for rec_get_offsets() to allocate it from mem_heap_t. So, reducing memory allocations. RECORD_OFFSET, INDEX_OFFSET: now it's less convenient to store pointers in offset_t* array. One pointer occupies now several offset_t. And those constant are start indexes into array to places where to store pointer values REC_OFFS_HEADER_SIZE: adjusted for the new reality REC_OFFS_NORMAL_SIZE: increase size from 100 to 300 which means less heap allocations. And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which is smaller than previous 800 bytes. REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality rem0rec.h, rem0rec.ic, rem0rec.cc: various arguments, return values and local variables types were changed to fix numerous integer conversions issues. enum field_type_t: offset types concept was introduces which replaces old offset flags stuff. Like in earlier version, 2 upper bits are used to store offset type. And this enum represents those types. REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed get_type(), set_type(), get_value(), combine(): these are convenience functions to work with offsets and it's types rec_offs_base()[0]: still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL rec_offs_base()[i]: these have type offset_t now. Two upper bits contains type.	2019-12-13 00:26:50 +07:00
Marko Mäkelä	b42dbdbccd	Merge 10.2 into 10.3	2019-06-11 13:00:18 +03:00
Marko Mäkelä	cbac8f9351	MDEV-19725 Incorrect error handling in ALTER TABLE Some I/O functions and macros that are declared in os0file.h used to return a Boolean status code (nonzero on success). In MySQL 5.7, they were changed to return dberr_t instead. Alas, in MariaDB Server 10.2, some uses of functions were not adjusted to the changed return value. Until MDEV-19231, the valid values of dberr_t were always nonzero. This means that some code that was incorrectly checking for a zero return value from the functions would never detect a failure. After MDEV-19231, some tests for ALTER ONLINE TABLE would fail with cmake -DPLUGIN_PERFSCHEMA=NO. It turned out that the wrappers pfs_os_file_read_no_error_handling_int_fd_func() and pfs_os_file_write_int_fd_func() were wrongly returning bool instead of dberr_t. Also the callers of these functions were wrongly expecting bool (nonzero on success) instead of dberr_t. This mistake had been made when the addition of these functions was merged from MySQL 5.6.36 and 5.7.18 into MariaDB Server 10.2.7. This fix also reverts commit `40becbc3c7` which attempted to work around the problem.	2019-06-10 18:15:25 +03:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	c0ac0b8860	Update FSF address	2019-05-11 19:25:02 +03:00
Marko Mäkelä	cc492bfd4f	Merge 10.2 into 10.3	2019-04-07 11:49:50 +03:00
Marko Mäkelä	aa3f7a107c	MDEV-12699 preparation: Write MLOG_INDEX_LOAD for FTS_ tables The record MLOG_INDEX_LOAD is supposed to be written to indicate that some page modifications bypassed redo logging, and that redo logging is now re-enabled. It was not written for fulltext indexes during ALTER TABLE. row_merge_write_redo(): Declare globally. Assert that the index is neither a spatial nor fulltext index. recv_mlog_index_load(): Observe a MLOG_INDEX_LOAD operation. recv_parse_log_recs(): Handle MLOG_INDEX_LOAD also in multi-record mini-transactions. Because of this omission, we should keep writing MLOG_INDEX_LOAD in single-record mini-transactions, because older versions of Mariabackup would fail. row_fts_merge_insert(): Write MLOG_INDEX_LOAD for the auxiliary tables of fulltext indexes.	2019-04-06 21:25:43 +03:00
Marko Mäkelä	0abd2766b1	Merge 10.2 into 10.3 Also, related to MDEV-15522, MDEV-17304, MDEV-17835, remove the Galera xtrabackup tests, because xtrabackup never worked with MariaDB Server 10.3 due to InnoDB redo log format changes.	2018-11-30 09:38:56 +02:00
Marko Mäkelä	447e493179	Remove some unnecessary InnoDB #include	2018-11-29 12:53:44 +02:00
Thirunarayanan Balathandayuthapani	88aaf590ac	MDEV-16365 Setting a column NOT NULL fails to return error for NULL values when there is no DEFAULT Copy and inplace algorithm works similarly for NULL to NOT NULL conversion for the following cases: (1) strict sql mode - Should give error. (2) non-strict sql mode - Should give warnings alone (3) alter ignore table command. - Should give warnings alone.	2018-06-25 14:52:38 +05:30
Marko Mäkelä	682e7b8ff4	MDEV-16334 Incorrect ALTER TABLE for changing column option commit `2dbeebdb16` accidentally changed ALTER_COLUMN_OPTION and ALTER_COLUMN_STORAGE_TYPE to be separate flags. InnoDB and Mroonga are only checking for the latter; the example storage engine is checking for the former only. The impact of this bug should be incorrect operation of Mroonga when the column options GROONGA_TYPE, FLAGS are changed. InnoDB does not define any column options, only table options, so the flag ALTER_COLUMN_OPTION should never have been set. Also, remove the unused flag ALTER_DROP_HISTORICAL.	2018-05-30 14:35:34 +03:00
Marko Mäkelä	a90100d756	Replace univ_page_size and UNIV_PAGE_SIZE Try to use one variable (srv_page_size) for innodb_page_size. Also, replace UNIV_PAGE_SIZE_SHIFT with srv_page_size_shift.	2018-04-28 20:45:45 +03:00
Thirunarayanan Balathandayuthapani	4f9977d8d3	MDEV-14168 Unconditionally allow ALGORITHM=INPLACE for setting a column NOT NULL - Allow NOT NULL constraint to replace the NULL value in the row with explicit or implicit default value. - If the default value is non-const value then inplace alter won't support it. - ALTER IGNORE will ignore the error if the concurrent DML contains NULL value.	2018-04-24 13:15:35 +05:30
Vladislav Vaintroub	321771f89f	MDEV-15895 : make Innodb merge temp tables use pfs_os_file_t for file IO, rather than int. On Windows, it is suboptimal to depend on C runtime, as it has limited number of file descriptors. This change eliminates os_file_read_no_error_handling_int_fd(), os_file_write_int_fd(), OS_FILE_FROM_FD() macro.	2018-04-17 09:07:38 +01:00
Marko Mäkelä	609d0a9194	MDEV-14407 Assertion failure during rollback Rollback attempted to dereference DB_ROLL_PTR=0, which cannot possibly be a valid undo log pointer. A safer canonical value would be roll_ptr_t(1) << ROLL_PTR_INSERT_FLAG_POS which is what was chosen in MDEV-12288, corresponding to reset_trx_id. No deterministic test case for the bug was found. The simplest test cases may be related to MDEV-11415, which suppresses undo logging for ALGORITHM=COPY operations. In those operations, in the spirit of MDEV-12288, we should actually have written reset_trx_id instead of using the transaction identifier of the current transaction (and a bogus value of DB_ROLL_PTR=0). However, thanks to MySQL Bug#28432 which I had fixed in MySQL 5.6.8 as part of WL#6255, access to the rebuilt table by earlier-started transactions should actually have been refused with ER_TABLE_DEF_CHANGED. reset_trx_id: Move the definition to data0type.cc and the declaration to data0type.h. btr_cur_ins_lock_and_undo(): When undo logging is disabled, use the safe value that corresponds to reset_trx_id. btr_cur_optimistic_insert(): Validate the DB_TRX_ID,DB_ROLL_PTR before inserting into a clustered index leaf page. ins_node_t::sys_buf[]: Replaces row_id_buf and trx_id_buf and some heap usage. row_ins_alloc_sys_fields(): Init ins_node_t::sys_buf[] to reset_trx_id. row_ins_buf(): Only if undo logging is enabled, copy trx->id to node->sys_buf. Otherwise, rely on the initialization in row_ins_alloc_sys_fields(). row_purge_reset_trx_id(): Invoke mlog_write_string() with reset_trx_id directly. (No functional change.) trx_undo_page_report_modify(): Assert that the DB_ROLL_PTR is not 0. trx_undo_get_undo_rec_low(): Assert that the roll_ptr is valid before trying to dereference it. dict_index_t::is_primary(): Check if the index is the primary key. PageConverter::adjust_cluster_record(): Fix MDEV-15249 Crash in MVCC read after IMPORT TABLESPACE by resetting the system fields to reset_trx_id instead of writing the current transaction ID (which will be committed at the end of the IMPORT TABLESPACE) and DB_ROLL_PTR=0. This can partially be viewed as a follow-up fix of MDEV-12288, because IMPORT should already then have written DB_TRX_ID=0 and DB_ROLL_PTR=1<<55 to prevent unnecessary DB_TRX_ID lookups in subsequent accesses to the table.	2018-02-08 12:14:34 +02:00
Eugene Kosov	fa79f6ac86	IB: style changes [closes #306 ]	2017-11-15 00:22:09 +03:00
Aleksey Midenkov	d8d7251019	System Versioning pre0.12 Merge remote-tracking branch 'origin/archive/2017-10-17' into 10.3	2017-11-07 00:37:49 +03:00
Alexander Barkov	835cbbcc7b	Merge remote-tracking branch 'origin/bb-10.2-ext' into 10.3 TODO: enable MDEV-13049 optimization for 10.3	2017-10-30 20:47:39 +04:00
Sergei Golubchik	9d2e2d7533	Merge branch '10.0' into 10.1	2017-10-22 13:03:41 +02:00
Sergei Golubchik	607d8f9e97	MDEV-14081 ALTER TABLE CHANGE COLUMN Corrupts Index Leading to Crashes in 10.2 remove remnants of 10.0 bugfix, incorrectly merged into 10.2 Using col_names[i] was obviously, wrong, must've been col_names[ifield->col_no]. incorrect column name resulted in innodb having index unique_id2(id1), while the server thought it's unique_id2(id4). But col_names[ifield->col_no] is wrong too, because `table` has non-renamed columns, so the correct column name is always dict_table_get_col_name(table, ifield->col_no)	2017-10-18 22:15:27 +02:00
Jan Lindström	fc9ff69578	MDEV-13838: Wrong result after altering a partitioned table Reverted incorrect changes done on MDEV-7367 and MDEV-9469. Fixes properly also related bugs: MDEV-13668: InnoDB unnecessarily rebuilds table when renaming a column and adding index MDEV-9469: 'Incorrect key file' on ALTER TABLE MDEV-9548: Alter table (renaming and adding index) fails with "Incorrect key file for table" MDEV-10535: ALTER TABLE causes standalone/wsrep cluster crash MDEV-13640: ALTER TABLE CHANGE and ADD INDEX on auto_increment column fails with "Incorrect key file for table..." Root cause for all these bugs is the fact that MariaDB .frm file can contain virtual columns but InnoDB dictionary does not and previous fixes were incorrect or unnecessarily forced table rebuilt. In index creation key_part->fieldnr can be bigger than number of columns in InnoDB data dictionary. We need to skip not stored fields when calculating correct column number for InnoDB data dictionary. dict_table_get_col_name_for_mysql Remove innobase_match_index_columns Revert incorrect change done on MDEV-7367 innobase_need_rebuild Remove unnecessary rebuild force when column is renamed. innobase_create_index_field_def Calculate InnoDB column number correctly and remove unnecessary column name set. innobase_create_index_def, innobase_create_key_defs Remove unneeded fields parameter. Revert unneeded memset. prepare_inplace_alter_table_dict Remove unneeded col_names parameter index_field_t Remove unneeded col_name member. row_merge_create_index Remove unneeded col_names parameter and resolution. Effected tests: innodb-alter-table : Add test case for MDEV-13668 innodb-alter : Remove MDEV-13668, MDEV-9469 FIXMEs and restore original tests innodb-wl5980-alter : Remove MDEV-13668, MDEV-9469 FIXMEs and restore original tests	2017-10-10 17:03:40 +03:00
Marko Mäkelä	a4948dafcd	MDEV-11369 Instant ADD COLUMN for InnoDB For InnoDB tables, adding, dropping and reordering columns has required a rebuild of the table and all its indexes. Since MySQL 5.6 (and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing concurrent modification of the tables. This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously, with only minor changes performed to the table structure. The counter innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS is incremented whenever a table rebuild operation is converted into an instant ADD COLUMN operation. ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN. Some usability limitations will be addressed in subsequent work: MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY and ALGORITHM=INSTANT MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE The format of the clustered index (PRIMARY KEY) is changed as follows: (1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT, and a new field PAGE_INSTANT will contain the original number of fields in the clustered index ('core' fields). If instant ADD COLUMN has not been used or the table becomes empty, or the very first instant ADD COLUMN operation is rolled back, the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset to 0 and FIL_PAGE_INDEX. (2) A special 'default row' record is inserted into the leftmost leaf, between the page infimum and the first user record. This record is distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the same format as records that contain values for the instantly added columns. This 'default row' always has the same number of fields as the clustered index according to the table definition. The values of 'core' fields are to be ignored. For other fields, the 'default row' will contain the default values as they were during the ALTER TABLE statement. (If the column default values are changed later, those values will only be stored in the .frm file. The 'default row' will contain the original evaluated values, which must be the same for every row.) The 'default row' must be completely hidden from higher-level access routines. Assertions have been added to ensure that no 'default row' is ever present in the adaptive hash index or in locked records. The 'default row' is never delete-marked. (3) In clustered index leaf page records, the number of fields must reside between the number of 'core' fields (dict_index_t::n_core_fields introduced in this work) and dict_index_t::n_fields. If the number of fields is less than dict_index_t::n_fields, the missing fields are replaced with the column value of the 'default row'. Note: The number of fields in the record may shrink if some of the last instantly added columns are updated to the value that is in the 'default row'. The function btr_cur_trim() implements this 'compression' on update and rollback; dtuple::trim() implements it on insert. (4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new status value REC_STATUS_COLUMNS_ADDED will indicate the presence of a new record header that will encode n_fields-n_core_fields-1 in 1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header always explicitly encodes the number of fields.) We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for covering the insert of the 'default row' record when instant ADD COLUMN is used for the first time. Subsequent instant ADD COLUMN can use TRX_UNDO_UPD_EXIST_REC. This is joint work with Vin Chen (陈福荣) from Tencent. The design that was discussed in April 2017 would not have allowed import or export of data files, because instead of the 'default row' it would have introduced a data dictionary table. The test rpl.rpl_alter_instant is exactly as contributed in pull request #408. The test innodb.instant_alter is based on a contributed test. The redo log record format changes for ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPACT are as contributed. (With this change present, crash recovery from MariaDB 10.3.1 will fail in spectacular ways!) Also the semantics of higher-level redo log records that modify the PAGE_INSTANT field is changed. The redo log format version identifier was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1. Everything else has been rewritten by me. Thanks to Elena Stepanova, the code has been tested extensively. When rolling back an instant ADD COLUMN operation, we must empty the PAGE_FREE list after deleting or shortening the 'default row' record, by calling either btr_page_empty() or btr_page_reorganize(). We must know the size of each entry in the PAGE_FREE list. If rollback left a freed copy of the 'default row' in the PAGE_FREE list, we would be unable to determine its size (if it is in ROW_FORMAT=COMPACT or ROW_FORMAT=DYNAMIC) because it would contain more fields than the rolled-back definition of the clustered index. UNIV_SQL_DEFAULT: A new special constant that designates an instantly added column that is not present in the clustered index record. len_is_stored(): Check if a length is an actual length. There are two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL. dict_col_t::def_val: The 'default row' value of the column. If the column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT. dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(), instant_value(). dict_col_t::remove_instant(): Remove the 'instant ADD' status of a column. dict_col_t::name(const dict_table_t& table): Replaces dict_table_get_col_name(). dict_index_t::n_core_fields: The original number of fields. For secondary indexes and if instant ADD COLUMN has not been used, this will be equal to dict_index_t::n_fields. dict_index_t::n_core_null_bytes: Number of bytes needed to represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable). dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that n_core_null_bytes was not initialized yet from the clustered index root page. dict_index_t: Add the accessors is_instant(), is_clust(), get_n_nullable(), instant_field_value(). dict_index_t::instant_add_field(): Adjust clustered index metadata for instant ADD COLUMN. dict_index_t::remove_instant(): Remove the 'instant ADD' status of a clustered index when the table becomes empty, or the very first instant ADD COLUMN operation is rolled back. dict_table_t: Add the accessors is_instant(), is_temporary(), supports_instant(). dict_table_t::instant_add_column(): Adjust metadata for instant ADD COLUMN. dict_table_t::rollback_instant(): Adjust metadata on the rollback of instant ADD COLUMN. prepare_inplace_alter_table_dict(): First create the ctx->new_table, and only then decide if the table really needs to be rebuilt. We must split the creation of table or index metadata from the creation of the dictionary table records and the creation of the data. In this way, we can transform a table-rebuilding operation into an instant ADD COLUMN operation. Dictionary objects will only be added to cache when table rebuilding or index creation is needed. The ctx->instant_table will never be added to cache. dict_table_t::add_to_cache(): Modified and renamed from dict_table_add_to_cache(). Do not modify the table metadata. Let the callers invoke dict_table_add_system_columns() and if needed, set can_be_evicted. dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the system columns (which will now exist in the dict_table_t object already at this point). dict_create_table_step(): Expect the callers to invoke dict_table_add_system_columns(). pars_create_table(): Before creating the table creation execution graph, invoke dict_table_add_system_columns(). row_create_table_for_mysql(): Expect all callers to invoke dict_table_add_system_columns(). create_index_dict(): Replaces row_merge_create_index_graph(). innodb_update_n_cols(): Renamed from innobase_update_n_virtual(). Call my_error() if an error occurs. btr_cur_instant_init(), btr_cur_instant_init_low(), btr_cur_instant_root_init(): Load additional metadata from the clustered index and set dict_index_t::n_core_null_bytes. This is invoked when table metadata is first loaded into the data dictionary. dict_boot(): Initialize n_core_null_bytes for the four hard-coded dictionary tables. dict_create_index_step(): Initialize n_core_null_bytes. This is executed as part of CREATE TABLE. dict_index_build_internal_clust(): Initialize n_core_null_bytes to NO_CORE_NULL_BYTES if table->supports_instant(). row_create_index_for_mysql(): Initialize n_core_null_bytes for CREATE TEMPORARY TABLE. commit_cache_norebuild(): Call the code to rename or enlarge columns in the cache only if instant ADD COLUMN is not being used. (Instant ADD COLUMN would copy all column metadata from instant_table to old_table, including the names and lengths.) PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields. This is repurposing the 16-bit field PAGE_DIRECTION, of which only the least significant 3 bits were used. The original byte containing PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B. page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT. page_ptr_get_direction(), page_get_direction(), page_ptr_set_direction(): Accessors for PAGE_DIRECTION. page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION. page_direction_increment(): Increment PAGE_N_DIRECTION and set PAGE_DIRECTION. rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes, and assume that heap_no is always set. Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records, even if the record contains fewer fields. rec_offs_make_valid(): Add the parameter 'leaf'. rec_copy_prefix_to_dtuple(): Assert that the tuple is only built on the core fields. Instant ADD COLUMN only applies to the clustered index, and we should never build a search key that has more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR. All these columns are always present. dict_index_build_data_tuple(): Remove assertions that would be duplicated in rec_copy_prefix_to_dtuple(). rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose number of fields is between n_core_fields and n_fields. cmp_rec_rec_with_match(): Implement the comparison between two MIN_REC_FLAG records. trx_t::in_rollback: Make the field available in non-debug builds. trx_start_for_ddl_low(): Remove dangerous error-tolerance. A dictionary transaction must be flagged as such before it has generated any undo log records. This is because trx_undo_assign_undo() will mark the transaction as a dictionary transaction in the undo log header right before the very first undo log record is being written. btr_index_rec_validate(): Account for instant ADD COLUMN row_undo_ins_remove_clust_rec(): On the rollback of an insert into SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the last column from the table and the clustered index. row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(), trx_undo_update_rec_get_update(): Handle the 'default row' as a special case. dtuple_t::trim(index): Omit a redundant suffix of an index tuple right before insert or update. After instant ADD COLUMN, if the last fields of a clustered index tuple match the 'default row', there is no need to store them. While trimming the entry, we must hold a page latch, so that the table cannot be emptied and the 'default row' be deleted. btr_cur_optimistic_update(), btr_cur_pessimistic_update(), row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low(): Invoke dtuple_t::trim() if needed. row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling row_ins_clust_index_entry_low(). rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number of fields to be between n_core_fields and n_fields. Do not support infimum,supremum. They are never supposed to be stored in dtuple_t, because page creation nowadays uses a lower-level method for initializing them. rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the number of fields. btr_cur_trim(): In an update, trim the index entry as needed. For the 'default row', handle rollback specially. For user records, omit fields that match the 'default row'. btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete(): Skip locking and adaptive hash index for the 'default row'. row_log_table_apply_convert_mrec(): Replace 'default row' values if needed. In the temporary file that is applied by row_log_table_apply(), we must identify whether the records contain the extra header for instantly added columns. For now, we will allocate an additional byte for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table has been subject to instant ADD COLUMN. The ROW_T_DELETE records are fine, as they will be converted and will only contain 'core' columns (PRIMARY KEY and some system columns) that are converted from dtuple_t. rec_get_converted_size_temp(), rec_init_offsets_temp(), rec_convert_dtuple_to_temp(): Add the parameter 'status'. REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG \| REC_STATUS_COLUMNS_ADDED: An info_bits constant for distinguishing the 'default row' record. rec_comp_status_t: An enum of the status bit values. rec_leaf_format: An enum that replaces the bool parameter of rec_init_offsets_comp_ordinary().	2017-10-06 09:50:10 +03:00
Marko Mäkelä	e17a282da9	Merge bb-10.2-ext into 10.3	2017-09-18 11:38:07 +03:00
Marko Mäkelä	d9277732d7	Merge 10.1 into 10.2 This should also fix the MariaDB 10.2.2 bug MDEV-13826 CREATE FULLTEXT INDEX on encrypted table fails. MDEV-12634 FIXME: Modify innodb-index-online, innodb-table-online so that they will write and read merge sort files. InnoDB 5.7 introduced some optimizations to avoid using the files for small tables. Many collation test results have been adjusted for MDEV-10191.	2017-09-17 11:05:33 +03:00
Jan Lindström	fa2701c6f7	MDEV-12634: Uninitialised ROW_MERGE_RESERVE_SIZE bytes written to tem… …porary file Fixed by removing writing key version to start of every block that was encrypted. Instead we will use single key version from log_sys crypt info. After this MDEV also blocks writen to row log are encrypted and blocks read from row log aren decrypted if encryption is configured for the table. innodb_status_variables[], struct srv_stats_t Added status variables for merge block and row log block encryption and decryption amounts. Removed ROW_MERGE_RESERVE_SIZE define. row_merge_fts_doc_tokenize Remove ROW_MERGE_RESERVE_SIZE row_log_t Add index, crypt_tail, crypt_head to be used in case of encryption. row_log_online_op, row_log_table_close_func Before writing a block encrypt it if encryption is enabled row_log_table_apply_ops, row_log_apply_ops After reading a block decrypt it if encryption is enabled row_log_allocate Allocate temporary buffers crypt_head and crypt_tail if needed. row_log_free Free temporary buffers crypt_head and crypt_tail if they exist. row_merge_encrypt_buf, row_merge_decrypt_buf Removed. row_merge_buf_create, row_merge_buf_write Remove ROW_MERGE_RESERVE_SIZE row_merge_build_indexes Allocate temporary buffer used in decryption and encryption if needed. log_tmp_blocks_crypt, log_tmp_block_encrypt, log_temp_block_decrypt New functions used in block encryption and decryption log_tmp_is_encrypted New function to check is encryption enabled. Added test case innodb-rowlog to force creating a row log and verify that operations are done using introduced status variables.	2017-09-14 09:23:20 +03:00

1 2

84 commits