mariadb

mirror of https://github.com/MariaDB/server.git synced 2026-05-02 13:15:32 +02:00

Author	SHA1	Message	Date
Marko Mäkelä	49e2c8f0a6	MDEV-25743: Unnecessary copying of table names in InnoDB dictionary Many InnoDB data dictionary cache operations require that the table name be copied so that it will be NUL terminated. (For example, SYS_TABLES.NAME is not guaranteed to be NUL-terminated.) dict_table_t::is_garbage_name(): Check if a name belongs to the background drop table queue. dict_check_if_system_table_exists(): Remove. dict_sys_t::load_sys_tables(): Load the non-hard-coded system tables SYS_FOREIGN, SYS_FOREIGN_COLS, SYS_VIRTUAL on startup. dict_sys_t::create_or_check_sys_tables(): Replaces dict_create_or_check_foreign_constraint_tables() and dict_create_or_check_sys_virtual(). dict_sys_t::load_table(): Replaces dict_table_get_low() and dict_load_table(). dict_sys_t::find_table(): Renamed from get_table(). dict_sys_t::sys_tables_exist(): Check whether all the non-hard-coded tables SYS_FOREIGN, SYS_FOREIGN_COLS, SYS_VIRTUAL exist. trx_t::has_stats_table_lock(): Moved to dict0stats.cc. Some error messages will now report table names in the internal databasename/tablename format, instead of `databasename`.`tablename`.	2021-05-21 18:03:40 +03:00
Monty	744a53806e	Fixed hang in concurrent DROP TABLE and BACKUP LOCK BLOCK_DDL The problem was that tdc_remove_referenced_share() did not take into account that someone could push things into share->free_tables() even if there is a MDL_EXCLUSIVE lock on the table. This can happen if flush_tables() uses the table cache to flush a a non transactional table to disk.	2021-05-19 22:54:14 +02:00
Marko Mäkelä	86dc7b4d4c	MDEV-24626 Remove synchronous write of page0 file during file creation During data file creation, InnoDB holds dict_sys mutex, tries to write page 0 of the file and flushes the file. This not only causing unnecessary contention but also a deviation from the write-ahead logging protocol. The clean sequence of operations is that we first start a dictionary transaction and write SYS_TABLES and SYS_INDEXES records that identify the tablespace. Then, we durably write a FILE_CREATE record to the write-ahead log and create the file. Recovery should not unnecessarily insist that the first page of each data file that is referred to by the redo log is valid. It must be enough that page 0 of the tablespace can be initialized based on the redo log contents. We introduce a new data structure deferred_spaces that keeps track of corrupted-looking files during recovery. The data structure holds the last LSN of a FILE_ record referring to the data file, the tablespace identifier, and the last known file name. There are two scenarios can happen during recovery: i) Sufficient memory: InnoDB can reconstruct the tablespace after parsing all redo log records. ii) Insufficient memory(multiple apply phase): InnoDB should store the deferred tablespace redo logs even though tablespace is not present. InnoDB should start constructing the tablespace when it first encounters deferred tablespace id. Mariabackup copies the zero filled ibd file in backup_fix_ddl() as the extension of .new file. Mariabackup test case does page flushing when it deals with DDL operation during backup operation. fil_ibd_create(): Remove the write of page0 and flushing of file fil_ibd_load(): Return FIL_LOAD_DEFER if the tablespace has zero filled page0 Datafile: Clean up the error handling, and do not report errors if we are in the middle of recovery. The caller will check Datafile::m_defer. fil_node_t::deferred: Indicates whether the tablespace loading was deferred during recovery FIL_LOAD_DEFER: Returned by fil_ibd_load() to indicate that tablespace file was cannot be loaded. recv_sys_t::recover_deferred(): Invoke deferred_spaces.create() to initialize fil_space_t based on buffered metadata and records to initialize page 0. Ignore the flags in fil_name_t, because they are intentionally invalid. fil_name_process(): Update deferred_spaces. recv_sys_t::parse(): Store the redo log if the tablespace id is present in deferred spaces recv_sys_t::recover_low(): Should recover the first page of the tablespace even though the tablespace instance is not present recv_sys_t::apply(): Initialize the deferred tablespace before applying the deferred tablespace records recv_validate_tablespace(): Skip the validation for deferred_spaces. recv_rename_files(): Moved and revised from recv_sys_t::apply(). For deferred-recovery tablespaces, do not attempt to rename the file if a deferred-recovery tablespace is associated with the name. recv_recovery_from_checkpoint_start(): Invoke recv_rename_files() and initialize all deferred tablespaces before applying redo log. fil_node_t::read_page0(): Skip page0 validation if the tablespace is deferred buf_page_create_deferred(): A variant of buf_page_create() when the fil_space_t is not available yet This is joint work with Thirunarayanan Balathandayuthapani, who implemented an initial prototype.	2021-05-17 18:12:33 +03:00
Marko Mäkelä	3a427c568b	Cleanup: Remove useless message "If you are installing InnoDB"	2021-05-14 08:26:51 +03:00
Marko Mäkelä	d2e2d32933	Merge 10.5 into 10.6	2021-04-14 12:32:27 +03:00
Marko Mäkelä	6c3e860cbf	Merge 10.4 into 10.5	2021-04-14 11:35:39 +03:00
Marko Mäkelä	5008171b05	Merge 10.3 into 10.4	2021-04-14 10:33:59 +03:00
Marko Mäkelä	450c017c2d	Merge 10.2 into 10.3	2021-04-09 14:32:06 +03:00
Srinidhi Kaushik	5bc5ecce08	MDEV-24197: Add "innodb_force_recovery" for "mariabackup --prepare" During the prepare phase of restoring backups, "mariabackup" does not seem to allow (or recognize) the option "innodb_force_recovery" for the embedded InnoDB server instance that it starts. If page corruption observed during page recovery, the prepare step fails. While this is indeed the correct behavior ideally, allowing this option to be set in case of emergencies might be useful when the current backup is the only copy available. Some error messages during "--prepare" suggest to set "innodb_force_recovery" to 1: [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption. For backwards compatibility, "mariabackup --innobackupex --apply-log" should also have this option. Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>	2021-04-01 13:34:40 +03:00
Vladislav Vaintroub	08cb5d8483	MDEV-25221 Do not remove source file, if copy_file() fails in mariabackup --move-back Remove an incompletely copied destination file.	2021-03-31 14:23:56 +02:00
Marko Mäkelä	d346763479	Merge 10.5 into 10.6	2021-03-08 10:51:31 +02:00
Marko Mäkelä	a5d3c1c819	Merge 10.4 into 10.5	2021-03-08 10:16:20 +02:00
Marko Mäkelä	a26e7a3726	Merge 10.3 into 10.4	2021-03-08 09:39:54 +02:00
Marko Mäkelä	bcd160753c	Merge 10.2 into 10.3	2021-03-05 10:06:42 +02:00
Vladislav Vaintroub	545cba13eb	MDEV-22929 fixup. Print "completed OK!" if page corruption and --log-innodb-page-corruption Since we do not stop at corrupted page error, there is no reason to log a backup error.	2021-03-05 09:04:30 +01:00
Marko Mäkelä	94b4578704	Merge 10.5 into 10.6	2021-02-17 19:39:05 +02:00
Marko Mäkelä	d82386b6b9	MDEV-24848 Assertion rlen<llen failed when applying MEMSET btr_cur_upd_rec_in_place(): Prefer WRITE to MEMSET for a single-byte operation. log_phys_t::apply(): Relax the assertion to allow a single-byte MEMSET, even though it is 1 byte longer than a WRITE record.	2021-02-17 16:18:55 +02:00
Marko Mäkelä	3cef4f8f0f	MDEV-515 Reduce InnoDB undo logging for insert into empty table We implement an idea that was suggested by Michael 'Monty' Widenius in October 2017: When InnoDB is inserting into an empty table or partition, we can write a single undo log record TRX_UNDO_EMPTY, which will cause ROLLBACK to clear the table. For this to work, the insert into an empty table or partition must be covered by an exclusive table lock that will be held until the transaction has been committed or rolled back, or the INSERT operation has been rolled back (and the table is empty again), in lock_table_x_unlock(). Clustered index records that are covered by the TRX_UNDO_EMPTY record will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot be distinguished from what MDEV-12288 leaves behind after purging the history of row-logged operations. Concurrent non-locking reads must be adjusted: If the read view was created before the INSERT into an empty table, then we must continue to imagine that the table is empty, and not try to read any records. If the read view was created after the INSERT was committed, then all records must be visible normally. To implement this, we introduce the field dict_table_t::bulk_trx_id. This special handling only applies to the very first INSERT statement of a transaction for the empty table or partition. If a subsequent statement in the transaction is modifying the initially empty table again, we must enable row-level undo logging, so that we will be able to roll back to the start of the statement in case of an error (such as duplicate key). INSERT IGNORE will continue to use row-level logging and locking, because implementing it would require the ability to roll back the latest row. Since the undo log that we write only allows us to roll back the entire statement, we cannot support INSERT IGNORE. We will introduce a handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage engines that INSERT IGNORE is being executed. In many test cases, we add an extra record to the table, so that during the 'interesting' part of the test, row-level locking and logging will be used. Replicas will continue to use row-level logging and locking until MDEV-24622 has been addressed. Likewise, this optimization will be disabled in Galera cluster until MDEV-24623 enables it. dict_table_t::bulk_trx_id: The latest active or committed transaction that initiated an insert into an empty table or partition. Protected by exclusive table lock and a clustered index leaf page latch. ins_node_t::bulk_insert: Whether bulk insert was initiated. trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert). Unlike earlier, this collection will cover also temporary tables. trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(), is_bulk_insert(), was_bulk_insert(). trx_undo_report_row_operation(): Before accessing any undo log pages, invoke trx->mod_tables.emplace() in order to determine whether undo logging was disabled, or whether this is the first INSERT and we are supposed to write a TRX_UNDO_EMPTY record. row_ins_clust_index_entry_low(): If we are inserting into an empty clustered index leaf page, set the ins_node_t::bulk_insert flag for the subsequent trx_undo_report_row_operation() call. lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock(): Remove the redundant parameter 'flags' that can be checked in the caller. btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation(). trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT), ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that the next statement will not be covered by table-level undo logging. ReadView::changes_visible(trx_id_t) const: New accessor for the case where the trx_id_t is not read from a potentially corrupted index page but directly from the memory. In this case, we can skip a sanity check. row_sel(), row_sel_try_search_shortcut(), row_search_mvcc(): row_sel_try_search_shortcut_for_mysql(), row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id. row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees(). lock_sec_rec_cons_read_sees(): Replaced with lower-level code. btr_root_page_init(): Refactored from btr_create(). dict_index_t::clear(), dict_table_t::clear(): Empty an index or table, for the ROLLBACK of an INSERT operation. ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT into an empty table. This is joint work with Thirunarayanan Balathandayuthapani, who created a working prototype. Thanks to Matthias Leich for extensive testing.	2021-01-25 18:41:27 +02:00
Marko Mäkelä	1fe3dd003a	MDEV-24426 fil_crypt_thread keep spinning even if innodb_encryption_rotate_key_age=0 After MDEV-15528, two modes of operation in the fil_crypt_thread remains, depending on whether innodb_encryption_rotate_key_age=0 (whether key rotation is disabled). If the key rotation is disabled, the fil_crypt_thread miss the opportunity to sleep, which will result in lots of wasted CPU usage. fil_crypt_return_iops(): Add a parameter to specify whether other fil_crypt_thread should be woken up. fil_system_t::keyrotate_next(): Return the special value fil_system.temp_space to indicate that no work is to be done. fil_space_t::next(): Propagage the special value fil_system.temp_space to the caller. fil_crypt_find_space_to_rotate(): If no work is to be done, do not wake up other threads.	2020-12-17 13:46:21 +02:00
Marko Mäkelä	6a1e655cb0	Merge 10.4 into 10.5	2020-12-02 18:29:49 +02:00
Marko Mäkelä	589cf8dbf3	Merge 10.3 into 10.4	2020-12-01 19:51:14 +02:00
Marko Mäkelä	81ab9ea63f	Merge 10.2 into 10.3	2020-12-01 14:55:46 +02:00
Vlad Lesin	e6b3e38d62	MDEV-22929 MariaBackup option to report and/or continue when corruption is encountered The new option --log-innodb-page-corruption is introduced. When this option is set, backup is not interrupted if innodb corrupted page is detected. Instead it logs all found corrupted pages in innodb_corrupted_pages file in backup directory and finishes with error. For incremental backup corrupted pages are also copied to .delta file, because we can't do LSN check for such pages during backup, innodb_corrupted_pages will also be created in incremental backup directory. During --prepare, corrupted pages list is read from the file just after redo log is applied, and each page from the list is checked if it is allocated in it's tablespace or not. If it is not allocated, then it is zeroed out, flushed to the tablespace and removed from the list. If all pages are removed from the list, then --prepare is finished successfully and innodb_corrupted_pages file is removed from backup directory. Otherwise --prepare is finished with error message and innodb_corrupted_pages contains the list of the pages, which are detected as corrupted during backup, and are allocated in their tablespaces, what means backup directory contains corrupted innodb pages, and backup can not be considered as consistent. For incremental --prepare corrupted pages from .delta files are applied to the base backup, innodb_corrupted_pages is read from both base in incremental directories, and the same action is proceded for corrupted pages list as for full --prepare. innodb_corrupted_pages file is modified or removed only in base directory. If DDL happens during backup, it is also processed at the end of backup to have correct tablespace names in innodb_corrupted_pages.	2020-12-01 08:08:57 +03:00
Marko Mäkelä	533a13af06	Merge 10.3 into 10.4	2020-11-03 14:49:17 +02:00
Oleksandr Byelkin	8e1e2856f2	Merge branch '10.4' into 10.5	2020-11-01 14:26:15 +01:00
Oleksandr Byelkin	80c951ce28	Merge branch '10.3' into 10.4	2020-10-31 21:06:49 +01:00
Oleksandr Byelkin	794f665139	Merge branch '10.2' into 10.3	2020-10-30 17:23:53 +01:00
Marko Mäkelä	7b2bb67113	Merge 10.3 into 10.4	2020-10-29 13:38:38 +02:00
Vlad Lesin	6cb88685c4	MDEV-24026: InnoDB: Failing assertion: os_total_large_mem_allocated >= size upon incremental backup mariabackup deallocated uninitialized write_filt_ctxt.u.wf_incremental_ctxt in xtrabackup_copy_datafile() when some table should be skipped due to parsed DDL redo log record.	2020-10-29 07:39:43 +01:00
Marko Mäkelä	a8de8f261d	Merge 10.2 into 10.3	2020-10-28 10:01:50 +02:00
Marko Mäkelä	987df9b37a	MDEV-23720 Change innodb_log_optimize_ddl=OFF by default MariaDB 10.2.2 inherited from MySQL 5.7 a perceived optimization of ALTER TABLE, which skips the writing of redo log records. In MDEV-16809 we introduced a parameter that allows the redo log to be written, so that Mariabackup would not be impacted, but we kept the MySQL 5.7 behaviour enabled by default (innodb_log_optimize_ddl=ON). As noted in MDEV-19747 (Deprecate and ignore innodb_log_optimize_ddl, implemented in MariaDB 10.5.1), omitting the redo log writes can actually reduce performance, because we will have to wait for the data pages to be written out. When the redo log file is configured to be large enough, it actually can be much faster to write the redo log and avoid the extra page flushing. When the redo log is omitted (innodb_log_optimize_ddl=ON), also Mariabackup may have to perform a lot of extra work, to re-copy the entire data file if it is possible that any log was omitted during the backup. Starting with MariaDB 10.5.1, the parameter innodb_log_optimize_ddl is deprecated and ignored. We hereby deprecate (but will not ignore) the parameter in earlier versions as well.	2020-10-25 11:48:34 +02:00
Vlad Lesin	985ede9203	MDEV-20755 InnoDB: Database page corruption on disk or a failed file read of tablespace upon prepare of mariabackup incremental backup The problem: When incremental backup is taken, delta files are created for innodb tables which are marked as new tables during innodb ddl tracking. When such tablespace is tried to be opened during prepare in xb_delta_open_matching_space(), it is "created", i.e. xb_space_create_file() is invoked, instead of opening, even if a tablespace with the same name exists in the base backup directory. xb_space_create_file() writes page 0 header the tablespace. This header does not contain crypt data, as mariabackup does not have any information about crypt data in delta file metadata for tablespaces. After delta file is applied, recovery process is started. As the sequence of recovery for different pages is not defined, there can be the situation when crypt data redo log event is executed after some other page is read for recovery. When some page is read for recovery, it's decrypted using crypt data stored in tablespace header in page 0, if there is no crypt data, the page is not decryped and does not pass corruption test. This causes error for incremental backup --prepare for encrypted tablespaces. The error is not stable because crypt data redo log event updates crypt data on page 0, and recovery for different pages can be executed in undefined order. The fix: When delta file is created, the corresponding write filter copies only the pages which LSN is greater then some incremental LSN. When new file is created during incremental backup, the LSN of all it's pages must be greater then incremental LSN, so there is no need to create delta for such table, we can just copy it completely. The fix is to copy the whole file which was tracked during incremental backup with innodb ddl tracker, and copy it to base directory during --prepare instead of delta applying. There is also DBUG_EXECUTE_IF() in innodb code to avoid writing redo log record for crypt data updating on page 0 to make the test case stable. Note: The issue is not reproducible in 10.5 as optimized DDL's are deprecated in 10.5. But the fix is still useful because it allows to decrease data copy size during backup, as delta file contains some extra info. The test case should be removed for 10.5 as it will always pass.	2020-10-23 11:02:25 +03:00
Marko Mäkelä	882ce206db	Merge 10.4 into 10.5	2020-09-23 11:32:43 +03:00
Marko Mäkelä	952a028a52	Merge 10.3 into 10.4 We omit commit `a3bdce8f1e` and commit `a0e2a293bc` because they would make the test galera_3nodes.galera_gtid_2_cluster fail and disable it.	2020-09-21 17:42:02 +03:00
Marko Mäkelä	2cf489d430	Merge 10.2 into 10.3	2020-09-21 16:39:23 +03:00
Vlad Lesin	0a224edc3e	MDEV-23711 make mariabackup innodb redo log read error message more clear log_group_read_log_seg() returns error when: 1) Calculated log block number does not correspond to read log block number. This can be caused by: a) Garbage or an incompletely written log block. We can exclude this case by checking log block checksum if it's enabled(see innodb-log-checksums, encrypted log block contains checksum always). b) The log block is overwritten. In this case checksum will be correct and read log block number will be greater then requested one. 2) When log block length is wrong. In this case recv_sys->found_corrupt_log is set. 3) When redo log block checksum is wrong. In this case innodb code writes messages to error log with the following prefix: "Invalid log block checksum." The fix processes all the cases above.	2020-09-21 12:29:52 +03:00
Marko Mäkelä	3a423088ac	Merge 10.3 into 10.4	2020-09-21 12:29:00 +03:00
Marko Mäkelä	cbcb4ecabb	Merge 10.2 into 10.3	2020-09-21 11:04:04 +03:00
Vlad Lesin	80075ba011	MDEV-19264 Better support MariaDB GTID for Mariabackup's --slave-info option Parse SHOW SLAVE STATUS output for the "Using_Gtid" column. If the value is "No", then old log file and position is backed up, otherwise gtid_slave_pos is backed up.	2020-09-14 11:14:50 +03:00
Marko Mäkelä	e67daa5653	Merge 10.4 into 10.5	2020-07-15 14:51:22 +03:00
Marko Mäkelä	9936cfd531	Merge 10.3 into 10.4	2020-07-15 10:17:15 +03:00
Marko Mäkelä	8a0944080c	Merge 10.2 into 10.3	2020-07-14 22:59:19 +03:00
Marko Mäkelä	646a6005e7	Merge 10.1 into 10.2	2020-07-14 15:10:59 +03:00
Thirunarayanan Balathandayuthapani	e80183dbd5	MDEV-15662 mariabackup.huge_lsn fails sporadically with "log sequence number is in the future" - Problem is that test case creates iblogfile* files. So existing ibdata pages could point to future LSN. Fix is that taking the backup of data before iblogfile* creation and apply it before exiting the test case.	2020-07-14 13:24:37 +05:30
Marko Mäkelä	1813d92d0c	Merge 10.4 into 10.5	2020-07-02 09:41:44 +03:00
Marko Mäkelä	f347b3e0e6	Merge 10.3 into 10.4	2020-07-02 07:39:33 +03:00
Marko Mäkelä	1df1a63924	Merge 10.2 into 10.3	2020-07-02 06:17:51 +03:00
Oleksandr Byelkin	b0f836053b	MDEV-22983: Mariabackup's --help option disappeared return --help option	2020-07-01 23:30:44 +03:00
Sergei Golubchik	5a097c5556	MDEV-21222 mariabackup.incremental_backup failed with memory allocation failure mariabackup tries to allocate a buffer of page_size*page_size/4 size. for 64k page it means 1Gb, which doesn't work very well on 32-bit builders. Skip the 64k page test on 32bit.	2020-07-01 17:22:22 +03:00
Marko Mäkelä	c515b1d092	Merge 10.4 into 10.5	2020-06-18 13:58:54 +03:00

... 3 4 5 6 7 ...

481 commits