mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-15 19:42:28 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	15700f54c2	Merge 11.4 into 11.7	2025-01-09 09:41:38 +02:00
Marko Mäkelä	17f01186f5	Merge 10.11 into 11.4	2025-01-09 07:58:08 +02:00
Marko Mäkelä	a54d151fc1	Merge 10.6 into 10.11	2024-12-19 15:38:53 +02:00
Marko Mäkelä	ddd7d5d8e3	MDEV-24035 Failing assertion: UT_LIST_GET_LEN(lock.trx_locks) == 0 causing disruption and replication failure Under unknown circumstances, the SQL layer may wrongly disregard an invocation of thd_mark_transaction_to_rollback() when an InnoDB transaction had been aborted (rolled back) due to one of the following errors: * HA_ERR_LOCK_DEADLOCK * HA_ERR_RECORD_CHANGED (if innodb_snapshot_isolation=ON) * HA_ERR_LOCK_WAIT_TIMEOUT (if innodb_rollback_on_timeout=ON) Such an error used to cause a crash of InnoDB during transaction commit. These changes aim to catch and report the error earlier, so that not only this crash can be avoided but also the original root cause be found and fixed more easily later. The idea of this fix is from Michael 'Monty' Widenius. HA_ERR_ROLLBACK: A new error code that will be translated into ER_ROLLBACK_ONLY, signalling that the current transaction has been aborted and the only allowed action is ROLLBACK. trx_t::state: Add TRX_STATE_ABORTED that is like TRX_STATE_NOT_STARTED, but noting that the transaction had been rolled back and aborted. trx_t::is_started(): Replaces trx_is_started(). ha_innobase: Check the transaction state in various places. Simplify the logic around SAVEPOINT. ha_innobase::is_valid_trx(): Replaces ha_innobase::is_read_only(). The InnoDB logic around transaction savepoints, commit, and rollback was unnecessarily complex and might have contributed to this inconsistency. So, we are simplifying that logic as well. trx_savept_t: Replace with const undo_no_t*. When we rollback to a savepoint, all we need to know is the number of undo log records that must survive. trx_named_savept_t, DB_NO_SAVEPOINT: Remove. We can store undo_no_t directly in the space allocated at innobase_hton->savepoint_offset. fts_trx_create(): Do not copy previous savepoints. fts_savepoint_rollback(): If a savepoint was not found, roll back everything after the default savepoint of fts_trx_create(). The test innodb_fts.savepoint is extended to cover this code. Reviewed by: Vladislav Lesin Tested by: Matthias Leich	2024-12-12 18:02:00 +02:00
Sergei Golubchik	d6add9a03d	initial support for vector indexes MDEV-33407 Parser support for vector indexes The syntax is create table t1 (... vector index (v) ...); limitation: * v is a binary string and NOT NULL * only one vector index per table * temporary tables are not supported MDEV-33404 Engine-independent indexes: subtable method added support for so-called "high level indexes", they are not visible to the storage engine, implemented on the sql level. For every such an index in a table, say, t1, the server implicitly creates a second table named, like, t1#i#05 (where "05" is the index number in t1). This table has a fixed structure, no frm, not accessible directly, doesn't go into the table cache, needs no MDLs. MDEV-33406 basic optimizer support for k-NN searches for a query like SELECT ... ORDER BY func() optimizer will use item_func->part_of_sortkey() to decide what keys can be used to resolve ORDER BY.	2024-11-05 14:00:48 -08:00
Sergei Golubchik	062f8eb37d	cleanup: key algorithm vs key flags the information about index algorithm was stored in two places inconsistently split between both. BTREE index could have key->algorithm == HA_KEY_ALG_BTREE, if the user explicitly specified USING BTREE or HA_KEY_ALG_UNDEF, if not. RTREE index had key->algorithm == HA_KEY_ALG_RTREE and always had key->flags & HA_SPATIAL FULLTEXT index had key->algorithm == HA_KEY_ALG_FULLTEXT and always had key->flags & HA_FULLTEXT HASH index had key->algorithm == HA_KEY_ALG_HASH or HA_KEY_ALG_UNDEF long unique index always had key->algorithm == HA_KEY_ALG_LONG_HASH In this commit: All indexes except BTREE and HASH always have key->algorithm set, HA_SPATIAL and HA_FULLTEXT flags are not used anymore (except for storage to keep frms backward compatible). As a side effect ALTER TABLE now detects FULLTEXT index renames correctly	2024-11-05 14:00:47 -08:00
Sergei Golubchik	32e6f8ff2e	cleanup: remove unconditional #ifdef's	2024-11-05 14:00:47 -08:00
Sergei Golubchik	9fa31c1bd9	cleanup: spaces, casts, comments	2024-11-05 14:00:47 -08:00
Monty	b9f5793176	MDEV-9101 Limit size of created disk temporary files and tables Two new variables added: - max_tmp_space_usage : Limits the the temporary space allowance per user - max_total_tmp_space_usage: Limits the temporary space allowance for all users. New status variables: tmp_space_used & max_tmp_space_used New field in information_schema.process_list: TMP_SPACE_USED The temporary space is counted for: - All SQL level temporary files. This includes files for filesort, transaction temporary space, analyze, binlog_stmt_cache etc. It does not include engine internal temporary files used for repair, alter table, index pre sorting etc. - All internal on disk temporary tables created as part of resolving a SELECT, multi-source update etc. Special cases: - When doing a commit, the last flush of the binlog_stmt_cache will not cause an error even if the temporary space limit is exceeded. This is to avoid giving errors on commit. This means that a user can temporary go over the limit with up to binlog_stmt_cache_size. Noteworthy issue: - One has to be careful when using small values for max_tmp_space_limit together with binary logging and with non transactional tables. If a the binary log entry for the query is bigger than binlog_stmt_cache_size and one hits the limit of max_tmp_space_limit when flushing the entry to disk, the query will abort and the binary log will not contain the last changes to the table. This will also stop the slave! This is also true for all Aria tables as Aria cannot do rollback (except in case of crashes)! One way to avoid it is to use @@binlog_format=statement for queries that updates a lot of rows. Implementation: - All writes to temporary files or internal temporary tables, that increases the file size, are routed through temp_file_size_cb_func() which updates and checks the temp space usage. - Most of the temporary file monitoring is done inside IO_CACHE. Temporary file monitoring is done inside the Aria engine. - MY_TRACK and MY_TRACK_WITH_LIMIT are new flags for ini_io_cache(). MY_TRACK means that we track the file usage. TRACK_WITH_LIMIT means that we track the file usage and we give an error if the limit is breached. This is used to not give an error on commit when binlog_stmp_cache is flushed. - global_tmp_space_used contains the total tmp space used so far. This is needed quickly check against max_total_tmp_space_usage. - Temporary space errors are using EE_LOCAL_TMP_SPACE_FULL and handler errors are using HA_ERR_LOCAL_TMP_SPACE_FULL. This is needed until we move general errors to it's own error space so that they cannot conflict with system error numbers. - Return value of my_chsize() and mysql_file_chsize() has changed so that -1 is returned in the case my_chsize() could not decrease the file size (very unlikely and will not happen on modern systems). All calls to _chsize() are updated to check for > 0 as the error condition. - At the destruction of THD we check that THD::tmp_file_space == 0 - At server end we check that global_tmp_space_used == 0 - As a precaution against errors in the tmp_space_used code, one can set max_tmp_space_usage and max_total_tmp_space_usage to 0 to disable the tmp space quota errors. - truncate_io_cache() function added. - Aria tables using static or dynamic row length are registered in 8K increments to avoid some calls to update_tmp_file_size(). Other things: - Ensure that all handler errors are registered. Before, some engine errors could be printed as "Unknown error". - Fixed bug in filesort() that causes a assert if there was an error when writing to the temporay file. - Fixed that compute_window_func() now takes into account write errors. - In case of parallel replication, rpl_group_info::cleanup_context() could call trans_rollback() with thd->error set, which would cause an assert. Fixed by resetting the error before calling trans_rollback(). - Fixed bug in subselect3.inc which caused following test to use heap tables with low value for max_heap_table_size - Fixed bug in sql_expression_cache where it did not overflow heap table to Aria table. - Added Max_tmp_disk_space_used to slow query log. - Fixed some bugs in log_slow_innodb.test	2024-05-27 12:39:04 +02:00
Marko Mäkelä	fec2fd6add	Merge 10.11 into 11.0	2024-03-28 10:51:36 +02:00
Marko Mäkelä	788953463d	Merge 10.6 into 10.11 Some fixes related to commit `f838b2d799` and Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row() for system-versioned tables were provided by Nikita Malyavin. This was required by test versioning.rpl,trx_id,row.	2024-03-28 09:16:57 +02:00
Sergei Golubchik	f71d7f2f0f	Merge branch '10.5' into 10.6	2024-03-13 21:02:34 +01:00
Marko Mäkelä	f703e72bd8	Merge 10.4 into 10.5	2024-03-11 10:08:20 +02:00
Kristian Nielsen	c73c6aea63	MDEV-33426: Aria temptables wrong thread-specific memory accounting in slave thread Aria temporary tables account allocated memory as specific to the current THD. But this fails for slave threads, where the temporary tables need to be detached from any specific THD. Introduce a new flag to mark temporary tables in replication as "global", and use that inside Aria to not account memory allocations as thread specific for such tables. Based on original suggestion by Monty. Reviewed-by: Monty <monty@mariadb.org> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-02-16 12:48:30 +01:00
Sergei Golubchik	3f6bccb888	Merge branch '10.11' into 11.0	2023-09-29 12:24:54 +02:00
Thirunarayanan Balathandayuthapani	bf3b787e02	MDEV-31835 Remove unnecessary extra HA_EXTRA_IGNORE_INSERT call - This commit is different from 10.6 commit `c438284863`. Due to Commit `045757af4c` (MDEV-24621), InnoDB does buffer and pre-sort the records for each index, and build the indexes one page at a time. Multiple large insert ignore statment aborts the server during bulk insert operation. Problem is that InnoDB merge record exceeds the page size. To avoid this scenario, InnoDB should catch too big record while buffering the insert operation itself. row_merge_buf_encode(): returns length of the encoded index record row_merge_buf_write(): Catches the DB_TOO_BIG_RECORD earlier and returns error	2023-08-25 23:13:05 +05:30
Thirunarayanan Balathandayuthapani	c438284863	MDEV-31835 Remove unnecesary extra HA_EXTRA_IGNORE_INSERT call - HA_EXTRA_IGNORE_INSERT call is being called for every inserted row, and on partitioned tables on every row * every partition. This leads to slowness during load..data operation - Under bulk operation, multiple insert statement error handling will end up emptying the table. This behaviour introduced by the commit `8ea923f55b` (MDEV-24818). This makes the HA_EXTRA_IGNORE_INSERT call redundant. We can use the same behavior for insert..ignore statement as well. - Removed the extra call HA_EXTRA_IGNORE_INSERT as the solution to improve the performance of load command.	2023-08-25 17:22:17 +05:30
Oleksandr Byelkin	51f9d62005	Merge branch '10.11' into 11.0	2023-08-09 07:53:48 +02:00
Oleksandr Byelkin	34a8e78581	Merge branch '10.6' into 10.9	2023-08-04 08:01:06 +02:00
Oleksandr Byelkin	5ea5291d97	Merge branch '10.5' into 10.6	2023-08-04 07:52:54 +02:00
Sergei Golubchik	f7a9f446d7	cleanup: remove unused keyinfo flag HA_UNIQUE_CHECK was * only used internally by MyISAM/Aria * only used for internal temporary tables (for DISTINCT) * never saved in frm * saved in MYI/MAD but only for temporary tables * only set, never checked it's safe to remove it and free the bit (there are only 16 of them)	2023-08-01 22:43:16 +02:00
Oleksandr Byelkin	6bf8483cac	Merge branch '10.5' into 10.6	2023-08-01 15:08:52 +02:00
Oleksandr Byelkin	7564be1352	Merge branch '10.4' into 10.5	2023-07-26 16:02:57 +02:00
Yuchen Pei	734583b0d7	MDEV-31400 Simple plugin dependency resolution We introduce simple plugin dependency. A plugin init function may return HA_ERR_RETRY_INIT. If this happens during server startup when the server is trying to initialise all plugins, the failed plugins will be retried, until no more plugins succeed in initialisation or want to be retried. This will fix spider init bugs which is caused in part by its dependency on Aria for initialisation. The reason we need a new return code, instead of treating every failure as a request for retry, is that it may be impossible to clean up after a failed plugin initialisation. Take InnoDB for example, it has a global variable `buf_page_cleaner_is_active`, which may not satisfy an assertion during a second initialisation try, probably because InnoDB does not expect the initialisation to be called twice.	2023-07-25 18:24:20 +10:00
Oleksandr Byelkin	f52954ef42	Merge commit '10.4' into 10.5	2023-07-20 11:54:52 +02:00
Sergei Petrunia	b3edbf25a1	MDEV-31022: SIGSEGV in maria_create from create_internal_tmp_table The code in create_internal_tmp_table() didn't take into account that now temporary (derived) tables may have multiple indexes: - one index due to duplicate removal = In this example created by conversion of big-IN(...) into subquery = this index might be converted into a "unique constraint" if the key length is too large. - one index added by derived_with_keys optimization. Make create_internal_tmp_table() handle multiple indexes. Before this patch, use of a unique constraint was indicated in TABLE_SHARE::uniques. This was ok as unique constraint was the only index in the table. Now it's no longer the case so TABLE_SHARE::uniques is removed and replaced with an in-memory-only flag HA_UNIQUE_HASH. This patch is based on Monty's patch. Co-Author: Monty <monty@mariadb.org>	2023-05-09 10:12:27 +03:00
Monty	1ef22e28ad	MDEV-26258 Various crashes/asserts/corruptions when Aria encryption is enabled/used, but the encryption plugin is not loaded The reason for the MDEV reported failures is that the tests are enabling encryption for Aria but not providing any encryption keys. Fixed by checking if encryption keys exists before creating the table. Other things: - maria.encrypt_wrong-key changed as we now get the error on CREATE instead during insert.	2023-05-02 23:37:10 +03:00
Marko Mäkelä	820ebcec86	Merge 10.8 into 10.9	2023-01-10 14:50:58 +02:00
Marko Mäkelä	e441c32a0b	Merge 10.5 into 10.6	2023-01-03 18:13:11 +02:00
Marko Mäkelä	8b9b4ab3f5	Merge 10.4 into 10.5	2023-01-03 17:08:42 +02:00
Marko Mäkelä	fb0808c450	Merge 10.3 into 10.4	2023-01-03 16:10:02 +02:00
musvaage	e9e6c7a3c5	header typos	2022-12-20 08:55:48 +11:00
Oleksandr Byelkin	ebf2121529	Merge branch '10.8' into 10.9	2022-11-01 10:33:44 +01:00
Marko Mäkelä	aeccbbd926	Merge 10.5 into 10.6 To prevent ASAN heap-use-after-poison in the MDEV-16549 part of ./mtr --repeat=6 main.derived the initialization of Name_resolution_context was cleaned up.	2022-10-25 14:25:42 +03:00
Marko Mäkelä	9a0b9e3360	Merge 10.4 into 10.5	2022-10-25 11:26:37 +03:00
Alexander Barkov	2a57396e59	MDEV-29481 mariadb-upgrade prints confusing statement This is a new version of the patch instead of the reverted: MDEV-28727 ALTER TABLE ALGORITHM=NOCOPY does not work after upgrade Ignore the difference in key packing flags HA_BINARY_PACK_KEY and HA_PACK_KEY during ALTER to allow ALGORITHM=INSTANT and ALGORITHM=NOCOPY in more cases. If for some reasons (e.g. due to a bug fix such as MDEV-20704) these cumulative (over all segments) flags in KEY::flags are different for the old and new table inside compare_keys_but_name(), the difference in HA_BINARY_PACK_KEY and HA_PACK_KEY in KEY::flags is not really important: MyISAM and Aria can handle such cases well: per-segment flags are stored in MYI and MAI files anyway and they are read during ha_myisam::open() ha_maria::open() time. So indexes get opened with correct per-segment flags that were calculated during the table CREATE time, no matter what the old (CREATE time) and new (ALTER TIME) per-index compression flags are, and no matter if they are equal or not. All other engine ignore key compression flags, so this change is safe for other engines as well.	2022-10-22 14:22:20 +04:00
Aleksey Midenkov	92bfc0e8c4	MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT :: Syntax change :: Keyword AUTO enables history partition auto-creation. Examples: CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO; CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME INTERVAL 1 MONTH STARTS '2021-01-01 00:00:00' AUTO PARTITIONS 12; CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME LIMIT 1000 AUTO; Or with explicit partitions: CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO (PARTITION p0 HISTORY, PARTITION pn CURRENT); To disable or enable auto-creation one can use ALTER TABLE by adding or removing AUTO from partitioning specification: CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO; # Disables auto-creation: ALTER TABLE t1 PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR; # Enables auto-creation: ALTER TABLE t1 PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO; If the rest of partitioning specification is identical to CREATE TABLE no repartitioning will be done (for details see MDEV-27328). :: Description :: Before executing history-generating DML command (see the list of commands below) add N history partitions, so that N would be sufficient for potentially generated history. N > 1 may be required when history partitions are switched by INTERVAL and current_timestamp is N times further than the interval boundary of the last history partition. If the last history partition equals or exceeds LIMIT records then new history partition is created and selected as the working partition. According to MDEV-28411 partitions cannot be switched (or created) while the command is running. Thus LIMIT does not carry strict limitation and the history partition size must be planned as LIMIT value plus average number of history one DML command can generate. Auto-creation is implemented by synchronous fast_alter_partition_table() call from the thread of the executed DML command before the command itself is run (by the fallback and retry mechanism similar to Discovery feature, see Open_table_context). The name for newly added partitions are generated like default partition names with extension of MDEV-22155 (which avoids name clashes by extending assignment counter to next free-enough gap). These DML commands can trigger auto-creation: DELETE (including multitable DELETE, excluding DELETE HISTORY) UPDATE (including multitable UPDATE) REPLACE (including REPLACE .. SELECT) INSERT .. ON DUPLICATE KEY UPDATE (including INSERT .. SELECT .. ODKU) LOAD DATA .. REPLACE :: Bug fixes :: MDEV-23642 Locking timeout caused by auto-creation affects original DML The reasons for this are: - Do not disrupt main business process (the history is auxiliary service); - Consequences are non-fatal (history is not lost, but comes into wrong partition; fixed by partitioning rebuild); - There is more freedom for application to fail in this case or not: it may read warning info and find corresponding error number. - While non-failing command is easy to handle by an application and fail it, the opposite is hard to handle: there is no automatic actions to fix failed command and retry, DBA intervention is required and until then application is non-functioning. MDEV-23639 Auto-create does not work under LOCK TABLES or inside triggers Don't do tdc_remove_table() for OT_ADD_HISTORY_PARTITION because it is not possible in locked tables mode. LTM_LOCK_TABLES mode (and LTM_PRELOCKED_UNDER_LOCK_TABLES) works out of the box as fast_alter_partition_table() can reopen tables via locked_tables_list. In LTM_PRELOCKED we reopen and relock table manually. :: More fixes :: * some_table_marked_for_reopen flag fix some_table_marked_for_reopen affets only reopen of m_locked_tables. I.e. Locked_tables_list::reopen_tables() reopens only tables from m_locked_tables. * Unused can_recover_from_failed_open() condition Is recover_from_failed_open() can be really used after open_and_process_routine()? :: Reviewed by :: Sergei Golubchik <serg@mariadb.org>	2022-05-06 15:11:02 +03:00
Oleksandr Byelkin	f5c5f8e41e	Merge branch '10.5' into 10.6	2022-02-03 17:01:31 +01:00
Oleksandr Byelkin	cf63eecef4	Merge branch '10.4' into 10.5	2022-02-01 20:33:04 +01:00
Oleksandr Byelkin	ca41fdba22	fix MDEV-27217 (`4d5ae2b325`) Add error from later versions to avoid chaging HA_ERR_* accross versions and in already released versions.	2022-01-30 17:52:15 +01:00
Oleksandr Byelkin	a576a1cea5	Merge branch '10.3' into 10.4	2022-01-30 09:46:52 +01:00
Aleksey Midenkov	4d5ae2b325	MDEV-27217 DELETE partition selection doesn't work for history partitions LIMIT history switching requires the number of history partitions to be marked for read: from first to last non-empty plus one empty. The least we can do is to fail with error message if the needed partition was not marked for read. As this is handler interface we require new handler error code to display user-friendly error message. Switching by INTERVAL works out-of-the-box with ER_ROW_DOES_NOT_MATCH_GIVEN_PARTITION_SET error.	2022-01-13 23:35:16 +03:00
Marko Mäkelä	03c09837fc	Merge 10.5 into 10.6	2021-09-16 20:17:12 +03:00
Monty	f03fee06b0	Improve error messages from Aria - Error on commit now returns HA_ERR_COMMIT_ERROR instead of HA_ERR_INTERNAL_ERROR - If checkpoint fails, it will now print out where it failed.	2021-09-15 19:27:34 +03:00
Marko Mäkelä	1bd681c8b3	MDEV-25506 (3 of 3): Do not delete .ibd files before commit This is a complete rewrite of DROP TABLE, also as part of other DDL, such as ALTER TABLE, CREATE TABLE...SELECT, TRUNCATE TABLE. The background DROP TABLE queue hack is removed. If a transaction needs to drop and create a table by the same name (like TRUNCATE TABLE does), it must first rename the table to an internal #sql-ib name. No committed version of the data dictionary will include any #sql-ib tables, because whenever a transaction renames a table to a #sql-ib name, it will also drop that table. Either the rename will be rolled back, or the drop will be committed. Data files will be unlinked after the transaction has been committed and a FILE_RENAME record has been durably written. The file will actually be deleted when the detached file handle returned by fil_delete_tablespace() will be closed, after the latches have been released. It is possible that a purge of the delete of the SYS_INDEXES record for the clustered index will execute fil_delete_tablespace() concurrently with the DDL transaction. In that case, the thread that arrives later will wait for the other thread to finish. HTON_TRUNCATE_REQUIRES_EXCLUSIVE_USE: A new handler flag. ha_innobase::truncate() now requires that all other references to the table be released in advance. This was implemented by Monty. ha_innobase::delete_table(): If CREATE TABLE..SELECT is detected, we will "hijack" the current transaction, drop the table in the current transaction and commit the current transaction. This essentially fixes MDEV-21602. There is a FIXME comment about making the check less failure-prone. ha_innobase::truncate(), ha_innobase::delete_table(): Implement a fast path for temporary tables. We will no longer allow temporary tables to use the adaptive hash index. dict_table_t::mdl_name: The original table name for the purpose of acquiring MDL in purge, to prevent a race condition between a DDL transaction that is dropping a table, and purge processing undo log records of DML that had executed before the DDL operation. For #sql-backup- tables during ALTER TABLE...ALGORITHM=COPY, the dict_table_t::mdl_name will differ from dict_table_t::name. dict_table_t::parse_name(): Use mdl_name instead of name. dict_table_rename_in_cache(): Update mdl_name. For the internal FTS_ tables of FULLTEXT INDEX, purge would acquire MDL on the FTS_ table name, but not on the main table, and therefore it would be able to run concurrently with a DDL transaction that is dropping the table. Previously, the DROP TABLE queue hack prevented a race between purge and DDL. For now, we introduce purge_sys.stop_FTS() to prevent purge from opening any table, while a DDL transaction that may drop FTS_ tables is in progress. The function fts_lock_table(), which will be invoked before the dictionary is locked, will wait for purge to release any table handles. trx_t::drop_table_statistics(): Drop statistics for the table. This replaces dict_stats_drop_index(). We will drop or rename persistent statistics atomically as part of DDL transactions. On lock conflict for dropping statistics, we will fail instantly with DB_LOCK_WAIT_TIMEOUT, because we will be holding the exclusive data dictionary latch. trx_t::commit_cleanup(): Separated from trx_t::commit_in_memory(). Relax an assertion around fts_commit() and allow DB_LOCK_WAIT_TIMEOUT in addition to DB_DUPLICATE_KEY. The call to fts_commit() is entirely misplaced here and may obviously break the consistency of transactions that affect FULLTEXT INDEX. It needs to be fixed separately. dict_table_t::n_foreign_key_checks_running: Remove (MDEV-21175). The counter was a work-around for missing meta-data locking (MDL) on the SQL layer, and not really needed in MariaDB. ER_TABLE_IN_FK_CHECK: Replaced with ER_UNUSED_28. HA_ERR_TABLE_IN_FK_CHECK: Remove. row_ins_check_foreign_constraints(): Do not acquire dict_sys.latch either. The SQL-layer MDL will protect us. This was reviewed by Thirunarayanan Balathandayuthapani and tested by Matthias Leich.	2021-06-09 17:06:07 +03:00
Marko Mäkelä	03ff588d15	Merge 10.5 into 10.6	2021-03-05 16:05:47 +02:00
Marko Mäkelä	10d544aa7b	Merge 10.4 into 10.5	2021-03-05 12:54:43 +02:00
Marko Mäkelä	fcc9f8b10c	Remove unused HA_EXTRA_FAKE_START_STMT This is fixup for commit `f06a0b5338`.	2021-03-05 10:40:16 +02:00
Marko Mäkelä	3cef4f8f0f	MDEV-515 Reduce InnoDB undo logging for insert into empty table We implement an idea that was suggested by Michael 'Monty' Widenius in October 2017: When InnoDB is inserting into an empty table or partition, we can write a single undo log record TRX_UNDO_EMPTY, which will cause ROLLBACK to clear the table. For this to work, the insert into an empty table or partition must be covered by an exclusive table lock that will be held until the transaction has been committed or rolled back, or the INSERT operation has been rolled back (and the table is empty again), in lock_table_x_unlock(). Clustered index records that are covered by the TRX_UNDO_EMPTY record will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot be distinguished from what MDEV-12288 leaves behind after purging the history of row-logged operations. Concurrent non-locking reads must be adjusted: If the read view was created before the INSERT into an empty table, then we must continue to imagine that the table is empty, and not try to read any records. If the read view was created after the INSERT was committed, then all records must be visible normally. To implement this, we introduce the field dict_table_t::bulk_trx_id. This special handling only applies to the very first INSERT statement of a transaction for the empty table or partition. If a subsequent statement in the transaction is modifying the initially empty table again, we must enable row-level undo logging, so that we will be able to roll back to the start of the statement in case of an error (such as duplicate key). INSERT IGNORE will continue to use row-level logging and locking, because implementing it would require the ability to roll back the latest row. Since the undo log that we write only allows us to roll back the entire statement, we cannot support INSERT IGNORE. We will introduce a handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage engines that INSERT IGNORE is being executed. In many test cases, we add an extra record to the table, so that during the 'interesting' part of the test, row-level locking and logging will be used. Replicas will continue to use row-level logging and locking until MDEV-24622 has been addressed. Likewise, this optimization will be disabled in Galera cluster until MDEV-24623 enables it. dict_table_t::bulk_trx_id: The latest active or committed transaction that initiated an insert into an empty table or partition. Protected by exclusive table lock and a clustered index leaf page latch. ins_node_t::bulk_insert: Whether bulk insert was initiated. trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert). Unlike earlier, this collection will cover also temporary tables. trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(), is_bulk_insert(), was_bulk_insert(). trx_undo_report_row_operation(): Before accessing any undo log pages, invoke trx->mod_tables.emplace() in order to determine whether undo logging was disabled, or whether this is the first INSERT and we are supposed to write a TRX_UNDO_EMPTY record. row_ins_clust_index_entry_low(): If we are inserting into an empty clustered index leaf page, set the ins_node_t::bulk_insert flag for the subsequent trx_undo_report_row_operation() call. lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock(): Remove the redundant parameter 'flags' that can be checked in the caller. btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation(). trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT), ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that the next statement will not be covered by table-level undo logging. ReadView::changes_visible(trx_id_t) const: New accessor for the case where the trx_id_t is not read from a potentially corrupted index page but directly from the memory. In this case, we can skip a sanity check. row_sel(), row_sel_try_search_shortcut(), row_search_mvcc(): row_sel_try_search_shortcut_for_mysql(), row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id. row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees(). lock_sec_rec_cons_read_sees(): Replaced with lower-level code. btr_root_page_init(): Refactored from btr_create(). dict_index_t::clear(), dict_table_t::clear(): Empty an index or table, for the ROLLBACK of an INSERT operation. ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT into an empty table. This is joint work with Thirunarayanan Balathandayuthapani, who created a working prototype. Thanks to Matthias Leich for extensive testing.	2021-01-25 18:41:27 +02:00
Monty	6a3b581b90	MDEV-19745 BACKUP STAGE BLOCK_DDL hangs on flush sequence table Problem was that FLUSH TABLES where trying to read latest sequence state which conflicted with a running ALTER SEQUENCE. Removed the reading of the state, when opening a table for FLUSH, as it's not needed in this case. Other thing: - Fixed a potential issue with concurrently running ALTER SEQUENCE where the later ALTER could potentially read old data	2020-06-14 19:39:43 +03:00

1 2 3 4 5 ...

359 commits