mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 12:02:42 +01:00

Author	SHA1	Message	Date
Denis Protivensky	210db2935c	MDEV-30804 Rollback multi-engine transaction requiring 2PC but committing in one phase Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-04-19 14:08:57 +02:00
Thirunarayanan Balathandayuthapani	2bfd04e314	MDEV-31025 Redundant table alter fails when fixed column stored externally row_merge_buf_add(): Has strict assert that fixed length mismatch shouldn't happen while rebuilding the redundant row format table btr_index_rec_validate(): Fixed size column can be stored externally. So sum of inline stored length and external stored length of the column should be equal to total column length	2023-04-19 17:11:14 +05:30
Daniele Sciascia	feeeacc4d7	MDEV-30955 Explicit locks released too early in rollback path Assertion `thd->mdl_context.is_lock_owner()` fires when a client is disconnected, while transaction and and a table is opened through `HANDLER` interface. Reason for the assertion is that when a connection closes, its ongoing transaction is eventually rolled back in `Wsrep_client_state::bf_rollback()`. This method also releases explicit which are expected to survive beyond the transaction lifetime. This patch also removes calls to `mysql_ull_cleanup()`. User level locks are not supported in combination with Galera, making these calls unnecessary.	2023-04-18 13:57:59 +02:00
Daniele Sciascia	bc3bfcf943	MDEV-30862 Assertion `mode_ == m_high_priority' failed CREATE TABLE AS SELECT is not supported in combination with streaming replication.	2023-04-18 10:02:22 +02:00
Andrei	8f87023d3f	MDEV-28777 binlog.binlog_truncate_multi_engine failed in bb with Lost connection The 2013 error was right to catch the case B of the test unprepared for an expected simulated crash. The test gets refined to SELECT a (type of) bool value before the crash is invoked.	2023-04-17 20:07:37 +03:00
Sergei Petrunia	c7fe8e51de	Merge 10.11 into 11.0	2023-04-17 16:50:01 +03:00
Marko Mäkelä	656c2e18b1	Merge 10.10 into 10.11	2023-04-14 13:08:28 +03:00
Marko Mäkelä	a009280e60	Merge 10.9 into 10.10	2023-04-14 12:24:14 +03:00
Marko Mäkelä	44281b88f3	Merge 10.8 into 10.9	2023-04-14 11:32:36 +03:00
Marko Mäkelä	1d1e0ab2cc	Merge 10.6 into 10.8	2023-04-12 15:50:08 +03:00
Junqi Xie	d20a96f9c1	MDEV-21921 Make transaction_isolation and transaction_read_only into system variables In MariaDB, we have a confusing problem where: * The transaction_isolation option can be set in a configuration file, but it cannot be set dynamically. * The tx_isolation system variable can be set dynamically, but it cannot be set in a configuration file. Therefore, we have two different names for the same thing in different contexts. This is needlessly confusing, and it complicates the documentation. The same thing applys for transaction_read_only. MySQL 5.7 solved this problem by making them into system variables. https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-20.html This commit takes a similar approach by adding new system variables and marking the original ones as deprecated. This commit also resolves some legacy problems related to SET STATEMENT and transaction_isolation.	2023-04-12 11:04:29 +10:00
Marko Mäkelä	5bada1246d	Merge 10.5 into 10.6	2023-04-11 16:15:19 +03:00
Daniel Black	4472a7b4ff	MDEV-30205: /usr/share/mysql-test -> mariadb-test (fix) A suppression was needed for encryption.innodb-bad-key-change due to the path change.	2023-04-11 17:47:34 +10:00
Jan Lindström	f83b7ae13d	MDEV-26175 : Assertion `! thd->in_sub_stmt' failed in bool trans_rollback_stmt(THD*) If we are inside stored function or trigger we should not commit or rollback current statement transaction. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-04-11 09:07:33 +02:00
Alexander Barkov	ed2adc8c6f	MDEV-28190 sql_mode makes MDEV-371 virtual column expressions nondeterministic This problem was fixed earlier by MDEV-27653. Adding MTR tests only.	2023-04-06 16:17:50 +04:00
Alexander Barkov	fb72dfbf7f	MDEV-30415 MDEV-30415 PERIOD false positive overlap wtih utf8mb4_unicode_nopad_ci The problem was earlier fixed by the patch for MDEV-30034. Adding MTR tests only.	2023-04-06 09:45:05 +04:00
Alexander Barkov	62e137d4d7	Merge remote-tracking branch 'origin/10.4' into 10.5	2023-04-05 16:16:19 +04:00
Alexander Barkov	8f9bb82640	MDEV-30971 Add a new system variable aria_data_home_dir	2023-04-04 16:05:55 +04:00
Jan Lindström	afdf19cf33	MDEV-28641 : Query cache entries not invalidated on slave of a Galera cluster Query cache should be invalidated if we are not in applier. For some reason this condition was incorrect starting from 10.5 but it is correct in 10.4. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-04-04 12:12:16 +02:00
Alexander Barkov	8020b1bd73	MDEV-30034 UNIQUE USING HASH accepts duplicate entries for tricky collations - Adding a new argument "flag" to MY_COLLATION_HANDLER::strnncollsp_nchars() and a flag MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES. The flag defines if strnncollsp_nchars() should emulate trailing spaces which were possibly trimmed earlier (e.g. in InnoDB CHAR compression). This is important for NOPAD collations. For example, with this input: - str1= 'a ' (Latin letter a followed by one space) - str2= 'a ' (Latin letter a followed by two spaces) - nchars= 3 if the flag is given, strnncollsp_nchars() will virtually restore one trailing space to str1 up to nchars (3) characters and compare two strings as equal: - str1= 'a ' (one extra trailing space emulated) - str2= 'a ' (as is) If the flag is not given, strnncollsp_nchars() does not add trailing virtual spaces, so in case of a NOPAD collation, str1 will be compared as less than str2 because it is shorter. - Field_string::cmp_prefix() now passes the new flag. Field_varstring::cmp_prefix() and Field_blob::cmp_prefix() do not pass the new flag. - The branch in cmp_whole_field() in storage/innobase/rem/rem0cmp.cc (which handles the CHAR data type) now also passed the new flag. - Fixing UCA collations to respect the new flag. Other collations are possibly also affected, however I had no success in making an SQL script demonstrating the problem. Other collations will be extended to respect this flags in a separate patch later. - Changing the meaning of the last parameter of Field::cmp_prefix() from "number of bytes" (internal length) to "number of characters" (user visible length). The code calling cmp_prefix() from handler.cc was wrong. After this change, the call in handler.cc became correct. The code calling cmp_prefix() from key_rec_cmp() in key.cc was adjusted according to this change. - Old strnncollsp_nchar() related tests in unittest/strings/strings-t.c now pass the new flag. A few new tests also were added, without the flag.	2023-04-04 12:30:50 +04:00
Lorna Luo	0cc1694e9c	Make 'move_file' command more reliable in 3 innodb tests The tests innodb.import_tablespace_race, innodn.restart, and innodb.innodb-wl5522 move the tablespace file between the data directory and the tmp directory specified by global environment variables. However this is risky because it's not unusual that the set tmp directory (often under /tmp) is mounted on another disk partition or device, and 'move_file' command may fail with "Errcode: 18 'Invalid cross-device link.'" For innodb.import_tablespace_race and innodb.innodb-wl5522, moving files across directories is not necessary. Modify the tests so they rename files under the same directory. For innodb.restart, instead of moving between datadir and MYSQL_TMPDIR, move the files under MYSQLTEST_VARDIR. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2023-04-03 14:36:11 +02:00
Oleksandr Byelkin	ac5a534a4c	Merge remote-tracking branch '10.4' into 10.5	2023-03-31 21:32:41 +02:00
Jan Lindström	eaebe8b560	MDEV-25045 : Assertion `client_state_.mode() != wsrep::client_state::m_toi' failed in int wsrep::transaction::before_commit() CREATE [TEMPORARY] SEQUENCE is internally CREATE+INSERT (initial value) and it is replicated using statement based replication. In Galera we use either TOI or RSU so we should skip commit time hooks for it. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-03-31 12:48:13 +02:00
Marko Mäkelä	2b61ff8f22	Merge 11.0 into 11.1	2023-03-29 17:23:21 +03:00
Marko Mäkelä	5e01255732	Merge 10.11 into 11.0	2023-03-29 17:20:42 +03:00
Marko Mäkelä	d84a282629	Merge 10.10 into 10.11	2023-03-29 16:53:37 +03:00
Marko Mäkelä	a6780df49b	MDEV-30453 Setting innodb_buffer_pool_filename to an empty string attempts to delete the data directory on shutdown Let us make innodb_buffer_pool_filename a read-only variable so that a malicious user cannot cause an important file to be deleted on InnoDB shutdown. An attempt to delete a directory will fail because it is not a regular file, but what if the variable pointed to (say) ibdata1, ib_logfile0 or some *.ibd file? It does not seem to make much sense for this parameter to be configurable in the first place, but we will not change that in order to avoid breaking compatibility.	2023-03-29 16:49:10 +03:00
Marko Mäkelä	191821f7df	Merge 10.9 into 10.10	2023-03-29 15:29:02 +03:00
Marko Mäkelä	55e78ebf41	Merge 10.8 into 10.9	2023-03-29 15:28:13 +03:00
Marko Mäkelä	dd2fe81122	Merge 10.6 into 10.8	2023-03-29 15:16:42 +03:00
Marko Mäkelä	0760ad3336	Merge 10.5 into 10.6	2023-03-28 15:25:52 +03:00
Sergei Golubchik	c2b6916393	MDEV-19629 post-merge fixes * it isn't "pfs" function, don't call it Item_func_pfs, don't use item_pfsfunc.* * tests don't depend on performance schema, put in the main suite * inherit from Item_str_ascii_func * use connection collation, not utf8mb3_general_ci * set result length in fix_length_and_dec * do not set maybe_null * use my_snprintf() where possible * don't set m_value.ptr on every invocation * update sys schema to use the format_pico_time() * len must be size_t (compilation error on Windows) * the correct function name for double->double is fabs() * drop volatile hack	2023-03-27 21:27:27 +02:00
Vlad Lesin	4c226c1850	MDEV-29050 mariabackup issues error messages during InnoDB tablespaces export on partial backup preparing The solution is to suppress error messages for missing tablespaces if mariabackup is launched with "--prepare --export" options. "mariabackup --prepare --export" invokes itself with --mysqld parameter. If the parameter is set, then it starts server to feed "FLUSH TABLES ... FOR EXPORT;" queries for exported tablespaces. This is "normal" server start, that's why new srv_operation value is introduced. Reviewed by Marko Makela.	2023-03-27 20:15:10 +03:00
Ahmed Ibrahim	d9808f79de	MDEV-19629: format_pico_time implementation	2023-03-27 16:34:29 +02:00
Igor Babaev	f33fc2fae5	MDEV-30539 EXPLAIN EXTENDED: no message with queries for DML statements EXPLAIN EXTENDED for an UPDATE/DELETE/INSERT/REPLACE statement did not produce the warning containing the text representation of the query obtained after the optimization phase. Such warning was produced for SELECT statements, but not for DML statements. The patch fixes this defect of EXPLAIN EXTENDED for DML statements.	2023-03-25 12:36:59 -07:00
Thirunarayanan Balathandayuthapani	e06c6046d2	MDEV-29545 InnoDB: Can't find record during replace stmt Problem: ======== - InnoDB replace statement returns can't find record as result during bulk insert operation. InnoDB returns DB_END_OF_INDEX blindly when bulk transaction is visible to current transaction even though the search tuple is inserted as a part of current replace statement. Solution: ========= row_search_mvcc(): InnoDB should allow the transaction to read all the rows when innodb intends to do any locking on the record even though bulk insert transaction changes are visible to the current transaction	2023-03-24 15:20:21 +05:30
Otto Kekalainen	50c8ef01fc	Fix trivial spelling errors - agressively -> aggressively - exising -> existing - occured -> occurred - releated -> related - seperated -> separated - sucess -> success - use use -> use All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2023-03-24 12:54:05 +11:00
Aleksey Midenkov	a8b616d1e9	MDEV-30421 rpl_parallel_.test cleanup Moved rpl_parallel_.inc to rpl_parallel_*.test	2023-03-23 22:31:55 +03:00
Aleksey Midenkov	91e5e47a50	MDEV-30421 more tests cleaned up All the .inc files that included from binlog_encryption are refactored.	2023-03-23 21:07:32 +03:00
Aleksey Midenkov	bdf5580611	MDEV-30421 rpl_parallel.test cleanup Moved rpl_parallel.inc to rpl_parallel.test	2023-03-23 21:07:32 +03:00
Anel Husakovic	c596ad734d	MDEV-30269: Remove rpl_semi_sync_[slave,master] usage in code - Description: - Before 10.3.8 semisync was a plugin that is built into the server with MDEV-13073,starting with commit `cbc71485e2`. There are still some usage of `rpl_semi_sync_master` in mtr. Note: - To recognize the replica in the `dump_thread`, replica is creating local variable `rpl_semi_sync_slave` (the keyword of plugin) in function `request_transmit`, that is catched by primary in `is_semi_sync_slave()`. This is the user variable and as such not related to the obsolete plugin. - Found in `sys_vars.all_vars` and `rpl_semi_sync_wait_point` tests, usage of plugins `rpl_semi_sync_master`, `rpl_semi_sync_slave`. The former test is disabled by default (`sys_vars/disabled.def`) and marked as `obsolete`, however this patch will remove the queries. - Add cosmetic fixes to semisync codebase Reviewer: <brandon.nesterenko@mariadb.com> Closes PR #2528, PR #2380	2023-03-23 13:39:46 +01:00
Marko Mäkelä	1efdf67e60	Merge 10.5 into 10.6	2023-03-22 15:54:45 +02:00
Yuchen Pei	7c91082e39	MDEV-27912 Fixing inconsistency w.r.t. expect files in tests. mtr uses group suffix, but some existing inc and test files use server_id for expect files. This patch aims to fix that. For spider: With this change we will not have to maintain a separate version of restart_mysqld.inc for spider, that duplicates code, just because spider tests use different names for expect files, and shutdown_mysqld requires magical names for them. With this change spider tests will also be able to use other features provided by restart_mysqld.inc without code duplication, like the parameter $restart_parameters (see e.g. the testcase mdev_29904.test in commit ef1161e5d4f). Tests run after this change: default, spider, rocksdb, galera, using the following command mtr --parallel=auto --force --max-test-fail=0 --skip-core-file mtr --suite spider,spider/,spider//* \ --skip-test="spider/oracle.\|./t\..*" --parallel=auto --big-test \ --force --max-test-fail=0 --skip-core-file mtr --suite galera --parallel=auto mtr --suite rocksdb --parallel=auto	2023-03-22 11:55:57 +11:00
Tingyao Nian	dccbb5a6db	[MDEV-30824] Fix binlog to use 'String' for setting 'character_set_client' Commit `a923d6f49c` disabled numeric setting of character_set_* variables with non-default values: MariaDB [(none)]> set character_set_client=224; ERROR 1115 (42000): Unknown character set: '224' However the corresponding binlog functionality still write numeric values for log event, and this will break binlog replay if the value is not default. Now make the server use 'String' type for 'character_set_client' when generating binlog events Before: /!\C utf8mb4 //!/; SET @@session.character_set_client=224,@@session.collation_connection=224,@@session.collation_server=33/!/; After: /!\C utf8mb4 //!/; SET @@session.character_set_client=utf8mb4,@@session.collation_connection=33,@@session.collation_server=8/!/; Note: prior to the previous commit, setting with '224' or '45' or 'utf8mb4' have the same effect, as they all set the parameter to 'utf8mb4'. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2023-03-21 17:19:44 +04:00
Marko Mäkelä	c73a65f55b	MDEV-29692 Assertion `(writeptr + (i * size)) != local_frame' failed upon IMPORT TABLESPACE fil_iterate(): Allocation bitmap pages are never encrypted. Reviewed by: Thirunarayanan Balathandayuthapani	2023-03-21 14:33:54 +02:00
Vlad Lesin	f8c3d4c2d5	MDEV-28187 mariadb-backup doesn't utilise innodb-undo-log-directory (if specified as a relative path) during copy-back operation Make absolute destination path from relative one, basing on mysql data directory. Reviewed by Alexander Barkov.	2023-03-21 11:16:51 +03:00
Vicențiu Ciorbaru	a2cb6d8760	Update feedback plugin URL to use feedback.mariadb.org subdomain	2023-03-20 18:15:17 +02:00
Marko Mäkelä	6e58d5ab6a	Merge 11.0 into 11.1	2023-03-17 15:04:38 +02:00
Marko Mäkelä	4c355d4e81	Merge 10.11 into 11.0	2023-03-17 15:03:17 +02:00
Marko Mäkelä	7343a2ceb6	Merge 10.10 into 10.11	2023-03-17 14:23:03 +02:00
Marko Mäkelä	df08731b58	Merge 10.9 into 10.10	2023-03-17 14:22:35 +02:00
Marko Mäkelä	1147e32688	Merge 10.8 into 10.9	2023-03-17 14:22:10 +02:00
Marko Mäkelä	fa56adff75	Merge 10.6 into 10.8	2023-03-17 14:19:17 +02:00
Thirunarayanan Balathandayuthapani	e8e0559ed2	MDEV-30870 Undo tablespace name displays wrongly for I_S queries - INNODB_SYS_TABLESPACES in information schema should display innodb_undo001, innodb_undo002 etc as tablespace name for undo tablespaces	2023-03-17 17:17:35 +05:30
Thirunarayanan Balathandayuthapani	18e4978edc	MDEV-29975 InnoDB fails to release savepoint during bulk insert - InnoDB does rollback the whole transaction and discards the savepoint when there is a failure happens during bulk insert operation. When server request to release the savepoint, InnoDB should return DB_SUCCESS when it deals with bulk insert operation	2023-03-17 16:41:27 +05:30
Marko Mäkelä	c50f849d64	Merge 10.10 into 10.11	2023-03-17 07:00:03 +02:00
Marko Mäkelä	3dd33789c1	Merge 10.9 into 10.10	2023-03-17 06:59:46 +02:00
Marko Mäkelä	fffa4b28a1	Merge 10.8 into 10.9	2023-03-17 06:58:33 +02:00
Andrei	d4339620be	MDEV-30780 optimistic parallel slave hangs after hit an error The hang could be seen as show slave status displaying an error like Last_Error: Could not execute Write_rows_v1 along with Slave_SQL_Running: Yes accompanied with one of the replication threads in show-processlist characteristically having status like 2394 \| system user \| \| NULL \| Slave_worker \| 50852\| closing tables It turns out that closing tables worker got entrapped in endless looping in mark_start_commit_inner() across already garbage-collected gco items. The reclaimed gco links are explained with actually possible out-of-order groups of events termination due to the Last_Error. This patch reinforces the correct ordering to perform finish_event_group's cleanup actions, incl unlinking gco:s from the active list.	2023-03-16 18:55:19 +02:00
Marko Mäkelä	acf46b7b36	Merge 10.6 into 10.8	2023-03-16 18:11:37 +02:00
Marko Mäkelä	a55b951e60	MDEV-26827 Make page flushing even faster For more convenient monitoring of something that could greatly affect the volume of page writes, we add the status variable Innodb_buffer_pool_pages_split that was previously only available via information_schema.innodb_metrics as "innodb_page_splits". This was suggested by Axel Schwenke. buf_flush_page_count: Replaced with buf_pool.stat.n_pages_written. We protect buf_pool.stat (except n_page_gets) with buf_pool.mutex and remove unnecessary export_vars indirection. buf_pool.flush_list_bytes: Moved from buf_pool.stat.flush_list_bytes. Protected by buf_pool.flush_list_mutex. buf_pool_t::page_cleaner_status: Replaces buf_pool_t::n_flush_LRU_, buf_pool_t::n_flush_list_, and buf_pool_t::page_cleaner_is_idle. Protected by buf_pool.flush_list_mutex. We will exclusively broadcast buf_pool.done_flush_list by the buf_flush_page_cleaner thread, and only wait for it when communicating with buf_flush_page_cleaner. There is no need to keep a count of pending writes by the buf_pool.flush_list processing. A single flag suffices for that. Waits for page write completion can be performed by simply waiting on block->page.lock, or by invoking buf_dblwr.wait_for_page_writes(). buf_LRU_block_free_non_file_page(): Broadcast buf_pool.done_free and set buf_pool.try_LRU_scan when freeing a page. This would be executed also as part of buf_page_write_complete(). buf_page_write_complete(): Do not broadcast buf_pool.done_flush_list, and do not acquire buf_pool.mutex unless buf_pool.LRU eviction is needed. Let buf_dblwr count all writes to persistent pages and broadcast a condition variable when no outstanding writes remain. buf_flush_page_cleaner(): Prioritize LRU flushing and eviction right after "furious flushing" (lsn_limit). Simplify the conditions and reduce the hold time of buf_pool.flush_list_mutex. Refuse to shut down or sleep if buf_pool.ran_out(), that is, LRU eviction is needed. buf_pool_t::page_cleaner_wakeup(): Add the optional parameter for_LRU. buf_LRU_get_free_block(): Protect buf_lru_free_blocks_error_printed with buf_pool.mutex. Invoke buf_pool.page_cleaner_wakeup(true) to to ensure that buf_flush_page_cleaner() will process the LRU flush request. buf_do_LRU_batch(), buf_flush_list(), buf_flush_list_space(): Update buf_pool.stat.n_pages_written when submitting writes (while holding buf_pool.mutex), not when completing them. buf_page_t::flush(), buf_flush_discard_page(): Require that the page U-latch be acquired upfront, and remove buf_page_t::ready_for_flush(). buf_pool_t::delete_from_flush_list(): Remove the parameter "bool clear". buf_flush_page(): Count pending page writes via buf_dblwr. buf_flush_try_neighbors(): Take the block of page_id as a parameter. If the tablespace is dropped before our page has been written out, release the page U-latch. buf_pool_invalidate(): Let the caller ensure that there are no outstanding writes. buf_flush_wait_batch_end(false), buf_flush_wait_batch_end_acquiring_mutex(false): Replaced with buf_dblwr.wait_for_page_writes(). buf_flush_wait_LRU_batch_end(): Replaces buf_flush_wait_batch_end(true). buf_flush_list(): Remove some broadcast of buf_pool.done_flush_list. buf_flush_buffer_pool(): Invoke also buf_dblwr.wait_for_page_writes(). buf_pool_t::io_pending(), buf_pool_t::n_flush_list(): Remove. Outstanding writes are reflected by buf_dblwr.pending_writes(). buf_dblwr_t::init(): New function, to initialize the mutex and the condition variables, but not the backing store. buf_dblwr_t::is_created(): Replaces buf_dblwr_t::is_initialised(). buf_dblwr_t::pending_writes(), buf_dblwr_t::writes_pending: Keeps track of writes of persistent data pages. buf_flush_LRU(): Allow calls while LRU flushing may be in progress in another thread. Tested by Matthias Leich (correctness) and Axel Schwenke (performance)	2023-03-16 17:19:58 +02:00
Marko Mäkelä	85cbfaefee	Merge 10.5 into 10.6	2023-03-16 15:48:08 +02:00
Lena Startseva	1e0a72a18b	MDEV-29390: Improve coverage for UPDATE and DELETE statements in MTR test suites Created tests for "delete" based on update_use_source.test For the update_use_source.test tests, data recovery in the table has been changed from a rollback transaction to a complete delete and re-insert of the data with optimize table. Cases are now being checked on three engines. Added tests for update/delete with LooseScan and DuplicateWeedout optimization strategies Added tests for engine MEMORY on delete and update Added tests for multi-update with JSON_TABLE Added tests for multi-update and multi-delete for engine Connect	2023-03-15 17:35:22 -07:00
Igor Babaev	88ca62dc68	MDEV-28965 Assertion failure when preparing UPDATE with derived table in WHERE This patch fixes not only the assertion failure in the function Field_iterator_table_ref::set_field_iterator() but also: - fixes the problem of forced materialization of derived tables used in subqueries contained in WHERE clauses of single-table and multi-table UPDATE and DELETE statements - fixes the problem of MDEV-17954 that prevented execution of multi-table DELETE statements if they use in their WHERE clauses references to the tables that are updated. The patch must be considered a complement to the patch for MDEV-28883. Approved by Oleksandr Byelkin <sanja@mariadb.com>	2023-03-15 17:35:22 -07:00
Igor Babaev	3a9358a410	MDEV-28883 Re-design the upper level of handling UPDATE and DELETE statements This patch introduces a new way of handling UPDATE and DELETE commands at the top level after the parsing phase. This new way of processing update and delete statements can be seen in the implementation of the prepare() and execute() methods from the new Sql_cmd_dml class. This class derived from the Sql_cmd class can be considered as an interface class for processing such commands as SELECT, INSERT, UPDATE, DELETE and other comands manipulating data in tables. With this patch processing of update and delete statements after parsing proceeds by the following schema: - precheck of the access rights is performed for the used tables - the used tables are opened - context analysis phase is performed for the statement - the used tables are locked - the statement is optimized and executed - clean-up is performed for the statement The implementation of the method Sql_cmd_dml::execute() adheres this schema. The virtual functions of the class Sql_cmd_dml used for precheck of the access rights, context analysis, optimization and execution allow to adjust this schema for processing data manipulation statements of any types. This schema of processing data manipulation statements is taken from the current MySQL code. Moreover the definition the class Sql_cmd_dml introduced in this patch is almost a full replica of such class in the existing MySQL. However the implementation of the derived classes for update and delete statements is quite different. This implementation employs the JOIN class for all kinds of update and delete statements. It allows to perform main bulk of context analysis actions by the function JOIN::prepare(). This guarantees that characteristics and properties of the statement tree discovered for optimization phase when doing context analysis are the same for single-table and multi-table updates and deletes. With this patch the following functions are gone: mysql_prepare_update(), mysql_multi_update_prepare(), mysql_update(), mysql_multi_update(), mysql_prepare_delete(), mysql_multi_delete_prepare(), mysql_delete(). The code within these functions have been used as much as possible though. The functions mysql_test_update() and mysql_test_delete() are also not needed anymore. The method Sql_cmd_dml::prepare() serves processing - update/delete statement - PREPARE stmt FROM "<update/delete statement>" - EXECUTE stmt when stmt is prepared from update/delete statement. Approved by Oleksandr Byelkin <sanja@mariadb.com>	2023-03-15 17:35:22 -07:00
Thirunarayanan Balathandayuthapani	dfdcd7ffab	MDEV-26198 Assertion `0' failed in row_log_table_apply_op during redundant table rebuild - InnoDB alter fails to apply the online log during redundant table rebuild. Problem is that InnoDB wrongly reads the length flags of the record while applying the temporary log record. rec_init_offsets_comp_ordinary(): For finding the n_core_null_bytes, InnoDB should use the same logic as rec_convert_dtuple_to_rec_comp().	2023-03-14 13:34:23 +05:30
Marko Mäkelä	7ca89af6f8	MDEV-30545 Remove innodb_defragment and related parameters The deprecated parameters will be removed: innodb_defragment innodb_defragment_n_pages innodb_defragment_stats_accuracy innodb_defragment_fill_factor_n_recs innodb_defragment_fill_factor innodb_defragment_frequency The mysql.innodb_index_stats.stat_name values 'n_page_split' and 'n_pages_freed' will lose their special meaning. The related changes to OPTIMIZE TABLE in InnoDB will be removed as well. The parameter innodb_optimize_fulltext_only will retain its special meaning in OPTIMIZE TABLE. Tested by: Matthias Leich	2023-03-11 10:45:35 +02:00
Alexander Barkov	b314f7b642	MDEV-18931 Rename Mariabackup's xtrabackup_* files to mariadb_backup_* Renaming the default MariaDB backup directory from xtrabackup_backupfiles to mariadb_backup_files. Renaming files: - xtrabackup_binlog_info to mariadb_backup_binlog_info - xtrabackup_checkpoints to mariadb_backup_checkpoints - xtrabackup_galera_info to mariadb_backup_galera_info - xtrabackup_info to mariadb_backup_info - xtrabackup_slave_info to mariadb_backup_slave_info	2023-03-10 12:41:58 +01:00
Sergei Golubchik	4ae97333f0	11.1 branch	2023-03-10 12:41:52 +01:00
Anel Husakovic	2f6bb9cda5	MDEV-30698 Cover missing test cases for mariadb-binlog options --raw [and] --flashback - Adding test case for --raw without -R - Adding unsuported combination of --raw and --flashback parameters and covered with test case	2023-03-08 12:26:00 +00:00
Ekaterine Papava	6b8370a90f	MDEV-30789: Add Georgian error messages and locale Test case and minor fixes by Daniel Black Reviewer: Alexander Barkov	2023-03-08 11:22:57 +11:00
Monty	7a277a3352	Allow firstmatch to use HASH joins Firstmatch_picker::check_qep() has an optimization that allows firstmatch to be used together with join buffer under some conditions. In this case the cost was assumed to be same as what best_access_path() had calculated. However if HASH+join_buffer was used, then fix_semijoin_strategies_for_picked_join_order() would remove the join_buffer (which would cause a full join to be used) and the cost assumption by Firstmatch_picker::check_qep() would be wrong. Later check_join_cache_usage() sees that it's a full scan and decides it can use join buffering, (But not the hash join). Fixed by also allowing HASH joins with firstmatch. This removes the need to change disable and re-enable join buffer. Test case changes: - HASH join used with firstmatch (Using join buffer (flat, BNLH join)) - Filtered could change with firstmatch as the conversion with and without join_buffered lost the filtering information. - The not "re-enabling join buffer" is shown in main.optimizer_trace Original code by Sergei, optimized by Monty. Author: Sergei Petrunia <sergey@mariadb.com>, monty@mariadb.org	2023-03-07 14:27:26 +02:00
Thirunarayanan Balathandayuthapani	062ba0bd4a	MDEV-30183 Assertion `!memcmp(rec_trx_id, old_pk_trx_id->data, 6 + 7)' failed in row_log_table_apply_update - This failure caused by commit `358921ce32` row_ins_duplicate_online() should consider if the record is an exact match of the tuple when number of matching fields equals with number of unique fields + DB_TRX_ID + DB_ROLL_PTR	2023-03-06 23:40:13 +05:30
Marko Mäkelä	c5fdb988b7	Merge 10.11 into 11.0	2023-03-06 16:06:52 +02:00
Marko Mäkelä	9267160c11	Merge 10.10 into 10.11	2023-03-06 13:39:12 +02:00
Marko Mäkelä	4ccb2be65f	Merge 10.9 into 10.10	2023-03-06 13:38:39 +02:00
Marko Mäkelä	46a7603813	Merge 10.8 into 10.9	2023-03-06 13:38:16 +02:00
Marko Mäkelä	669a0c6efb	Merge 10.6 into 10.8	2023-03-06 13:37:12 +02:00
Monty	ae05097714	Fixed crashing bug in recursive SQL if write to tmp table would fail This error was discovered while working on MDEV-30540 Wrong result with IN list length reaching IN_PREDICATE_CONVERSION_THRESHOLD If there is read error from handler::ha_rnd_next() during a recursive query, st_select_lex_unit::exec_recursive() will crash as it will try to get the error code from a structure that was deleted by the callee. The code was using the construct: sl->join->exec(); saved_error=sl->join->error; This does not work as sl->join was freed by the exec() and sl->join would be set to 0. Fixed by having JOIN::exec() return the error code. The included test case simulates the error in ha_rnd_next(), which causes a crash without the patch. scovered whle working on MDEV-30540 Wrong result with IN list length reaching IN_PREDICATE_CONVERSION_THRESHOLD If there is read error from handler::ha_rnd_next() during a recursive query, st_select_lex_unit::exec_recursive() will crash as it will try to get the error code from a structure that was deleted by the callee. The code was using the construct: sl->join->exec(); saved_error=sl->join->error; This does not work as sl->join was freed by the exec() and sl->join was set to 0. Fixed by having JOIN::exec() return the error code. The included test case simulates the error in ha_rnd_next(), which causes a crash without the patch.	2023-03-02 13:11:54 +02:00
Monty	bd9ca2a0e3	MDEV-30540 Wrong result with IN list length reaching IN_PREDICATE_CONVERSION_THRESHOLD The problem was the mysql_derived_prepare() did not correctly set 'distinct' when creating a temporary derivated table. Fixed by separating checking for distinct for queries with and without UNION. Other things: - Fixed bug in generate_derived_keys_for_table() where we set the wrong bit for join_tab->keys - Cleaned up JOIN::drop_unused_derived_keys() - Changed TABLE::use_index() to keep unique keys and update share->key_parts Author: Sergei Petrunia <sergey@mariadb.com>, monty@mariadb.org	2023-03-02 13:11:54 +02:00
Marko Mäkelä	085d0ac238	Merge 10.5 into 10.6	2023-02-28 16:05:21 +02:00
Marko Mäkelä	7a834d6248	Merge 10.11 into 11.0	2023-02-28 13:14:08 +02:00
Marko Mäkelä	95d51369c9	Merge 10.10 into 10.11	2023-02-28 10:52:42 +02:00
Marko Mäkelä	f14d9fa09a	Merge 10.9 into 10.10	2023-02-28 10:43:29 +02:00
Marko Mäkelä	c3246e4bf0	Merge 10.8 into 10.9	2023-02-28 10:37:11 +02:00
Marko Mäkelä	6ac44ac3ab	Merge 10.6 into 10.8	2023-02-28 10:36:17 +02:00
Monty	57c526ffb8	Added detection of memory overwrite with multi_malloc This patch also fixes some bugs detected by valgrind after this patch: - Not enough copy_func elements was allocated by Create_tmp_table() which causes an memory overwrite in Create_tmp_table::add_fields() I added an ASSERT() to be able to detect this also without valgrind. The bug was that TMP_TABLE_PARAM::copy_fields was not correctly set when calling create_tmp_table(). - Aria::empty_bits is not allocated if there is no varchar/char/blob fields in the table. Fixed code to take this into account. This cannot cause any issues as this is just a memory access into other Aria memory and the content of the memory would not be used. - Aria::last_key_buff was not allocated big enough. This may have caused issues with rtrees and ma_extra(HA_EXTRA_REMEMBER_POS) as they would use the same memory area. - Aria and MyISAM didn't take extended key parts into account, which caused problems when copying rec_per_key from engine to sql level. - Mark asan builds with 'asan' in version strihng to detect these in not_valgrind_build.inc. This is needed to not have main.sp-no-valgrind fail with asan.	2023-02-27 19:25:44 +02:00
Marko Mäkelä	3e2ad0e918	Merge 10.5 into 10.6	2023-02-27 13:17:35 +02:00
Marko Mäkelä	0de3be8cfd	MDEV-30671 InnoDB undo log truncation fails to wait for purge of history It is not safe to invoke trx_purge_free_segment() or execute innodb_undo_log_truncate=ON before all undo log records in the rollback segment has been processed. A prominent failure that would occur due to premature freeing of undo log pages is that trx_undo_get_undo_rec() would crash when trying to copy an undo log record to fetch the previous version of a record. If trx_undo_get_undo_rec() was not invoked in the unlucky time frame, then the symptom would be that some committed transaction history is never removed. This would be detected by CHECK TABLE...EXTENDED that was impleented in commit `ab0190101b`. Such a garbage collection leak should be possible even when using innodb_undo_log_truncate=OFF, just involving trx_purge_free_segment(). trx_rseg_t::needs_purge: Change the type from Boolean to a transaction identifier, noting the most recent non-purged transaction, or 0 if everything has been purged. On transaction start, we initialize this to 1 more than the transaction start ID. On recovery, the field may be adjusted to the transaction end ID (TRX_UNDO_TRX_NO) if it is larger. The field TRX_UNDO_NEEDS_PURGE becomes write-only; only some debug assertions that would validate the value. The field reflects the old inaccurate Boolean field trx_rseg_t::needs_purge. trx_undo_mem_create_at_db_start(), trx_undo_lists_init(), trx_rseg_mem_restore(): Remove the parameter max_trx_id. Instead, store the maximum in trx_rseg_t::needs_purge, where trx_rseg_array_init() will find it. trx_purge_free_segment(): Contiguously hold a lock on trx_rseg_t to prevent any concurrent allocation of undo log. trx_purge_truncate_rseg_history(): Only invoke trx_purge_free_segment() if the rollback segment is empty and there are no pending transactions associated with it. trx_purge_truncate_history(): Only proceed with innodb_undo_log_truncate=ON if trx_rseg_t::needs_purge indicates that all history has been purged. Tested by: Matthias Leich	2023-02-24 14:24:44 +02:00
Thirunarayanan Balathandayuthapani	db245e1140	MDEV-25984 Assertion `max_doc_id > 0' failed in fts_init_doc_id() - rollback_inplace_alter_table() locks the fts internal tables. At the time, insert tries to fetch the doc id from config table, fails to lock the config table and returns doc id as 0. fts_cmp_set_sync_doc_id(): Retry to fetch the doc id again if it encounter DB_LOCK_WAIT_TIMEOUT error	2023-02-22 18:54:00 +05:30
Sergei Golubchik	2e6a9886a9	MDEV-30526 Assertion `rights == merged->cols' failed in update_role_columns another case of the antipattern "iterate the HASH and delete elements as we go"	2023-02-21 23:22:56 +01:00
Thirunarayanan Balathandayuthapani	df9f9ba12b	MDEV-29871 innodb_fts.fulltext_misc unexpectedly reports a result - match()+0 returns the floating result and converts into integer value and it leads to sporadic failure.	2023-02-21 18:48:59 +05:30
Monty	3c1b7fb03e	Adjust costs for rowid filter - Use log2() insted of log() - Added missing ''+' when calculating rowid setup cost - Adjusted ROWID_FILTER_PER_ELEMENT_MODIFIER (from 3 to 1) Other things: - Adjusted cost for index_merge where rows_out < 1.0 The effects of the changes: - rowid filter will have higher setup cost - rowid filter will have slightly less costs per row This can be seen in mtr where some tests, with 'small tables or that uses rowid filters with many rows, will not use rowid filter anymore.	2023-02-21 15:35:27 +03:00
Marko Mäkelä	d5d7c8ba96	MDEV-30544 Deprecate innodb_defragment and related parameters There is a little used option innodb_defragment that would make OPTIMIZE TABLE not rebuild the table as usual for InnoDB, but instead cause the index B-trees to be optimized in place. This option uses excessive locking (exclusively locking index trees). It never covered SPATIAL INDEX or FULLTEXT INDEX. Storage space was never reclaimed. Because this option is not particularly useful and causes a maintenance burden (most recently in commit `de4030e4d4`), it is best to deprecate it, to prepare for its removal.	2023-02-21 13:33:47 +02:00
Marko Mäkelä	2b13ae1a31	MDEV-29694 fixup: Remove srv_change_buffer_max_size, adjust comments	2023-02-21 12:34:20 +02:00
Vlad Lesin	a474e3278c	MDEV-27701 Race on trx->lock.wait_lock between lock_rec_move() and lock_sys_t::cancel() The initial issue was in assertion failure, which checked the equality of lock to cancel with trx->lock.wait_lock in lock_sys_t::cancel(). If we analyze lock_sys_t::cancel() code from the perspective of trx->lock.wait_lock racing, we won't find the error there, except the cases when we need to reload it after the corresponding latches acquiring. So the fix is just to remove the assertion and reload trx->lock.wait_lock after acquiring necessary latches. Reviewed by: Marko Mäkelä <marko.makela@mariadb.com>	2023-02-20 20:31:24 +03:00
Sergei Golubchik	799f75953f	bump the maturity	2023-02-20 10:18:45 +01:00
Marko Mäkelä	2e431ff7e6	Merge 10.11 into 11.0	2023-02-16 13:34:45 +02:00
Thirunarayanan Balathandayuthapani	702d1af32c	MDEV-30615 Can't read from I_S.INNODB_SYS_INDEXES when having a discarded tablesace - MY_I_S_MAYBE_NULL field attributes is added PAGE_NO and SPACE in innodb_sys_index table. By doing this, InnoDB can set null for these fields when it encounters discarded tablespace	2023-02-16 16:04:46 +05:30
Marko Mäkelä	1fd0099839	Merge 10.10 into 10.11	2023-02-16 11:41:18 +02:00
Marko Mäkelä	345356b868	Merge 10.9 into 10.10	2023-02-16 11:36:38 +02:00
Marko Mäkelä	0d55914d96	Merge 10.8 into 10.9	2023-02-16 10:25:34 +02:00
Marko Mäkelä	b12cd88ce1	Merge 10.6 into 10.8	2023-02-16 10:24:23 +02:00
Marko Mäkelä	67a6ad0a4a	Merge 10.5 into 10.6	2023-02-16 10:17:58 +02:00
Marko Mäkelä	d3f35aa47b	MDEV-30552 fixup: Fix the test for non-debug	2023-02-16 10:16:38 +02:00
Marko Mäkelä	5abbe092e6	Merge 10.6 into 10.8	2023-02-16 09:17:06 +02:00
Julius Goryavsky	80b4fa54e1	MDEV-30318: galera error messages in mariadb log without galera enabled Post-fix to MDEV-30318 and MDEV-22570-related changes: unified handling of wsrep_provider by code so that "none" is interpreted as case-insensitive everywhere and that work with an empty string is supported everywhere.	2023-02-15 17:46:26 +01:00
Haidong Ji	03c9a4ef4a	MDEV-29091: Correct event_name in PFS for wait caused by FOR UPDATE When one session SELECT ... FOR UPDATE and holds the lock, subsequent sessions that SELECT ... FOR UPDATE will wait to get the lock. Currently, that event is labeled as `wait/io/table/sql/handler`, which is incorrect. Instead, it should have been `wait/lock/table/sql/handler`. Two factors contribute to this bug: 1. Instrumentation interface and the heavy usage of `TABLE_IO_WAIT` in `sql/handler.cc` file. See interface [^1] for better understanding; 2. The balancing act [^2] of doing instrumentation aggregration _AND_ having good performance. For example, EVENTS_WAITS_SUMMARY... is aggregated using EVENTS_WAITS_CURRENT. Aggregration needs to be based on the same wait class, and the code was overly aggressive in label a LOCK operation as an IO operation in this case. The proposed fix is pretty simple, but understanding the bug took a while. Hence the footnotes below. For future improvement and refactoring, we may want to consider renaming `TABLE_IO_WAIT` and making it less coarse and more targeted. Note that newly added test case, events_waits_current_MDEV-29091, initially didn't pass Buildbot CI for embedded build tests. Further research showed that other impacted tests all included not_embedded.inc. This oversight was fixed later. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc. [^1]: To understand `performance_schema` instrumentation interface, I found this URL is the most helpful: https://dev.mysql.com/doc/dev/mysql-server/latest/PAGE_PFS_PSI.html [^2]: The best place to understand instrumentation projection, composition, and aggregration is through the source file. Although I prefer reading Doxygen produced html file, but for whatever reason, the rendering is not ideal. Here is link to 10.6's pfs.cc: https://github.com/MariaDB/server/blob/10.6/storage/perfschema/pfs.cc	2023-02-15 14:36:05 +00:00
Daniel Black	fab166532f	MDEV-30630 locale: Chinese error messages for ZH_CN MDEV-28227 added the error messages in simplified characters. Lets use these for those running a zh_CN profile. From Haidong Ji in the MDEV, Taiwan/Hong Kong (zh_TW/zh_HK) would expect traditional characters so this is left for when we have these.	2023-02-15 22:39:38 +11:00
Sergei Petrunia	587646a476	Do a proper cleanup in testcase for MDEV-30569	2023-02-15 13:33:59 +03:00
Sergei Petrunia	10a974adc9	Merge 11.0-selectivity into 11.0	2023-02-15 12:03:12 +03:00
Marko Mäkelä	96a3b11d13	Merge 10.5 into 10.6	2023-02-14 15:23:23 +02:00
Jan Lindström	2988db1cd9	MDEV-30318 : galera error messages in mariadb log without galera enabled Do not compile wsrep_provider plugin if WITH_WSREP is not enabled. We should not enable wsrep_provider plugin if WSREP_ON=OFF and at that case we can only print information that Plugin 'wsrep-provider' is disabled. Make sure tests require Galera library 26.4.14 if needed.	2023-02-14 13:57:47 +01:00
Jan Lindström	00f202b22a	MDEV-30133 MariaDB startup does not validate plugin-wsrep-provider when wsrep_mode=off or wsrep_provider is not set Refuse to start if WSREP_ON=OFF or wsrep_provider is not set or it is set to 'none' if plugin-wsrep-provider is used.	2023-02-14 12:01:54 +01:00
Jan Lindström	4849b73c4b	MDEV-30120 Update the wsrep_provider_options read_only value in the system_variables table. When wsrep-provider-options plugin is initialized we need to update wsrep-provider-options variable as READ_ONLY.	2023-02-14 12:01:54 +01:00
Daniele Sciascia	79d0194eef	MDEV-22570 Implement wsrep_provider_options as plugin - Provider options are read from the provider during startup, before plugins are initialized. - New wsrep_provider plugin for which sysvars are generated dynamically from options read from the provider. - The plugin is enabled by option plugin-wsrep-provider=ON. If enabled, wsrep_provider_options can no longer be used, (an error is raised on attempts to do so). - Each option is either string, integer, double or bool - Options can be dynamic / readonly - Options can be deprecated Limitations: - We do not check that the value of a provider option falls within a certain range. This type of validation is still done in Galera side. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-02-14 12:01:54 +01:00
Thirunarayanan Balathandayuthapani	951d81d92e	MDEV-30426 Assertion !rec_offs_nth_extern(offsets2, n) during bulk insert - cmp_rec_rec_simple() fails to detect duplicate key error for bulk insert operation	2023-02-14 15:43:33 +05:30
Thirunarayanan Balathandayuthapani	1a5c7552ea	MDEV-30552 InnoDB recovery crashes when error handling scenario - InnoDB fails to reset the after_apply variable before applying the redo log in last batch during multi-batch recovery.	2023-02-14 14:36:17 +05:30
Thirunarayanan Balathandayuthapani	3eea2e8e10	MDEV-30551 InnoDB recovery hangs when buffer pool ran out of memory - During non-last batch of multi-batch recovery, InnoDB holds log_sys.mutex and preallocates the block which may intiate page flush, which may initiate log flush, which requires log_sys.mutex to acquire again. This leads to assert failure. So InnoDB recovery should release log_sys.mutex before preallocating the block.	2023-02-14 14:35:35 +05:30
Thirunarayanan Balathandayuthapani	81faf41786	MDEV-30597 Assertion `flag == 1' failed in row_build_index_entry_low - InnoDB tries to build the previous version of the record for the virtual index, but the undo log record doesn't contain virtual column information. This leads to assert failure while building the tuple.	2023-02-14 14:28:27 +05:30
asklavou	7aace5d5da	MDEV-28839: remove current_pos where not intentionally being tested Task: ===== Update tests to reflect MDEV-20122, deprecation of master_use_gtid=current_pos. Change Master (CM) statements were either removed or modified with current_pos --> slave_pos based on original intention of the test. Reviewed by: ============ Brandon Nesterenko <brandon.nesterenko@mariadb.com>	2023-02-13 21:04:52 +00:00
Vicențiu Ciorbaru	c7fba948e1	Fix RPL tests post DEBUG_SYNC change The tests expect the SIGNAL to not be cleared. Thus set NO_CLEAR_EVENT within DBUG_EXECUTE_IF	2023-02-10 14:44:45 +02:00
Marko Mäkelä	dbab3e8d90	Merge 10.6 into 10.8	2023-02-10 13:43:53 +02:00
Sergei Petrunia	a7666952e0	MDEV-30569: Assertion ...ha_table_flags() in Duplicate_weedout_picker::check_qep DuplicateWeedout semi-join optimization requires that the tables in the parent subquery provide rowids that can be compared across table scans. Most engines support this, federated is the only exception. DuplicateWeedout is the default catch-all semi-join strategy, which must be always available. If it is not available for some edge case, it's better to disable semi-join conversion altogether. This is what was done in the fix for MDEV-30395. However that fix has put the check before the view processing, so it didn't detect federated tables inside mergeable VIEWs. This patch moves the check to be done at a later phase, when mergeable views are already merged.	2023-02-10 13:35:32 +02:00
Monty	00704aff98	Fixed bug in extended key handling when there is no primary key Extended keys works by first checking if the engine supports extended keys. If yes, it extends secondary key with primary key components and mark the secondary keys as HA_EXT_NOSAME (unique). If we later notice that there where no primary key, the extended key information for secondary keys in share->key_info is reset. However the key_info->flag HA_EXT_NOSAME was not reset! This causes some strange things to happen: - Tables that have no primary key or secondary index that contained the primary key would be wrongly optimized as the secondary key could be thought to be unique when it was not and not unique when it was. - The problem was not shown in EXPLAIN because of a bug in create_ref_for_key() that caused EQ_REF to be displayed by EXPLAIN as REF when extended keys where used and the secondary key contained the primary key. This is fixed with: - Removed wrong test in make_join_select() which did not detect that key where unique when a secondary key contains the primary. - Moved initialization of extended keys from create_key_infos() to init_from_binary_frm_image() after we know if there is a usable primary key or not. One disadvantage with this approach is that key_info->key_parts may have not used slots (for keys we thought could be extended but could not). Fixed by adding a check for unused key_parts to copy_keys_from_share(). Other things: - Simplified copying of first key part in create_key_infos(). - Added a lot of code comments in code that I had to check as part of finding the issue. - Fixed some indentation. - Replaced a couple of looks using references to pointers in C context where the reference does not give any benefit. - Updated Aria and Maria to not assume the all key_info->rec_per_key are in one memory block (this could happen when using dervived tables with many keys). - Fixed a bug where key_info->rec_per_key where not allocated - Optimized TABLE::add_tmp_key() to only call alloc() once. (No logic changes) Test case changes: - innodb_mysql.test changed index as an index the optimizer thought was unique, was not. (Table had no primary key) TODO: - Move code that checks for partial or too long keys to the primary loop earlier that initally decides if we should add extended key fields. This is needed to ensure that HA_EXT_NOSAME is not set for partial or too long keys. It will also shorten the current code notable.	2023-02-10 13:35:31 +02:00
Marko Mäkelä	6aec87544c	Merge 10.5 into 10.6	2023-02-10 13:03:01 +02:00
Monty	01c82173dd	Removed /2 of InnoDB ref_per_key[] estimates The original code was there to favor index search over table scan. This is not needed anymore as the cost calculations for table scans and index lookups are now more exact.	2023-02-10 12:59:36 +02:00
Monty	5de734da6b	Added sys.optimizer_switch_on() and sys.optimizer_switch_off() These are helpful tools to quickly see what optimizer switch options are on or off. The different options are displayed alphabetically	2023-02-10 12:58:50 +02:00
Monty	9a4110aa57	MDEV-30256 Wrong result (missing rows) upon join with empty table The problem was an assignment in test_quick_select() that flagged empty tables with "Impossible where". This test was however wrong as it didn't work correctly for left join. Removed the test, but added checking of empty tables in DELETE and UPDATE to get similar EXPLAIN as before. The new tests is a bit more strict (better) than before as it catches all cases of empty tables in single table DELETE/UPDATE.	2023-02-10 12:58:50 +02:00
Monty	3fa99f0c0e	Change cost for REF to take into account cost for 1 extra key read_next The main difference in code path between EQ_REF and REF is that for REF we have to do an extra read_next on the index to check that there is no more matching rows. Before this patch we added a preference of EQ_REF by ensuring that REF would always estimate to find at least 2 rows. This patch adds the cost of the extra key read_next to REF access and removes the code that limited REF to at least 2 rows. For some queries this can have a big effect as the total estimated rows will be halved for each REF table with 1 rows. multi_range cost calculations are also changed to take into account the difference between EQ_REF and REF. The effect of the patch to the test suite: - About 80 test case changed - Almost all changes where for EXPLAIN where estimated rows for REF where changed from 2 to 1. - A few test cases using explain extended had a change of 'filtered'. This is because of the estimated rows are now closer to the calculated selectivity. - A very few test had a change of table order. This is because the change of estimated rows from 2 to 1 or the small cost change for REF (main.subselect_sj_jcl6, main.group_by, main.dervied_cond_pushdown, main.distinct, main.join_nested, main.order_by, main.join_cache) - No key statistics and the estimated rows are now smaller which cased estimated filtering to be lower. (main.subselect_sj_mat) - The number of total rows are halved. (main.derived_cond_pushdown) - Plans with 1 row changed to use RANGE instead of REF. (main.group_min_max) - ALL changed to REF (main.key_diff) - Key changed from ref + index_only to PRIMARY key for InnoDB, as OPTIMIZER_ROW_LOOKUP_COST + OPTIMIZER_ROW_NEXT_FIND_COST is smaller than OPTIMIZER_KEY_LOOKUP_COST + OPTIMIZER_KEY_NEXT_FIND_COST. (main.join_outer_innodb) - Cost changes printouts (main.opt_trace*) - Result order change (innodb_gis.rtree)	2023-02-10 12:58:50 +02:00
Denis Protivensky	4d09050ca7	MDEV-29281 Report events from provider (add node eviction event) Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-02-10 11:30:46 +01:00
Marko Mäkelä	c41c79650a	Merge 10.4 into 10.5	2023-02-10 12:02:11 +02:00
Daniel Black	cacea31687	MDEV-30621: Türkiye is the correct current country naming As requested to the UN the country formerly known as Turkey is to be refered to as Türkiye. Reviewer: Alexander Barkov	2023-02-10 17:07:38 +11:00
Daniel Black	483ddb5684	MDEV-30621: Türkiye is the correct current country naming As requested to the UN the country formerly known as Turkey is to be refered to as Türkiye.	2023-02-10 08:44:14 +11:00
Brandon Nesterenko	eecd4f1459	MDEV-30608: rpl.rpl_delayed_parallel_slave_sbm sometimes fails with Seconds_Behind_Master should not have used second transaction timestamp One of the constraints added in the MDEV-29639 patch, is that only the first event after idling should update last_master_timestamp; and as long as the replica has more events to execute, the variable should not be updated. The corresponding test, rpl_delayed_parallel_slave_sbm.test, aims to verify this; however, if the IO thread takes too long to queue events, the SQL thread can appear to catch up too fast. This fix ensures that the relay log has been fully written before executing the events. Note that the underlying cause of this test failure needs to be addressed as a bug-fix, this is a temporary fix to stop test failures. To track work on the bug-fix for the underlying issue, please see MDEV-30619.	2023-02-09 13:02:14 -07:00
Daniel Black	ecc93c9824	MDEV-30492 Crash when use mariabackup.exe with config 'innodb_flush_method=async_unbuffered' Normalize innodb_flush_method, the same as the service, before attempting to print it.	2023-02-07 20:14:26 +11:00
Christian Gonzalez	3622644836	MDEV-30498 Rename mysql_upgrade state file to mariadb_upgrade Renames the upgrade state file, and ensures the old file is properly removed when `mariadb-upgrade` tool is executed. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2023-02-07 07:29:04 +00:00
Oleksandr Byelkin	70a515df43	Merge branch '10.6.12' into 10.6	2023-02-06 20:18:44 +01:00
Oleksandr Byelkin	40adf52d1c	Merge branch '10.4.28' into 10.4	2023-02-06 20:12:55 +01:00
Marko Mäkelä	ff12a5b897	MariaDB 10.5.19 release -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmPhCNUACgkQ8WVvJMdM 0dhMnA//cGJYi+Pj8kfy6lpETErEtX0LPIji6ZMivzZqsdhQhF2pqeN3D4dAPXwf +K8ktPlViqJN8XLsM8EGxyL4kGfrCIh6BMkqx+dS3G2n8xvke7myw2lu4j4iH25C xl9m90dDKQTl/UBZUSuwiPVnIeuLT3zIfnJWUSPPmFjsww2JsG5zKS0xi9/Oh0/h qu99r1imGaK01mXh1At5/jwniCEUYESNpzhADyrYFikhzYjNZBLuih8uVw2Orj0M /8SO6XEBv3iVMAsxsXWruLMn5QFisNZh0VMi+9FjTfVPEaGwcCU81iCK4rlVUfzD QYEOYbOHrCJa7OnO6++6J800XEOLlgHTM9JsVlIJlB78NUqs73xMwW8LNFtoF1qV U2GCae8stank0CJ7JVg89HGExI4r/pmfGJWv9gkwniYjQYONFLnCOOGAz2BATHRS oEcZNMeydg1Uuatj804og+mYMfR/Sd6zP4/fLalUOt2td7ELi6siA3QjyvucAKte HcfadLTbekBiTlBC1tfG4qL6zCa4CfpfKNGLzlAV2cBRJdwhlKawsY+1w8wmhZSK 16KtuyE8bzpj3+M/Gy6q5TOpma8Rl4kVJk5JxhZlDP8amtoQOZej95IwJJWcNFog JnAk+pwqzzY6kvjxXztdQj7iwG96EFWnZLf1e3qWaInmQApDK6U= =8Ff5 -----END PGP SIGNATURE----- Merge mariadb-10.5.19 into 10.5	2023-02-06 17:55:01 +02:00
Sergei Golubchik	d6e3d89c80	MDEV-29668 SUPER should not allow actions that have fine-grained dedicated privileges SUPER privilege used to allow various actions that were alternatively allowed by one of BINLOG ADMIN, BINLOG MONITOR, BINLOG REPLAY, CONNECTION ADMIN, FEDERATED ADMIN, REPL MASTER ADMIN, REPL SLAVE ADMIN, SET USER, SLAVE MONITOR. Now SUPER no longer does that, one has to grant one of the fine-grained privileges above to be to perform corresponding actions. On upgrade from MariaDB versions 10.11 and below all the privileges above are granted automatically if the user has SUPER. As a side-effect, such an upgrade will allow SUPER-user to run SHOW BINLOG EVENTS, SHOW RELAYLOG EVENTS, SHOW SLAVE HOSTS, even if he wasn't able to do it before the upgrade.	2023-02-06 14:31:48 +01:00
Sergei Golubchik	0ac5132505	MDEV-29227 deprecate explicit_defaults_for_timestamp=0	2023-02-05 22:02:30 +01:00
Sergei Golubchik	760d149067	MDEV-30128 remove support for 5.1- replication events including patches from Andrei Elkin	2023-02-05 22:02:30 +01:00
Vicențiu Ciorbaru	addcf08d0f	Revert test changes from "Fixed debug_sync timeout in deadlock_drop_table" After introducing multiple signals possible for debug_sync, the test changes are no longer necessary. Revert them to the original state.	2023-02-03 16:27:16 +02:00
Vicențiu Ciorbaru	8885225de6	Implement multiple-signal debug_sync The patch is inspired from MySQL. Instead of using a single String to hold the current active debug_sync signal, use a Hash_set to store LEX_STRINGS. This patch ensures that a signal can not be lost, by being overwritten by another thread via set DEBUG_SYNC = '... SIGNAL ...'; All signals are kepts "alive" until they are consumed by a wait event. This requires updating test cases that assume the GLOBAL signal is never consumed. Follow-up work needed: Port the additional syntax that allows one to set multiple signals and also conditionally deactivate signals when waiting.	2023-02-03 16:27:16 +02:00
Monty	5e0832e132	Updated some tests for --valgrind - Increased timeout for binlog_mysqlbinlog_raw_flush.test. The old timeout was not enough when running with --valgrind - Disabled ssl_timeout for --valgrind as it times out - Disabled binlog_truncate_multi_engine for --valgrind as it does restarts	2023-02-03 14:33:37 +03:00
Sergei Petrunia	dba78f3c32	Stabilize engines/iuds.type_bit_iuds test Make sure the queries use the intended query plan	2023-02-03 13:26:21 +03:00
Sergei Petrunia	0fcc32f864	Remove mysql-test/suite/versioning/r/select,trx_id.rdiff which is empty This seems to confuse windows.	2023-02-03 13:26:08 +03:00
Monty	1f4a9f086a	Removed "<select expression> INTO <destination>" deprication. This was done after discussions with Igor, Sanja and Bar. The main reason for removing the deprication was to ensure that MariaDB is always backward compatible whenever possible. Other things: - Added statistics counters, mainly for the feedback plugin. - INTO OUTFILE - INTO variable - If INTO is using the old syntax (end of query)	2023-02-03 11:57:50 +03:00
Monty	b74d2623eb	Removed diff dates from rdiff files	2023-02-03 11:57:45 +03:00
Monty	02f6ba571e	Changed some startup warnings to notes - Changed 'WARNING' of type "You need to use --log-bin to make ... work" to 'Note' - Only print startup Notes if log_warnings >= 4	2023-02-03 11:26:07 +03:00
Sergei Petrunia	6c4076fac4	MDEV-30032: EXPLAIN FORMAT=JSON output: part #2 : print 'loops'.	2023-02-03 11:22:17 +03:00
Sergei Petrunia	ffe0beca25	MDEV-30032: EXPLAIN FORMAT=JSON output: print costs Basic printout for join and table execution costs.	2023-02-03 11:01:24 +03:00
Monty	0dd9ec97d0	Changed a rule to be cost based in test_if_cheaper_ordering - Simplified test by setting read_time=DBL_MAX at start of loop if FORCE INDEX is used - No need to test for 'group by' as the cost compare should handle it. - Only one test change where index scan was replaced with table scan (correct)	2023-02-03 10:57:02 +03:00
Monty	d645025e87	Change default of histogram_type to JSON_HB	2023-02-03 10:56:40 +03:00
Monty	98879f8d43	Version change to 11.0	2023-02-03 10:56:23 +03:00
Monty	66dde8a54e	Added rowid_filter support to Aria This includes: - cleanup and optimization of filtering and pushdown engine code. - Adjusted costs for rowid filters (based on extensive testing and profiling). This made a small two changes to the handler_rowid_filter_is_active() API: - One should not call it with a zero pointer! - One does not need to call handler_rowid_filter_is_active() for every row anymore. It is enough to check if filter is active by calling it call it during index_init() or when handler::rowid_filter_changed() is called The changes was to avoid unnecessary function calls and checks if pushdown conditions and rowid_filter is not used. Updated costs for rowid_filter_lookup() to be closer to reality. The old cost was based only on rowid_compare_cost. This is now changed to take into account the overhead in checking the rowid. Changed the Range_rowid_filter class to use DYNAMIC_ARRAY directly instead of Dynamic_array<>. This was done to be able to use the new append_dynamic() functions which gives a notable speed improvment compared to the old code. Removing the abstraction also makes the code easier to understand. The cost of filtering is now slightly lower than before, which is reflected in some test cases that is now using rowid filters.	2023-02-03 10:42:28 +03:00
Monty	727491b72a	Added test cases for preceding test This includes all test changes from "Changing all cost calculation to be given in milliseconds" and forwards. Some of the things that caused changes in the result files: - As part of fixing tests, I added 'echo' to some comments to be able to easier find out where things where wrong. - MATERIALIZED has now a higher cost compared to X than before. Because of this some MATERIALIZED types have changed to DEPENDEND SUBQUERY. - Some test cases that required MATERIALIZED to repeat a bug was changed by adding more rows to force MATERIALIZED to happen. - 'Filtered' in SHOW EXPLAIN has in many case changed from 100.00 to something smaller. This is because now filtered also takes into account the smallest possible ref access and filters, even if they where not used. Another reason for 'Filtered' being smaller is that we now also take into account implicit filtering done for subqueries using FIRSTMATCH. (main.subselect_no_exists_to_in) This is caluculated in best_access_path() and stored in records_out. - Table orders has changed because more accurate costs. - 'index' and 'ALL' for small tables has changed to use 'range' or 'ref' because of optimizer_scan_setup_cost. - index can be changed to 'range' as 'range' optimizer assumes we don't have to read the blocks from disk that range optimizer has already read. This can be confusing in the case where there is no obvious where clause but instead there is a hidden 'key_column > NULL' added by the optimizer. (main.subselect_no_exists_to_in) - Scan on primary clustered key does not report 'Using Index' anymore (It's a table scan, not an index scan). - For derived tables, the number of rows is now 100 instead of 2, which can be seen in EXPLAIN. - More tests have "Using index for group by" as the cost of this optimization is now more correct (lower). - A primary key could be preferred for a normal key, even if it would access more rows, as it's faster to do 1 lokoup and 3 'index_next' on a clustered primary key than one lookup trough a secondary. (main.stat_tables_innodb) Notes: - There was a 4.7% more calls to best_extension_by_limited_search() in the main.greedy_optimizer test. However examining the test results it looked that the plans where slightly better (eq_ref where more chained together) so I assume this is ok. - I have verified a few test cases where there was notable/unexpected changes in the plan and in all cases the new optimizer plans where faster. (main.greedy_optimizer and some others)	2023-02-03 00:00:35 +03:00
Monty	013ba37ae2	Fix cost calculation in test_if_cheaper_ordering() to be cost based The original code was mostly rule based and preferred clustered or covering indexed independent of cost. There where a few test changes: - Some test changed from using filesort to index or table scan. This happened when most of the rows had to be sorted and the ORDER BY could use covering or a clustered index (innodb_mysql, create_spatial_index). - Some test changed range to filesort. This where mainly because the range was scanning most of the rows or using index scan + row lookup and filesort with table scan is cheaper. (order_by). - Change in join_cache was because sorting 2 rows is faster than retrieving 10 rows. - In selectivity_innodb.test one test changed to use a cheaper index.	2023-02-02 23:08:23 +03:00
Monty	5e5a8eda16	Derived tables and union can now create distinct keys The idea is that instead of marking all select_lex's with DISTINCT, we only mark those that really need distinct result. Benefits of this change: - Temporary tables used with derived tables, UNION, IN are now smaller as duplicates are removed already on the insert phase. - The optimizer can now produce better plans with EQ_REF. This can be seen from the tests where several queries does not anymore materialize derived tables twice. - Queries affected by 'in_predicate_conversion_threshold' where large IN lists are converted to sub query produces better plans. Other things: - Removed on duplicate call to sel->init_select() in LEX::add_primary_to_query_expression_body() - I moved the testing of tab->table->pos_in_table_list->is_materialized_derived() in join_read_const_table() to the caller as it caused problems for derived tables that could be proven to be const tables. This also is likely to fix some bugs as if join_read_const_table() was aborted, the table was left marked as JT_CONST, which cannot be good. I added an ASSERT there for now that can be removed when the code has been properly tested.	2023-02-02 22:32:57 +03:00
Monty	5e651c9aea	Make the most important optimizer constants user variables Variables added: - optimizer_index_block_copy_cost - optimizer_key_copy_cost - optimizer_key_next_find_cost - optimizer_key_compare_cost - optimizer_row_copy_cost - optimizer_where_compare_cost Some rename of defines was done to make the internal defines similar to the visible ones: TIME_FOR_COMPARE -> WHERE_COST; WHERE_COST was also "inverted" to be a number between 0 and 1 that is multiply with accepted records (similar to other optimizer variables). TIME_FOR_COMPARE_IDX -> KEY_COMPARE_COST. This is also inverted, similar to TIME_FOR_COMPARE. TIME_FOR_COMPARE_ROWID -> ROWID_COMPARE_COST. This is also inverted, similar to TIME_FOR_COMPARE. All default costs are identical to what they where before this patch. Other things: - Compare factor in get_merge_buffers_cost() was inverted. - Changed namespace to static in filesort_utils.cc	2023-02-02 21:44:00 +03:00
Monty	b6215b9b20	Update row and key fetch cost models to take into account data copy costs Before this patch, when calculating the cost of fetching and using a row/key from the engine, we took into account the cost of finding a row or key from the engine, but did not consistently take into account index only accessed, clustered key or covered keys for all access paths. The cost of the WHERE clause (TIME_FOR_COMPARE) was not consistently considered in best_access_path(). TIME_FOR_COMPARE was used in calculation in other places, like greedy_search(), but was in some cases (like scans) done an a different number of rows than was accessed. The cost calculation of row and index scans didn't take into account the number of rows that where accessed, only the number of accepted rows. When using a filter, the cost of index_only_reads and cost of accessing and disregarding 'filtered rows' where not taken into account, which made filters cost less than there actually where. To remedy the above, the following key & row fetch related costs has been added: - The cost of fetching and using a row is now split into different costs: - key + Row fetch cost (as before) but multiplied with the variable 'optimizer_cache_cost' (default to 0.5). This allows the user to tell the optimizer the likehood of finding the key and row in the engine cache. - ROW_COPY_COST, The cost copying a row from the engine to the sql layer or creating a row from the join_cache to the record buffer. Mostly affects table scan costs. - ROW_LOOKUP_COST, the cost of fetching a row by rowid. - KEY_COPY_COST the cost of finding the next key and copying it from the engine to the SQL layer. This is used when we calculate the cost index only reads. It makes index scans more expensive than before if they cover a lot of rows. (main.index_merge_myisam) - KEY_LOOKUP_COST, the cost of finding the first key in a range. This replaces the old define IDX_LOOKUP_COST, but with a higher cost. - KEY_NEXT_FIND_COST, the cost of finding the next key (and rowid). when doing a index scan and comparing the rowid to the filter. Before this cost was assumed to be 0. All of the above constants/variables are now tuned to be somewhat in proportion of executing complexity to each other. There is tuning need for these in the future, but that can wait until the above are made user variables as that will make tuning much easier. To make the usage of the above easy, there are new (not virtual) cost calclation functions in handler: - ha_read_time(), like read_time(), but take optimizer_cache_cost into account. - ha_read_and_copy_time(), like ha_read_time() but take into account ROW_COPY_TIME - ha_read_and_compare_time(), like ha_read_and_copy_time() but take TIME_FOR_COMPARE into account. - ha_rnd_pos_time(). Read row with row id, taking ROW_COPY_COST into account. This is used with filesort where we don't need to execute the WHERE clause again. - ha_keyread_time(), like keyread_time() but take optimizer_cache_cost into account. - ha_keyread_and_copy_time(), like ha_keyread_time(), but add KEY_COPY_COST. - ha_key_scan_time(), like key_scan_time() but take optimizer_cache_cost nto account. - ha_key_scan_and_compare_time(), like ha_key_scan_time(), but add KEY_COPY_COST & TIME_FOR_COMPARE. I also added some setup costs for doing different types of scans and creating temporary tables (on disk and in memory). This encourages the optimizer to not use these for simple 'a few row' lookups if there are adequate key lookup strategies. - TABLE_SCAN_SETUP_COST, cost of starting a table scan. - INDEX_SCAN_SETUP_COST, cost of starting an index scan. - HEAP_TEMPTABLE_CREATE_COST, cost of creating in memory temporary table. - DISK_TEMPTABLE_CREATE_COST, cost of creating an on disk temporary table. When calculating cost of fetching ranges, we had a cost of IDX_LOOKUP_COST (0.125) for doing a key div for a new range. This is now replaced with 'io_cost * KEY_LOOKUP_COST (1.0) * optimizer_cache_cost', which matches the cost we use for 'ref' and other key lookups. The effect is that the cost is now a bit higher when we have many ranges for a key. Allmost all calculation with TIME_FOR_COMPARE is now done in best_access_path(). 'JOIN::read_time' now includes the full cost for finding the rows in the table. In the result files, many of the changes are now again close to what they where before the "Update cost for hash and cached joins" commit, as that commit didn't fix the filter cost (too complex to do everything in one commit). The above changes showed a lot of a lot of inconsistencies in optimizer cost calculation. The main objective with the other changes was to do calculation as similar (and accurate) as possible and to make different plans more comparable. Detailed list of changes: - Calculate index_only_cost consistently and correctly for all scan and ref accesses. The row fetch_cost and index_only_cost now takes into account clustered keys, covered keys and index only accesses. - cost_for_index_read now returns both full cost and index_only_cost - Fixed cost calculation of get_sweep_read_cost() to match other similar costs. This is bases on the assumption that data is more often stored on SSD than a hard disk. - Replaced constant 2.0 with new define TABLE_SCAN_SETUP_COST. - Some scan cost estimates did not take into account TIME_FOR_COMPARE. Now all scan costs takes this into account. (main.show_explain) - Added session variable optimizer_cache_hit_ratio (default 50%). By adjusting this on can reduce or increase the cost of index or direct record lookups. The effect of the default is that key lookups is now a bit cheaper than before. See usage of 'optimizer_cache_cost' in handler.h. - JOIN_TAB::scan_time() did not take into account index only scans, which produced a wrong cost when index scan was used. Changed JOIN_TAB:::scan_time() to take into consideration clustered and covered keys. The values are now cached and we only have to call this function once. Other calls are changed to use the cached values. Function renamed to JOIN_TAB::estimate_scan_time(). - Fixed that most index cost calculations are done the same way and more close to 'range' calculations. The cost is now lower than before for small data sets and higher for large data sets as we take into account how many keys are read (main.opt_trace_selectivity, main.limit_rows_examined). - Ensured that index_scan_cost() == range(scan_of_all_rows_in_table_using_one_range) + MULTI_RANGE_READ_INFO_CONST. One effect of this is that if there is choice of doing a full index scan and a range-index scan over almost the whole table then index scan will be preferred (no range-read setup cost). (innodb.innodb, main.show_explain, main.range) - Fixed the EQ_REF and REF takes into account clustered and covered keys. This changes some plans to use covered or clustered indexes as these are much cheaper. (main.subselect_mat_cost, main.state_tables_innodb, main.limit_rows_examined) - Rowid filter setup cost and filter compare cost now takes into account fetching and checking the rowid (KEY_NEXT_FIND_COST). (main.partition_pruning heap.heap_btree main.log_state) - Added KEY_NEXT_FIND_COST to Range_rowid_filter_cost_info::lookup_cost to account of the time to find and check the next key value against the container - Introduced ha_keyread_time(rows) that takes into account finding the next row and copying the key value to 'record' (KEY_COPY_COST). - Introduced ha_key_scan_time() for calculating an index scan over all rows. - Added IDX_LOOKUP_COST to keyread_time() as a startup cost. - Added index_only_fetch_cost() as a convenience function to OPT_RANGE. - keyread_time() cost is slightly reduced to prefer shorter keys. (main.index_merge_myisam) - All of the above caused some index_merge combinations to be rejected because of cost (main.index_intersect). In some cases 'ref' where replaced with index_merge because of the low cost calculation of get_sweep_read_cost(). - Some index usage moved from PRIMARY to a covering index. (main.subselect_innodb) - Changed cost calculation of filter to take KEY_LOOKUP_COST and TIME_FOR_COMPARE into account. See sql_select.cc::apply_filter(). filter parameters and costs are now written to optimizer_trace. - Don't use matchings_records_in_range() to try to estimate the number of filtered rows for ranges. The reason is that we want to ensure that 'range' is calculated similar to 'ref'. There is also more work needed to calculate the selectivity when using ranges and ranges and filtering. This causes filtering column in EXPLAIN EXTENDED to be 100.00 for some cases where range cannot use filtering. (main.rowid_filter) - Introduced ha_scan_time() that takes into account the CPU cost of finding the next row and copying the row from the engine to 'record'. This causes costs of table scan to slightly increase and some test to changed their plan from ALL to RANGE or ALL to ref. (innodb.innodb_mysql, main.select_pkeycache) In a few cases where scan time of very small tables have lower cost than a ref or range, things changed from ref/range to ALL. (main.myisam, main.func_group, main.limit_rows_examined, main.subselect2) - Introduced ha_scan_and_compare_time() which is like ha_scan_time() but also adds the cost of the where clause (TIME_FOR_COMPARE). - Added small cost for creating temporary table for materialization. This causes some very small tables to use scan instead of materialization. - Added checking of the WHERE clause (TIME_FOR_COMPARE) of the accepted rows to ROR costs in get_best_ror_intersect() - Removed '- 0.001' from 'join->best_read' and optimize_straight_join() to ensure that the 'Last_query_cost' status variable contains the same value as the one that was calculated by the optimizer. - Take avg_io_cost() into account in handler::keyread_time() and handler::read_time(). This should have no effect as it's 1.0 by default, except for heap that overrides these functions. - Some 'ref_or_null' accesses changed to 'range' because of cost adjustments (main.order_by) - Added scan type "scan_with_join_cache" for optimizer_trace. This is just to show in the trace what kind of scan was used. - When using 'scan_with_join_cache' take into account number of preceding tables (as have to restore all fields for all previous table combination when checking the where clause) The new cost added is: (row_combinations * ROW_COPY_COST * number_of_cached_tables). This increases the cost of join buffering in proportion of the number of tables in the join buffer. One effect is that full scans are now done earlier as the cost is then smaller. (main.join_outer_innodb, main.greedy_optimizer) - Removed the usage of 'worst_seeks' in cost_for_index_read as it caused wrong plans to be created; It prefered JT_EQ_REF even if it would be much more expensive than a full table scan. A related issue was that worst_seeks only applied to full lookup, not to clustered or index only lookups, which is not consistent. This caused some plans to use index scan instead of eq_ref (main.union) - Changed federated block size from 4096 to 1500, which is the typical size of an IO packet. - Added costs for reading rows to Federated. Needed as there is no caching of rows in the federated engine. - Added ha_innobase::rnd_pos_time() cost function. - A lot of extra things added to optimizer trace - More costs, especially for materialization and index_merge. - Make lables more uniform - Fixed a lot of minor bugs - Added 'trace_started()' around a lot of trace blocks. - When calculating ORDER BY with LIMIT cost for using an index the cost did not take into account the number of row retrivals that has to be done or the cost of comparing the rows with the WHERE clause. The cost calculated would be just a fraction of the real cost. Now we calculate the cost as we do for ranges and 'ref'. - 'Using index for group-by' is used a bit more than before as now take into account the WHERE clause cost when comparing with 'ref' and prefer the method with fewer row combinations. (main.group_min_max). Bugs fixed: - Fixed that we don't calculate TIME_FOR_COMPARE twice for some plans, like in optimize_straight_join() and greedy_search() - Fixed bug in save_explain_data where we could test for the wrong index when displaying 'Using index'. This caused some old plans to show 'Using index'. (main.subselect_innodb, main.subselect2) - Fixed bug in get_best_ror_intersect() where 'min_cost' was not updated, and the cost we compared with was not the one that was used. - Fixed very wrong cost calculation for priority queues in check_if_pq_applicable(). (main.order_by now correctly uses priority queue) - When calculating cost of EQ_REF or REF, we added the cost of comparing the WHERE clause with the found rows, not all row combinations. This made ref and eq_ref to be regarded way to cheap compared to other access methods. - FORCE INDEX cost calculation didn't take into account clustered or covered indexes. - JT_EQ_REF cost was estimated as avg_io_cost(), which is half the cost of a JT_REF key. This may be true for InnoDB primary key, but not for other unique keys or other engines. Now we use handler function to calculate the cost, which allows us to handle consistently clustered, covered keys and not covered keys. - ha_start_keyread() didn't call extra_opt() if keyread was already enabled but still changed the 'keyread' variable (which is wrong). Fixed by not doing anything if keyread is already enabled. - multi_range_read_info_cost() didn't take into account io_cost when calculating the cost of ranges. - fix_semijoin_strategies_for_picked_join_order() used the wrong record_count when calling best_access_path() for SJ_OPT_FIRST_MATCH and SJ_OPT_LOOSE_SCAN. - Hash joins didn't provide correct best_cost to the upper level, which means that the cost for hash_joins more expensive than calculated in best_access_path (a difference of 10x * TIME_OF_COMPARE). This is fixed in the new code thanks to that we now include TIME_OF_COMPARE cost in 'read_time'. Other things: - Added some 'if (thd->trace_started())' to speed up code - Removed not used function Cost_estimate::is_zero() - Simplified testing of HA_POS_ERROR in get_best_ror_intersect(). (No cost changes) - Moved ha_start_keyread() from join_read_const_table() to join_read_const() to enable keyread for all types of JT_CONST tables. - Made a few very short functions inline in handler.h Notes: - In main.rowid_filter the join order of order and lineitem is swapped. This is because the cost of doing a range fetch of lineitem(98 rows) is almost as big as the whole join of order,lineitem. The filtering will also ensure that we only have to do very small key fetches of the rows in lineitem. - main.index_merge_myisam had a few changes where we are now using less keys for index_merge. This is because index scans are now more expensive than before. - handler->optimizer_cache_cost is updated in ha_external_lock(). This ensures that it is up to date per statements. Not an optimal solution (for locked tables), but should be ok for now. - 'DELETE FROM t1 WHERE t1.a > 0 ORDER BY t1.a' does not take cost of filesort into consideration when table scan is chosen. (main.myisam_explain_non_select_all) - perfschema.table_aggregate_global_* has changed because an update on a table with 1 row will now use table scan instead of key lookup. TODO in upcomming commits: - Fix selectivity calculation for ranges with and without filtering and when there is a ref access but scan is chosen. For this we have to store the lowest known value for 'accepted_records' in the OPT_RANGE structure. - Change that records_read does not include filtered rows. - test_if_cheaper_ordering() needs to be updated to properly calculate costs. This will fix tests like main.order_by_innodb, main.single_delete_update - Extend get_range_limit_read_cost() to take into considering cost_for_index_read() if there where no quick keys. This will reduce the computed cost for ORDER BY with LIMIT in some cases. (main.innodb_ext_key) - Fix that we take into account selectivity when counting the number of rows we have to read when considering using a index table scan to resolve ORDER BY. - Add new calculation for rnd_pos_time() where we take into account the benefit of reading multiple rows from the same page.	2023-02-02 21:43:30 +03:00
Monty	956980971f	Update cost for hash and cached joins The old code did not't correctly add TIME_FOR_COMPARE to rows that are part of the scan that will be compared with the attached where clause. Now the cost calculation for hash join and full join cache join are identical except for HASH_FANOUT (10%) The cost for a join with keys is now also uniform. The total cost for a using a key for lookup is calculated in one place as: (cost_of_finding_rows_through_key(records) + records/TIME_FOR_COMPARE)* record_count_of_previous_row_combinations + startup_cost startup_cost is the cost of a creating a temporary table (if needed) Best_cost now includes the cost of comparing all WHERE clauses and also cost of joining with previous row combinations. Other things: - Optimizer trace is now printing the total costs, including testing the WHERE clause (TIME_FOR_COMPARE) and comparing with all previous rows. - In optimizer trace, include also total cost of query together with the final join order. This makes it easier to find out where the cost was calculated. - Old code used filter even if the cost for it was higher than not using a filter. This is not corrected. - When rebasing on 10.11, I noticed some changes to access_cost_factor calculation. These changes was not picked as the coming changes to filtering will make that code obsolete.	2023-02-02 20:49:35 +03:00
Monty	6fa7451759	Adjust costs for doing index scan in cost_group_min_max() The idea is that when doing a tree dive (once per group), we need to compare key values, which is fast. For each new group, we have to compare the full where clause for the row. Compared to original code, the cost of group_min_max() has slightly increased which affects some test with only a few rows. main.group_min_max and main.distinct have been modified to show the effect of the change. The patch also adjust the number of groups in case of quick selects: - For simple WHERE clauses, ensure that we have at least as many groups as we have conditions on the used group-by key parts. The assumption is that each condition will create at least one group. - Ensure that there are no more groups than rows found by quick_select Test changes: - For some small tables there has been a change of Using index for group-by -> Using index for group-by (scanning) Range -> Index and Using index for group-by -> Using index	2023-02-02 20:25:25 +03:00
Monty	b67144893a	Update matching_candidates_in_table() to treat all conditions similar Fixed also that the 'with_found_constraint parameter' to matching_candidates_in_table() is as documented: It is now true only if there is a reference to a previous table in the WHERE condition for the current examined table (as it was originally documented) Changes in test results: - Filtered was 25% smaller for some queries (expected). - Some join order changed (probably because the tables had very few rows). - Some more table scans, probably because there would be fewer returned rows. - Some tests exposes a bug that if there is more filtered rows, then the cost for table scan will be higher. This will be fixed in a later commit.	2023-02-02 20:19:32 +03:00
Oleksandr Byelkin	cafba8761a	Merge branch '10.10' into 10.11	2023-02-01 18:28:03 +01:00
Oleksandr Byelkin	cc8b9bcee3	Merge branch '10.9' into 10.10	2023-02-01 17:53:45 +01:00
Julius Goryavsky	e3e72644cf	MDEV-30452: ssl error: unexpected EOF while reading This commit contains fixes for error codes, which are needed because OpenSSL 3.x and recent versions of GnuTLS have changed the indication of error codes when the peer does not send close_notify before closing the connection.	2023-02-01 17:50:29 +01:00
Oleksandr Byelkin	260f1fe7c3	Merge branch '10.8' into 10.9	2023-02-01 17:21:42 +01:00
Oleksandr Byelkin	d4310eb96a	Merge branch '10.7' into 10.8	2023-02-01 17:19:48 +01:00
Oleksandr Byelkin	bc656c4fa5	Merge branch '10.6' into 10.7	2023-02-01 16:29:16 +01:00
Julius Goryavsky	4c79e15cc3	MDEV-30536: no expected deadlock in galera_insert_bulk test Unstable test (galera_insert_bulk) temporarily disabled.	2023-02-01 15:57:22 +01:00
Marko Mäkelä	1c926b6263	MDEV-30527 Assertion !m_freed_pages in mtr_t::start() on DROP TEMPORARY TABLE mtr_t::commit(): Add special handling of innodb_immediate_scrub_data_uncompressed for TEMPORARY TABLE. This fixes a regression that was caused by commit `de4030e4d4` (MDEV-30400).	2023-02-01 10:55:49 +02:00
Oleksandr Byelkin	c7c415734d	Merge branch '10.10' into 10.11	2023-01-31 11:07:08 +01:00
Oleksandr Byelkin	76bcea3154	Merge branch '10.9' into 10.10	2023-01-31 11:01:48 +01:00
Oleksandr Byelkin	de2d089942	Merge branch '10.8' into 10.9	2023-01-31 10:37:31 +01:00
Oleksandr Byelkin	638625278e	Merge branch '10.7' into 10.8	2023-01-31 09:57:52 +01:00
Oleksandr Byelkin	b923b80cfd	Merge branch '10.6' into 10.7	2023-01-31 09:33:58 +01:00
Oleksandr Byelkin	c3a5cf2b5b	Merge branch '10.5' into 10.6	2023-01-31 09:31:42 +01:00
Andrei	f8a85af8ca	MDEV-30940: Revert "binlog.innodb_rc_insert_before_delete is disabled with MDEV-30490" This reverts commit `b2ea57e899`, as well as edits binlog.innodb_rc_insert_before_delete.test to be safely runnable with any preceding test. Note: manual 10.5 -> 10.6 merge is required to the test.	2023-01-30 21:28:21 +01:00
Monty	7d0bef6cd7	Fixed bug in SQL_SELECT_LIMIT We where comparing costs when we should be comparing number of rows that will be examined	2023-01-30 15:24:15 +02:00
Monty	87d4d7232c	Limit calculated rows to the number of rows in the table The result file changes are mainly that number of rows is one smaller for some queries with DISTINCT or GROUP BY	2023-01-30 15:22:20 +02:00
Monty	c443dbff0e	Ensure that test_quick_select doesn't return more rows than in the table Other changes: - In test_quick_select(), assume that if table->used_stats_records is 0 then the table has 0 rows. - Fixed prepare_simple_select() to populate table->used_stat_records - Enusre that set_statistics_for_tables() doesn't cause used_stats_records to be 0 when using stat_tables. - To get blackhole to work with replication, set stats.records to 2 so that test_quick_select() doesn't assume the table is empty.	2023-01-30 15:22:20 +02:00
Andrei	b2ea57e899	binlog.innodb_rc_insert_before_delete is disabled with MDEV-30490	2023-01-30 13:25:26 +01:00
Oleksandr Byelkin	db8019ef00	Merge branch '10.4' into 10.5	2023-01-30 13:25:02 +01:00
Jan Lindström	b05218e08f	MDEV-30473 : Do not allow GET_LOCK() / RELEASE_LOCK() in cluster Following tests do not test anymore what they intended to test deleted: suite/galera/t/MDEV-24143.test deleted: suite/galera/t/galera_bf_abort_get_lock.test	2023-01-30 08:55:35 +02:00
Oleksandr Byelkin	a977054ee0	Merge branch '10.3' into 10.4	2023-01-28 18:22:55 +01:00
Andrei	6173a4a15b	binlog.innodb_rc_insert_before_delete is disabled with MDEV-30490	2023-01-28 17:10:42 +02:00
Andrei	c73985f2ce	MDEV-30010 post-push: fixing test results.	2023-01-28 15:21:23 +02:00
Oleksandr Byelkin	7fa02f5c0b	Merge branch '10.4' into 10.5	2023-01-27 13:54:14 +01:00
Jan Lindström	49ee18eb42	MDEV-30473 : Do not allow GET_LOCK() / RELEASE_LOCK() in cluster In 10.5 If WSREP_ON=ON do not allow RELEASE_ALL_LOCKS function. Instead print clear error message.	2023-01-27 10:40:07 +02:00
Jan Lindström	696562ce55	MDEV-30473 : Do not allow GET_LOCK() / RELEASE_LOCK() in cluster If WSREP_ON=ON do not allow GET_LOCK and RELEASE_LOCK functions. Instead print clear error message.	2023-01-27 10:34:06 +02:00
Jan Lindström	844ddb1109	MDEV-30473 : Do not allow GET_LOCK() / RELEASE_LOCK() in cluster If WSREP_ON=ON do not allow GET_LOCK and RELEASE_LOCK functions. Instead print clear error message.	2023-01-27 08:39:32 +02:00
Jan Lindström	015fb54d45	MDEV-25037 : SIGSEGV in MDL_lock::hog_lock_types_bitmap We should not call mdl_context.release_explicit_locks() in Wsrep_client_service::bf_rollback() if client is quiting because it will be done again in THD::cleanup(). Note that problem with GET_LOCK() / RELEASE_LOCK() will be fixed on MDEV-30473.	2023-01-27 08:38:27 +02:00
Oleksandr Byelkin	dd24fa3063	Merge branch '10.3' into 10.4	2023-01-26 10:34:26 +01:00
Marko Mäkelä	82b18a8361	MDEV-29374 fixup: Suppress an error in a test	2023-01-25 10:56:07 +02:00
Marko Mäkelä	75c78316d6	Merge 10.11 into 11.0	2023-01-25 10:17:54 +02:00
Jan Lindström	509c7f66bd	MDEV-27977 : galera.galera_UK_conflict fails with wrong result Add wait_condition so that all rows expected are really replicated before we check it.	2023-01-25 09:08:15 +02:00
Andrei	7fe932444d	MDEV-30323 Some DDLs like ANALYZE can complete on parallel slave out of order ANALYZE was observed to race over a preceding in binlog order DML in updating the binlog and slave gtid states. Tagging ANALYZE and other admin class commands in binlog by the fixes of MDEV-17515 left a flaw allowing such race leading to the gtid mode out-of-order error. This is fixed now to observe by ADMIN commands the ordered access to the slave gtid status variables and binlog.	2023-01-24 20:18:03 +02:00
Andrei	3aa04c0deb	MDEV-30010 Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964 This commit merely adds is a Read-Committed version MDEV-30225 test solely to prove the RC isolation yields ROW binlog format as it is supposed to per docs.	2023-01-24 19:39:44 +02:00
Brandon Nesterenko	d69e835787	MDEV-29639: Seconds_Behind_Master is incorrect for Delayed, Parallel Replicas Problem ======== On a parallel, delayed replica, Seconds_Behind_Master will not be calculated until after MASTER_DELAY seconds have passed and the event has finished executing, resulting in potentially very large values of Seconds_Behind_Master (which could be much larger than the MASTER_DELAY parameter) for the entire duration the event is delayed. This contradicts the documented MASTER_DELAY behavior, which specifies how many seconds to withhold replicated events from execution. Solution ======== After a parallel replica idles, the first event after idling should immediately update last_master_timestamp with the time that it began execution on the primary. Reviewed By =========== Andrei Elkin <andrei.elkin@mariadb.com>	2023-01-24 08:11:35 -07:00
Marko Mäkelä	10635c2833	Merge 10.10 into 10.11	2023-01-24 15:17:39 +02:00
Marko Mäkelä	51fc6b91d2	Merge 10.9 into 10.10	2023-01-24 15:17:10 +02:00
Marko Mäkelä	4d9fe4032b	Merge 10.8 into 10.9	2023-01-24 14:59:42 +02:00
Marko Mäkelä	fa543a0f62	Merge 10.7 into 10.8	2023-01-24 14:52:25 +02:00
Marko Mäkelä	cea50896d2	Merge 10.6 into 10.7	2023-01-24 14:35:36 +02:00
Marko Mäkelä	de4030e4d4	MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT This also fixes part of MDEV-29835 Partial server freeze which is caused by violations of the latching order that was defined in https://dev.mysql.com/worklog/task/?id=6326 (WL#6326: InnoDB: fix index->lock contention). Unless the current thread is holding an exclusive dict_index_t::lock, it must acquire page latches in a strict parent-to-child, left-to-right order. Not all cases of MDEV-29835 are fixed yet. Failure to follow the correct latching order will cause deadlocks of threads due to lock order inversion. As part of these changes, the BTR_MODIFY_TREE mode is modified so that an Update latch (U a.k.a. SX) will be acquired on the root page, and eXclusive latches (X) will be acquired on all pages leading to the leaf page, as well as any left and right siblings of the pages along the path. The DEBUG_SYNC test innodb.innodb_wl6326 will be removed, because at the time the DEBUG_SYNC point is hit, the thread is actually holding several page latches that will be blocking a concurrent SELECT statement. We also remove double bookkeeping that was caused due to excessive information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo store information of latched pages, and ensure that mtr_memo_slot_t::object is never a null pointer. The tree_blocks[] and tree_savepoints[] were redundant. buf_page_get_low(): If innodb_change_buffering_debug=1, to avoid a hang, do not try to evict blocks if we are holding a latch on a modified page. The test innodb.innodb-change-buffer-recovery will be removed, because change buffering may no longer be forced by debug injection when the change buffer comprises multiple pages. Remove a debug assertion that could fail when innodb_change_buffering_debug=1 fails to evict a page. For other cases, the assertion is redundant, because we already checked that right after the got_block: label. The test innodb.innodb-change-buffering-recovery will be removed, because due to this change, we will be unable to evict the desired page. mtr_t::lock_register(): Register a change of a page latch on an unmodified buffer-fixed block. mtr_t::x_latch_at_savepoint(), mtr_t::sx_latch_at_savepoint(): Replaced by the use of mtr_t::upgrade_buffer_fix(), which now also handles RW_S_LATCH. mtr_t::set_modified(): For temporary tables, invoke buf_page_t::set_modified() here and not in mtr_t::commit(). We will never set the MTR_MEMO_MODIFY flag on other than persistent data pages, nor set mtr_t::m_modifications when temporary data pages are modified. mtr_t::commit(): Only invoke the buf_flush_note_modification() loop if persistent data pages were modified. mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo. This avoids many redundant entries in mtr_t::m_memo, as well as redundant calls to buf_page_get_gen() for blocks that had already been looked up in a mini-transaction. btr_get_latched_root(): Return a pointer to an already latched root page. This replaces btr_root_block_get() in cases where the mini-transaction has already latched the root page. btr_page_get_parent(): Fetch a parent page that was already latched in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched(). If needed, upgrade the root page U latch to X. This avoids bloating mtr_t::m_memo as well as performing redundant buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for B-tree defragmentation, we will invoke btr_cur_search_to_nth_level(). btr_cur_search_to_nth_level(): This will only be used for non-leaf (level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be removed altogether, or retained for the case of CHECK TABLE without QUICK. btr_cur_t::left_block: Remove. btr_pcur_move_backward_from_page() can retrieve the left sibling from the end of mtr_t::m_memo. btr_cur_t::open_leaf(): Some clean-up. btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level() for searches to level=0 (the leaf level). We will never release parent page latches before acquiring leaf page latches. If we need to temporarily release the level=1 page latch in the BTR_SEARCH_PREV or BTR_MODIFY_PREV latch_mode, we will reposition the cursor on the child node pointer so that we will land on the correct leaf page. btr_cur_t::pessimistic_search_leaf(): Implement new BTR_MODIFY_TREE latching logic in the case that page splits or merges will be needed. The parent pages (and their siblings) should already be latched on the first dive to the leaf and be present in mtr_t::m_memo; there should be no need for BTR_CONT_MODIFY_TREE. This pre-latching almost suffices; it must be revised in MDEV-29835 and work-arounds removed for cases where mtr_t::get_already_latched() fails to find a block. rtr_search_to_nth_level(): A SPATIAL INDEX version of btr_search_to_nth_level() that can search to any level (including the leaf level). rtr_search_leaf(), rtr_insert_leaf(): Wrappers for rtr_search_to_nth_level(). rtr_search(): Replaces rtr_pcur_open(). rtr_latch_leaves(): Replaces btr_cur_latch_leaves(). Note that unlike in the B-tree code, there is no error handling in case the sibling pages are corrupted. rtr_cur_restore_position(): Remove an unused constant parameter. btr_pcur_open_on_user_rec(): Remove the constant parameter mode=PAGE_CUR_GE. row_ins_clust_index_entry_low(): Use a new mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC. BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove. BTR_CONT_MODIFY_TREE: Note that this is only used by rtr_search_to_nth_level(). btr_pcur_optimistic_latch_leaves(): Replaces btr_cur_optimistic_latch_leaves(). ibuf_delete_rec(): Acquire exclusive ibuf.index->lock in order to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV). btr_blob_log_check_t(): Acquire a U latch on the root page, so that btr_page_alloc() in btr_store_big_rec_extern_fields() will avoid a deadlock. btr_store_big_rec_extern_fields(): Assert that the root page latch is being held. Tested by: Matthias Leich Reviewed by: Vladislav Lesin	2023-01-24 14:09:21 +02:00
Denis Protivensky	39f4674599	MDEV-24623 Replicate bulk insert as table-level exclusive key - introduce table key construction function in wsrep service interface - don't add row keys when replicating bulk insert - don't start bulk insert on applier or when transaction is not active - don't start bulk insert on system versioned tables - implement actual bulk insert table-level key replication Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-24 11:54:25 +02:00
Andrei	dc646c2389	MDEV-30423 Deadlock on Replica during BACKUP STAGE BLOCK_COMMIT on XA transactions The user XA commit execution branch was caught not have been covered with MDEV-21953 fixes. The XA involved deadlock is resolved now to apply the former fixes pattern. Along the fixes the following changes have been implemented. - MDL lock attribute correction - dissociation of the externally completed XA from the current thread's xid_state in the error branches - cleanup_context() preseves the prepared XA - wait_for_prior_commit() is relocated to satisfy both the binlog ON (log-slave-updates and skip-log-bin) and OFF slave execution branches.	2023-01-23 19:01:48 +02:00
Sergei Petrunia	f18c2b6c8a	MDEV-15178: Filesort::make_sortorder: Assertion `pos->field != __null \| (Initial patch by Varun Gupta. Amended and added comments). When the query has both 1. Aggregate functions that require sorting data by group, and 2. Window functions we need to use two temporary tables. The first temp.table will hold the join output. Then it is passed to filesort(). Reading it in sorted order allows to compute the aggregate functions. Then, we need to write their values into the second temp. table. Then, Window Function computation step can pass that to filesort() and read them in the order it needs. Failure to create the second temp. table would cause an assertion failure: window function could would not find where to get the values of the aggregate functions.	2023-01-23 18:22:21 +02:00
Marko Mäkelä	e41fb3697c	Revert "MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT" This reverts commit `f9cac8d2cb` which was accidentally pushed prematurely.	2023-01-23 14:52:49 +02:00
Marko Mäkelä	851c56771e	Merge 10.5 into 10.6	2023-01-23 13:15:41 +02:00
Marko Mäkelä	1bbf37e0db	MDEV-515: Improve test coverage Cover dict_index_t::clear() for TEMPORARY TABLE	2023-01-23 13:05:52 +02:00
Thirunarayanan Balathandayuthapani	647a7232ff	MDEV-30438 innodb.undo_truncate,4k fails when innodb-immediate-scrub-data-uncompressed is enabled - InnoDB fails to clear the freed ranges during truncation of innodb undo log tablespace. During shutdown, InnoDB flushes the freed page ranges and throws the out of bound error. mtr_t::commit_shrink(): clear the freed ranges while doing undo tablespace truncation	2023-01-23 09:55:49 +05:30
Daniel Black	26ef4875e6	MDEV-6339 deprecate log_slow_admin_statements log_slow_filter=admin as been available for a long time. Uses can migrate from log_slow_statements_statements=OFF by removing 'admin' from the default log_slow_filter variable setting.	2023-01-23 08:39:41 +11:00
Oleksandr Byelkin	f4e023ae7f	Change maturity	2023-01-20 19:31:41 +01:00
Sergei Golubchik	db50919f97	MDEV-27631 Assertion `global_status_var.global_memory_used == 0' failed in mysqld_exit plugin_vars_free_values() was walking plugin sysvars and thus did not free memory of plugin PLUGIN_VAR_NOSYSVAR vars. * change it to walk all plugin vars * add the pluginname_ prefix to NOSYSVARS var names too, so that plugin_vars_free_values() would be able to find their bookmarks	2023-01-20 15:44:15 +01:00
Daniele Sciascia	c4f5128d46	Correct assert_grep.inc params in galera gcache tests	2023-01-20 07:17:28 +02:00
Marko Mäkelä	f9cac8d2cb	MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT This also fixes part of MDEV-29835 Partial server freeze which is caused by violations of the latching order that was defined in https://dev.mysql.com/worklog/task/?id=6326 (WL#6326: InnoDB: fix index->lock contention). Unless the current thread is holding an exclusive dict_index_t::lock, it must acquire page latches in a strict parent-to-child, left-to-right order. Not all cases are fixed yet. Failure to follow the correct latching order will cause deadlocks of threads due to lock order inversion. As part of these changes, the BTR_MODIFY_TREE mode is modified so that an Update latch (U a.k.a. SX) will be acquired on the root page, and eXclusive latches (X) will be acquired on all pages leading to the leaf page, as well as any left and right siblings of the pages along the path. The test innodb.innodb_wl6326 will be removed, because at the time the DEBUG_SYNC point is hit, the thread is actually holding several page latches that will be blocking a concurrent SELECT statement. We also remove double bookkeeping that was caused due to excessive information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo store information of latched pages, and ensure that mtr_memo_slot_t::object is never a null pointer. The tree_blocks[] and tree_savepoints[] were redundant. mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo. This avoids many redundant entries in mtr_t::m_memo, as well as redundant calls to buf_page_get_gen() for blocks that had already been looked up in a mini-transaction. btr_get_latched_root(): Return a pointer to an already latched root page. This replaces btr_root_block_get() in cases where the mini-transaction has already latched the root page. btr_page_get_parent(): Fetch a parent page that was already latched in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched(). If needed, upgrade the root page U latch to X. This avoids bloating mtr_t::m_memo as well as redundant buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for B-tree defragmentation, we will invoke btr_cur_search_to_nth_level(). btr_cur_search_to_nth_level(): This will only be used for non-leaf (level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be removed altogether, or retained for the case of CHECK TABLE without QUICK. btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level() for searches to level=0 (the leaf level). btr_cur_t::pessimistic_search_leaf(): Implement the new BTR_MODIFY_TREE latching logic in the case that page splits or merges will be needed. The parent pages (and their siblings) should already be latched on the first dive to the leaf and be present in mtr_t::m_memo; there should be no need for BTR_CONT_MODIFY_TREE. This pre-latching almost suffices; MDEV-29835 will have to revise it and remove work-arounds where mtr_t::get_already_latched() fails to find a block. rtr_search_to_nth_level(): A SPATIAL INDEX version of btr_search_to_nth_level() that can search to any level (including the leaf level). rtr_search_leaf(), rtr_insert_leaf(): Wrappers for rtr_search_to_nth_level(). rtr_search(): Replaces rtr_pcur_open(). rtr_cur_restore_position(): Remove an unused constant parameter. btr_pcur_open_on_user_rec(): Remove the constant parameter mode=PAGE_CUR_GE. btr_cur_latch_leaves(): Update a pre-existing mtr_t::m_memo entry for the current leaf page. row_ins_clust_index_entry_low(): Use a new mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC. btr_cur_t::open_leaf(): Some clean-up. mtr_t::lock_register(): Register a page latch on a buffer-fixed block. BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove. BTR_CONT_MODIFY_TREE: Note that this is only used by rtr_search_to_nth_level(). btr_pcur_optimistic_latch_leaves(): Replaces btr_cur_optimistic_latch_leaves(). ibuf_delete_rec(): Acquire ibuf.index->lock.u_lock() in order to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV). Tested by: Matthias Leich	2023-01-19 17:19:18 +02:00
Alexander Barkov	0ddbec40fb	MDEV-23335 MariaBackup Incremental Does Not Reflect Dropped/Created Databases	2023-01-19 17:18:06 +04:00
Sergei Golubchik	c37ebaf6c2	MDEV-30153 ad hoc client versions are confusing try to make them less confusing for users. Hopefully, if the version string will be changed like - mariadb Ver 15.1 Distrib 10.11.2-MariaDB for Linux (x86_64) + mariadb from 10.11.2-MariaDB, client 15.1 for Linux (x86_64) users will be less inclined to reply "15.1" to the question "what mariadb version are you using?"	2023-01-19 12:39:37 +01:00
Daniele Sciascia	eeb8ebb152	MDEV-29774 BF abort no longer wakes up debug_sync waiters Since commit `d7d3ad698a`, "hard" kill is required to interrupt debug sync waits. Affected the following tests: - galera_var_retry_autocommit, - galera_bf_abort_at_after_statement - galera_parallel_apply_3nodes Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-19 08:24:41 +02:00
Oleksandr Byelkin	66bd8cd6c3	Merge branch '10.10' into 10.11	2023-01-18 16:58:28 +01:00
Oleksandr Byelkin	45087dd0b3	Merge branch '10.9' into 10.10	2023-01-18 16:45:59 +01:00
Oleksandr Byelkin	08d4968404	Merge branch '10.8' into 10.9	2023-01-18 16:39:11 +01:00
Oleksandr Byelkin	26d8485244	Merge branch '10.7' into 10.8	2023-01-18 16:37:40 +01:00
Oleksandr Byelkin	795ff0daf0	Merge branch '10.6' into 10.7	2023-01-18 16:36:13 +01:00
Marko Mäkelä	a8c5635cf1	Merge 10.5 into 10.6	2023-01-17 20:02:29 +02:00
Sergei Golubchik	a5eff044cb	MDEV-22602 Disable UPDATE CASCADE for SQL constraints fix it for named constraints too	2023-01-17 15:28:56 +01:00
Jan Lindström	107d54600e	Stabilize tests galera_gcache_recover and galera_gcache_recover_manytrx grepping on error log is not always successful as messages might be in different order or contain different values galera_vote_sr We need to make sure required table creation has replicated as we use WSREP_ON=off	2023-01-17 14:08:41 +02:00
Daniele Sciascia	9ec475c376	MDEV-29171 changing the value of wsrep_gtid_domain_id with full cluster restart fails on some nodes Fix `wsrep_init_gtid()` to avoid overwriting the domain id received during state transfer. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-17 14:08:28 +02:00
Jan Lindström	d1a4f6e627	Merge 10.6 into 10.7	2023-01-17 11:41:29 +02:00
sjaakola	95de5248c7	MDEV-26391 BF abortable mariabackup execution This commit changes backup execution (namely the block ddl phase), so that node is not paused from cluster. Instead, the following backup execution is declared as vulnerable for possible cluster level conflicts, especially with DDL statement applying. With this, the mariabackup execution may be aborted, if DDL statements happen during backup execution. This abortable backup execution is optional feature and may be enabled/disabled by wsrep_mode: BF_ABORT_MARIABACKUP. Note that old style node desync and pause, despite of WSREP_MODE_BF_MARIABACKUP is needed if node is operating as SST donor. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-17 10:03:05 +02:00
Jan Lindström	179c283372	Merge branch 10.4 into 10.5	2023-01-14 08:25:57 +02:00
Monty	981a6b7044	MDEV-30395 Wrong result with semijoin and Federated as outer table The problem was that federated engine does not support comparable rowids which was not taken into account by semijoin code. Fixed by checking that we don't use semijoin with tables that does not support comparable rowids. Other things: - Fixed some typos in the code comments	2023-01-13 16:23:21 +02:00
sjaakola	68cfcf9cb6	MDEV-29512 deadlock between commit monitor and THD::LOCK_thd_data mutex This commit contains only a mtr test for reproducing the issue in MDEV-29512 The actual fix will be pushed in wsrep-lib repository The hanging in MDEV-29512 happens when binlog purging is attempted, and there is one local BF aborted transaction waiting for commit monitor. The test will launch two node cluster and enable binlogging with expire log days, to force binlog purging to happen. A local transaction is executed so that will become BF abort victim, and has advanced to replication stage waiting for commit monitor for final cleanup (to mark position in innodb) after that, applier is released to complete the BF abort and due to binlog configuration, starting the binlog purging. This is where the hanging would occur, if code is buggy Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-13 13:11:03 +02:00
sjaakola	cd97523dcf	MDEV-30317 Transaction savepoint may cause failure in galera replaying Created mtr test for reproducing the crash Developed actual fix for the issue. Setting THD::system_thread_info.rpl_sql_info for replayer thread, same way as it is handled for appliers. Recorded test result, with the fix Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-13 13:11:03 +02:00
Marko Mäkelä	44dce3b207	MDEV-29986 Set innodb_undo_tablespaces=3 by default Starting with commit `baf276e6d4` (MDEV-19229) the parameter innodb_undo_tablespaces can be increased from its previous default value 0 while allowing an upgrade from old databases. We will change the default setting to innodb_undo_tablespaces=3 so that the space occupied by possible bursts of undo log records can be reclaimed after SET GLOBAL innodb_undo_log_truncate=ON. We will not enable innodb_undo_log_truncate by default, because it causes some observable performance degradation. Special thanks to Thirunarayanan Balathandayuthapani for diagnosing and fixing a number of bugs related to this new default setting. Tested by: Matthias Leich, Axel Schwenke, Vladislav Vaintroub (with both values of innodb_undo_log_truncate)	2023-01-13 12:46:30 +02:00
Marko Mäkelä	d6d85c92ee	Merge 10.11 into 11.0	2023-01-13 12:33:12 +02:00
Marko Mäkelä	bb3a63903e	Merge 10.10 into 10.11	2023-01-13 12:22:30 +02:00
Marko Mäkelä	6ffe9ad0d4	Merge 10.9 into 10.10	2023-01-13 11:45:57 +02:00
Marko Mäkelä	5d5735c181	Merge 10.8 into 10.9	2023-01-13 11:22:29 +02:00
Marko Mäkelä	88c35781cc	Merge 10.7 into 10.8	2023-01-13 11:11:04 +02:00
Marko Mäkelä	1e04cafcba	Merge 10.6 into 10.7	2023-01-13 10:47:56 +02:00
Marko Mäkelä	3386b30975	Merge 10.5 into 10.6	2023-01-13 10:45:41 +02:00
Marko Mäkelä	73ecab3d26	Merge 10.4 into 10.5	2023-01-13 10:18:30 +02:00
Marko Mäkelä	71e8e4934d	Merge 10.3 into 10.4	2023-01-13 09:28:25 +02:00
Nikita Malyavin	7a98d232e4	MDEV-30378 Versioned REPLACE succeeds with ON DELETE RESTRICT constraint node->is_delete was incorrectly set to NO_DELETE for a set of operations. In general we shouldn't rely on sql_command and look for more abstract ways to control the behavior. trg_event_map seems to be a suitable way. To mind replica nodes, it is ORed with slave_fk_event_map, which stores trg_event_map when replica has triggers disabled.	2023-01-12 21:51:48 +03:00
Brandon Nesterenko	b194c83b7b	MDEV-25277: mysqlbinlog --verbose cannot read row events with compressed columns: Don't know how to handle column type: 140 Problem: ======= Mysqlbinlog cannot show the type of a compressed column when two levels of verbosity is provided. Solution: ======== Extend the log event printing logic to handle and tag compressed types. Behavioral Changes: ================== Old: When mysqlbinlog is called in verbose mode and the database uses compressed columns, an error is returned to the user. New: The output will append “ COMPRESSED” on the type of compressed columns Reviewed By =========== Andrei Elkin <andrei.elkin@mariadb.com>	2023-01-11 10:37:49 -07:00
Marko Mäkelä	f27e9c8947	MDEV-29694 Remove the InnoDB change buffer The purpose of the change buffer was to reduce random disk access, which could be useful on rotational storage, but maybe less so on solid-state storage. When we wished to (1) insert a record into a non-unique secondary index, (2) delete-mark a secondary index record, (3) delete a secondary index record as part of purge (but not ROLLBACK), and the B-tree leaf page where the record belongs to is not in the buffer pool, we inserted a record into the change buffer B-tree, indexed by the page identifier. When the page was eventually read into the buffer pool, we looked up the change buffer B-tree for any modifications to the page, applied these upon the completion of the read operation. This was called the insert buffer merge. We remove the change buffer, because it has been the source of various hard-to-reproduce corruption bugs, including those fixed in commit `5b9ee8d819` and commit `165564d3c3` but not limited to them. A downgrade will fail with a clear message starting with commit `db14eb16f9` (MDEV-30106). buf_page_t::state: Merge IBUF_EXIST to UNFIXED and WRITE_FIX_IBUF to WRITE_FIX. buf_pool_t::watch[]: Remove. trx_t: Move isolation_level, check_foreigns, check_unique_secondary, bulk_insert into the same bit-field. The only purpose of trx_t::check_unique_secondary is to enable bulk insert into an empty table. It no longer enables insert buffering for UNIQUE INDEX. btr_cur_t::thr: Remove. This field was originally needed for change buffering. Later, its use was extended to cover SPATIAL INDEX. Much of the time, rtr_info::thr holds this field. When it does not, we will add parameters to SPATIAL INDEX specific functions. ibuf_upgrade_needed(): Check if the change buffer needs to be updated. ibuf_upgrade(): Merge and upgrade the change buffer after all redo log has been applied. Free any pages consumed by the change buffer, and zero out the change buffer root page to mark the upgrade completed, and to prevent a downgrade to an earlier version. dict_load_tablespaces(): Renamed from dict_check_tablespaces_and_store_max_id(). This needs to be invoked before ibuf_upgrade(). btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics. The change buffer merge does not need this function anymore. btr_page_alloc(): Renamed from btr_page_alloc_low(). We no longer allocate any change buffer pages. btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics. The change buffer merge does not need this function anymore. row_search_index_entry(), btr_lift_page_up(): Add a parameter thr for the SPATIAL INDEX case. rtr_page_split_and_insert(): Specialized from btr_page_split_and_insert(). rtr_root_raise_and_insert(): Specialized from btr_root_raise_and_insert(). Note: The support for upgrading from the MySQL 3.23 or MySQL 4.0 change buffer format that predates the MySQL 4.1 introduction of the option innodb_file_per_table was removed in MySQL 5.6.5 as part of mysql/mysql-server@69b6241a79 and MariaDB 10.0.11 as part of `1d0f70c2f8`. In the tests innodb.log_upgrade and innodb.log_corruption, we create valid (upgraded) change buffer pages. Tested by: Matthias Leich	2023-01-11 17:59:36 +02:00

... 3 4 5 6 7 ...

17321 commits