mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-28 17:54:16 +01:00

Author	SHA1	Message	Date
Alexander Barkov	4ced4898fd	MDEV-32958 Unusable key notes do not get reported for some operations Enable unusable key notes for non-equality predicates: <, <=, =>, >, BETWEEN, IN, LIKE Note, in some scenarios it displays duplicate notes, e.g. for queries with ORDER BY: SELECT * FROM t1 WHERE indexed_string_column >= 10 ORDER BY indexed_string_column LIMIT 5; This should be tolarable. Getting rid of the diplicate note completely would need a much more complex patch, which is not desiable in 10.6. Details: - Changing RANGE_OPT_PARAM::note_unusable_keys from bool to a new data type Item_func::Bitmap, so the caller can choose with a better granuality which predicates should raise unusable key notes inside the range optimizer: a. all predicates (=, <=>, <, <=, =>, >, BETWEEN, IN, LIKE) b. all predicates except equality (=, <=>) c. none of the predicates "b." is needed because in some scenarios equality predicates (=, <=>) send unusable key notes at an earlier stage, before the range optimizer, during update_ref_and_keys(). Calling the range optimizer with "all predicates" would produce duplicate notes for = and <=> in such cases. - Fixing get_quick_record_count() to call the range optimizer with "all predicates except equality" instead of "none of the predicates". Before this change the range optimizer suppressed all notes for non-equality predicates: <, <=, =>, >, BETWEEN, IN, LIKE. This actually fixes the reported problem. - Fixing JOIN::make_range_rowid_filters() to call the range optimizer with "all predicates except equality" instead of "all predicates". Before this change the range optimizer produced duplicate notes for = and <=> during a rowid_filter optimization. - Cleanup: Adding the op_collation argument to Field::raise_note_cannot_use_key_part() and displaying the operation collation rather than the argument collation in the unusable key note. This is important for operations with more than two arguments: BETWEEN and IN, e.g.: SELECT * FROM t1 WHERE column_utf8mb3_general_ci BETWEEN 'a' AND 'b' COLLATE utf8mb3_unicode_ci; SELECT * FROM t1 WHERE column_utf8mb3_general_ci IN ('a', 'b' COLLATE utf8mb3_unicode_ci); The note for 'a' now prints utf8mb3_unicode_ci as the collation. which is the collation of the entire operation: Cannot use key key1 part[0] for lookup: "`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >= "'a'" of collation `utf8mb3_unicode_ci` Before this change it printed the collation of 'a', so the note was confusing: Cannot use key key1 part[0] for lookup: "`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >= "'a'" of collation `utf8mb3_general_ci`"	2023-12-11 08:55:27 +04:00
Oleksandr Byelkin	6bf8483cac	Merge branch '10.5' into 10.6	2023-08-01 15:08:52 +02:00
Oleksandr Byelkin	f52954ef42	Merge commit '10.4' into 10.5	2023-07-20 11:54:52 +02:00
Monty	6daccd4e48	Moved test from group_min_max.test to group_min_max_not_embedded.test	2023-06-25 16:26:28 +03:00
Sergei Petrunia	b3074128a6	MDEV-31380 post-fix: fix group_min_max.test with embedded and view-protocol embedded doesn't have optimizer trace, view-protocol doesn't work with long column names.	2023-06-08 11:14:50 +03:00
Marko Mäkelä	80585c9d6f	Merge 10.5 into 10.6	2023-06-08 10:42:56 +03:00
Sergei Petrunia	a0e7bd735b	MDEV-31380: Assertion `s->table->opt_range_condition_rows <= s->found_records' failed LooseScan code set opt_range_condition_rows to be the MIN(loose_scan_plan->records, table->records) totally ignoring possible quick range selects. If there was a quick select $QUICK on another index with $QUICK->records < loose_scan_plan->records this would create a situation where opt_range_condition_rows > $QUICK->records which causes an assert in 10.6+ and potentially wrong query plan choice in 10.5. Fixed by making opt_range_condition_rows to be the minimum #rows of any quick select. Approved-by: Monty <monty@mariadb.org>	2023-06-07 13:54:34 +03:00
Marko Mäkelä	270eeeb523	Merge 10.5 into 10.6	2023-05-23 12:25:39 +03:00
Monty	16258677b3	MDEV-6768 Wrong result with aggregate with join with no result set When a query does implicit grouping and join operation produces an empty result set, a NULL-complemented row combination is generated. However, constant table fields still show non-NULL values. What happens in the is that end_send_group() is called with a const row but without any rows matching the WHERE clause. This last part is shown by 'join->first_record' not being set. This causes item->no_rows_in_result() to be called for all items to reset all sum functions to their initial state. However fields are not set to NULL. The used fix is to produce NULL-complemented records for constant tables as well. Also, reset the constant table's records back in case we're in a subquery which may get re-executed. An alternative fix would have item->no_rows_in_result() also work with Item_field objects. There is some other issues with the code: - join->no_rows_in_result_called is used but never set. - Tables that are used with group functions are not properly marked as maybe_null, which is required if the table rows should be regarded as null-complemented (not existing). - The code that tries to detect if mixed_implicit_grouping should be set didn't take into account all usage of fields and sum functions. - Item_func::restore_to_before_no_rows_in_result() called the wrong function. - join->clear() does not use a table_map argument to clear_tables(), which caused it to ignore constant tables. - unclear_tables() does not correctly restore status to what is was before clear_tables(). Main bug fix was to always use a table_map argument to clear_tables() and always use join->clear() and clear_tables() together with unclear_tables(). Other fixes: - Fixed Item_func::restore_to_before_no_rows_in_result() - Set 'join->no_rows_in_result_called' when no_rows_in_result_set() is called. - Removed not used argument from setup_end_select_func(). - More code comments - Ensure that end_send_group() modifies the same fields as are in the result set. - Changed return_zero_rows() to use pointers instead of references, similar to the rest of the code. Reviewer: Sergei Petrunia <sergey@mariadb.com>	2023-05-22 17:15:46 +03:00
Monty	7f96dd50e2	MDEV-6768 Wrong result with aggregate with join with no result set When a query does implicit grouping and join operation produces an empty result set, a NULL-complemented row combination is generated. However, constant table fields still show non-NULL values. What happens in the is that end_send_group() is called with a const row but without any rows matching the WHERE clause. This last part is shown by 'join->first_record' not being set. This causes item->no_rows_in_result() to be called for all items to reset all sum functions to their initial state. However fields are not set to NULL. The used fix is to produce NULL-complemented records for constant tables as well. Also, reset the constant table's records back in case we're in a subquery which may get re-executed. An alternative fix would have item->no_rows_in_result() also work with Item_field objects. There is some other issues with the code: - join->no_rows_in_result_called is used but never set. - Tables that are used with group functions are not properly marked as maybe_null, which is required if the table rows should be regarded as null-complemented (not existing). - The code that tries to detect if mixed_implicit_grouping should be set didn't take into account all usage of fields and sum functions. - Item_func::restore_to_before_no_rows_in_result() called the wrong function. - join->clear() does not use a table_map argument to clear_tables(), which caused it to ignore constant tables. - unclear_tables() does not correctly restore status to what is was before clear_tables(). Main bug fix was to always use a table_map argument to clear_tables() and always use join->clear() and clear_tables() together with unclear_tables(). Other fixes: - Fixed Item_func::restore_to_before_no_rows_in_result() - Set 'join->no_rows_in_result_called' when no_rows_in_result_set() is called. - Removed not used argument from setup_end_select_func(). - More code comments - Ensure that end_send_group() modifies the same fields as are in the result set. - Changed return_zero_rows() to use pointers instead of references, similar to the rest of the code.	2023-05-02 23:43:12 +03:00
Marko Mäkelä	abe4c7bfd6	Merge 10.5 into 10.6	2023-04-21 16:38:22 +03:00
Sergei Petrunia	be7ef6566f	MDEV-30605: Wrong result while using index for group-by A GROUP BY query which uses "MIN(pk)" and has "pk<>const" in the WHERE clause would produce wrong result when handled with "Using index for group-by". Here "pk" column is the table's primary key. The problem was introduced by fix for MDEV-23634. It made the range optimizer to not produce ranges for conditions in form "pk != const". However, LooseScan code requires that the optimizer is able to convert the condition on the MIN/MAX column into an equivalent range. The range is used to locate the row that has the MIN/MAX value. LooseScan checks this in check_group_min_max_predicates(). This fix makes the code in that function to take into account that "pk != const" does not produce a range.	2023-04-18 14:42:47 +03:00
Marko Mäkelä	56c9b0bca0	Merge 10.5 into 10.6	2023-01-10 13:54:17 +02:00
Monty	d0603fc5ba	MDEV-30240 Wrong result upon aggregate function with SQL_BUFFER_RESULT The problem was that when storing rows into a temporary table, MIN/MAX items that where marked as constants (as theire value had been computed at start of query) would be reset. Fixed by not reseting MIN/MAX items that are marked as const in Item_sum_min_max::clear().	2023-01-03 19:44:19 +02:00
Marko Mäkelä	30914389fe	Merge 10.5 into 10.6	2022-07-27 17:52:37 +03:00
Marko Mäkelä	098c0f2634	Merge 10.4 into 10.5	2022-07-27 17:17:24 +03:00
Oleksandr Byelkin	3bb36e9495	Merge branch '10.3' into 10.4	2022-07-27 11:02:57 +02:00
Alexander Barkov	57f5c319af	MDEV-21445 Strange/inconsistent behavior of IN condition when mixing numbers and strings	2022-07-06 15:42:21 +04:00
Marko Mäkelä	d96433ad20	Merge 10.5 into 10.6	2022-03-01 17:40:27 +02:00
Sergei Petrunia	a1965b80e1	Make testcase for MDEV-26585 stable.	2022-03-01 17:19:58 +03:00
Marko Mäkelä	cce994057b	Merge 10.5 into 10.6	2022-02-09 15:49:50 +02:00
Oleksandr Byelkin	34c5019698	Merge branch '10.5' into bb-10.5-release	2022-02-09 08:57:41 +01:00
Monty	38058c04a4	MDEV-26585 Wrong query results when `using index for group-by` The problem was that "group_min_max optimization" does not work if some aggregate functions, like COUNT(), is used. The function get_best_group_min_max() is using the join->sum_funcs array to check which aggregate functions are used. The bug was that aggregates in HAVING where not yet added to join->sum_funcs at the time get_best_group_min_max() was called. Fixed by populate join->sum_funcs already in prepare, which means that all sum functions will be in join->sum_funcs in get_best_group_min_max(). A benefit of this approach is that we can remove several calls to make_sum_func_list() from the code and simplify the function. I removed some wrong setting of 'sort_and_group'. This variable is set when alloc_group_fields() is called, as part of allocating the cache needed by end_send_group() and does not need to be set by other functions. One problematic thing was that Spider is using join->sum_funcs to detect at which stage the optimizer is and do internal calculations of aggregate functions. Updating join->sum_funcs early caused Spider to fail when trying to find min/max values in opt_sum_query(). Fixed by temporarily resetting sum_funcs during opt_sum_query(). Reviewer: Sergei Petrunia	2022-02-08 14:32:29 +02:00
Monty	d314bd2664	MDEV-27442 Wrong result upon query with DISTINCT and EXISTS subquery The problem was that get_best_group_min_max() did not check if fields used by the "group_min_max optimization" where used in sub queries. Because of this, it did not detect that a key (b,a) was used in the WHERE clause for the statement: SELECT DISTINCT b FROM t1 WHERE EXISTS ( SELECT 1 FROM DUAL WHERE a > 1 ). Fixed by also traversing the sub queries when checking if a field is used. This disables group_min_max_optimization for the above query. Reviewer: Sergei Petrunia	2022-02-08 14:32:28 +02:00
Oleksandr Byelkin	f5c5f8e41e	Merge branch '10.5' into 10.6	2022-02-03 17:01:31 +01:00
Oleksandr Byelkin	cf63eecef4	Merge branch '10.4' into 10.5	2022-02-01 20:33:04 +01:00
Monty	fdec885201	MDEV-25830 optimizer_use_condition_selectivity=4 sometimes produces worse plan than optimizer_use_condition_selectivity=1 The issue was that calc_cond_selectivity_for_table prefered ranges with many parts and when deciding on which selectivity to use. Fixed by going through ranges according to the number of rows in the range. This ensures that selectivity from ranges with few rows will be prefered over ranges with many rows for indexes that uses the same columns.	2022-01-19 18:49:53 +02:00
Marko Mäkelä	d2e2d32933	Merge 10.5 into 10.6	2021-04-14 12:32:27 +03:00
Marko Mäkelä	6c3e860cbf	Merge 10.4 into 10.5	2021-04-14 11:35:39 +03:00
Sergei Petrunia	c03841ec0e	MDEV-23634: Select query hanged the server and leads to OOM ... Handle "col<>const" in the same way that MDEV-21958 did for "col NOT IN(const-list)": do not use the condition for range/index_merge accesses if there is a unique UNIQUE KEY(col). The testcase is in main/range.test. The rest of test updates are due to widespread use of 'pk<>1' in the testsuite. Changed the test to use different but equivalent forms of the conditions.	2021-04-08 17:25:02 +03:00
Varun Gupta	d79c3f3297	MDEV-24353: Adding GROUP BY slows down a query A heuristic in best_access_path says that if for an index ref access involved key parts which are greater than equal to that for range access, then range access should not be considered. The assumption made by this heuristic does not hold when the range optimizer opted to use the group-by min-max optimization. So the fix here would be to not consider the heuristic if the range optimizer picked the usage of group-by min-max optimization.	2020-12-11 18:38:18 +05:30
Monty	eb483c5181	Updated optimizer costs in multi_range_read_info_const() and sql_select.cc - multi_range_read_info_const now uses the new records_in_range interface - Added handler::avg_io_cost() - Don't calculate avg_io_cost() in get_sweep_read_cost if avg_io_cost is not 1.0. In this case we trust the avg_io_cost() from the handler. - Changed test_quick_select to use TIME_FOR_COMPARE instead of TIME_FOR_COMPARE_IDX to align this with the rest of the code. - Fixed bug when using test_if_cheaper_ordering where we didn't use keyread if index was changed - Fixed a bug where we didn't use index only read when using order-by-index - Added keyread_time() to HEAP. The default keyread_time() was optimized for blocks and not suitable for HEAP. The effect was the HEAP prefered table scans over ranges for btree indexes. - Fixed get_sweep_read_cost() for HEAP tables - Ensure that range and ref have same cost for simple ranges Added a small cost (MULTI_RANGE_READ_SETUP_COST) to ranges to ensure we favior ref for range for simple queries. - Fixed that matching_candidates_in_table() uses same number of records as the rest of the optimizer - Added avg_io_cost() to JT_EQ_REF cost. This helps calculate the cost for HEAP and temporary tables better. A few tests changed because of this. - heap::read_time() and heap::keyread_time() adjusted to not add +1. This was to ensure that handler::keyread_time() doesn't give higher cost for heap tables than for normal tables. One effect of this is that heap and derived tables stored in heap will prefer key access as this is now regarded as cheap. - Changed cost for index read in sql_select.cc to match multi_range_read_info_const(). All index cost calculation is now done trough one function. - 'ref' will now use quick_cost for keys if it exists. This is done so that for '=' ranges, 'ref' is prefered over 'range'. - scan_time() now takes avg_io_costs() into account - get_delayed_table_estimates() uses block_size and avg_io_cost() - Removed default argument to test_if_order_by_key(); simplifies code	2020-03-27 03:58:32 +02:00
Marko Mäkelä	c016ea660e	Merge 10.2 into 10.3	2019-09-23 10:25:34 +03:00
Oleksandr Byelkin	2792c6e7b0	Merge branch '10.3' into 10.4	2019-07-28 13:43:26 +02:00
Galina Shalygina	7fe1ca7ed6	Merge branch '10.4' into bb-10.4-mdev7486	2019-02-19 11:00:39 +03:00
Sergei Petrunia	15a77a1a2c	MDEV-18608: Defaults for 10.4: histogram size should be set Change the defaults: -histogram_size=0 +histogram_size=254 -histogram_type=SINGLE_PREC_HB +histogram_type=DOUBLE_PREC_HB Adjust the testcases: - Some have ignorable changes in EXPLAIN outputs and more counter increments due to EITS table reads. - Testcases that meaningfully depend on the old defaults are changed to use the old values.	2019-02-18 13:37:57 +03:00
Galina Shalygina	7a77b221f1	MDEV-7486: Condition pushdown from HAVING into WHERE Condition can be pushed from the HAVING clause into the WHERE clause if it depends only on the fields that are used in the GROUP BY list or depends on the fields that are equal to grouping fields. Aggregate functions can't be pushed down. How the pushdown is performed on the example: SELECT t1.a,MAX(t1.b) FROM t1 GROUP BY t1.a HAVING (t1.a>2) AND (MAX(c)>12); => SELECT t1.a,MAX(t1.b) FROM t1 WHERE (t1.a>2) GROUP BY t1.a HAVING (MAX(c)>12); The implementation scheme: 1. Extract the most restrictive condition cond from the HAVING clause of the select that depends only on the fields that are used in the GROUP BY list of the select (directly or indirectly through equalities) 2. Save cond as a condition that can be pushed into the WHERE clause of the select 3. Remove cond from the HAVING clause if it is possible The optimization is implemented in the function st_select_lex::pushdown_from_having_into_where(). New test file having_cond_pushdown.test is created.	2019-02-17 23:38:44 -08:00
Igor Babaev	37deed3f37	Merge branch '10.4' into bb-10.4-mdev16188	2019-02-03 18:41:18 -08:00
Igor Babaev	658128af43	MDEV-16188 Use in-memory PK filters built from range index scans This patch contains a full implementation of the optimization that allows to use in-memory rowid / primary filters built for range conditions over indexes. In many cases usage of such filters reduce the number of disk seeks spent for fetching table rows. In this implementation the choice of what possible filter to be applied (if any) is made purely on cost-based considerations. This implementation re-achitectured the partial implementation of the feature pushed by Galina Shalygina in the commit `8d5a11122c`. Besides this patch contains a better implementation of the generic handler function handler::multi_range_read_info_const() that takes into account gaps between ranges when calculating the cost of range index scans. It also contains some corrections of the implementation of the handler function records_in_range() for MyISAM. This patch supports the feature for InnoDB and MyISAM.	2019-02-03 14:56:12 -08:00
Varun Gupta	93c360e3a5	MDEV-15253: Default optimizer setting changes for MariaDB 10.4 use_stat_tables= PREFERABLY optimizer_use_condition_selectivity= 4	2018-12-09 09:22:00 +05:30
Sergei Golubchik	57e0da50bb	Merge branch '10.2' into 10.3	2018-09-28 16:37:06 +02:00
Marko Mäkelä	7830fb7f45	Merge 10.2 into 10.3	2018-08-28 12:22:56 +03:00
Michael Widenius	a7abddeffa	Create 'main' test directory and move 't' and 'r' there	2018-03-29 13:59:44 +03:00

43 commits