Workaround patch: Do not remove GROUP BY clause when it has
subquer(ies) in it.
remove_redundant_subquery_clauses() removes redundant GROUP BY clause
from queries in form:
expr IN (SELECT no_aggregates GROUP BY ...)
expr {CMP} {ALL|ANY|SOME} (SELECT no_aggregates GROUP BY ...)
This hits problems when the GROUP BY clause itself has subquer(y/ies).
This patch is just a workaround: it disables removal of GROUP BY clause
if the clause has one or more subqueries in it.
Tests:
- subselect_elimination.test has all known crashing cases.
- subselect4.result, insert_select.result are updated.
Note that in some cases results of SELECT are changed too (not just
EXPLAINs). These are caused by non-deterministic SQL: when running a
query like:
x > ANY( SELECT col1 FROM t1 GROUP BY constant_expression)
without removing the GROUP BY, the executor is free to pick the value
of t1.col1 from any row in the GROUP BY group (denote it $COL1_VAL).
Then, it computes x > ANY(SELECT $COL1_VAL).
When running the same query and removing the GROUP BY:
x > ANY( SELECT col1 FROM t1)
the executor will actually check all rows of t1.
If a query has a HAVING clause that contains a predicate with a constant
IN subquery whose lef part in its turn is a subquery and the predicate is
subject to pushdown from HAVING to WHERE then execution of the query could
cause a crash of the server.
The cause of the problem was the missing implementation of the walk()
method for the class Item_in_optimizer. As a result in some cases the left
operand of the Item_in_optimizer condition could be traversed twice by
the walk procedure. For many call-back functions used as an argument of
this procedure it does not matter. Yet it matters for the call-back
function cleanup_excluding_immutables_processor() used in pushdown of
predicates from HAVING to WHERE. If the processed item is marked with
the IMMUTABLE_FL flag then the processor just removes this flag, otherwise
it performs cleanup of the item making it unfixed. If an item is marked
with an the IMMUTABLE_FL and it traversed with this processor twice then
it becomes unfixed after the second traversal though the flag indicates
that the item should not be cleaned up.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Enable unusable key notes for non-equality predicates:
<, <=, =>, >, BETWEEN, IN, LIKE
Note, in some scenarios it displays duplicate notes, e.g.
for queries with ORDER BY:
SELECT * FROM t1
WHERE indexed_string_column >= 10
ORDER BY indexed_string_column
LIMIT 5;
This should be tolarable. Getting rid of the diplicate note
completely would need a much more complex patch, which is
not desiable in 10.6.
Details:
- Changing RANGE_OPT_PARAM::note_unusable_keys from bool
to a new data type Item_func::Bitmap, so the caller can
choose with a better granuality which predicates
should raise unusable key notes inside the range optimizer:
a. all predicates (=, <=>, <, <=, =>, >, BETWEEN, IN, LIKE)
b. all predicates except equality (=, <=>)
c. none of the predicates
"b." is needed because in some scenarios equality predicates (=, <=>)
send unusable key notes at an earlier stage, before the range optimizer,
during update_ref_and_keys(). Calling the range optimizer with
"all predicates" would produce duplicate notes for = and <=> in such cases.
- Fixing get_quick_record_count() to call the range optimizer
with "all predicates except equality" instead of "none of the predicates".
Before this change the range optimizer suppressed all notes for
non-equality predicates: <, <=, =>, >, BETWEEN, IN, LIKE.
This actually fixes the reported problem.
- Fixing JOIN::make_range_rowid_filters() to call the range optimizer
with "all predicates except equality" instead of "all predicates".
Before this change the range optimizer produced duplicate notes
for = and <=> during a rowid_filter optimization.
- Cleanup:
Adding the op_collation argument to Field::raise_note_cannot_use_key_part()
and displaying the operation collation rather than the argument collation
in the unusable key note. This is important for operations with more than
two arguments: BETWEEN and IN, e.g.:
SELECT * FROM t1
WHERE column_utf8mb3_general_ci
BETWEEN 'a' AND 'b' COLLATE utf8mb3_unicode_ci;
SELECT * FROM t1
WHERE column_utf8mb3_general_ci
IN ('a', 'b' COLLATE utf8mb3_unicode_ci);
The note for 'a' now prints utf8mb3_unicode_ci as the collation.
which is the collation of the entire operation:
Cannot use key key1 part[0] for lookup:
"`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
"'a'" of collation `utf8mb3_unicode_ci`
Before this change it printed the collation of 'a',
so the note was confusing:
Cannot use key key1 part[0] for lookup:
"`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
"'a'" of collation `utf8mb3_general_ci`"
The code inside Item_subselect::fix_fields() could fail to check
that left expression had an Item_row, like this:
(('x', 1.0) ,1) IN (SELECT 'x', 1.23 FROM ... UNION ...)
In order to hit the failure, the first SELECT of the subquery had
to be a degenerate no-tables select. In this case, execution will
not enter into Item_in_subselect::create_row_in_to_exists_cond()
and will not check if left_expr is composed of scalars.
But the subquery is a UNION so as a whole it is not degenerate.
We try to create an expression cache for the subquery.
We create a temp.table from left_expr columns. No field is created
for the Item_row. Then, we crash when trying to add an index over a
non-existent field.
Fixed by moving the left_expr cardinality check to a point in
check_and_do_in_subquery_rewrites() which gets executed for all
cases.
It's better to make the check early so we don't have to care about
subquery rewrite code hitting Item_row in left_expr.
ANALYZE FORMAT=JSON output now includes table.r_engine_stats which
has the engine statistics. Only non-zero members are printed.
Internally: EXPLAIN data structures Explain_table_acccess and
Explain_update now have handler* handler_for_stats pointer.
It is used to read statistics from handler_for_stats->handler_stats.
The following applies only to 10.9+, backport doesn't use it:
Explain data structures exist after the tables are closed. We avoid
walking invalid pointers using this:
- SQL layer calls Explain_query::notify_tables_are_closed() before
closing tables.
- After that call, printing of JSON output is disabled. Non-JSON output
can be printed but we don't access handler_for_stats when doing that.
MDEV-30668 Set function aggregated in outer select used in view definition
This patch fixes two bugs concerning views whose specifications contain
subqueries with set functions aggregated in outer selects.
Due to the first bug those such views that have implicit grouping were
considered as mergeable. This led to wrong result sets for selects from
these views.
Due to the second bug the aggregation select was determined incorrectly and
this led to bogus error messages.
The patch added several test cases for these two bugs and for four other
duplicate bugs.
The patch also enables view-protocol for many other test cases.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
This bug affected some queries with an IN/ALL/ANY predicand or an EXISTS
predicate whose subquery contained a GROUP BY clause that could be
eliminated. If this clause used a IN/ALL/ANY predicand whose left operand
was a single-value subquery then execution of the query caused a crash of
the server after invokation of remove_redundant_subquery_clauses().
The crash was caused by an attempt to exclude the unit for the single-value
subquery from the query tree for the second time by the function
Item_subselect::eliminate_subselect_processor().
This bug had been masked by the bug MDEV-28617 until a fix for the latter
that properly excluded units was pushed into 10.3.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
This bug could cause a crash of the server when executing queries containing
ANY/ALL predicands with redundant subqueries in GROUP BY clauses.
These subqueries are eliminated by remove_redundant_subquery_clause()
together with elimination of GROUP BY list containing these subqueries.
However the references to the elements of the GROUP BY remained in the
JOIN::all_fields list of the right operand of of the ALL/ANY predicand.
Later these references confused make_aggr_tables_info() when forming
proper execution structures after ALL/ANY predicands had been replaced
with expressions containing MIN/MAX set functions.
The patch just removes these references from JOIN::all_fields list used
by the subquery of the ALL/ANY predicand when its GROUP BY clause is
eliminated.
Approved by Oleksandr Byelkin <sanja@mariadb.com>