It was found that unnecessary work of building Ordered_key structures
is being done when processing NULL-aware materialization for IN predicates
having only one column. In fact, the logic for that simplified case can be
expressed as follows.
Say we have predicate left_expr IN (SELECT <subq1>), where left_expr is
scalar(not a tuple).
Then
if (left_expr is NULL) {
if (subq1 produced any rows) {
// note that we don't care if subq1 has produced
// NULLs or not.
NULL IN (<some values>) -> UNKNOWN, i.e. NULL.
} else {
NULL IN ({empty-set}) -> FALSE.
}
} else {
// left_expr is a non-NULL value
if (subq1 output has a match for left_expr) {
left_expr IN (..., left_expr ...) -> TRUE
} else {
// no "known" matches.
if (subq1 output has a NULL) {
left_expr IN ( ... NULL ...) ->
(NULL could have been a match or not)
-> NULL.
} else {
// subq1 didn't produce any "UNKNOWNs" so
// we're positive there weren't any matches
-> FALSE.
}
}
}
This commit introduces subselect_single_column_partial_engine class
implementing the logic described.
Reviewer: Sergei Petrunia <sergey@mariadb.com>
Item_exists_subselect::fix_length_and_dec() sets explicit_limit to 1.
In the exists2in transformation it resets select_limit to NULL. For
consistency we should reset explicity_limit too.
This fixes a bug where spider table returns wrong results for queries
that gets through exists2in transformation when semijoin is off.
This commits adds the "materialization" block to the output of
EXPLAIN/ANALYZE FORMAT=JSON when materialized subqueries are involved
into processing. In the case of ANALYZE additional runtime information
is displayed, such as:
- chosen strategy of materialization
- number of partial match/index lookup loops
- sizes of partial match buffers
JOIN_CACHE has a light-weight initialization mode that's targeted at
EXPLAINs. In that mode, JOIN_CACHE objects are not able to execute.
Light-weight mode was used whenever the statement was an EXPLAIN. However
the EXPLAIN can execute subqueries, provided they enumerate less than
@@expensive_subquery_limit rows.
Make sure we use light-weight initialization mode only when the select is
more expensive @@expensive_subquery_limit.
Also add an assert into JOIN_CACHE::put_record() which prevents its use
if it was initialized for EXPLAIN only.
The bug is fixed by the patch ported from MySQL. See the comprehensive
description below.
commit 455c4e8810c76430719b1a08a63ca0f69f44678a
Author: Guilhem Bichot <guilhem.bichot@oracle.com>
Date: Fri Mar 13 17:51:27 2015 +0100
Bug#17668844: CRASH/ASSERT AT ITEM_TYPE_HOLDER::VAL_STR IN ITEM.C
We have a predicate of the form:
literal_row <=> (a UNION)
The subquery is constant, so Item_cache objects are used for its
SELECT list.
In order, this happens:
- Item_subselect::fix_fields() calls select_lex_unit::prepare,
where we create Item_type_holder's
(appended to unit->types list), create the tmp table (using type info
found in unit->types), and call fill_item_list() to put the
Item_field's of this table into unit->item_list.
- Item_subselect::fix_length_and_dec() calls set_row() which
makes Item_cache's of the subquery wrap the Item_type_holder's
- When/if a first result row is found for the subquery,
Item_cache's are re-pointed to unit->item_list
(i.e. Item_field objects which reference the UNION's tmp table
columns) (see call to Item_singlerow_subselect::store()).
- In our subquery, no result row is found, so the Item_cache's
still wrap Item_type_holder's; evaluating '<=>' reads the
value of those, but Item_type_holder objects are not expected to be
evaluated.
Fix: instead of putting unit->types into Item_cache, and later
replacing with unit->item_list, put unit->item_list in Item_cache from
the start.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
This is the follow-up patch that removes explicit use of thd->stmt_arena
for memory allocation and replaces it with call of the method
THD::active_stmt_arena_to_use()
Additionally, this patch adds extra DBUG_ASSERT to check that right
query arena is in use.
A subquery in form "(SELECT not_null_value LIMIT 1 OFFSET 1)" will
produce no rows which will translate into scalar SQL NULL value.
The code in Item_singlerow_subselect::fix_length_and_dec() failed to
take the LIMIT/OFFSET clause into account and used to set
item_subselect->maybe_null=0, despite that SQL NULL will be produced.
If such subselect was used in ORDER BY, this would cause a crash in
filesort() code when it would get a NULL value for a not-nullable item.
also made subselect_engine::no_tables() const function.
The code inside Item_subselect::fix_fields() could fail to check
that left expression had an Item_row, like this:
(('x', 1.0) ,1) IN (SELECT 'x', 1.23 FROM ... UNION ...)
In order to hit the failure, the first SELECT of the subquery had
to be a degenerate no-tables select. In this case, execution will
not enter into Item_in_subselect::create_row_in_to_exists_cond()
and will not check if left_expr is composed of scalars.
But the subquery is a UNION so as a whole it is not degenerate.
We try to create an expression cache for the subquery.
We create a temp.table from left_expr columns. No field is created
for the Item_row. Then, we crash when trying to add an index over a
non-existent field.
Fixed by moving the left_expr cardinality check to a point in
check_and_do_in_subquery_rewrites() which gets executed for all
cases.
It's better to make the check early so we don't have to care about
subquery rewrite code hitting Item_row in left_expr.
This bug affected EXPLAIN EXTENDED command for single-table DELETE that
used an IN subquery in its WHERE clause. A crash happened if the optimizer
chose to employ index_subquery or unique_subquery access when processing
such command.
The crash happened when the command tried to print the transformed query.
In the current code of 10.4 for single-table DELETE statements the output
of any explain command is produced after the join structures of all used
subqueries have been destroyed. JOIN::destroy() sets the field tab of the
JOIN_TAB structures created for subquery tables to NULL. As a result
subselect_indexsubquery_engine::print(), subselect_indexsubquery_engine()
cannot use this field to get the alias name of the joined table.
This patch suggests to use the field TABLE_LIST::TAB that can be accessed
from JOIN_TAB::tab_list to get the alias name of the joined table.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
This patch allowed transformation of EXISTS subqueries into equivalent
IN predicands at the top level of WHERE conditions for multi-table UPDATE
and DELETE statements. There was no reason to prohibit the transformation
for such statements. The transformation provides more opportunities of
using semi-join optimizations.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Nowdays subquery in a UNION's ORDER BY placed correctly in fake select,
the only problem was incorrect Name_resolution_contect is fixed by this
patch in parsing, so we do not need scanning/reseting of ORDER BY of
a union.
This bug affected some queries with an IN/ALL/ANY predicand or an EXISTS
predicate whose subquery contained a GROUP BY clause that could be
eliminated. If this clause used a IN/ALL/ANY predicand whose left operand
was a single-value subquery then execution of the query caused a crash of
the server after invokation of remove_redundant_subquery_clauses().
The crash was caused by an attempt to exclude the unit for the single-value
subquery from the query tree for the second time by the function
Item_subselect::eliminate_subselect_processor().
This bug had been masked by the bug MDEV-28617 until a fix for the latter
that properly excluded units was pushed into 10.3.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
The cause of crash:
remove_redundant_subquery_clauses() removes redundant item expressions.
The primary goal of this is to remove the subquery items.
The removal process unlinks the subquery from SELECT_LEX tree, but does
not remove it from SELECT_LEX:::ref_pointer_array or from JOIN::all_fields.
Then, setup_subquery_caches() tries to wrap the subquery item in an
expression cache, which fails, the first reason for failure being that
the item doesn't have a query plan.
Solution: do not wrap eliminated items with expression cache.
(also added an assert to check that we do not attempt to execute them).
This may look like an incomplete fix: why don't we remove any mention
of eliminated item everywhere? The difficulties here are:
* items can be "un-removed" (see set_fake_select_as_master_processor)
* it's difficult to remove an element from ref_pointer_array: Item_ref
objects refer to elements of that array, so one can't shift elements in
it. Replacing eliminated subselect with a dummy Item doesn't look like a
good idea, either.
When single-row subquery fails with "Subquery reutrns more than 1 row"
error, it will raise an error and return NULL.
On the other hand, Item_singlerow_subselect sets item->maybe_null=0
for table-less subqueries like "(SELECT not_null_value)" (*)
This discrepancy (item with maybe_null=0 returning NULL) causes the
code in Type_handler_decimal_result::make_sort_key_part() to crash.
Fixed this by allowing inference (*) only when the subquery is NOT a
UNION.
[Patch idea by Igor Babaev]
Symptom: for IN (SELECT ...) subqueries using IN-to-EXISTS transformation,
the optimizer was unable to make inferences using multiple equalities.
The cause is code Item_in_subselect::inject_in_to_exists_cond() which may
break invariants that Multiple-Equality code relies on. In particular, it
may produce a WHERE condition with an empty Item_cond::m_cond_equal.
Fixed this by making Item_cond::m_cond_equal.
This patch reverts the fixes of the bugs MDEV-24454 and MDEV-25631 from
the commit 3690c549c6.
It leaves the changes in plugin/feedback/feedback.cc and corresponding
test files introduced in this commit intact.
Proper fixes for the bug MDEV-24454 and MDEV-25631 will follow immediately.
If the optimizer decides to rewrites a NOT IN predicand of the form
outer_expr IN (SELECT inner_col FROM ... WHERE subquery_where)
into the EXISTS subquery
EXISTS (SELECT 1 FROM ... WHERE subquery_where AND
(outer_expr=inner_col OR inner_col IS NULL))
then the pushed equality predicate outer_expr=inner_col can be used for
ref[or_null] access if inner_col is a reference to an indexed column.
In this case if there is a selective range condition over this column then
a Rowid filter may be employed coupled the with ref[or_null] access. The
filter is 'pushed' into the engine and in InnoDB currently it cannot be
used with index look-ups by primary key. The ref[or_null] access can be
used only when outer_expr is not NULL. Otherwise the original predicand
is evaluated to TRUE only if the result set returned by the query
SELECT 1 FROM ... WHERE subquery_where
is empty. When performing this evaluation the executor switches to the
table scan by primary key. Before this patch the pushed filter still
remained marked as active and the engine tried to apply the filter. This
was incorrect and in InnoDB this attempt to use the filter led to an
assertion failure.
This patch fixes the problem by disabling usage of the filter when
outer_expr is evaluated to NULL.
Use in_sum_func (and so nest_level) only in LEX to which SELECT lex belong to
Reduce usage of current_select (because it does not always point on the correct
SELECT_LEX, for example with prepare.
Change context for all classes inherited from Item_ident (was only for Item_field) in case of pushing down it to HAVING.
Now name resolution context have to have SELECT_LEX reference if the context is present.
Fixed feedback plugin stack usage.