This commits adds the "materialization" block to the output of
EXPLAIN/ANALYZE FORMAT=JSON when materialized subqueries are involved
into processing. In the case of ANALYZE additional runtime information
is displayed, such as:
- chosen strategy of materialization
- number of partial match/index lookup loops
- sizes of partial match buffers
Condition can be pushed from the HAVING clause into the WHERE clause
if it depends only on the fields that are used in the GROUP BY list
or depends on the fields that are equal to grouping fields.
Aggregate functions can't be pushed down.
How the pushdown is performed on the example:
SELECT t1.a,MAX(t1.b)
FROM t1
GROUP BY t1.a
HAVING (t1.a>2) AND (MAX(c)>12);
=>
SELECT t1.a,MAX(t1.b)
FROM t1
WHERE (t1.a>2)
GROUP BY t1.a
HAVING (MAX(c)>12);
The implementation scheme:
1. Extract the most restrictive condition cond from the HAVING clause of
the select that depends only on the fields that are used in the GROUP BY
list of the select (directly or indirectly through equalities)
2. Save cond as a condition that can be pushed into the WHERE clause
of the select
3. Remove cond from the HAVING clause if it is possible
The optimization is implemented in the function
st_select_lex::pushdown_from_having_into_where().
New test file having_cond_pushdown.test is created.
The function and_new_conditions_to_optimized_cond() incorrectly handled
the WHERE conditions with one multiple equality and one IN subquery predicate
that could be converted into a jtbm semi-join. This could cause crashes.
The fix code was prepared by Galina Shalygina.
This was a bug in the code of MDEV-12387 "Push conditions into materialized
subqueries". The bug manifested itself in rather rare situations. An
affected query must contain IN subquery predicate whose left operand
was an outer field of a mergeable derived table or view and right operand
was a materialized subquery.
The erroneous code in fact stripped off the Item_direct_ref wrapper from
the left operand of the IN subquery predicate when building equalities
produced by the conversion of the predicate into a semi-join. As a result
the left operand was not considered as an outer reference anymore and
used_tables() was calculated incorrectly. This caused a crash in the
function optimize_keyuse().
The problem appears because of the pushdown of a non-pushable condition 'cond'
into the materialized derived table/view. To prevent pushdown a map of
tables that are used in 'cond' should be updated. This call is missing
because of the MDEV-12387 changes. The call is added in the
setup_jtbm_semi_joins() method.
failed
The bug appeared as in MDEV-12387 setup_jtbm_semi_joins() procedure had been
devided into two functions, one called before optimization of WHERE clause
and another after this optimization. When the second function was called for
a degenerated jtbm semi join equalities connecting the subselect and
the parent select were created but invocation of fix_fields() for these
equalities was missing.
The logic and the implementation scheme are similar with the
MDEV-9197 Pushdown conditions into non-mergeable views/derived tables
How the push down is made on the example:
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y) from t2 group by x);
-->
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y)
from t2
where x>3
group by x
having max(y)>10);
The implementation scheme:
1. Search for the condition cond that depends only on the fields
from the left part of the IN subquery (left_part)
2. Find fields F_group in the select of the right part of the
IN subquery (right_part) that are used in the GROUP BY
3. Extract from the cond condition cond_where that depends only on the
fields from the left_part that stay at the same places in the left_part
(have the same indexes) as the F_group fields in the projection of the
right_part
4. Transform cond_where so it can be pushed into the WHERE clause of the
right_part and delete cond_where from the cond
5. Transform cond so it can be pushed into the HAVING clause of the right_part
The optimization is made in the
Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the
variable condition_pushdown_for_subquery.
New test file in_subq_cond_pushdown.test is created.
There are also some changes made for setup_jtbm_semi_joins().
Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins()
that is called before optimize_cond() for cond and setup_jtbm_semi_joins()
that is called after optimize_cond().
New setup_jtbm_semi_joins() is made in the way so that the result of its work is
the same as if it was called before optimize_cond().
The code that is common for pushdown into materialized derived and into materialized
IN subqueries is factored out into pushdown_cond_for_derived(),
Item_in_subselect::pushdown_cond_for_in_subquery() and
st_select_lex::pushdown_cond_into_where_clause().