Analysis:
The wrong result is a consquence of sorting the subquery
result and then selecting only the first row due to the
artificial LIMIT 1 introduced by the fix_fields phase.
Normally, if there is an ORDER BY in a subquery, the ORDER
is removed (Item_in_subselect::select_in_like_transformer),
however if a GROUP BY is transformed into ORDER, this happens
later, after the removal of the ORDER clause of subqueries, so
we end up with a subquery with an ORDER clause, and an artificially
added LIMIT 1.
The reason why the same works in the main 5.3 without MWL#89, is
that the 5.3 performs all subquery transformations, including
IN->EXISTS before JOIN::optimize(). The beginning of JOIN::optimize
does:
if (having || (select_options & OPTION_FOUND_ROWS))
select_limit= HA_POS_ERROR;
which sets the limit back to infinity, thus 5.3 sorts the whole
subquery result, and IN performs the lookup into all subquery result
rows.
Solution:
Sorting of subqueries without LIMIT is meaningless. Since LIMIT in
subqueries is not supported, the patch removes sorting by setting
join->skip_sort_order= true
for each subquery JOIN object. This improves a number of execution
plans to not perform unnecessary sorting at all.
- "Using MRR" is no longer shown with range access.
- Instead, both range and BKA accesses will show one of the following:
= "Rowid-ordered scan"
= "Key-ordered scan"
= "Key-ordered Rowid-ordered scan"
depending on whether DS-MRR implementation will do scan keys in order, rowids in order,
or both.
- The patch also introduces a way for other storage engines/MRR implementations to
pass information to EXPLAIN output about the properties of employed MRR scans.
- Auto-merge with 5.3 main.
- Changed the test for LP BUG#719198 so that
an two more queries were added, and removed a
query that produces a wrong result due to an
unrelated problem. The wrong result is submitted
as a separate bug.
Analysis (BUG#719198):
The assert failed because the execution code for
partial matching is designed with the assumption that
NULLs on the left side are detected as early as possible,
and a NULL result is returned before any lookups are
performed at all.
However, in the case of an Item_cache object on the left
side, null was not detected properly, because detection
was done via Item::is_null(), which is not implemented at
all for Item_cache, and resolved to the default Item::is_null()
which always returns FALSE.
Solution:
Imlpement Item::is_null().
******
Analysis (BUG#730604):
The method Item_field::is_null() determines if an item is NULL from its
Item_field::field object. However, for Item_fields that represent internal
temporary tables, Item_field::field represents the field of the original
table that was the source for the temporary table (in this case t1.f3).
Both in the committed test case, and in the original bug report the current
value of t1.f3 is not NULL. This results in an incorrect count of NULLs
for this column. As a consequence, all related Ordered_key buffers are
allocated with incorrect sizes. Depending on the exact query and data,
these incorrect sizes result in various crashes or failed asserts.
Solution:
The correct value of the current field of the internal temp table is
in Item_field::result_field. This value is determined by
Item::is_null_result().
- In join buffering code, call join_tab_execution_startup() (#1) before we call join_tab_scan->open() (#2).
This is important with SJ-Materialization because #1 fills the materialized table, while
#2 will actually try to read the first row. Attempt to read the first row before we have
populated the materialized table would cause zero rows to be returned when actually there were matches.
Fix MySQL BUG#52344 - Subquery materialization: Assertion if subquery in on-clause of outer join
Original fix and comments from Oysten, adjusted for the different
subquery optimization in MariaDB.
"
Problem: If tables of an outer join are constant tables,
the associated on-clause will be evaluated in the optimization
phase. If the on-clause contains a query that is to be
executed with subquery materialization, this will not work
since the infrastructure for such execution is not yet set up.
Solution: Do not evaluate on-clause in optimization phase if
is_expensive() returns true for this clause. This is how the
problem is currently avoided for where-clauses. This works
because, Item_in_subselect::is_expensive_processor returns true
if query is to be executed with subquery materialization.
"
In addition, after MWL#89, in MariaDB if the IN-EXISTS strategy
is chosen, the in-to-exists predicates are insterted after
join_read_const_table() is called, resulting in evaluation of
the subquery without the in-to-exists predicates.
Merge 5.3-mwl89 into 5.3 main.
There is one remaining test failure in this merge:
innodb_mysql_lock2. All other tests have been checked to
deliver the same results/explains as 5.3-mwl89, including
the few remaining wrong results.
The bug was a result of missing logic to handle the case
when there are 'expensive' predicates that are not evaluated
during constant table optimization. Such is the case for
the IN predicate, which is considered expensive if it is
computed via materialization. In general this bug can be
triggered with any expensive predicate instead of IN.
When FALSE constant predicates are not evaluated during constant
optimization, the execution path changes so that instead of
setting JOIN::zero_result_cause after make_join_select, and
exiting JOIN::exec via the call to return_zero_rows(), execution
ends in JOIN::exec in the branch:
if (join->tables == join->const_tables)
{
...
else if (join->send_row_on_empty_set())
...
rc= join->result->send_data(*columns_list);
}
Unlike return_zero_rows(), this branch didn't evaluate the
having clause of the query.
The patch adds a call to evaluate the HAVING clause of a query even
when all tables are constant, because even for an empty result set
some aggregate functions may produce a NULL value.
- Added more tests to the MWL#89 specific test, and made the test more modular.
- Updated test files.
- Fixed a memory leak.
- More comments.
mysql-test/r/subselect_mat.result:
- Updated the test file to reflect the new optimizer switches related to
materialized subquery execution.
- Added one extra test to test all cases that expose BUG#40037 (this is an old bug from 5.x).
- Updated the test result with correct results that expose BUG#40037.
mysql-test/t/subselect_mat.test:
- Updated the test file to reflect the new optimizer switches related to
materialized subquery execution.
- Added one extra test to test all cases that expose BUG#40037 (this is an old bug from 5.x).
- Updated the test result with correct results that expose BUG#40037.
sql/sql_select.cc:
Fixed a memory leak reported by Valgrind.
was not cleaned up between PS re-executions. The reason was two-fold:
- a merge with mysql-6.0 missed select_union::cleanup() that should
have cleaned up the temp table, and
- the subclass of select_union used by materialization didn't call
the base class cleanup() method.
1. Changed the lazy optimization for subqueries that can be
materialized into bottom-up optimization during the optimization of
the main query.
The main change is implemented by the method
Item_in_subselect::setup_engine.
All other changes were required to correct problems resulting from
changing the order of optimization. Most of these problems followed
the same pattern - there are some shared structures between a
subquery and its parent query. Depending on which one is optimized
first (parent or child query), these shared strucutres may get
different values, thus resulting in an inconsistent query plan.
2. Changed the code-generation for subquery materialization to be
performed in runtime memory for each (re)execution, instead of in
statement memory (once per prepared statement).
- Item_in_subselect::setup_engine() no longer creates materialization
related objects in statement memory.
- Merged subselect_hash_sj_engine::init_permanent and
subselect_hash_sj_engine::init_runtime into
subselect_hash_sj_engine::init, which is called for each
(re)execution.
- Fixed deletion of the temp table accordingly.
mysql-test/r/subselect_mat.result:
Adjusted changed EXPLAIN because of earlier optimization of subqueries.
- Unify EXPLAIN printout for <subqueryN> tables with regular tables
- Update test results for <subqueryN> tables:
s/unique_key/distinct_key/g
s/1.0/100.0/ for "filtered" column
- Change "SUBQUERY#n" to "<subquery{n}>" in EXPLAIN output. We need to it to be
lowercase so that EXPLAIN results do not differ in case between systems with
case-sensitive and case-insensitive filesystems.
- Remove garbage comments, add better comments.
- Code cleanu.
- Make MWL#90 code require @@optimizer_switch='semijoin=on'
- Update test results with the above
- Fork subselect_mat.test - we want to check both semi-join materialization,
which now has broader scope and non-semijoin materialization.
was not cleaned up between PS re-executions. The reason was two-fold:
- a merge with mysql-6.0 missed select_union::cleanup() that should
have cleaned up the temp table, and
- the subclass of select_union used by materialization didn't call
the base class cleanup() method.
- Add Item_in_subselect::get_identifier() that returns subquery's id
- Change select_describe() to produce output in new format
- Update test results (checked)
Bug#48213 Materialized subselect crashes if using GEOMETRY type
The problem occurred because during semi-join a materialized table
was created which contained a GEOMETRY column, which is a specialized
BLOB column. This caused an segmentation fault because such tables will
have extra columns, and the semi-join code was not prepared for that.
The solution is to disable materialization when Blob/Geometry columns would
need to be materialized. Blob columns cannot be used for index look-up
anyway, so it does not makes sense to use materialization.
This fix implies that it is detected earlier that subquery materialization
can not be used. The result of that is that in->exist optimization may
be performed for such queries. Hence, extended query plans for such
queries had to be updated.
mysql-test/r/subselect_mat.result:
Update extended query plan for subqueries that cannot use materialization
due to Blobs.
mysql-test/r/subselect_sj.result:
Updated result file.
mysql-test/r/subselect_sj_jcl6.result:
Update result file.
mysql-test/t/subselect_sj.test:
Add test case for Bug#48213 that verifies that semi-join works when subquery select list contain Blob columns. Also verify that materialization is not
used.
sql/opt_subselect.cc:
Disable materialization for semi-join/subqueries when the subquery select list
contain Blob columns.
BUG#50019: Wrong result for IN-subquery with materialization
- Fix equality substitution in presense of semi-join materialization, lookup and scan variants
(started off from fix by Evgen Potemkin, then modified it to work in all cases)