- Item_in_subselect::inject_in_to_exists_cond() should not call
((Item_cond*)join->conds)->argument_list()->concat(join->cond_equal->current_level)
as that makes two lists share their tail, and the cond_equal list
will end up containing non-Item_equal objects when substitute_for_best_equal_field()
walks through join->conds and replaces all Item_equal objects with Item_func_eq objects.
- So, instead of using List::concat(), manually copy entries from one list to another.
- setup_sj_materialization() code failed to take into account that it can be that
the first [in join order ordering] table inside semi-join-materialization nest
is also an inner table wrt an outer join (that is embedded in the semi-join).
This can happen when all of the tables that are inside the semi-join but not inside
the outer join are constant.
- Made a trivial to not assume that table's embedding join nest is the semi-join
nest: instead, walk up the outer join nests until we reach the semi-join nest.
Analysis:
Partial matching is used even when there are no NULLs in
a materialized subquery, as long as the left NOT IN operand
may contain NULL values.
This case was not handled correctly in two different places.
First, the implementation of parital matching did not clear
the set of matching columns when the merge process advanced
to the next row.
Second, there is no need to perform partial matching at all
when the left operand has no NULLs.
Solution:
First fix subselect_rowid_merge_engine::partial_match() to
properly cleanup the bitmap of matching keys when advancing
to the next row.
Second, change subselect_partial_match_engine::exec() so
that when the materialized subquery doesn't contain any
NULLs, and the left operand of [NOT] IN doesn't contain
NULLs either, the method returns without doing any
unnecessary partial matching. The correct result in this
case is in Item::in_value.
Identified all test cases in the MySQL file subquery_mat.inc that are
not present in MariaDB. In total found 8 test cases for the following
MySQL bugs:
* BUG#49630 - not a bug in MariaDB, added test case
* BUG#52538 - not a bug in MariaDB, added test case (checked with VG)
* BUG#53103 - not a bug in MariaDB, added test case
* BUG#54511 - not a bug in MariaDB, added test case
* BUG#56367 - not a bug in MariaDB, added test case
* BUG#59833 - not a bug in MariaDB, added test case
* BUG#11852644 - not a bug in MariaDB, added test case
* BUG#12668294 - not a bug in MariaDB, added test case
All of these MySQL bugs are not present in MariaDB 5.3.
The comparison was based on the following version of
mysql-trunk:
revno: 3350 [merge]
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: mysql-trunk
timestamp: Mon 2011-08-08 12:42:09 +0300
message:
Merge mysql-5.5 to mysql-trunk.
The function matching_cond should take into account that
there may be always false constant conjunctive conditions
that has not been evaluated yet,for example, conjunctive
conditions with non-correlated subqueries.
- Set the default
- Adjust the testcases so that 'new' tests are run with optimizations turned on.
- Pull out relevant tests from "irrelevant" tests and run them with optimizations on.
- Run range.test and innodb.test with both mrr=on and mrr=off
semijoin=on,firstmatch=on,loosescan=on
to
semijoin=off,firstmatch=off,loosescan=off
Adjust the testcases:
- Modify subselect*.test and join_cache.test so that all tests
use the same execution paths as before (i.e. optimizations that
are being tested are enabled)
- Let all other test files run with the new default settings (i.e.
with new optimizations disabled)
- Copy subquery testcases from these files into t/subselect_extra.test
which will run them with new optimizations enabled.
Resolved all conflicts, bad merges and fixed a few minor bugs in the code.
Commented out the queries from multi_update, view, subselect_sj, func_str,
derived_view, view_grant that failed either with crashes in ps-protocol or
with wrong results.
The failures are clear indications of some bugs in the code and these bugs
are to be fixed.
Analysis:
During optimization of the subquery, in the call chain:
update_ref_and_keys -> add_key_fields ->
merge_key_fields -> Item_direct_ref::is_null -> Item_cache::is_null
The call to Item_cache::is_null() returns TRUE, which is wrong.
This results in Item_null replacing the field 'f3' in the KEY_FIELD,
then this Item_null is used for index access, producing a wrong result.
The reason why Item_cache::is_null returns wrong result is that
this Item_cache object is a cache of the left operand of IN, and was
updated in Item_in_optimizer::val_int. In MWL#89 the latter method is
called during the execution phase, which is after we optimize the subquery.
Therefore during the optization phase the left operand cache of IN was
not updated.
Solution:
Update the left operand cache during optimization if it is a constant.
This bug fix also discoveres and fixes a wrong IF statement in
convert_constant_item().
Analysis:
The wrong result is a consquence of sorting the subquery
result and then selecting only the first row due to the
artificial LIMIT 1 introduced by the fix_fields phase.
Normally, if there is an ORDER BY in a subquery, the ORDER
is removed (Item_in_subselect::select_in_like_transformer),
however if a GROUP BY is transformed into ORDER, this happens
later, after the removal of the ORDER clause of subqueries, so
we end up with a subquery with an ORDER clause, and an artificially
added LIMIT 1.
The reason why the same works in the main 5.3 without MWL#89, is
that the 5.3 performs all subquery transformations, including
IN->EXISTS before JOIN::optimize(). The beginning of JOIN::optimize
does:
if (having || (select_options & OPTION_FOUND_ROWS))
select_limit= HA_POS_ERROR;
which sets the limit back to infinity, thus 5.3 sorts the whole
subquery result, and IN performs the lookup into all subquery result
rows.
Solution:
Sorting of subqueries without LIMIT is meaningless. Since LIMIT in
subqueries is not supported, the patch removes sorting by setting
join->skip_sort_order= true
for each subquery JOIN object. This improves a number of execution
plans to not perform unnecessary sorting at all.
- "Using MRR" is no longer shown with range access.
- Instead, both range and BKA accesses will show one of the following:
= "Rowid-ordered scan"
= "Key-ordered scan"
= "Key-ordered Rowid-ordered scan"
depending on whether DS-MRR implementation will do scan keys in order, rowids in order,
or both.
- The patch also introduces a way for other storage engines/MRR implementations to
pass information to EXPLAIN output about the properties of employed MRR scans.
- Auto-merge with 5.3 main.
- Changed the test for LP BUG#719198 so that
an two more queries were added, and removed a
query that produces a wrong result due to an
unrelated problem. The wrong result is submitted
as a separate bug.
Analysis (BUG#719198):
The assert failed because the execution code for
partial matching is designed with the assumption that
NULLs on the left side are detected as early as possible,
and a NULL result is returned before any lookups are
performed at all.
However, in the case of an Item_cache object on the left
side, null was not detected properly, because detection
was done via Item::is_null(), which is not implemented at
all for Item_cache, and resolved to the default Item::is_null()
which always returns FALSE.
Solution:
Imlpement Item::is_null().
******
Analysis (BUG#730604):
The method Item_field::is_null() determines if an item is NULL from its
Item_field::field object. However, for Item_fields that represent internal
temporary tables, Item_field::field represents the field of the original
table that was the source for the temporary table (in this case t1.f3).
Both in the committed test case, and in the original bug report the current
value of t1.f3 is not NULL. This results in an incorrect count of NULLs
for this column. As a consequence, all related Ordered_key buffers are
allocated with incorrect sizes. Depending on the exact query and data,
these incorrect sizes result in various crashes or failed asserts.
Solution:
The correct value of the current field of the internal temp table is
in Item_field::result_field. This value is determined by
Item::is_null_result().
- In join buffering code, call join_tab_execution_startup() (#1) before we call join_tab_scan->open() (#2).
This is important with SJ-Materialization because #1 fills the materialized table, while
#2 will actually try to read the first row. Attempt to read the first row before we have
populated the materialized table would cause zero rows to be returned when actually there were matches.
Fix MySQL BUG#52344 - Subquery materialization: Assertion if subquery in on-clause of outer join
Original fix and comments from Oysten, adjusted for the different
subquery optimization in MariaDB.
"
Problem: If tables of an outer join are constant tables,
the associated on-clause will be evaluated in the optimization
phase. If the on-clause contains a query that is to be
executed with subquery materialization, this will not work
since the infrastructure for such execution is not yet set up.
Solution: Do not evaluate on-clause in optimization phase if
is_expensive() returns true for this clause. This is how the
problem is currently avoided for where-clauses. This works
because, Item_in_subselect::is_expensive_processor returns true
if query is to be executed with subquery materialization.
"
In addition, after MWL#89, in MariaDB if the IN-EXISTS strategy
is chosen, the in-to-exists predicates are insterted after
join_read_const_table() is called, resulting in evaluation of
the subquery without the in-to-exists predicates.
Merge 5.3-mwl89 into 5.3 main.
There is one remaining test failure in this merge:
innodb_mysql_lock2. All other tests have been checked to
deliver the same results/explains as 5.3-mwl89, including
the few remaining wrong results.
The bug was a result of missing logic to handle the case
when there are 'expensive' predicates that are not evaluated
during constant table optimization. Such is the case for
the IN predicate, which is considered expensive if it is
computed via materialization. In general this bug can be
triggered with any expensive predicate instead of IN.
When FALSE constant predicates are not evaluated during constant
optimization, the execution path changes so that instead of
setting JOIN::zero_result_cause after make_join_select, and
exiting JOIN::exec via the call to return_zero_rows(), execution
ends in JOIN::exec in the branch:
if (join->tables == join->const_tables)
{
...
else if (join->send_row_on_empty_set())
...
rc= join->result->send_data(*columns_list);
}
Unlike return_zero_rows(), this branch didn't evaluate the
having clause of the query.
The patch adds a call to evaluate the HAVING clause of a query even
when all tables are constant, because even for an empty result set
some aggregate functions may produce a NULL value.
- Added more tests to the MWL#89 specific test, and made the test more modular.
- Updated test files.
- Fixed a memory leak.
- More comments.
mysql-test/r/subselect_mat.result:
- Updated the test file to reflect the new optimizer switches related to
materialized subquery execution.
- Added one extra test to test all cases that expose BUG#40037 (this is an old bug from 5.x).
- Updated the test result with correct results that expose BUG#40037.
mysql-test/t/subselect_mat.test:
- Updated the test file to reflect the new optimizer switches related to
materialized subquery execution.
- Added one extra test to test all cases that expose BUG#40037 (this is an old bug from 5.x).
- Updated the test result with correct results that expose BUG#40037.
sql/sql_select.cc:
Fixed a memory leak reported by Valgrind.
was not cleaned up between PS re-executions. The reason was two-fold:
- a merge with mysql-6.0 missed select_union::cleanup() that should
have cleaned up the temp table, and
- the subclass of select_union used by materialization didn't call
the base class cleanup() method.
1. Changed the lazy optimization for subqueries that can be
materialized into bottom-up optimization during the optimization of
the main query.
The main change is implemented by the method
Item_in_subselect::setup_engine.
All other changes were required to correct problems resulting from
changing the order of optimization. Most of these problems followed
the same pattern - there are some shared structures between a
subquery and its parent query. Depending on which one is optimized
first (parent or child query), these shared strucutres may get
different values, thus resulting in an inconsistent query plan.
2. Changed the code-generation for subquery materialization to be
performed in runtime memory for each (re)execution, instead of in
statement memory (once per prepared statement).
- Item_in_subselect::setup_engine() no longer creates materialization
related objects in statement memory.
- Merged subselect_hash_sj_engine::init_permanent and
subselect_hash_sj_engine::init_runtime into
subselect_hash_sj_engine::init, which is called for each
(re)execution.
- Fixed deletion of the temp table accordingly.
mysql-test/r/subselect_mat.result:
Adjusted changed EXPLAIN because of earlier optimization of subqueries.