- Create/use do_copy_nullable_row_to_notnull() function for ref access, which is used
when copying from not-NULL field in table that can be NULL-complemented to not-NULL field.
fixed several defects in the greedy optimization:
1) The greedy optimizer calculated the 'compare-cost' (CPU-cost)
for iterating over the partial plan result at each level in
the query plan as 'record_count / (double) TIME_FOR_COMPARE'
This cost was only used locally for 'best' calculation at each
level, and *not* accumulated into the total cost for the query plan.
This fix added the 'CPU-cost' of processing 'current_record_count'
records at each level to 'current_read_time' *before* it is used as
'accumulated cost' argument to recursive
best_extension_by_limited_search() calls. This ensured that the
cost of a huge join-fanout early in the QEP was correctly
reflected in the cost of the final QEP.
To get identical cost for a 'best' optimized query and a
straight_join with the same join order, the same change was also
applied to optimize_straight_join() and get_partial_join_cost()
2) Furthermore to get equal cost for 'best' optimized query and a
straight_join the new code substrcated the same '0.001' in
optimize_straight_join() as it had been already done in
best_extension_by_limited_search()
3) When best_extension_by_limited_search() aggregated the 'best' plan a
plan was 'best' by the check :
'if ((search_depth == 1) || (current_read_time < join->best_read))'
The term '(search_depth == 1' incorrectly caused a new best plan to be
collected whenever the specified 'search_depth' was reached - even if
this partial query plan was more expensive than what we had already
found.
- Correctly handle plan refinement stage for LooseScan plans: run create_ref_for_key() if LooseScan
plan includes a ref access, and if we don't have any fixed key components, switch to a full index scan.
The function setup_sj_materialization_part1() forgot to set the value
of TABLE::map for any materialized IN subquery.
This could lead to wrong results for queries with subqueries that were
converted to queries with semijoins.
This bug in the function Loose_scan_opt::check_ref_access_part1 could lead
to choosing an invalid execution plan employing a loose scan access to a
semi-join table even in the cases when such access could not be used at all.
This could result in wrong answers for some queries with IN subqueries.
If the optimizer switch 'semijoin_with_cache' is set to 'off' then
join cache cannot be used to join inner tables of a semijoin.
Also fixed a bug in the function check_join_cache_usage() that led
to wrong output of the EXPLAIN commands for some test cases.
- when create_ref_for_key() is constructing a ref access for
a table that's inside a SJ-Materialization nest, it may not
use references to fields of tables that are unside the nest (because
these fields will not yet have values when ref access will be used)
The check was performed in the first of create_ref_for_key's loops (the
one which counts how many key parts are usable) but not in the second
(the one which actually fills the TABLE_REF structure).
of the 5.3 code line after a merge with 5.2 on 2010-10-28
in order not to allow the cost to access a joined table to be equal
to 0 ever.
Expanded data sets for many test cases to get the same execution plans
as before.
- convert_subq_to_jtbm() didn't check that subuqery optimization was successful. If it wasn't (in this
example because of @@max_join_size violation), it would proceed further and eventually crash when
trying to execute the un-optimized subquery.
- Provide fix_after_pullout() function for Item_in_optimizer and other Item_XXX classes (basically, all of them
that have eval_not_null_tables, which means they have special rules for calculating not_null_tables_cache value)
- If convert_join_subqueries_to_semijoins() decides to wrap Item_in_subselect in Item_in_optimizer,
it should do so in prep_on_expr/prep_where, too, as long as they are present.
There seems to be two possibilities of how we arrive in this function:
- prep_on_expr/prep_where==NULL, and will be set later by simplify_joins()
- prep_on_expr/prep_where!=NULL, and it is a copy_and_or_structure()-made copy of on_expr/where.
the latter can happen for some (but not all!) nested joins. This bug was that we didn't handle this case.
- Let join buffering code correctly take into account rowids needed
by DuplicateElimination when it is calculating minimum record sizes.
- In JOIN_CACHE::write_record_data, added asserts that prevent us from
writing beyond the end of the buffer.
- The problem was that the code that made the check whether the subquery is an AND-part of the WHERE
clause didn't work correctly for nested subqueries. In particular, grand-child subquery in HAVING was
treated as if it was in the WHERE, which eventually caused an assert when replace_where_subcondition
looked for the subquery predicate in the WHERE and couldn't find it there.
- The fix: Removed implementation of "thd_marker approach". thd->thd_marker was used to determine the
location of subquery predicate: setup_conds() would set accordingly it when making the
{where|on_expr}->fix_fields(...)
call so that AND-parts of the WHERE/ON clauses can determine they are the AND-parts.
Item_cond_or::fix_fields(), Item_func::fix_fields(), Item_subselect::fix_fields (this one was missed),
and all other items-that-contain-items had to reset thd->thd_marker before calling fix_fields() for
their children items, so that the children can see they are not AND-parts of WHERE/ON.
- The "thd_marker approach" required that a lot of code in different locations maintains correct value of
thd->thd_marker, so it was replaced with:
- The new approach with mark_as_condition_AND_part does not keep context in thd->thd_marker. Instead,
setup_conds() now calls
{where|on_expr}->mark_as_condition_AND_part()
and implementations of that function make sure that:
- parts of AND-expressions get the mark_as_condition_AND_part() call
- Item_in_subselect objects record that they are AND-parts of WHERE/ON
- Make simplify_joins() set maybe_null=FALSE for tables that were on the
inner sides of inner joins and then were moved to the inner sides of semi-joins.
(This is not a real fix for this bug, even though it makes it to no longer repeat)
- Semi-join subquery predicates, i.e. ... WHERE outer_expr IN (SELECT ...) may have null-rejecting properties,
may allow to convert outer joins into inner.
- When convert_subq_to_sj() injected IN-equality into parent's WHERE/ON clause, it didn't call
$new_cond->top_level_item(), which would cause null-rejecting properties to be lost.
- Fixed, now the mentioned outer-to-inner conversion will really take place.
- Set the default
- Adjust the testcases so that 'new' tests are run with optimizations turned on.
- Pull out relevant tests from "irrelevant" tests and run them with optimizations on.
- Run range.test and innodb.test with both mrr=on and mrr=off
semijoin=on,firstmatch=on,loosescan=on
to
semijoin=off,firstmatch=off,loosescan=off
Adjust the testcases:
- Modify subselect*.test and join_cache.test so that all tests
use the same execution paths as before (i.e. optimizations that
are being tested are enabled)
- Let all other test files run with the new default settings (i.e.
with new optimizations disabled)
- Copy subquery testcases from these files into t/subselect_extra.test
which will run them with new optimizations enabled.
- JOIN::prepare would have set JOIN::table_count to incorrect value (bad merge of MWL 106)
- optimize_keyuse() would use table-bit as table number
(the change in optimize_keyuse is also the reason for query plan changes. Not
expected to have much effect because only handles cases of no index statistics)
- st_select_lex::register_dependency_item() ignored the fact that some of the
selects on the dependency paths could have been merged to their parents (because they
were mergeable VIEWs)
- Undo the incorrect fix in Item_subselect::recalc_used_tables(): do not call
fix_after_pullout() for Item_subselect::Ref_to_outside members.
- The crash was because a NOT NULL table column inside the subquery was considered NULLable
because the code thought it was on the inner side of an outer join nest.
- Fixed by making correct distinction between tables inside outer join nests and inside semi-join nests.
- Update test results
- Fix a problem with PS:
= convert_subq_to_sj() should not save where to prep_where or on_expr to prep_on_expr.
= After an unmerged subquery predicate has been pulled, it should call fix_after_pullout() for
outer_refs.
The code that added semi-join transformations missed checking
the state of the fixed flag for the items built with the
and_items function before calls of the fix_fields method.
This could lead to an abort failure when the first argument
of and_items() happened to be NULL.
- Don't attempt to construct FirstMatch access method if we've
just figured three lines above that it can't be used (because join
prefix doesn't have the needed tables), and so have set
pos->first_firstmatch_table= MAX_TABLES
Attempts to analyze join->positions[MAX_TABLES] caused valgrind warnings
- in advance_sj_state(), remember join->cur_dups_producing_tables in
pos->prefix_dups_producing_tables *before* we modify it, so that
restore_prev_sj_state() restores cur_dups_producing_tables in all cases.
- Updated test results in subselect_sj2[_jcl6].result (the original EXPLAIN
was invalid there)