The method JOIN_CACHE::init may fail (return 1) if some conditions on the
used join buffer is not satisfied. For example it fails if join_buffer_size
is greater than join_buffer_space_limit. The conditions should be checked
when running the EXPLAIN command for the query. That's why the method
JOIN_CACHE::init has to be called for EXPLAIN commands as well.
- The problem was that JOIN::prepare() tried to set TABLE::maybe_null
for a table in join. Non-merged semi-join tables 1) are present as
join's base tables on second EXECUTE, but 2) do not yet have a TABLE
object.
Worked around the problem by putting mixed_implicit_grouping into JOIN
object, and then passing it to JTBM tables in setup_jtbm_semi_joins().
- Don't pull out a table out of a semi-join if it is on the inner side of an outer join.
- Make join->sort_by_table= get_sort_by_table(...) call after const table detection
is done. That way, the value of join->sort_by_table will match the actual execution.
Which will allow the code in setup_semijoin_dups_elimination() (search for
"Make sure that possible sorting of rows from the head table is not to be employed."
to see that "Using filesort" is going to be used together with Duplicate Elimination (
and change it to Using temporary + Using filesort)
Apply the patch from Patryk Pomykalski:
- create_internal_tmp_table_from_heap() will now return information whether
the last row that we tried to write was a duplicate row.
(mysql-5.6 also has this change)
- Call tmp_having->update_used_tables() *before* we have call JOIN::cleanup().
Making the call after join::cleanup() is not allowed, because subquery
predicate items walk parent join's JOIN_TAB structures. Which can be
invalidated by JOIN::cleanup().
reached by fix_fields() (via reference) before row which it belongs to (on the second execution)
and fix_field for row did not follow usual protocol for Items with argument
(first check that the item fixed then call fix_fields).
Item_row::fix_field fixed.
- In JOIN::exec(), make the having->update_used_tables() call before we've
made the JOIN::cleanup(full=true) call. The latter frees SJ-Materialization
structures, which correlated subquery predicate items attempt to walk afterwards.
- Let fix_semijoin_strategies_for_picked_join_order() set
POSITION::prefix_record_count for POSITION records that it copies from
SJ_MATERIALIZATION_INFO::tables.
(These records do not have prefix_record_count set, because they are optimized
as joins-inside-semijoin-nests, without full advance_sj_state() processing).
Part#1: make EXPLAIN's plan match the one by actual execution:
Item_subselect::used_tables() should return the same value irrespectively
of whether we're running an EXPLAIN or a SELECT.
- The problem was that convert_subq_to_jtbm() attached the semi-join
TABLE_LIST object into the wrong list: they used to attach it to the
end of parent_lex->leaf_tables.head()->next_local->...->next_local.
This was apparently inccorect, as one can construct an example where
JTBM nest is attached to a table that is inside some mergeable VIEW, which
breaks (causes crash) for name resolution on the subsequent statement
re-execution.
- Solution: Attach to the "right" list. The "wording" was copied from
st_select_lex::handle_derived.
- If LooseScan is used with quick select, require that quick select produces
data in key order (this disables use of MRR, which can return data in arbitrary order).
not be reproduced in the latest release of mariadb-5.3 as it was was fixed
by Sergey Petrunia when working on the problems concerning outer joins within
in subqueries converted to semi-joins.
- Disable use of join cache when we're using FirstMatch strategy, and the join
order is such that subquery's inner tables are interleaved with outer. Join
buffering code is incapable of handling such join orders.
- The testcase requires use of @@debug_optimizer_prefer_join_prefix to hit the bug,
but I'm pushing it anyway (including the mention of the variable in .test file),
so that it can be found and enabled when/if we get something comparable in the
main tree.
The problem was that LooseScan execution code assumed that tab->key holds
the index used for looseScan. This is only true when range or full index
scan are used. In case of ref access, the index is in tab->ref.key (and
tab->index==0 which explains how LooseScan passed tests with ref access: they
used one index)
Fixed by setting/using loosescan_key, which always the correct index#.
of mysql-5.6 code line. The bugs could not be reproduced in the latest release
of mariadb-5.3 as they were fixed either when the code of subquery optimization
was back-ported from mysql-6.0 or later when some other bugs were fixed.
of mysql-5.6 code line. The bugs could not be reproduced in the latest release
of mariadb-5.3 as they were fixed either when the code of subquery optimization
was back-ported from mysql-6.0 or later when some other bugs were fixed.
- Create/use do_copy_nullable_row_to_notnull() function for ref access, which is used
when copying from not-NULL field in table that can be NULL-complemented to not-NULL field.
fixed several defects in the greedy optimization:
1) The greedy optimizer calculated the 'compare-cost' (CPU-cost)
for iterating over the partial plan result at each level in
the query plan as 'record_count / (double) TIME_FOR_COMPARE'
This cost was only used locally for 'best' calculation at each
level, and *not* accumulated into the total cost for the query plan.
This fix added the 'CPU-cost' of processing 'current_record_count'
records at each level to 'current_read_time' *before* it is used as
'accumulated cost' argument to recursive
best_extension_by_limited_search() calls. This ensured that the
cost of a huge join-fanout early in the QEP was correctly
reflected in the cost of the final QEP.
To get identical cost for a 'best' optimized query and a
straight_join with the same join order, the same change was also
applied to optimize_straight_join() and get_partial_join_cost()
2) Furthermore to get equal cost for 'best' optimized query and a
straight_join the new code substrcated the same '0.001' in
optimize_straight_join() as it had been already done in
best_extension_by_limited_search()
3) When best_extension_by_limited_search() aggregated the 'best' plan a
plan was 'best' by the check :
'if ((search_depth == 1) || (current_read_time < join->best_read))'
The term '(search_depth == 1' incorrectly caused a new best plan to be
collected whenever the specified 'search_depth' was reached - even if
this partial query plan was more expensive than what we had already
found.
- Correctly handle plan refinement stage for LooseScan plans: run create_ref_for_key() if LooseScan
plan includes a ref access, and if we don't have any fixed key components, switch to a full index scan.
The function setup_sj_materialization_part1() forgot to set the value
of TABLE::map for any materialized IN subquery.
This could lead to wrong results for queries with subqueries that were
converted to queries with semijoins.
This bug in the function Loose_scan_opt::check_ref_access_part1 could lead
to choosing an invalid execution plan employing a loose scan access to a
semi-join table even in the cases when such access could not be used at all.
This could result in wrong answers for some queries with IN subqueries.
If the optimizer switch 'semijoin_with_cache' is set to 'off' then
join cache cannot be used to join inner tables of a semijoin.
Also fixed a bug in the function check_join_cache_usage() that led
to wrong output of the EXPLAIN commands for some test cases.
- when create_ref_for_key() is constructing a ref access for
a table that's inside a SJ-Materialization nest, it may not
use references to fields of tables that are unside the nest (because
these fields will not yet have values when ref access will be used)
The check was performed in the first of create_ref_for_key's loops (the
one which counts how many key parts are usable) but not in the second
(the one which actually fills the TABLE_REF structure).
of the 5.3 code line after a merge with 5.2 on 2010-10-28
in order not to allow the cost to access a joined table to be equal
to 0 ever.
Expanded data sets for many test cases to get the same execution plans
as before.
- convert_subq_to_jtbm() didn't check that subuqery optimization was successful. If it wasn't (in this
example because of @@max_join_size violation), it would proceed further and eventually crash when
trying to execute the un-optimized subquery.
- Provide fix_after_pullout() function for Item_in_optimizer and other Item_XXX classes (basically, all of them
that have eval_not_null_tables, which means they have special rules for calculating not_null_tables_cache value)
- If convert_join_subqueries_to_semijoins() decides to wrap Item_in_subselect in Item_in_optimizer,
it should do so in prep_on_expr/prep_where, too, as long as they are present.
There seems to be two possibilities of how we arrive in this function:
- prep_on_expr/prep_where==NULL, and will be set later by simplify_joins()
- prep_on_expr/prep_where!=NULL, and it is a copy_and_or_structure()-made copy of on_expr/where.
the latter can happen for some (but not all!) nested joins. This bug was that we didn't handle this case.
- Let join buffering code correctly take into account rowids needed
by DuplicateElimination when it is calculating minimum record sizes.
- In JOIN_CACHE::write_record_data, added asserts that prevent us from
writing beyond the end of the buffer.