- Provide fix_after_pullout() function for Item_in_optimizer and other Item_XXX classes (basically, all of them
that have eval_not_null_tables, which means they have special rules for calculating not_null_tables_cache value)
- If convert_join_subqueries_to_semijoins() decides to wrap Item_in_subselect in Item_in_optimizer,
it should do so in prep_on_expr/prep_where, too, as long as they are present.
There seems to be two possibilities of how we arrive in this function:
- prep_on_expr/prep_where==NULL, and will be set later by simplify_joins()
- prep_on_expr/prep_where!=NULL, and it is a copy_and_or_structure()-made copy of on_expr/where.
the latter can happen for some (but not all!) nested joins. This bug was that we didn't handle this case.
- Let join buffering code correctly take into account rowids needed
by DuplicateElimination when it is calculating minimum record sizes.
- In JOIN_CACHE::write_record_data, added asserts that prevent us from
writing beyond the end of the buffer.
- The problem was that the code that made the check whether the subquery is an AND-part of the WHERE
clause didn't work correctly for nested subqueries. In particular, grand-child subquery in HAVING was
treated as if it was in the WHERE, which eventually caused an assert when replace_where_subcondition
looked for the subquery predicate in the WHERE and couldn't find it there.
- The fix: Removed implementation of "thd_marker approach". thd->thd_marker was used to determine the
location of subquery predicate: setup_conds() would set accordingly it when making the
{where|on_expr}->fix_fields(...)
call so that AND-parts of the WHERE/ON clauses can determine they are the AND-parts.
Item_cond_or::fix_fields(), Item_func::fix_fields(), Item_subselect::fix_fields (this one was missed),
and all other items-that-contain-items had to reset thd->thd_marker before calling fix_fields() for
their children items, so that the children can see they are not AND-parts of WHERE/ON.
- The "thd_marker approach" required that a lot of code in different locations maintains correct value of
thd->thd_marker, so it was replaced with:
- The new approach with mark_as_condition_AND_part does not keep context in thd->thd_marker. Instead,
setup_conds() now calls
{where|on_expr}->mark_as_condition_AND_part()
and implementations of that function make sure that:
- parts of AND-expressions get the mark_as_condition_AND_part() call
- Item_in_subselect objects record that they are AND-parts of WHERE/ON
- Make simplify_joins() set maybe_null=FALSE for tables that were on the
inner sides of inner joins and then were moved to the inner sides of semi-joins.
(This is not a real fix for this bug, even though it makes it to no longer repeat)
- Semi-join subquery predicates, i.e. ... WHERE outer_expr IN (SELECT ...) may have null-rejecting properties,
may allow to convert outer joins into inner.
- When convert_subq_to_sj() injected IN-equality into parent's WHERE/ON clause, it didn't call
$new_cond->top_level_item(), which would cause null-rejecting properties to be lost.
- Fixed, now the mentioned outer-to-inner conversion will really take place.
- Set the default
- Adjust the testcases so that 'new' tests are run with optimizations turned on.
- Pull out relevant tests from "irrelevant" tests and run them with optimizations on.
- Run range.test and innodb.test with both mrr=on and mrr=off
semijoin=on,firstmatch=on,loosescan=on
to
semijoin=off,firstmatch=off,loosescan=off
Adjust the testcases:
- Modify subselect*.test and join_cache.test so that all tests
use the same execution paths as before (i.e. optimizations that
are being tested are enabled)
- Let all other test files run with the new default settings (i.e.
with new optimizations disabled)
- Copy subquery testcases from these files into t/subselect_extra.test
which will run them with new optimizations enabled.
- JOIN::prepare would have set JOIN::table_count to incorrect value (bad merge of MWL 106)
- optimize_keyuse() would use table-bit as table number
(the change in optimize_keyuse is also the reason for query plan changes. Not
expected to have much effect because only handles cases of no index statistics)
- st_select_lex::register_dependency_item() ignored the fact that some of the
selects on the dependency paths could have been merged to their parents (because they
were mergeable VIEWs)
- Undo the incorrect fix in Item_subselect::recalc_used_tables(): do not call
fix_after_pullout() for Item_subselect::Ref_to_outside members.
- The crash was because a NOT NULL table column inside the subquery was considered NULLable
because the code thought it was on the inner side of an outer join nest.
- Fixed by making correct distinction between tables inside outer join nests and inside semi-join nests.
- Update test results
- Fix a problem with PS:
= convert_subq_to_sj() should not save where to prep_where or on_expr to prep_on_expr.
= After an unmerged subquery predicate has been pulled, it should call fix_after_pullout() for
outer_refs.
The code that added semi-join transformations missed checking
the state of the fixed flag for the items built with the
and_items function before calls of the fix_fields method.
This could lead to an abort failure when the first argument
of and_items() happened to be NULL.
- Don't attempt to construct FirstMatch access method if we've
just figured three lines above that it can't be used (because join
prefix doesn't have the needed tables), and so have set
pos->first_firstmatch_table= MAX_TABLES
Attempts to analyze join->positions[MAX_TABLES] caused valgrind warnings
- in advance_sj_state(), remember join->cur_dups_producing_tables in
pos->prefix_dups_producing_tables *before* we modify it, so that
restore_prev_sj_state() restores cur_dups_producing_tables in all cases.
- Updated test results in subselect_sj2[_jcl6].result (the original EXPLAIN
was invalid there)
- Make optimize_wo_join_buffering() handle cases where position->records_read=0 (this
happens for outer joins that have constant tables inside them). The number of
0 is not correct (should be 1 because outer join will produce at least a NULL-complemented
record) but for now we just make it work with incorrect number.
- Let "mysqld --help --verbose" list all optimizer options
- Make it possible to add new @@optimizer_switch flags w/o causing .result
changes all over the testsuite:
= Remove "select @@optimizer_switch" from tests that do not need all switches
= Move @@optimizer_switch-specific tests to t/optimizer_switch.test
Bug#48623: Multiple subqueries are optimized incorrectly
The function setup_semijoin_dups_elimination() has a major loop that
goes through every table in the JOIN object. Usually, there is a normal
"plus one" increment in the for loop that implements this, but each semijoin
nest is treated as one entity and there is another increment that skips past
the semijoin nest to the next table in the JOIN object. However, when
combining these two increments, the next joined table is skipped, and if that
happens to be the start of another semijoin nest, the correct processing
for that nest will not be carried out.
mysql-test/r/subselect_sj.result:
Added test results for bug#48623
mysql-test/r/subselect_sj_jcl6.result:
Added test results for bug#48623
mysql-test/t/subselect_sj.test:
Added test case for bug#48623
sql/opt_subselect.cc:
Omitted the "plus one" increment in the for loop, added "plus one"
in the remaining switch case, fixed coding style issue in remaining
increment operations.
Fix two problems:
1. Let optimize_semijoin_nests() reset sj_nest->sjmat_info irrespectively
of value of optimizer_flag. We need this in case somebody has turned optimization
off between reexecutions of the same statement.
2. Do not pull out constant tables out of semi-join nests. The problem is that pullout
operation is not undoable, and if a table is constant because it is 1/0-row table it
may cease to be constant on the next execution. Note that tables that are constant
because of possible eq_ref(const) access will still be pulled out as they are
considered functionally-dependent.
Bug#48213 Materialized subselect crashes if using GEOMETRY type
The problem occurred because during semi-join a materialized table
was created which contained a GEOMETRY column, which is a specialized
BLOB column. This caused an segmentation fault because such tables will
have extra columns, and the semi-join code was not prepared for that.
The solution is to disable materialization when Blob/Geometry columns would
need to be materialized. Blob columns cannot be used for index look-up
anyway, so it does not makes sense to use materialization.
This fix implies that it is detected earlier that subquery materialization
can not be used. The result of that is that in->exist optimization may
be performed for such queries. Hence, extended query plans for such
queries had to be updated.
mysql-test/r/subselect_mat.result:
Update extended query plan for subqueries that cannot use materialization
due to Blobs.
mysql-test/r/subselect_sj.result:
Updated result file.
mysql-test/r/subselect_sj_jcl6.result:
Update result file.
mysql-test/t/subselect_sj.test:
Add test case for Bug#48213 that verifies that semi-join works when subquery select list contain Blob columns. Also verify that materialization is not
used.
sql/opt_subselect.cc:
Disable materialization for semi-join/subqueries when the subquery select list
contain Blob columns.
BUG#50019: Wrong result for IN-subquery with materialization
- Fix equality substitution in presense of semi-join materialization, lookup and scan variants
(started off from fix by Evgen Potemkin, then modified it to work in all cases)
Re-worked fix of Tor Didriksen:
The problem was that fix_after_pullout() after semijoin conversion
wasn't propagated from the view to the underlying table.
On subesequent executions of the prepared statement,
we would mark the underlying table as 'dependent' and the predicate
anlysis would lead to a different (and illegal) execution plan.