Analysis:
Partial matching is used even when there are no NULLs in
a materialized subquery, as long as the left NOT IN operand
may contain NULL values.
This case was not handled correctly in two different places.
First, the implementation of parital matching did not clear
the set of matching columns when the merge process advanced
to the next row.
Second, there is no need to perform partial matching at all
when the left operand has no NULLs.
Solution:
First fix subselect_rowid_merge_engine::partial_match() to
properly cleanup the bitmap of matching keys when advancing
to the next row.
Second, change subselect_partial_match_engine::exec() so
that when the materialized subquery doesn't contain any
NULLs, and the left operand of [NOT] IN doesn't contain
NULLs either, the method returns without doing any
unnecessary partial matching. The correct result in this
case is in Item::in_value.
When the WHERE/HAVING condition of a subquery has been transformed
by the optimizer the pointer stored the 'where'/'having' field
of the SELECT_LEX structure used for the subquery must be updated
accordingly. Otherwise the pointer may refer to an invalid item.
This can lead to the reported assertion failure for some queries
with correlated subqueries
The bug is a duplicate of MySQL's Bug#11764086,
however MySQL's fix is incomplete for MariaDB, so
this fix is slightly different.
In addition, this patch renames
Item_func_not_all::top_level() to is_top_level_item()
to make it in line with the analogous methods of
Item_in_optimizer, and Item_subselect.
Analysis:
It is possible to determine whether a predicate is
NULL-rejecting only if it is a top-level one. However,
this was not taken into account for Item_in_optimizer.
As a result, a NOT IN predicate was erroneously
considered as NULL-rejecting, and the NULL-complemented
rows generated by the outer join were rejected before
being checked by the NOT IN predicate.
Solution:
Change Item_in_optimizer to be considered as
NULL-rejecting only if it a top-level predicate.
- add_ref_to_table_cond() should not just overwrite pre_idx_push_select_cond
with the contents tab->select_cond.
pre_idx_push_select_cond exists precisely for the reason that it may contain
a condition that is a strict superset of what is in tab->select_cond.
The fix is to inject generated equality into pre_idx_push_select_cond.
- Make simplify_joins() set maybe_null=FALSE for tables that were on the
inner sides of inner joins and then were moved to the inner sides of semi-joins.
Identified all test cases in the MySQL file subquery.inc that are
not present in MariaDB. This patch adds the test cases that are:
- not present in MySQL 5.5, and
- already fixed in MariaDB 5.3
The patch adds test cases for the following mysql-trunk bugs:
- Bug#12763207 - not a bug, mysql-trunk, added test case
- BUG#50257 - not a bug, mysql-trunk, added test case
- Bug 11765699 - not a bug, mysql-trunk, added test case
- BUG#12616253 - not a bug, mysql-trunk, added test case
The comparison was based on the following version of
mysql-trunk:
revno: 3350 [merge]
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: mysql-trunk
timestamp: Mon 2011-08-08 12:42:09 +0300
message:
Merge mysql-5.5 to mysql-trunk.
The method Item_ref::not_null_tables() returned incorrect bitmap
for outer references to view columns. This could cause an invalid
conversion of an outer join into an inner join that could lead
to a wrong result set for a query with a correlated subquery over
an outer join whose where condition had an outer reference to a view.
The method Item_func_isnull::update_used_tables() erroneously did not
update cached values stored in the fields used_tables_cache and
const_item_cache of the Item_func_isnull objects. As a result the
Item_func_isnull::used_tables() returned wrong bitmaps and, as a
consequence, push-down predicates could be attached to wrong tables.
This fixes a bug that when you use mysqldump --no-create-info to generate a dump that you want to merge with an existing table,
you can get an innodb table with duplicated unique keys.
Patch orignally by Eric Bergen.
client/mysqldump.c:
Only use UNIQUE_CHECKS=0 for tables that are created.
This solves the issue that you can't get duplicate unique keys when merging two dumps.
mysql-test/r/mysqldump.result:
Test for mysqldump --no-create-info
Identified all test cases in the MySQL file subquery_mat.inc that are
not present in MariaDB. In total found 8 test cases for the following
MySQL bugs:
* BUG#49630 - not a bug in MariaDB, added test case
* BUG#52538 - not a bug in MariaDB, added test case (checked with VG)
* BUG#53103 - not a bug in MariaDB, added test case
* BUG#54511 - not a bug in MariaDB, added test case
* BUG#56367 - not a bug in MariaDB, added test case
* BUG#59833 - not a bug in MariaDB, added test case
* BUG#11852644 - not a bug in MariaDB, added test case
* BUG#12668294 - not a bug in MariaDB, added test case
All of these MySQL bugs are not present in MariaDB 5.3.
The comparison was based on the following version of
mysql-trunk:
revno: 3350 [merge]
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: mysql-trunk
timestamp: Mon 2011-08-08 12:42:09 +0300
message:
Merge mysql-5.5 to mysql-trunk.
This bug is a special case of lp:813447.
Analysis:
Constant optimization finds that the condition t2.a = 1
can be used to access the primary key of table 't2'. As
a result both outer table t1,t2 are considered as constant
when we reach the execution phase. At the same time, during
constant optimization, the IN predicate is not evaluated
because it is expensive.
When execution of the outer query reaches do_select(),
control flow enter the branch:
if (join->table_count == join->const_tables)
{ ... }
This branch checks only the WHERE and HAVING clauses,
but doesn't check the ON clauses of the query. Since the
IN predicate was not evaluated during optimization, it is
not evaluated at all, thus execution doesn't detect that
the ON clause is FALSE.
Solution:
Similar to the patch for bug lp:813447, exclude system
tables from constant substitution based on unique key
lookups if there is an expensive ON condition on the
inner table.
revno: 2876.47.174
revision-id: jorgen.loland@oracle.com-20110519120355-qn7eprkad9jqwu5j
parent: mayank.prasad@oracle.com-20110518143645-bdxv4udzrmqsjmhq
committer: Jorgen Loland <jorgen.loland@oracle.com>
branch nick: mysql-trunk-11765831
timestamp: Thu 2011-05-19 14:03:55 +0200
message:
BUG#11765831: 'RANGE ACCESS' MAY INCORRECTLY FILTER
AWAY QUALIFYING ROWS
The problem was that the ranges created when OR'ing two
conditions could be incorrect. Without the bugfix,
"I <> 6 OR (I <> 8 AND J = 5)" would create these ranges:
"NULL < I < 6",
"6 <= I <= 6 AND 5 <= J <= 5",
"6 < I < 8",
"8 <= I <= 8 AND 5 <= J <= 5",
"8 < I"
While the correct ranges is
"NULL < I < 6",
"6 <= I <= 6 AND 5 <= J <= 5",
"6 < I"
The problem occurs when key_or() ORs
(1) "NULL < I < 6, 6 <= I <= 6 AND 5 <= J <= 5, 6 < I" with
(2) "8 < I AND 5 <= J <= 5"
The reason for the bug is that in key_or(), SEL_ARG *tmp is
used to point to the range in (1) above that is merged with
(2) while key1 points to the root of the red-black tree of
(1). When merging (1) and (2), tmp refers to the "6 < I"
part whereas the root is the "6 <= ... AND 5 <= J <= 5" part.
key_or() decides that the tmp range needs to be split into
"6 < I < 8, 8 <= I <= 8, 8 < I", in which next_key_part of the
second range should be that of tmp. However, next_key_part is
set to key1->next_key_part ("5 <= J <= 5") instead of
tmp->next_key_part (empty). Fixing this gives the correct but
not optimal ranges:
"NULL < I < 6",
"6 <= I <= 6 AND 5 <= J <= 5",
"6 < I < 8",
"8 <= I <= 8",
"8 < I"
A second problem can be seen above: key_or() may create
adjacent ranges that could be replaced with a single range.
Fixes for this is also included in the patch so that the range
above becomes correct AND optimal:
"NULL < I < 6",
"6 <= I <= 6 AND 5 <= J <= 5",
"6 < I"
Merging adjacent ranges like this gives a slightly lower cost
estimate for the range access.
This problem could be observed for queries with nested outer joins
for which the not_exist optimization were applicable.
The problem was caused by the code of the patch for bug #49322
that erroneously forced the return to the previous nested loop
level when the join algorithm successfully builds a partial record
for an embedded outer to which the not_exist optimization could be
applied.
Actually the immediate return to the previous nested loops level
is correct only if this partial record is rejected by a predicate
pushed down to one of the inner tables of this outer join. Otherwise
attempts to find extensions of this record must be made.
In case of two views with subqueries it is dificult to decide about order of injected ORDER BY clauses.
A simple solution is just prohibit ORDER BY injection if there is other order by.
mysql-test/r/view.result:
New test added, old test changed.
mysql-test/t/view.test:
New test aded.
sql/share/errmsg.txt:
new warning added.
sql/sql_view.cc:
Inject ORDER BY only if there is no other one.
Warning about ignoring ORDER BY in this case for EXPLAIN EXTENDED.
There are 2 volatile condition constructions AND/OR constructions and fields(references) when first
good supported to be top elements of conditions because it is normal practice
(see copy_andor_structure for example) fields without any expression in the condition is really rare
and mostly useless case however it could lead to problems when optimiser changes/moves them unaware
of other variables referring to them. An easy solution of this problem is just to replace single field
in a condition with equivalent expression well supported by the server (<field> -> <field> != 0).
mysql-test/r/view.result:
New test added.
mysql-test/t/view.test:
New test added.
sql/sql_parse.cc:
<field> -> <field> != 0
sql/sql_yacc.yy:
<field> -> <field> != 0
An aggregating query over an empty set of a join of two tables
with a rejecting HAVING clause erroneously could return a row.
It could happen in the cases when the optimizer made a conclusion
that the aggregating set was empty.
Wrong results were produced because the server missed initial
setting for aggregation functions in the mentioned cases.
The function matching_cond should take into account that
there may be always false constant conjunctive conditions
that has not been evaluated yet,for example, conjunctive
conditions with non-correlated subqueries.
ALL subquery should return TRUE if subquery rowa set is empty independently
of left part. The problem was that Item_func_(eq,ne,gt,ge,lt,le) do not
call execution of second argument if first is NULL no in this case subquery
will not be executed and when Item_func_not_all calls any_value() of the
subquery or aggregation function which report that there was rows. So for
NULL < ALL (SELECT...) result was FALSE instead of TRUE.
Fix is just swapping of arguments of Item_func_(eq,ne,gt,ge,lt,le) (with
changing the operation if it is needed) so that result will be the same
(for examole a < b is equal to b > a). This fix exploit the fact that
first argument will be executed in any case.
mysql-test/r/subselect.result:
The test suite added.
mysql-test/r/subselect_no_mat.result:
The test suite added.
mysql-test/r/subselect_no_opts.result:
The test suite added.
mysql-test/r/subselect_no_semijoin.result:
The test suite added.
mysql-test/r/subselect_scache.result:
The test suite added.
mysql-test/t/subselect.test:
The test suite added.
sql/item_cmpfunc.cc:
Swap arguments creation methods added.
sql/item_cmpfunc.h:
Swap arguments creation methods added.
sql/item_subselect.cc:
Swap arguments of the comparison.
reset_nj_counters() used to rely on the fact that join nests have
table->table==NULL. This ceased to be true wit new derived table
optimizations. Use test for table->nested_join!=NULL instead.
The problem was that optimizer removes some outer references (it they are
constant for example) and the list of outer items built during prepare phase is
not actual during execution phase when we need it as the cache parameters.
First solution was use pointer on pointer on outer reference Item and
initialize temporary table on demand. This solved most problem except case
when optimiser also reduce Item which contains outer references ('OR' in
this bug test suite).
The solution is to build the list of outer reference items on execution
phase (after optimization) on demand (just before temporary table creation)
by walking Item tree and finding outer references among Item_ident
(Item_field/Item_ref) and Item_sum items.
Removed depends_on list (because it is not neede any mnore for the cache, in the place where it was used it replaced with upper_refs).
Added processor (collect_outer_ref_processor) and get_cache_parameters() methods to collect outer references (or other expression parameters in future).
mysql-test/r/subselect_cache.result:
A new test added.
mysql-test/r/subselect_scache.result:
Changes in creating the cache and its paremeters order or adding arguments of aggregate function (which is a parameter also, but this has no influence on the result).
mysql-test/t/subselect_cache.test:
Added a new test.
sql/item.cc:
depends_on removed.
Added processor (collect_outer_ref_processor) and get_cache_parameters() methods to collect outer references.
Item_cache_wrapper collect parameters befor initialization of its cache.
sql/item.h:
depends_on removed.
Added processor (collect_outer_ref_processor) and get_cache_parameters() methods to collect outer references.
sql/item_cmpfunc.cc:
depends_on removed.
Added processor (collect_outer_ref_processor) to collect outer references.
sql/item_cmpfunc.h:
Added processor (collect_outer_ref_processor) to collect outer references.
sql/item_subselect.cc:
depends_on removed.
Added processor get_cache_parameters() method to collect outer references.
sql/item_subselect.h:
depends_on removed.
Added processor get_cache_parameters() method to collect outer references.
sql/item_sum.cc:
Added processor (collect_outer_ref_processor) method to collect outer references.
sql/item_sum.h:
Added processor (collect_outer_ref_processor) and get_cache_parameters() methods to collect outer references.
sql/opt_range.cc:
depends_on removed.
sql/sql_base.cc:
depends_on removed.
sql/sql_class.h:
New iterator added.
sql/sql_expression_cache.cc:
Build of list of items resolved in outer query done just before creating expression cache on the first execution of the subquery which removes influence of optimizer removing items (all optimization already done).
sql/sql_expression_cache.h:
Build of list of items resolved in outer query done just before creating expression cache on the first execution of the subquery which removes influence of optimizer removing items (all optimization already done).
sql/sql_lex.cc:
depends_on removed.
sql/sql_lex.h:
depends_on removed.
sql/sql_list.h:
Added add_unique method to add only unique elements to the list.
sql/sql_select.cc:
Support of new Item list added.
sql/sql_select.h:
Support of new Item list added.
Analysis:
Both the wrong result and the valgrind warning were a result
of incomplete cleanup of the MIN/MAX subquery rewrite. At the
first execution of the query, the non-aggregate subquery is
transformed into an aggregate MIN/MAX subquery. During the
fix_fields phase of the MIN/MAX function, it sets the property
st_select_lex::with_sum_func to true.
The second execution of the query finds this flag to be ON.
When optimization reaches the same MIN/MAX subquery
transformation, it tests if the subquery is an aggregate or not.
Since select_lex->with_sum_func == true from the previous
execution, the transformation executes the second branch that
handles aggregate subqueries. This substitutes the subquery
Item into a Item_maxmin_subselect. At the same time elsewhere
it is assumed that the subquery Item is of type
Item_allany_subselect. Ultimately this results in casting the
actual object to the wrong class, and calling the wrong
any_value() method from empty_underlying_subquery().
Solution:
Cleanup the st_select_lex::with_sum_func property in the case
when the MIN/MAX transformation was performed for a non-aggregate
subquery, so that the transformation can be repeated.
This bug could lead to wrong result sets for a query over a
materialized derived table or view accessed by a multi-component
key.
It happened because the function get_next_field_for_derived_key
was supposed to update its argument, and it did not do it.
Also:
1. simplified the code of the function mysql_derived_merge_for_insert.
2. moved merge of views/dt for multi-update/delete to the prepare stage.
3. the list of the references to the candidates for semi-join now is
allocated in the statement memory.
(This is not a real fix for this bug, even though it makes it to no longer repeat)
- Semi-join subquery predicates, i.e. ... WHERE outer_expr IN (SELECT ...) may have null-rejecting properties,
may allow to convert outer joins into inner.
- When convert_subq_to_sj() injected IN-equality into parent's WHERE/ON clause, it didn't call
$new_cond->top_level_item(), which would cause null-rejecting properties to be lost.
- Fixed, now the mentioned outer-to-inner conversion will really take place.
- Added an initial set of feature-specific test cases
- Handled the special case where the materialized subquery of an
IN predicates consists of only NULL values.
- Fixed a bug where making Item_in_subselect a constant,
didn't respect its null_value value.
Analysis:
For some of the re-executions of the correlated subquery the
where clause is false. In these cases the execution of the
subquery detects that it must generate a NULL row because of
implicit grouping. In this case the subquery execution reaches
the following code in do_select():
while ((table= li++))
mark_as_null_row(table->table);
This code marks all rows in the table as complete NULL rows.
In the example, when evaluating the field t2.f10 for the second
row, all bits of Field::null_ptr[0] are set by the previous call
to mark_as_null_row(). Then the call to Field::is_null()
returns true, resulting in a NULL for the MAX function.
Thus the lines above are not suitable for subquery re-execution
because mark_as_null_row() changes the NULL bits of each table
field, and there is no logic to restore these fields.
Solution:
The call to mark_as_null_row() was added by the fix for bug
lp:613029. Therefore removing the fix for lp:613029 corrects
this wrong result. At the same time the test for lp:613029
behaves correctly because the changes of MWL#89 result in a
different execution path where:
- the constant subquery is evaluated via JOIN::exec_const_cond
- detecting that it has an empty result triggers the branch
if (zero_result_cause)
return_zero_rows()
- return_zero_rows() calls mark_as_null_row().