Statements affected by this bug need all the following to be true
1) a derived table table or view whose specification contains a set
operation at the top level.
2) a grouping operator (group by/having) operating on a column alias
other than in the first select of the union/intersect
3) an outer condition that will be pushed into all selects in this
union/intersect, either into the where or having clause
When pushing a condition into all selects of a unit with more than one
select, pushdown_cond_for_derived() renames items so we can re-use the
condition being pushed.
These names need to be saved and reset for correct name resolution on
second execution of prepared statements.
Reviewed by Igor Babaev (igor@mariadb.com)
For queries with derived tables populated having some side-effect, we
will fill such a derived table more than once, but without clearing
its rows. Consequently it will have duplicate rows.
An example query exhibiting the problem is
SELECT STRAIGHT_JOIN c1 FROM t1 JOIN (SELECT @a := 0) x;
Since mysql_derived_fill will, for UNCACHEABLE_DEPENDENT tables, drop
all rows and repopulate, we relax the condition at line 1204: rather
than assume all uncacheable values prevent early return, we now
allow an early return for uncacheable values other than
UNCACHEABLE_DEPENDENT. In general, we only populate derived tables
once unless they're dependent tables.
This bug led to wrong result sets returned by the second execution of
prepared statements from selects using mergeable derived tables pushed
into external engine. Such derived tables are always materialized. The
decision that they have to be materialized is taken late in the function
mysql_derived_optimized(). For regular derived tables this decision is
usually taken at the prepare phase. However in some cases for some derived
tables this decision is made in mysql_derived_optimized() too. It can be
seen in the code of mysql_derived_fill() that for such a derived table it's
critical to change its translation table to tune it to the fields of the
temporary table used for materialization of the derived table and this
must be done after each refill of the derived table. The same actions are
needed for derived tables pushed into external engines.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Consider this query
SELECT t1.* FROM t1, (SELECT t2.b FROM t2 WHERE NOT EXISTS
(SELECT 1 FROM t3) GROUP BY b) sq where sq.b = t1.a;
If SELECT 1 FROM t3 is expensive, for example t3 has >
thd->variables.expensive_subquery_limit, first evaluation is deferred to
mysql_derived_fill(). There it is noted that, in the above case
NOT EXISTS (SELECT 1 FROM t3) is constant and false.
This causes the join variable zero_result_cause to be set to
"Impossible WHERE noticed after reading const tables" and the handler
for this join is never "opened" via handler::ha_open.
When mysql_derived_fill() is called for the next group of results, this
unopened handler is not taken into account.
reviewed by Igor Babaev (igor@mariadb.com)
Statements affected by this bug have all the following
1) select statements with a sub-query
2) that sub-query includes a group-by clause
3) that group-by clause contains an expression
4) that expression has a reference to view
When a view is used in a group by expression, and that group by can be
eliminated in a sub-query simplification as part of and outer condition
that could be in, exists, > or <, then the table structure left behind
will have a unit that contains a null select_lex pointer.
If this happens as part of a prepared statement, or execute in a stored
procedure for the second time, then, when the statement is executed, the table
list entry for that, now eliminated, view is "opened" and "reinit"ialized.
This table entry's unit no longer has a select_lex pointer.
Prior to MDEV-31995 this was of little consequence, but now following this
null pointer will cause a crash.
Reviewed by Igor Babaev (igor@mariadb.com)
This commit addresses column naming issues with CTEs in the use of prepared
statements and stored procedures. Usage of either prepared statements or
procedures with Common Table Expressions and column renaming may be affected.
There are three related but different issues addressed here.
1) First execution issue. Consider the following
prepare s from "with cte (col1, col2) as (select a as c1, b as c2 from t
order by c1) select col1, col2 from cte";
execute s;
After parsing, items in the select are named (c1,c2), order by (and group by)
resolution is performed, then item names are set to (col1, col2).
When the statement is executed, context analysis is again performed, but
resolution of elements in the order by statement will not be able to find c1,
because it was renamed to col1 and remains this way.
The solution is to save the names of these items during context resolution
before they have been renamed. We can then reset item names back to those after
parsing so first execution can resolve items referred to in order and group by
clauses.
2) Second Execution Issue
When the derived table contains more than one select 'unioned' together we could
reasonably think that dealing with only items in the first select (which
determines names in the resultant table) would be sufficient. This can lead to
a different problem. Consider
prepare st from "with cte (c1,c2) as
(select a as col1, sum(b) as col2 from t1 where a > 0 group by col1
union select a as col3, sum(b) as col4 from t2 where b > 2 group by col3)
select * from cte where c1=1";
When the optimizer (only run during the first execution) pushes the outside
condition "c1=1" into every select in the derived table union, it renames the
items to make the condition valid. In this example, this leaves the first item
in the second select named 'c1'. The second execution will now fail 'group by'
resolution.
Again, the solution is to save the names during context analysis, resetting
before subsequent resolution, but making sure that we save/reset the item
names in all the selects in this union.
3) Memory Leak
During parsing Item::set_name() is used to allocate memory in the statement
arena. We cannot use this call during statement execution as this represents
a memory leak. We directly set the item list names to those in the column list
of this CTE (also allocated during parsing).
Approved by Igor Babaev <igor@mariadb.com>
MDEV-30668 Set function aggregated in outer select used in view definition
This patch fixes two bugs concerning views whose specifications contain
subqueries with set functions aggregated in outer selects.
Due to the first bug those such views that have implicit grouping were
considered as mergeable. This led to wrong result sets for selects from
these views.
Due to the second bug the aggregation select was determined incorrectly and
this led to bogus error messages.
The patch added several test cases for these two bugs and for four other
duplicate bugs.
The patch also enables view-protocol for many other test cases.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
This bug manifested itself when the server processed a query containing
a derived table over union whose ORDER BY clause included a subquery
with unresolvable column reference. For such a query the server crashed
when trying to resolve column references in the ORDER BY clause used by
union.
For any union with ORDER BY clause an extra SELECT_LEX structure is created
and it is attached to SELECT_LEX_UNIT structure of the union via the field
fake_select_lex. The outer context for fake_select_lex must be the same as
for other selects of the union. If the union is used in the FROM list of
a derived table then the outer context for fake_select_lex must be set to
NULL in line with other selects of the union. It was not done and it
caused a crash when searching for possible resolution of an unresolvable
column reference occurred in a subquery used in the ORDER BY clause.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Deallocation of TABLE_LIST::dt_handler and TABLE_LIST::pushdown_derived
was performed in multiple places if code. This not only made the code
more difficult to maintain but also led to memory leaks and
ASAN heap-use-after-free errors.
This commit puts deallocation of TABLE_LIST::dt_handler and
TABLE_LIST::pushdown_derived to the single point - JOIN::cleanup()
Consider a query of the form:
select ... from (select item2 as COL1) as T where COL1=123
Condition pushdown into derived table will try to push "COL1=123" condition
down into table T.
The process of pushdown involves "substituting" the item, that is,
replacing Item_field("T.COL1") with its "producing item" item2.
In order to use item2, one needs to clone it (call Item::build_clone).
If the item is not cloneable (e.g. Item_func_sp is not), the pushdown
process will fail and nothing at all will be pushed.
Fixed by introducing transform_condition_or_part() which will try to apply
the transformation for as many parts of condition as possible. The parts of
condition that couldn't be transformed are dropped.
This bug affected queries with views / derived_tables / CTEs whose
specifications were of the form
(SELECT ... LIMIT <n>) ORDER BY ...
Units representing such specifications contains one SELECT_LEX structure
for (SELECT ... LIMIT <n>) and additionally SELECT_LEX structure for
fake_select_lex. This fact should have been taken into account in the
function mysql_derived_fill().
This patch has to be applied to 10.2 and 10.3 only.
This bug affected queries with views / derived_tables / CTEs whose
specifications were of the form
(SELECT ... LIMIT <n>) ORDER BY ...
Units representing such specifications contains one SELECT_LEX structure
for (SELECT ... LIMIT <n>) and additionally SELECT_LEX structure for
fake_select_lex. This fact should have been taken into account in the
function mysql_derived_fill().
This patch has to be applied to 10.2 and 10.3 only.
When you only need view structure, don't call handle_derived with
DT_CREATE and rely on its internal hackish check to skip DT_CREATE.
Because handle_derived is called from many different places,
and this internal hackish check is indiscriminative.
Instead, just don't ask handle_derived to do DT_CREATE
if you don't want it to do DT_CREATE.
When you only need view structure, don't call handle_derived with
DT_CREATE and rely on its internal hackish check to skip DT_CREATE.
Because handle_derived is called from many different places,
and this internal hackish check is indiscriminative.
Instead, just don't ask handle_derived to do DT_CREATE
if you don't want it to do DT_CREATE.
Before this patch mergeable derived tables / view used in a multi-table
update / delete were merged before the preparation stage.
When the merge of a derived table / view is performed the on expression
attached to it is fixed and ANDed with the where condition of the select S
containing this derived table / view. It happens after the specification of
the derived table / view has been merged into S. If the ON expression refers
to a non existing field an error is reported and some other mergeable derived
tables / views remain unmerged. It's not a problem if the multi-table
update / delete statement is standalone. Yet if it is used in a stored
procedure the select with incompletely merged derived tables / views may
cause a problem for the second call of the procedure. This does not happen
for select queries using derived tables / views, because in this case their
specifications are merged after the preparation stage at which all ON
expressions are fixed.
This patch makes sure that merging of the derived tables / views used in a
multi-table update / delete statement is performed after the preparation
stage.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Before this patch mergeable derived tables / view used in a multi-table
update / delete were merged before the preparation stage.
When the merge of a derived table / view is performed the on expression
attached to it is fixed and ANDed with the where condition of the select S
containing this derived table / view. It happens after the specification of
the derived table / view has been merged into S. If the ON expression refers
to a non existing field an error is reported and some other mergeable derived
tables / views remain unmerged. It's not a problem if the multi-table
update / delete statement is standalone. Yet if it is used in a stored
procedure the select with incompletely merged derived tables / views may
cause a problem for the second call of the procedure. This does not happen
for select queries using derived tables / views, because in this case their
specifications are merged after the preparation stage at which all ON
expressions are fixed.
This patch makes sure that merging of the derived tables / views used in a
multi-table update / delete statement is performed after the preparation
stage.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
This patch sets the proper name resolution context for outer references
used in a subquery from an ON clause. Usually this context is more narrow
than the name resolution context of the parent select that were used before
this fix.
This fix revealed another problem that concerned ON expressions used in
from clauses of specifications of derived tables / views / CTEs. The name
resolution outer context for such ON expression must be set to NULL to
prevent name resolution beyond the derived table where it is used.
The solution to resolve this problem applied in sql_derived.cc was provided
by Sergei Petrunia <sergey@mariadb.com>.
The change in sql_parse.cc is not good for 10.4+. A corresponding diff for
10.4+ will be provided in JIRA entry for this bug.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
The issue here was we were trying to push an extracted condition for a view into the
underlying table value constructor inside the view.
The fix would be to not push conditions into table value constructors.
in queries like
create view v1 as select 2 like 1 escape (3 in (select 0 union select 1));
select 2 union select * from v1;
Item_func_like::escape was left uninitialized, because
Item_in_optimizer is const_during_execution()
but not actually const_item() during execution.
It's not, because const subquery evaluation was disabled for derived.
Practically it only needs to be disabled for multi-update
that runs fix_fields() before all tables are locked.
This bug could cause a crash when executing queries that used mutually
recursive CTEs with system variable big_tables set to 1. It happened due
to several bugs in the code that handled recursive table references
referred mutually recursive CTEs. For each recursive table reference a
temporary table is created that contains all rows generated for the
corresponding recursive CTE table on the previous step of recursion.
This temporary table should be created in the same way as the temporary
table created for a regular materialized derived table using the
method select_union::create_result_table(). In this case when the
temporary table is created it uses the select_union::TMP_TABLE_PARAM
structure as the parameter for the table construction. However the
code created the temporary table using just the function create_tmp_table()
and passed pointers to certain fields of the TMP_TABLE_PARAM structure
used for accumulation of rows of the recursive CTE table as parameters
for update. This was a mistake because now different temporary tables
cannot share some TMP_TABLE_PARAM fields in a general case. Besides,
depending on how mutually recursive CTE tables were defined and which
of them were referred in the executed query the select_union object
allocated for a recursive table reference could be allocated again after
the the temporary table had been created. In this case the TMP_TABLE_PARAM
object associated with the temporary table created for the recursive
table reference contained unassigned fields needed for execution when
Aria engine is employed as the engine for temporary tables.
This patch ensures that
- select_union object is created only once for any recursive table
reference
- any temporary table created for recursive CTEs uses its own
TMP_TABLE_PARAM structure
The patch also fixes a problem caused by incomplete cleanup of join tables
associated with recursive table references.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
A bogus error message was issued when a condition was pushed into a
materialized derived table or view specified as union of selects with
aggregation when the corresponding columns of the selects had different
names. This happened because the expression pushed into having clauses of
the selects was adjusted for the names of the first select of the union.
The easiest solution was to rename the columns of the other selects to be
name compatible with the columns of the first select.
Approved by Oleksandr Byelkin <sanja@mariadb.com>