Statements affected by this bug need all the following to be true
1) a derived table table or view whose specification contains a set
operation at the top level.
2) a grouping operator (group by/having) operating on a column alias
other than in the first select of the union/intersect
3) an outer condition that will be pushed into all selects in this
union/intersect, either into the where or having clause
When pushing a condition into all selects of a unit with more than one
select, pushdown_cond_for_derived() renames items so we can re-use the
condition being pushed.
These names need to be saved and reset for correct name resolution on
second execution of prepared statements.
Reviewed by Igor Babaev (igor@mariadb.com)
Improve performance of queries like
SELECT * FROM t1 WHERE field = NAME_CONST('a', 4);
by, in this example, replacing the WHERE clause with field = 4
in the case of ref access.
The rewrite is done during fix_fields and we disambiguate this
case from other cases of NAME_CONST by inspecting where we are
in parsing. We rely on THD::where to accomplish this. To
improve performance there, we change the type of THD::where to
be an enumeration, so we can avoid string comparisons during
Item_name_const::fix_fields. Consequently, this patch also
changes all usages of THD::where to conform likewise.
For queries with derived tables populated having some side-effect, we
will fill such a derived table more than once, but without clearing
its rows. Consequently it will have duplicate rows.
An example query exhibiting the problem is
SELECT STRAIGHT_JOIN c1 FROM t1 JOIN (SELECT @a := 0) x;
Since mysql_derived_fill will, for UNCACHEABLE_DEPENDENT tables, drop
all rows and repopulate, we relax the condition at line 1204: rather
than assume all uncacheable values prevent early return, we now
allow an early return for uncacheable values other than
UNCACHEABLE_DEPENDENT. In general, we only populate derived tables
once unless they're dependent tables.
This bug led to wrong result sets returned by the second execution of
prepared statements from selects using mergeable derived tables pushed
into external engine. Such derived tables are always materialized. The
decision that they have to be materialized is taken late in the function
mysql_derived_optimized(). For regular derived tables this decision is
usually taken at the prepare phase. However in some cases for some derived
tables this decision is made in mysql_derived_optimized() too. It can be
seen in the code of mysql_derived_fill() that for such a derived table it's
critical to change its translation table to tune it to the fields of the
temporary table used for materialization of the derived table and this
must be done after each refill of the derived table. The same actions are
needed for derived tables pushed into external engines.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Consider this query
SELECT t1.* FROM t1, (SELECT t2.b FROM t2 WHERE NOT EXISTS
(SELECT 1 FROM t3) GROUP BY b) sq where sq.b = t1.a;
If SELECT 1 FROM t3 is expensive, for example t3 has >
thd->variables.expensive_subquery_limit, first evaluation is deferred to
mysql_derived_fill(). There it is noted that, in the above case
NOT EXISTS (SELECT 1 FROM t3) is constant and false.
This causes the join variable zero_result_cause to be set to
"Impossible WHERE noticed after reading const tables" and the handler
for this join is never "opened" via handler::ha_open.
When mysql_derived_fill() is called for the next group of results, this
unopened handler is not taken into account.
reviewed by Igor Babaev (igor@mariadb.com)
Statements affected by this bug have all the following
1) select statements with a sub-query
2) that sub-query includes a group-by clause
3) that group-by clause contains an expression
4) that expression has a reference to view
When a view is used in a group by expression, and that group by can be
eliminated in a sub-query simplification as part of and outer condition
that could be in, exists, > or <, then the table structure left behind
will have a unit that contains a null select_lex pointer.
If this happens as part of a prepared statement, or execute in a stored
procedure for the second time, then, when the statement is executed, the table
list entry for that, now eliminated, view is "opened" and "reinit"ialized.
This table entry's unit no longer has a select_lex pointer.
Prior to MDEV-31995 this was of little consequence, but now following this
null pointer will cause a crash.
Reviewed by Igor Babaev (igor@mariadb.com)
This commit addresses column naming issues with CTEs in the use of prepared
statements and stored procedures. Usage of either prepared statements or
procedures with Common Table Expressions and column renaming may be affected.
There are three related but different issues addressed here.
1) First execution issue. Consider the following
prepare s from "with cte (col1, col2) as (select a as c1, b as c2 from t
order by c1) select col1, col2 from cte";
execute s;
After parsing, items in the select are named (c1,c2), order by (and group by)
resolution is performed, then item names are set to (col1, col2).
When the statement is executed, context analysis is again performed, but
resolution of elements in the order by statement will not be able to find c1,
because it was renamed to col1 and remains this way.
The solution is to save the names of these items during context resolution
before they have been renamed. We can then reset item names back to those after
parsing so first execution can resolve items referred to in order and group by
clauses.
2) Second Execution Issue
When the derived table contains more than one select 'unioned' together we could
reasonably think that dealing with only items in the first select (which
determines names in the resultant table) would be sufficient. This can lead to
a different problem. Consider
prepare st from "with cte (c1,c2) as
(select a as col1, sum(b) as col2 from t1 where a > 0 group by col1
union select a as col3, sum(b) as col4 from t2 where b > 2 group by col3)
select * from cte where c1=1";
When the optimizer (only run during the first execution) pushes the outside
condition "c1=1" into every select in the derived table union, it renames the
items to make the condition valid. In this example, this leaves the first item
in the second select named 'c1'. The second execution will now fail 'group by'
resolution.
Again, the solution is to save the names during context analysis, resetting
before subsequent resolution, but making sure that we save/reset the item
names in all the selects in this union.
3) Memory Leak
During parsing Item::set_name() is used to allocate memory in the statement
arena. We cannot use this call during statement execution as this represents
a memory leak. We directly set the item list names to those in the column list
of this CTE (also allocated during parsing).
Approved by Igor Babaev <igor@mariadb.com>
Allow queries of multiple SELECTs combined together with
UNIONs/EXCEPTs/INTERSECTs to be pushed down to foreign engines.
If the foreign engine provides an interface method "create_unit"
and the UNIT is a top-level unit of the SQL query then the server
tries to push the whole SELECT_LEX_UNIT down to the engine for execution.
The engine should perform necessary checks and if they succeed,
execute the query. If the engine is unable to execute the whole unit,
then another attempt is made to push down SELECTs composing the unit
separately using the "create_select" interface method. In this case
the results of separate SELECTs are combined at the server side
thus composing the final result
MDEV-30668 Set function aggregated in outer select used in view definition
This patch fixes two bugs concerning views whose specifications contain
subqueries with set functions aggregated in outer selects.
Due to the first bug those such views that have implicit grouping were
considered as mergeable. This led to wrong result sets for selects from
these views.
Due to the second bug the aggregation select was determined incorrectly and
this led to bogus error messages.
The patch added several test cases for these two bugs and for four other
duplicate bugs.
The patch also enables view-protocol for many other test cases.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
The problem was the mysql_derived_prepare() did not correctly set
'distinct' when creating a temporary derivated table.
Fixed by separating checking for distinct for queries with and without
UNION.
Other things:
- Fixed bug in generate_derived_keys_for_table() where we set the wrong
bit for join_tab->keys
- Cleaned up JOIN::drop_unused_derived_keys()
- Changed TABLE::use_index() to keep unique keys and update
share->key_parts
Author: Sergei Petrunia <sergey@mariadb.com>, monty@mariadb.org
The idea is that instead of marking all select_lex's with DISTINCT, we
only mark those that really need distinct result.
Benefits of this change:
- Temporary tables used with derived tables, UNION, IN are now smaller
as duplicates are removed already on the insert phase.
- The optimizer can now produce better plans with EQ_REF. This can be
seen from the tests where several queries does not anymore materialize
derived tables twice.
- Queries affected by 'in_predicate_conversion_threshold' where large IN
lists are converted to sub query produces better plans.
Other things:
- Removed on duplicate call to sel->init_select() in
LEX::add_primary_to_query_expression_body()
- I moved the testing of
tab->table->pos_in_table_list->is_materialized_derived()
in join_read_const_table() to the caller as it caused problems for
derived tables that could be proven to be const tables.
This also is likely to fix some bugs as if join_read_const_table()
was aborted, the table was left marked as JT_CONST, which cannot
be good. I added an ASSERT there for now that can be removed when
the code has been properly tested.
This bug manifested itself when the server processed a query containing
a derived table over union whose ORDER BY clause included a subquery
with unresolvable column reference. For such a query the server crashed
when trying to resolve column references in the ORDER BY clause used by
union.
For any union with ORDER BY clause an extra SELECT_LEX structure is created
and it is attached to SELECT_LEX_UNIT structure of the union via the field
fake_select_lex. The outer context for fake_select_lex must be the same as
for other selects of the union. If the union is used in the FROM list of
a derived table then the outer context for fake_select_lex must be set to
NULL in line with other selects of the union. It was not done and it
caused a crash when searching for possible resolution of an unresolvable
column reference occurred in a subquery used in the ORDER BY clause.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Deallocation of TABLE_LIST::dt_handler and TABLE_LIST::pushdown_derived
was performed in multiple places if code. This not only made the code
more difficult to maintain but also led to memory leaks and
ASAN heap-use-after-free errors.
This commit puts deallocation of TABLE_LIST::dt_handler and
TABLE_LIST::pushdown_derived to the single point - JOIN::cleanup()