prepare of "fake_select" for union made in JOIN::prepare only if
we do not execute it then before reset, i.e it was for PS prepare
and now required for CREATE VIEW to make global ORDER BY which
belongs to "fake_select" prepared.
GROUP BY
Issue 1:
--------
This problem occurs in the following conditions:
1) A UNION is present in the subquery of select list and
handles multiple columns.
2) Query has a GROUP BY.
A temporary table is created to handle the UNION.
Item_field objects are based on the expressions of the
result of the UNION (ie. the fake_select_lex). While
checking validity of the columns in the GROUP BY list, the
columns of the temporary table are checked in
Item_ident::local_column. But the Item_field objects
created for the temporary table don't have information like
the Name_resolution_context that they belong to or whether
they are dependent on an outer query. Since these members
are null, incorrect behavior is caused.
This can happen when such Item objects are cached to apply
the IN-to-EXISTS transform for Item_row.
Solution to Issue 1:
--------------------
Context information of the first select in the UNION will
be assigned to the new Item_field objects.
Issue 2:
--------
This problem occurs in the following conditions:
1) A UNION is present in the subquery of select list.
2) A column in the UNION's first SELECT refers to a table
in the outer-query making it a dependent union.
3) GROUP BY column refers to the outer-referencing column.
While resolving the select list with an outer-reference, an
Item_outer_ref object is created to handle the
outer-query's GROUP BY list. The Item_outer_ref object
replaces the Item_field object in the item tree.
Item_outer_ref::fix_fields will be called only while fixing
the inner references of the outer query.
Before resolving the outer-query, an Item_type_holder
object needs to be created to handle the UNION. But as
explained above, the Item_outer_ref object has not been
fixed yet. Having a fixed Item object is a pre-condition
for creating an Item_type_holder.
Solution to Issue 2:
--------------------
Use the reference (real_item()) of an Item_outer_ref object
instead of the object itself while creating an
Item_type_holder.
When the rows produced on the current iteration are sent to the
temporary table T of the UNION type created for CTE the rows
that were not there simultaneously are sent to the temporary
table D that contains rows for the next iteration. The test
whether a row was in T checks the return code of writing into T.
If just a HEAP table is used for T then the return code is
HA_ERR_FOUND_DUPP_KEY, but if an ARIA table is used for T then
the return code is HA_ERR_FOUND_DUPP_UNIQUE.
The implementation of select_union_recursive::send_data()
erroneously checked only for the first return code. So if an Aria
table was used for T then all rows produced by the current iteration
went to D and and in most cases D grew with each iteration.
Whether T has reached stabilization is detected by
checking whether D is empty. So as a result, the iterations were
never stopped unless a limit for them was set.
Fixed by checking for both HA_ERR_FOUND_DUPP_KEY and
HA_ERR_FOUND_DUPP_UNIQUE as return codes returned by
the function writing a row into the temporary table T.
This patch fixed some problems that occurred with subqueries that
contained directly or indirectly recursive references to recursive CTEs.
1. A [NOT] IN predicate with a constant left operand and a non-correlated
subquery as the right operand used in the specification of a recursive CTE
was considered as a constant predicate and was evaluated only once.
Now such a predicate is re-evaluated after every iteration of the process
that produces the records of the recursive CTE.
2. The Exists-To-IN transformation could be applied to [NOT] IN predicates
with recursive references. This opened a possibility of materialization
for the subqueries used as right operands. Yet, materialization
is prohibited for the subqueries if they contain a recursive reference.
Now the Exists-To-IN transformation cannot be applied for subquery
predicates with recursive references.
The function st_select_lex::check_subqueries_with_recursive_references()
is called now only for the first execution of the SELECT.
The function st_select_lex_unit::exec_recursive() incorrectly determined
that a CTE mutually recursive with some others was stabilized in the case
when the non-recursive part of the CTE returned an empty set. As a result
the server fell into an infinite loop when executing a query using
this CTE.
The temporary tables created for recursive table references
should be closed in close_thread_tables(), because they might
be used in the statements like ANALYZE WITH r AS (...) SELECT * from r
where r is defined through recursion.
1. The rows of a recursive CTE at some point may overflow
the HEAP temporary table containing them. At this point
the table is converted to a MyISAM temporary table and the
new added rows are placed into this MyISAM table.
A bug in the of select_union_recursive::send_data prevented
the server from writing the row that caused the overflow
into the temporary table used for the result of the iteration
steps. This could lead, in particular,to a premature end
of the iterations.
2. The method TABLE::insert_all_rows_into() that was used
to copy all rows of one temporary table into another
did not take into account that the destination temporary
table must be converted to a MyISAM table at some point.
This patch fixed this problem. It also renamed the method
into TABLE::insert_all_rows_into_tmp_table() and added
an extra parameter needed for the conversion.
The idea of this fix was taken from the patch by Roy Lyseng
for mysql-5.6 bug iBug#14740889: "Wrong result for aggregate
functions when executing query through cursor".
Here's Roy's comment for his patch:
"
The problem was that a grouped query did not behave properly when
executed using a cursor. On further inspection, the query used one
intermediate temporary table for the grouping.
Then, Select_materialize::send_result_set_metadata created a temporary
table for storing the query result. Notice that get_unit_column_types()
is used to retrieve column meta-data for the query. The items contained
in this list are later modified so that their result_field points to
the row buffer of the materialized temporary table for the cursor.
But prior to this, these result_field objects have been prepared for
use in the grouping operation, by JOIN::make_tmp_tables_info(), hence
the grouping operation operates on wrong column buffers.
The problem is solved by using the list JOIN::fields when copying data
to the materialized table. This list is set by JOIN::make_tmp_tables_info()
and points to the columns of the last intermediate temporary table of
the executed query. For a UNION, it points to the temporary table
that is the result of the UNION query.
Notice that we have to assign a value to ::fields early in JOIN::optimize()
in case the optimization shortcuts due to a const plan detection.
A more optimal solution might be to avoid creating the final temporary
table when the query result is already stored in a temporary table.
"
The patch does not contain a test case, but the description of the
problem corresponds exactly what could be observed in the test
case for mdev-11081.
Added comments.
Added reaction for exceeding maximum number of elements in with clause.
Added a test case to check this reaction.
Added a test case where the specification of a recursive table
uses two non-recursive with tables.
Moved checking whether the limit set for the number of iterations
when executing a recursive query has been reached from
st_select_lex_unit::exec_recursive to TABLE_LIST::fill_recursive.
Changed the name of the system variable max_recursion_level for
max_recursive_iterations.
Adjusted test cases.
- Tabular EXPLAIN now prints "RECURSIVE UNION".
- There is a basic implementation of EXPLAIN FORMAT=JSON.
- it produces "recursive_union" JSON struct
- No other details or ANALYZE support, yet.
Temporary tables created for recursive CTE
were instantiated at the prepare phase. As
a result these temporary tables missed
indexes for look-ups and optimizer could not
use them.
Actually mutually recursive CTE were not functional. Now the code
for mutually recursive CTE looks like functional, but still needs
re-writing.
Added many new test cases for mutually recursive CTE.
Added test cases to check the fix.
Fixed the problem of wrong types of recursive tables when the type of anchor part does not coincide with the
type of recursive part.
Prevented usage of marerialization and subquery cache for subqueries with recursive references.
Introduced system variables 'max_recursion_level'.
Added a test case to test usage of this variable.
Added the check whether there are set functions in the specifications of recursive CTE.
Added the check whether there are recursive references in subqueries.
Introduced boolean system variable 'standards_compliant_cte'. By default it's set to 'on'.
When it's set to 'off' non-standard compliant CTE can be executed.
filesort and init_read_record() for the same table.
This will simplify code for WINDOW FUNCTIONS (MDEV-6115)
- Filesort_info renamed to SORT_INFO and moved to filesort.h
- filesort now returns SORT_INFO
- init_read_record() now takes a SORT_INFO parameter.
- unique declaration is moved to uniques.h
- subselect caching of buffers is now more explicit than before
- filesort_buffer is now reusable even if rec_length has changed.
- filsort_free_buffers() and free_io_cache() calls are removed
- Remove one malloc() when using get_addon_fields()
Other things:
- Added --debug-assert-on-not-freed-memory option to make it easier to
debug some not-freed-memory issues.
"Re-factor the code for post-join operations".
The patch mainly contains the code ported from mysql-5.6 and
created for two essential architectural changes:
1. WL#5558: Resolve ORDER BY execution method at the optimization stage
2. WL#6071: Inline tmp tables into the nested loops algorithm
The first task was implemented for mysql-5.6 by Ole John Aske.
It allows to make all decisions on ORDER BY operation at the optimization
stage.
The second task implemented for mysql-5.6 by Evgeny Potemkin adds JOIN_TAB
nodes for post-join operations that require temporary tables. It allows
to execute these operations within the nested loops algorithm that used to
be used before this task only for join queries. Besides these task moves
all planning on the execution of these operations from the execution phase
to the optimization phase.
Some other re-factoring changes of mysql-5.6 were pulled in, mainly because
it was easier to pull them in than roll them back. In particular all
changes concerning Ref_ptr_array were incorporated.
The port required some changes in the MariaDB code that concerned the
functionality of EXPLAIN and ANALYZE. This was done mainly by Sergey
Petrunia.
EXPLAIN INSERT ... SELECT tried to use SELECT's execution path. This
caused a collection of problems:
- SELECT_DESCRIBE flag was not put into select_lex->options, which
means it was not in JOIN::select_options either (except for the first
member of the UNION).
- This caused UNION members to be executed. They would attempt to write
join output rows to the output.
- (Actual cause of the crash) second join sibling would call
result->send_eof() when finished execution. Then,
Explain_query::print_explain would attempt to write to query output
again, and cause an assertion due to non-empty query output.
- Added mem_root to all calls to new Item
- Added private method operator new(size_t size) to Item to ensure that
we always use a mem_root when creating an item.
This saves use once call to current_thd per Item creation
- Changed ER(ER_...) to ER_THD(thd, ER_...) when thd was known or if there was many calls to current_thd in the same function.
- Changed ER(ER_..) to ER_THD_OR_DEFAULT(current_thd, ER...) in some places where current_thd is not necessary defined.
- Removing calls to current_thd when we have access to thd
Part of this is optimization (not calling current_thd when not needed),
but part is bug fixing for error condition when current_thd is not defined
(For example on startup and end of mysqld)
Notable renames done as otherwise a lot of functions would have to be changed:
- In JOIN structure renamed:
examined_rows -> join_examined_rows
record_count -> join_record_count
- In Field, renamed new_field() to make_new_field()
Other things:
- Added DBUG_ASSERT(thd == tmp_thd) in Item_singlerow_subselect() just to be safe.
- Removed old 'tab' prefix in JOIN_TAB::save_explain_data() and use members directly
- Added 'thd' as argument to a few functions to avoid calling current_thd.
This is MDEV-7601, including it's sub tasks MDEV-7594, MDEV-7555, MDEV-7590, MDEV-7581, MDEV-7589
The problem was that select_lex->non_agg_fields was not properly reset for re-execution and this caused an overwrite of a random memory position.
The fix was move non_agg_fields from select_lext to JOIN, which is properly reset.
Added THD argument to select_result and all derivative classes.
This reduces number of pthread_getspecific calls from 796 to 776 per OLTP RO
transaction.