This bug appeared after the patch for bug MDEV-23886. Due to this bug
execution of queries with CTEs used the same CTE at least twice via
prepared statements or with stored procedures caused crashes of the server.
It happened because the select created for any of not the first usage of
a CTE erroneously was not included into all_selects_list.
This patch corrects the patch applied to fix the bug MDEV-26108.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
with missing RECURSIVE
If a table reference r used inthe specification of a CTE whose definition
is contained in the WITH clause where RECURSIVE is omitted then this table
reference cannot be considered as a recursive table reference even if it is
used in the query that specifies CTE whose name is r. It can be considered
only as a reference to an embedding CTE or to a temporary table or to
a base table/view. If there is no such object with name r then an error
message must be reported.
This patch fixes the code that actually in some cases resolved r as a
reference to the CTE whose specification contained r if its name was r
in spite of the fact that r was not considered as a recursive CTE.
This happened in the cases when the definition of r was used in the
specification of another CTE. Such wrong name resolution for r led to an
infinite recursive invocations of the parser that ultimately crashed the
server.
This bug is a result of the fix for mdev-13780 that was not quite correct.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
If the first token of the body of a stored procedure was 'WITH' then
the beginning of the body was determined incorrectly and that token was
missing in the string representing the body of the SP in mysql.proc. As a
resultnany call of such procedure failed as the string representing the
body could not be parsed.
The patch corrects the code of the functions get_tok_start() and
get_cpp_tok_start() of the class Lex_input_stream to make them take into
account look ahead tokens. The patch is needed only for 10.2 as this
problem has neen resolved in 10.3+.
In the code existed just before this patch binding of a table reference to
the specification of the corresponding CTE happens in the function
open_and_process_table(). If the table reference is not the first in the
query the specification is cloned in the same way as the specification of
a view is cloned for any reference of the view. This works fine for
standalone queries, but does not work for stored procedures / functions
for the following reason.
When the first call of a stored procedure/ function SP is processed the
body of SP is parsed. When a query of SP is parsed the info on each
encountered table reference is put into a TABLE_LIST object linked into
a global chain associated with the query. When parsing of the query is
finished the basic info on the table references from this chain except
table references to derived tables and information schema tables is put
in one hash table associated with SP. When parsing of the body of SP is
finished this hash table is used to construct TABLE_LIST objects for all
table references mentioned in SP and link them into the list of such
objects passed to a pre-locking process that calls open_and_process_table()
for each table from the list.
When a TABLE_LIST for a view is encountered the view is opened and its
specification is parsed. For any table reference occurred in
the specification a new TABLE_LIST object is created to be included into
the list for pre-locking. After all objects in the pre-locking have been
looked through the tables mentioned in the list are locked. Note that the
objects referenced CTEs are just skipped here as it is impossible to
resolve these references without any info on the context where they occur.
Now the statements from the body of SP are executed one by one that.
At the very beginning of the execution of a query the tables used in the
query are opened and open_and_process_table() now is called for each table
reference mentioned in the list of TABLE_LIST objects associated with the
query that was built when the query was parsed.
For each table reference first the reference is checked against CTEs
definitions in whose scope it occurred. If such definition is found the
reference is considered resolved and if this is not the first reference
to the found CTE the the specification of the CTE is re-parsed and the
result of the parsing is added to the parsing tree of the query as a
sub-tree. If this sub-tree contains table references to other tables they
are added to the list of TABLE_LIST objects associated with the query in
order the referenced tables to be opened. When the procedure that opens
the tables comes to the TABLE_LIST object created for a non-first
reference to a CTE it discovers that the referenced table instance is not
locked and reports an error.
Thus processing non-first table references to a CTE similar to how
references to view are processed does not work for queries used in stored
procedures / functions. And the main problem is that the current
pre-locking mechanism employed for stored procedures / functions does not
allow to save the context in which a CTE reference occur. It's not trivial
to save the info about the context where a CTE reference occurs while the
resolution of the table reference cannot be done without this context and
consequentially the specification for the table reference cannot be
determined.
This patch solves the above problem by moving resolution of all CTE
references at the parsing stage. More exactly references to CTEs occurred in
a query are resolved right after parsing of the query has finished. After
resolution any CTE reference it is marked as a reference to to derived
table. So it is excluded from the hash table created for pre-locking used
base tables and view when the first call of a stored procedure / function
is processed.
This solution required recursive calls of the parser. The function
THD::sql_parser() has been added specifically for recursive invocations of
the parser.
This bug manifested itself when executing queries with multiple reference
to a CTE specified by a query expression with union and having its
column names explicitly declared. In this case the server returned a bogus
error message about unknown column name. It happened because while for the
first reference to the CTE the names of the columns returned by the CTE
specification were properly changed to match the CTE definition for the
other references it was not done. This was a consequence of not quite
complete code of the function With_element::clone_parsed_spec() that forgot
to set the reference to the CTE definition for unit structures representing
non-first CTE references.
Approved by dmitry.shulga@mariadb.com
For table references to CTEs the field TABLE_LIST::db must be set to
an empty string as it's done for table references to derived tables in
order CTEs to be processed similar to how derived tables are processed.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
When there is a WITH clause we postpone check for tables without
database for later stages when tables in WITH will be defined.
But we should not try to open such tables as temporary tables because
temporary tables always belong to a some database.
With INFORMATION_SCHEMA set as the default database the check that a table
referred in the processed query is defined in INORMATION_SCHEMA must
be postponed until all CTE names can be identified.
This patch fills a serious flaw in the implementation of common table
expressions. Before this patch an attempt to prepare a statement from
a query with a parameter marker in a CTE that was used more than once
in the query ended up with a bogus error message. Similarly if a statement
in a stored procedure contained a CTE whose specification used a
local variables and this CTE was referred to more than once in the
statement then the server failed to execute the stored procedure returning
a bogus error message on a non-existing field.
The problems appeared due to incorrect handling of parameter markers /
local variables in CTEs that were referred more than once.
This patch fixes the problems by differentiating between the original
occurrences of a parameter marker / local variable used in the
specification of a CTE and the corresponding occurrences used
in copies of this specification. These copies are substituted
instead of non-first references to the CTE.
The idea of the fix and even some code were taken from the MySQL
implementation of the common table expressions.
The previous correction of the patch for mdev-16473 did not work
correctly for the databases whose names started with '*'.
Added a test case with a database named "*".
Before this patch if no default database was set the server threw
an error for any table name reference that was not fully qualified by
database name. In particular it happened for table names referenced
CTE tables. This was incorrect.
The error message was thrown at the parser stage when the names referencing
different tables were not resolved yet.
Now if no default database is set and a with clause is used in the
processed statement any table reference is just supplied with a dummy
database name "*none*" at the parser stage. Later after a call
of check_dependencies_in_with_clauses() when the names for CTE tables
can be resolved error messages are thrown only for those names that
refer to non-CTE tables. This is done in open_and_process_table().
This bug caused crashes for queries with unreferenced non-recursive
CTEs specified by unions.It happened because the function
st_select_lex_unit::prepare() tried to use the value of the field 'derived'
that could not be set for unferenced CTEs as there was no derived
table associated with an unreferenced CTE.
the non-recursive CTE defined with UNION
The problem appears as the columns of the non-recursive CTE weren't renamed.
The renaming procedure was called for recursive CTEs only.
To fix it in the procedure st_select_lex_unit::prepare
With_element::rename_columns_of_derived_unit is called now for both CTEs:
recursive and non-recursive.
the non-recursive CTE via prepared statement
The problem appears as the column names of the CTE were allocated on the
wrong MEMROOT and after the preparation of the statement they disappear.
To fix it in the procedure With_element::rename_columns_of_derived_unit
the CTE column names are now allocated in the permanent MEMROOT for the
prepared statements and stored procedures.
does not return error
Corrected the code of st_select_lex::find_table_def_in_with_clauses() for
a proper identification of CTE references used in embedded CTEs.
When identifying a table name the following should be taken into account:
a CTE name cannot be qualified with a database name, otherwise the table
name is considered as the name of a non-CTE table.
caused an error
The function subselect_single_select_engine::print() did not print
the WITH clause attached to a subselect with single select engine.
As a result views using suqueries with attached WITH clauses lost
these clauses when saved in frm files.
If the specification of a CTE contains a reference to a temporary table
then THD::open_temporary_table() must be called for this reference for
any occurrence of the CTE in the query. By mistake this was done only
for the first occurrences of CTEs.
The patch fixes this problem in With_element::clone_parsed_spec().
It also moves there the call of check_dependencies_in_with_clauses()
to its proper place before the call of check_table_access().
Additionally the patch optimizes the number of calls of the
function check_dependencies_in_with_clauses().
Other things, mainly to get
create_mysqld_error_find_printf_error tool to work:
- Added protection to not include mysqld_error.h twice
- Include "unireg.h" instead of "mysqld_error.h" in server
- Added protection if ER_XX messages are already defined
- Removed wrong calls to my_error(ER_OUTOFMEMORY) as
my_malloc() and my_alloc will do this automatically
- Added missing %s to ER_DUP_QUERY_NAME
- Removed old and wrong calls to my_strerror() when using
MY_ERROR_ON_RENAME (wrong merge)
- Fixed deadlock error message from Galera. Before the extra
information given to ER_LOCK_DEADLOCK was missing because
ER_LOCK_DEADLOCK doesn't provide any extra information.
I kept #ifdef mysqld_error_find_printf_error_used in sql_acl.h
to make it easy to do this kind of check again in the future
for a query that uses CTE
The first reference to a CTE in the processed query uses the unit
built by the parser for the CTE specification. This unit is
considered as the specification of the derived table created for
the first reference of the CTE. This requires some transformation of
the original query tree: the unit of the specification must be moved
to a new position as a slave of the select where the first reference
to the CTE occurs. The transformation is performed by the function
st_select_lex_node::move_as_slave(). There was an obvious bug in this
function. As a result of this bug in many cases the moved unit turned
out to be lost in the query tree. This could cause different problems.
In particular the prepared statements for queries that used CTEs could
miss cleanup for some selects that was performed at the end of the
preparation/execution of the PSs. If such cleanup is not done for a PS
the next execution of the PS causes an assertion abort or a crash.
The support of embedded CTEs was not correct in the cases when
embedded CTEs were used multiple times. The problems occurred with
both non-recursive (bug mdev-13780) and recursive (bug mdev-14184)
embedded CTEs.
A reference to a CTE may occur not in the master of the CTE
specification. In this case if the reference to the CTE is
the first one the specification should be detached from its
master and attached to the referencing select.
Also fixed the TYPE column in the lines of the EXPLAIN output
created for CTE tables.
The fix was in the call the open_normal_and_derived_tables() from
the function mysql_test_select() and it was similar to those from
the patch for mdev-13107.
Added explicit PREPARE statements that failed in --ps-protocol.
The problems were in the code of sql_show.cc. There the tables
could be opened in such a way that mysql_derived_init() never
worked for CTE tables. As a result they were not marked as
derived and mysql_handle_derived() were not called for derived
tables used in their specifications.
In the current code temporary tables we identified and opened before
other tables. CTE tables are identified in the same procedure as
regular tables. When a temporary table and a CTE table have the same
name T any reference to T that is in the scope of the CTE declaration
must be associated with this CTE. Yet it was not done properly.
When a reference to T was found in the scope of the declaration
of CTE T a pointer to this CTE was set in the reference. No check
that the reference had been already associated with a temporary table
was done. As a result, if the temporary table T had been created then
the reference to T was considered simultaneously as reference to the CTE
named T and as a reference to the temporary table named T. This
confused the code that were executed later and caused a crash of
the server.
Now when a table reference is associated with a CTE any previous
association with a temporary table is dropped.
This problem could be easily avoided if the temporary tables were
not identified prematurely.
as reference to CTE named T and
When a CTE referring to another CTE from the same with clause
was used twice then the server could not find the second CTE and
reported a bogus error message.
This happened because for any unit that was created as a clone of
a CTE specification the pointer to the WITH clause that owned this CTE
was not set.
The bug was caused by a wrong order of statements in With_clause::print().
As a result any view definition containing WITH clause with several
CTE specifications was put the frm file in a syntactically incorrect
form.
When a query containing a WITH clause is printed by EXPLAIN
EXTENDED command there should not be any data expansion in
the query specifications of the WITH elements of this WITH
clause.
The code for st_select_lex::find_table_def_in_with_clauses()
did not take into account the fact that the specs for mergeable
CTEs were cloned and were not processed by the function
With_element::check_dependencies_in_spec().
Make the new (CTE-related) code in set_explain_type to take into
account that some JOIN_TABs are non-merged semi-joins, and do not
have a TABLE object.
Added comments.
Added reaction for exceeding maximum number of elements in with clause.
Added a test case to check this reaction.
Added a test case where the specification of a recursive table
uses two non-recursive with tables.
explain for the query containing WITH clause
with an unreferenced CTE caused a crash.
Added a test covered this case.
Also added a test for usage CTE in different parts of union.
The patch for bug mdev-9937 actually did not fix the problem
of name resolution for tables used in views referred in queries
with WITH clauses. This fix corrects the patch.
Added test cases to check the fix.
Fixed the problem of wrong types of recursive tables when the type of anchor part does not coincide with the
type of recursive part.
Prevented usage of marerialization and subquery cache for subqueries with recursive references.
Introduced system variables 'max_recursion_level'.
Added a test case to test usage of this variable.