and to related methods and their parameters:
- The return value of Spvar_definition::m_column_type_ref()
- The parameter of Spvar_definition::set_column_type_ref()
- The method Qualified_column_ident::resolve_type_ref()
- The parameter of LEX::sp_variable_declarations_column_type_finalize()
The crash happened with an indexed virtual column whose
value is evaluated using a function that has a different meaning
in sql_mode='' vs sql_mode=ORACLE:
- DECODE()
- LTRIM()
- RTRIM()
- LPAD()
- RPAD()
- REPLACE()
- SUBSTR()
For example:
CREATE TABLE t1 (
b VARCHAR(1),
g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL,
KEY g(g)
);
So far we had replacement XXX_ORACLE() functions for all mentioned function,
e.g. SUBSTR_ORACLE() for SUBSTR(). So it was possible to correctly re-parse
SUBSTR_ORACLE() even in sql_mode=''.
But it was not possible to re-parse the MariaDB version of SUBSTR()
after switching to sql_mode=ORACLE. It was erroneously mis-interpreted
as SUBSTR_ORACLE().
As a result, this combination worked fine:
SET sql_mode=ORACLE;
CREATE TABLE t1 ... g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL, ...;
INSERT ...
FLUSH TABLES;
SET sql_mode='';
INSERT ...
But the other way around it crashed:
SET sql_mode='';
CREATE TABLE t1 ... g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL, ...;
INSERT ...
FLUSH TABLES;
SET sql_mode=ORACLE;
INSERT ...
At CREATE time, SUBSTR was instantiated as Item_func_substr and printed
in the FRM file as substr(). At re-open time with sql_mode=ORACLE, "substr()"
was erroneously instantiated as Item_func_substr_oracle.
Fix:
The fix proposes a symmetric solution. It provides a way to re-parse reliably
all sql_mode dependent functions to their original CREATE TABLE time meaning,
no matter what the open-time sql_mode is.
We take advantage of the same idea we previously used to resolve sql_mode
dependent data types.
Now all sql_mode dependent functions are printed by SHOW using a schema
qualifier when the current sql_mode differs from the function sql_mode:
SET sql_mode='';
CREATE TABLE t1 ... SUBSTR(a,b,c) ..;
SET sql_mode=ORACLE;
SHOW CREATE TABLE t1; -> mariadb_schema.substr(a,b,c)
SET sql_mode=ORACLE;
CREATE TABLE t2 ... SUBSTR(a,b,c) ..;
SET sql_mode='';
SHOW CREATE TABLE t1; -> oracle_schema.substr(a,b,c)
Old replacement names like substr_oracle() are still understood for
backward compatibility and used in FRM files (for downgrade compatibility),
but they are not printed by SHOW any more.
This commit addresses column naming issues with CTEs in the use of prepared
statements and stored procedures. Usage of either prepared statements or
procedures with Common Table Expressions and column renaming may be affected.
There are three related but different issues addressed here.
1) First execution issue. Consider the following
prepare s from "with cte (col1, col2) as (select a as c1, b as c2 from t
order by c1) select col1, col2 from cte";
execute s;
After parsing, items in the select are named (c1,c2), order by (and group by)
resolution is performed, then item names are set to (col1, col2).
When the statement is executed, context analysis is again performed, but
resolution of elements in the order by statement will not be able to find c1,
because it was renamed to col1 and remains this way.
The solution is to save the names of these items during context resolution
before they have been renamed. We can then reset item names back to those after
parsing so first execution can resolve items referred to in order and group by
clauses.
2) Second Execution Issue
When the derived table contains more than one select 'unioned' together we could
reasonably think that dealing with only items in the first select (which
determines names in the resultant table) would be sufficient. This can lead to
a different problem. Consider
prepare st from "with cte (c1,c2) as
(select a as col1, sum(b) as col2 from t1 where a > 0 group by col1
union select a as col3, sum(b) as col4 from t2 where b > 2 group by col3)
select * from cte where c1=1";
When the optimizer (only run during the first execution) pushes the outside
condition "c1=1" into every select in the derived table union, it renames the
items to make the condition valid. In this example, this leaves the first item
in the second select named 'c1'. The second execution will now fail 'group by'
resolution.
Again, the solution is to save the names during context analysis, resetting
before subsequent resolution, but making sure that we save/reset the item
names in all the selects in this union.
3) Memory Leak
During parsing Item::set_name() is used to allocate memory in the statement
arena. We cannot use this call during statement execution as this represents
a memory leak. We directly set the item list names to those in the column list
of this CTE (also allocated during parsing).
Approved by Igor Babaev <igor@mariadb.com>
New Feature:
============
This patch extends the START SLAVE UNTIL command with options
SQL_BEFORE_GTIDS and SQL_AFTER_GTIDS to allow user control of
whether the replica stops before or after a provided GTID state. Its
syntax is:
START SLAVE UNTIL (SQL_BEFORE_GTIDS|SQL_AFTER_GTIDS)=”<gtid_list>”
When providing SQL_BEFORE_GTIDS=”<gtid_list>”, for each domain
specified in the gtid_list, the replica will execute transactions up
to the GTID found, and immediately stop processing events in that
domain (without executing the transaction of the specified GTID).
Once all domains have stopped, the replica will stop. Events
originating from domains that are not specified in the list are not
replicated.
START SLAVE UNTIL SQL_AFTER_GTIDS=”<gtid_list>” is an alias to the
default behavior of START SLAVE UNTIL master_gtid_pos=”<gtid_list>”.
That is, the replica will only execute transactions originating from
domain ids provided in the list, and will stop once all transactions
provided in the UNTIL list have all been executed.
Example:
=========
If a primary server has a binary log consisting of the following GTIDs:
0-1-1
1-1-1
0-1-2
1-1-2
0-1-3
1-1-3
If a fresh replica (i.e. one with an empty GTID position,
@@gtid_slave_pos='') is started with SQL_BEFORE_GTIDS, i.e.
START SLAVE UNTIL SQL_BEFORE_GTIDS=”1-1-2”
The resulting gtid_slave_pos of the replica will be “1-1-1”.
This is because the replica will execute only events from domain 1
until it sees the transaction with sequence number 2, and
immediately stop without executing it.
If the replica is started with SQL_AFTER_GTIDS, i.e.
START SLAVE UNTIL SQL_AFTER_GTIDS=”1-1-2”
then the resulting gtid_slave_pos of the replica will be “1-1-2”.
This is because it will only execute events from domain 1 until it
has executed the provided GTID.
Reviewed By:
============
Kristian Nielson <knielsen@knielsen-hq.org>
An "ITERATE innerLoop" did not work properly inside
a WHILE loop, which itself is inside an outer FOR loop:
outerLoop:
FOR
...
innerLoop:
WHILE
...
ITERATE innerLoop;
...
END WHILE;
...
END FOR;
It erroneously generated an integer increment code for the outer FOR loop.
There were two problems:
1. "ITERATE innerLoop" worked like "ITERATE outerLoop"
2. It was always integer increment, even in case of FOR cursor loops.
Background:
- A FOR loop automatically creates a dedicated sp_pcontext stack entry,
to put the iteration and bound variables on it.
- Other loop types (LOOP, WHILE, REPEAT), do not generate a dedicated
slack entry.
The old code erroneously assumed that sp_pcontext::m_for_loop
either describes the most inner loop (in case the inner loop is FOR),
or is empty (in case the inner loop is not FOR).
But in fact, sp_pcontext::m_for_loop is never empty inside a FOR loop:
it describes the closest FOR loop, even if this FOR loop has nested
non-FOR loops inside.
So when we're near the ITERATE statement in the above script,
sp_pcontext::m_for_loop is not empty - it stores information about
the FOR loop labeled as "outrLoop:".
Fix:
- Adding a new member sp_pcontext::Lex_for_loop::m_start_label,
to remember the explicit or the auto-generated label correspoding
to the start of the FOR body. It's used during generation
of "ITERATE loop_label" code to check if "loop_label" belongs
to the current FOR loop pointed by sp_pcontext::m_for_loop,
or belongs to a non-FOR nested loop.
- Adding LEX methods sp_for_loop_intrange_iterate() and
sp_for_loop_cursor_iterate() to reuse the code between
methods handling:
* ITERATE
* END FOR
- Adding a test for Lex_for_loop::is_for_loop_cursor()
and generate a code either a cursor fetch, or for an integer increment.
Before this change, it always erroneously generated an integer increment
version.
- Cleanup: Initialize Lex_for_loop_st::m_cursor_offset inside
Lex_for_loop_st::init(), to avoid not initialized members.
- Cleanup: Removing a redundant method:
Lex_for_loop_st::init(const Lex_for_loop_st &other)
Using Lex_for_loop_st::operator(const Lex_for_loop_st &other) instead.
The function setup_windows() called at the prepare phase of processing a
select builds a list of all window specifications used in the select. This list
is built on the statement memory and it must be done only once.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
- Removing two copies of the drop_routine.
Adding a shared and much simplified version.
- Removing LEX metods:
bool stmt_drop_function(const DDL_options_st &options,
const Lex_ident_sys_st &db,
const Lex_ident_sys_st &name);
bool stmt_drop_function(const DDL_options_st &options,
const Lex_ident_sys_st &name);
bool stmt_drop_procedure(const DDL_options_st &options,
sp_name *name);
The code inside the methods was very similar.
Adding one method instead:
bool stmt_drop_routine(const Sp_handler *sph,
const DDL_options_st &options,
const Lex_ident_sys_st &db,
const Lex_ident_sys_st &name);
- Adding a new virtual method Sp_handler:sqlcom_drop().
It helped to unify the code inside the new stmt_drop_routine().
Problem:
Under terms of MDEV-27490, we'll update Unicode version used
to compare identifiers to 14.0.0. Unlike in the old Unicode version,
in the new version a string can grow during lower-case. We cannot
perform check_db_name() inplace any more.
Change summary:
- Allocate memory to store lower-cased identifiers in memory root
- Removing check_db_name() performing both in-place lower-casing and validation
at the same time. Splitting it into two separate stages:
* creating a memory-root lower-cased copy of an identifier
(using new MEM_ROOT functions and Query_arena wrapper methods)
* performing validation on a constant string
(using Lex_ident_fs methods)
Implementation details:
- Adding a mysys helper function to allocate lower-cased strings on MEM_ROOT:
lex_string_casedn_root()
and a Query_arena wrappers for it:
make_ident_casedn()
make_ident_opt_casedn()
- Adding a Query_arena method to perform both MEM_ROOT lower-casing and
database name validation at the same time:
to_ident_db_internal_with_error()
This method is very close to the old (pre-11.3) check_db_name(),
but performs lower-casing to a newly allocated MEM_ROOT
memory (instead of performing lower-casing the original string in-place).
- Adding a Table_ident method which additionally handles derived table names:
to_ident_db_internal_with_error()
- Removing the old check_db_name()
Changing LEX_CSTRING* parameters of LEX::make_sp_name() to Lex_ident_sys_st.
This makes the code clear because a value of Lex_ident_sys_st has
some guaranteed additional constraints over a base LEX_CSTRING:
- Its LEX_CSTRING::str is not NULL (sql_yacc.yy would abort otherwise)
- Its LEX_CSTRING::str is 0-terminated
- Its a valid utf8 string
- The string pointed by LEX_CSTRING::str was created on THD::mem_root
Also changing "pass by pointer" to "pass by reference",
as these parameters can never be NULL - they are Bison stack variables.
This patch is the second part of implementation for cursor's statement
re-parsing. The patch does the following changes:
- on re-parsing a failed SP instruction that does need to get access
to LEX a new lex is instantiated for every SP instruction except
cursor relating SP instructions.
- items created on re-parsing a cursor relating statement are moved
to the free_list of sp_lex_cursor.
Added re-parsing of failed statements inside a stored routine.
General idea of the patch is to install an instance of the class
Reprepare_observer before executing a next SP instruction and
re-parse a statement of this SP instruction in case of
its execution failure.
To implement the described approach the class sp_lex_keeper
has been extended with the method validate_lex_and_exec_core()
that is just a wrapper around the method reset_lex_and_exec_core()
with additional setting/resetting an instance of the class
Reprepare_observer on each iteration of SP instruction
execution.
If reset_lex_and_exec_core() returns error and an instance
of the class Reprepare_observer is installed before running
a SP instruction then a number of attempts to re-run the SP
instruction is checked against a max. limit and in case it doesn't
reach the limit a statement for the failed SP instruction is re-parsed.
Re-parsing of a statement for the failed SP instruction is implemented
by the new method sp_le_inst::parse_expr() that prepends
a SP instruction's statement with the clause 'SELECT' and parse it.
Own SP instruction MEM_ROOT and a separate free_list is used for
parsing of a SP statement. On successful re-parsing of SP instruction's
statement the virtual methods adjust_sql_command() and
on_after_expr_parsing() of the class sp_lex_instr is called
to update the SP instruction state with a new data created
on parsing the statement.
Few words about reason for prepending a SP instruction's statement
with the clause 'SELECT' - this is required step to produce a valid
SQL statement, since for some SP instructions the instructions statement
is not a valid SQL statement. Wrapping such text into 'SELECT ( )'
produces a correct operator from SQL syntax point of view.
For those SP instructions that need to get access to ia LEX object
on execution, added storing of their original sql expressions inside
classes derived from the class sp_lex_instr.
A stored sql expression is returned by the abstract method
sp_lex_instr::get_expr_query
redefined in derived classes.
Since an expression constituting a SP instruction can be invalid
SQL statement in general case (not parseable statement), the virtual
method sp_lex_instr::get_query() is introduced to return a valid string
for a statement that corresponds to the given instruction.
Additionally, introduced the rule remember_start_opt in the grammar.
The new rule intended to get correct position of a current
token taking into attention the fact whether lookahead was done or not.
This is the prerequisite patch to move the data member
LEX::trg_table_fields to the class sp_head and rename it as
m_trg_table_fields.
This data member is used for handling OLD/NEW pseudo-rows inside
a trigger body and in order to be able to re-parse a trigger body
the data member must be moved from the struct LEX to the class sp_head.
This patch adds a way to override default collations
(or "character set collations") for desired character sets.
The SQL standard says:
> Each collation known in an SQL-environment is applicable to one
> or more character sets, and for each character set, one or more
> collations are applicable to it, one of which is associated with
> it as its character set collation.
In MariaDB, character set collations has been hard-coded so far,
e.g. utf8mb4_general_ci has been a hard-coded character set collation
for utf8mb4.
This patch allows to override (globally per server, or per session)
character set collations, so for example, uca1400_ai_ci can be set as a
character set collation for Unicode character sets
(instead of compiled xxx_general_ci).
The array of overridden character set collations is stored in a new
(session and global) system variable @@character_set_collations and
can be set as a comma separated list of charset=collation pairs, e.g.:
SET @@character_set_collations='utf8mb3=uca1400_ai_ci,utf8mb4=uca1400_ai_ci';
The variable is empty by default, which mean use the hard-coded
character set collations (e.g. utf8mb4_general_ci for utf8mb4).
The variable can also be set globally by passing to the server startup command
line, and/or in my.cnf.
- Moving the code from a public function trim_whitespaces()
to the class Lex_cstring as methods. This code may
be useful in other contexts, and also this code becomes
visible inside sql_class.h
- Adding a helper method THD::strmake_lex_cstring_trim_whitespaces()
- Unifying the way how CREATE PROCEDURE/CREATE FUNCTION and
CREATE PACKAGE/CREATE PACKAGE BODY work:
a) Now CREATE PACKAGE/CREATE PACKAGE BODY also calls
Lex->sphead->set_body_start() to remember the cpp body start inside
an sp_head member.
b) adding a "const char *cpp_body_end" parameter to
sp_head::set_stmt_end().
These changes made it possible to reuse sp_head::set_stmt_end() inside
LEX::create_package_finalize() and remove the duplucate code.
- Renaming sp_head::m_body_begin to m_cpp_body_begin and adding a comment
to make it clear that this member is used only during parsing, and
points to a fragment inside the cpp buffer.
- Changed sp_head::set_body_start() and sp_head::set_stmt_end()
to skip the calls related to "body_utf8" in cases when m_parent is not NULL.
A non-NULL m_parent means that we're inside a package routine.
"body_utf8" in such case belongs not to the current sphead itself,
but to parent (the package) sphead.
So an sphead instance of a package routine should neither initialize,
nor finalize, nor change in any other ways the "body_utf8" related
members of Lex_input_stream, and should not take over or copy "body_utf8"
data from Lex_input_stream to "this".
The new statistics is enabled by adding the "engine", "innodb" or "full"
option to --log-slow-verbosity
Example output:
# Pages_accessed: 184 Pages_read: 95 Pages_updated: 0 Old_rows_read: 1
# Pages_read_time: 17.0204 Engine_time: 248.1297
Page_read_time is time doing physical reads inside a storage engine.
(Writes cannot be tracked as these are usually done in the background).
Engine_time is the time spent inside the storage engine for the full
duration of the read/write/update calls. It uses the same code as
'analyze statement' for calculating the time spent.
The engine statistics is done with a generic interface that should be
easy for any engine to use. It can also easily be extended to provide
even more statistics.
Currently only InnoDB has counters for Pages_% and Undo_% status.
Engine_time works for all engines.
Implementation details:
class ha_handler_stats holds all engine stats. This class is included
in handler and THD classes.
While a query is running, all statistics is updated in the handler. In
close_thread_tables() the statistics is added to the THD.
handler::handler_stats is a pointer to where statistics should be
collected. This is set to point to handler::active_handler_stats if
stats are requested. If not, it is set to 0.
handler_stats has also an element, 'active' that is 1 if stats are
requested. This is to allow engines to avoid doing any 'if's while
updating the statistics.
Cloned or partition tables have the pointer set to the base table if
status are requested.
There is a small performance impact when using --log-slow-verbosity=engine:
- All engine calls in 'select' will be timed.
- IO calls for InnoDB reads will be timed.
- Incrementation of counters are done on local variables and accesses
are inline, so these should have very little impact.
- Statistics has to be reset for each statement for the THD and each
used handler. This is only 40 bytes, which should be neglectable.
- For partition tables we have to loop over all partitions to update
the handler_status as part of table_init(). Can be optimized in the
future to only do this is log-slow-verbosity changes. For this to work
we have to update handler_status for all opened partitions and
also for all partitions opened in the future.
Other things:
- Added options 'engine' and 'full' to log-slow-verbosity.
- Some of the new files in the test suite comes from Percona server, which
has similar status information.
- buf_page_optimistic_get(): Do not increment any counter, since we are
only validating a pointer, not performing any buf_pool.page_hash lookup.
- Added THD argument to save_explain_data_intern().
- Switched arguments for save_explain_.*_data() to have
always THD first (generates better code as other functions also have THD
first).
Fake_select_lex->join was prepared at the unit execution stage so
the validation of fake_select_lex before the unit pushdown
was incomplete. That caused pushing down of statements having
an incorrect ORDER BY clause.
This commit moves preparation of the fake_select_lex->join to the unit
prepare() method, before initializing of the pushdown handler,
so incorrect clauses error out before being pushed down
When CURSOR parameters get parsed, their sp_assignment_lex instances
(one instance per parameter) get collected to List<sp_assignment_lex>.
These instances get linked to sphead only in the end of the list.
If a syntax error happened in the middle of the parameter list,
these instances were not deleted, which caused memory leaks.
Fix:
using a Bison %destructor to free rules of the <sp_assignment_lex_list>
type (on syntax errors).
Afte the fix these sp_assignment_lex instances from CURSOR parameters
deleted as follows:
- If the CURSOR statement was fully parsed, then these instances
get properly linked to sp_head structures, so they are deleted
during ~sp_head (this did not change)
- If the CURSOR statement failed on a syntax error, then by Bison's
%destructor (this is being added in the current patch).
During st_select_lex_unit::prepare() the member select_unit*
st_select_lex_unit::union_result is being assigned to an instance
of one of the following classes:
- select_unit
- select_unit_ext
- select_unit_recursive
- select_union_direct
Select_union_direct used to pass the result of the query directly to
the receiving select_result without filling a temporary table. This class
wraps a select_result object and is currently used to process UNION ALL
queries. Other select_unit_* classes involve some additional result processing.
Pushed down units are processed on the engine side so the results must be
also passed directly to a select_result object. So in the case when
the unit pushdown is employed st_select_lex_unit::union_result must be
assigned to an instance of select_union_direct.
Allow queries of multiple SELECTs combined together with
UNIONs/EXCEPTs/INTERSECTs to be pushed down to foreign engines.
If the foreign engine provides an interface method "create_unit"
and the UNIT is a top-level unit of the SQL query then the server
tries to push the whole SELECT_LEX_UNIT down to the engine for execution.
The engine should perform necessary checks and if they succeed,
execute the query. If the engine is unable to execute the whole unit,
then another attempt is made to push down SELECTs composing the unit
separately using the "create_select" interface method. In this case
the results of separate SELECTs are combined at the server side
thus composing the final result