- Using array_elements() instead of a constant to iterate through an array
- Adding some comments
- Adding new-line function comments
- Using STRING_WITH_LEN instead of C_STRING_WITH_LEN
Merging the following features from sql_yacc.yy to sql_yacc_ora.yy:
- system versioning
- column compression
- table value constructor
- spatial predicate WITHIN
- DELETE_DOMAIN_ID
The current code does not support recursive CTEs whose specifications
contain a mix of ALL UNION and DISTINCT UNION operations.
This patch catches such specifications and reports errors for them.
In this issue we hit the assert because we are adding addition fields to the field JOIN::all_fields list. This
is done because HEAP tables can't index BIT fields so we need to use an additional hidden field for grouping because later it will be
converted to a LONG field. Original field will remain of the BIT type and will be returned. This happens when we convert DISTINCT to
GROUP BY.
The solution is to take into account the number of such hidden fields that would be added to the field
JOIN::all_fields list while calculating the size of the ref_pointer_array.
Make sure that SELECT_LEX_UNIT::derived, behaves as documented
(points to the "TABLE_LIST representing this union in the
embedding select"). For recursive CTE this was not necessarily
the case, it could've pointed to the TABLE_LIST inside the CTE,
not in the embedding select.
To fix:
* don't update unit->derived in mysql_derived_prepare(), pass derived
as an argument to st_select_lex_unit::prepare()
* prefer to set unit->derived in TABLE_LIST::init_derived()
to the TABLE_LIST in the embedding select, not to the recursive
reference. Fail if there are many TABLE_LISTs in the embedding
select with conflicting FOR SYSTEM_TIME clauses.
cleanup:
* remove redundant THD* argument from st_select_lex_unit::prepare()
1. Adding THD::convert_string(LEX_CSTRING *to,...) as a wrapper
for convert_string(LEX_STRING *to,...), as LEX_CSTRING
is now frequently used for conversion purpose.
This reduced duplicate code in TEXT_STRING_sys,
TEXT_STRING_literal, TEXT_STRING_filesystem grammar rules in *.yy
2. Adding yet another THD::convert_string() with an extra parameter
"bool simple_copy_is_possible". This even more reduced
repeatable code in the mentioned grammar rules in *.yy
3. Deriving Lex_ident_cli_st from Lex_string_with_metadata_st,
as they have very similar functionality. Moving m_quote
from Lex_ident_cli_st to Lex_string_with_metadata_st,
as m_quote will be used later to optimize string literals anyway
(e.g. avoid redundant copying on the tokenizer stage).
Adjusting Lex_input_stream::get_text() accordingly.
4. Moving the reminders of the code in TEXT_STRING_sys, TEXT_STRING_literal,
TEXT_STRING_filesystem grammar rules as new methods in THD:
- make_text_string_sys()
- make_text_string_connection()
- make_text_string_filesystem()
and changing *.yy to use these new methods.
This reduced the amount of similar code in
sql_yacc.yy and sql_yacc_ora.yy.
5. Removing duplicate code in Lex_input_stream::body_utf8_append_ident():
by reusing THD::make_text_string_sys(). Thanks to #3 and #4.
6. Making THD members charset_is_system_charset,
charset_is_collation_connection, charset_is_character_set_filesystem
private, as they are not needed externally any more.
The code in the "sp_tail" rule in sql_yacc.yy always
used YYLIP->get_cpp_tok_start() as the start of the body,
and did not check for possible lookahead which happens
for keywords "FOR", "VALUES" and "WITH" for LALR(2)
resolution in Lex_input_stream::lex_token().
In case of the lookahead token presence,
get_tok_start_prev() should have been used instead
of get_cpp_tok_start() as the beginning of the SP body.
Change summary:
This patch hides the implementation of the lookahead
token completely inside Lex_input_stream.
The users of Lex_input_stream now just get token-by-token
transparently and should not care about lookahead any more.
Now external users of Lex_input_stream
are not aware of the lookahead token at all.
Change details:
- Moving Lex_input_stream::has_lookahead() into the "private" section.
- Removing Lex_input_stream::get_tok_start_prev() and
Lex_input_stream::get_cpp_start_prev().
- Fixing the external code to call get_tok_start() and get_cpp_tok_start()
in all places where get_tok_start_prev() and get_cpp_start_prev()
where used.
- Adding a test for has_lookahead() right inside
get_tok_start() and get_cpp_tok_start().
If there is a lookahead token, these methods now
return the position of the previous token automatically:
const char *get_tok_start()
{
return has_lookahead() ? m_tok_start_prev : m_tok_start;
}
const char *get_cpp_tok_start()
{
return has_lookahead() ? m_cpp_tok_start_prev : m_cpp_tok_start;
}
- Fixing the internal code inside Lex_input_stream methods
to use m_tok_start and m_cpp_tok_start directly,
instead of calling get_tok_start() and get_cpp_tok_start(),
to make sure to access to the *current* token position
(independently of a lookahead token presence).
Reasoning:
- Shorter and clearer code
- Better encapsulation
(a fair number of Lex_input_stream methods and members were
moved to the private section)
New methods:
int lex_token(union YYSTYPE *yylval, THD *thd);
bool consume_comment(int remaining_recursions_permitted);
int lex_one_token(union YYSTYPE *yylval, THD *thd);
int find_keyword(Lex_ident_cli_st *str, uint len, bool function);
LEX_CSTRING get_token(uint skip, uint length);
Additional changes:
- Removing Lex_input_stream::yylval.
In the original code it was just an alias
for the "yylval" passed to lex_one_token().
This coding style is bug prone and is hard to follow.
In the new reduction "yylval" (or its components) is passed to
the affected methods as a parameter.
- Moving the code in sql_lex.h up and down between "private" and "public"
sections (sorry if this made the diff somewhat harder to read)
The code passing positions in the query to constructors of
Rewritable_query_parameter descendants (e.g. Item_splocal)
was not reliable. It used various Lex_input_stream methods:
- get_tok_start()
- get_tok_start_prev()
- get_tok_end()
- get_ptr()
to find positions of the recently scanned tokens.
The challenge was mostly to choose between get_tok_start()
and get_tok_start_prev(), taking into account to the current
grammar (depending if lookahead takes place before
or after we read the positions in every particular rule).
But this approach did not work at all in combination
with token contractions, when MYSQLlex() translates
two tokens into one token ID, for example:
WITH ROLLUP -> WITH_ROLLUP_SYM
As a result, the tokenizer is already one more token ahead.
So in query fragment:
"GROUP BY d, spvar WITH ROLLUP"
get_tok_start() points to "ROLLUP".
get_tok_start_prev() points to "WITH".
As a result, it was "WITH" who was erroneously replaced
to NAME_CONST() instead of "spvar".
This patch modifies the code to do it a different way.
Changes:
1. For keywords and identifiers, the tokenizer now
returns LEX_CTRING pointing directly to the query
fragment. So query positions are now just available using:
- $1.str - for the beginning of a token
- $1.str+$1.length - for the end of a token
2. Identifiers are not allocated on the THD memory root
in the tokenizer any more. Allocation is now done
on later stages, in methods like LEX::create_item_ident().
3. Two LEX_CSTRING based structures were added:
- Lex_ident_cli_st - used to store the "client side"
identifier representation, pointing to the
query fragment. Note, these identifiers
are encoded in @@character_set_client
and can have broken byte sequences.
- Lex_ident_sys_st - used to store the "server side"
identifier representation, pointing to the
THD allocated memory. This representation
guarantees that the identifier was checked
for being well-formed, and is encoded in utf8.
4. To distinguish between two identifier types
in the grammar, two Bison types were added:
<ident_cli> and <ident_sys>
5. All non-reserved keywords were marked as
being of the type <ident_cli>.
All reserved keywords are still of the type NONE.
6. All curly brackets in rules collecting
non-reserved keywords into non-terminal
symbols were removed, e.g.:
Was:
keyword_sp_data_type:
BIT_SYM {}
| BOOLEAN_SYM {}
Now:
keyword_sp_data_type:
BIT_SYM
| BOOLEAN_SYM
This is important NOT to have brackets here!!!!
This is needed to make sure that the underlying
Lex_ident_cli_ststructure correctly passes up to
the calling rule.
6. The code to scan identifiers and keywords
was moved from lex_one_token() into new
Lex_input_stream methods:
scan_ident_sysvar()
scan_ident_start()
scan_ident_middle()
scan_ident_delimited()
This was done to:
- get rid of enormous amount of references to &yylval->lex_str
- and remove a lot of references like lip->xxx
7. The allocating functionality which puts identifiers on the
THD memory root now resides in methods of Lex_ident_sys_st,
and in THD::to_ident_sys_alloc().
get_quoted_token() was removed.
8. Cleanup: check_simple_select() was moved as a method to LEX.
9. Cleanup: Some more functionality was moved from *.yy
to new methods were added to LEX:
make_item_colon_ident_ident()
make_item_func_call_generic()
create_item_qualified_asterisk()
vers_setup_conds() used to AND all conditions on row_start/row_end
columns and store it either in the WHERE clause or in the ON
clause for some table. In some cases this caused ON clause
to have conditions for tables that aren't part of that ON's join.
Fixed to put a table's condition always in the ON clause of the
corresponding table.
Removed unnecessary ... `OR row_end IS NULL` clause, it's not needed
in the ON clause.
Simplified handling on PS and SP.
Main reason was to make it easier to print the above structures in
a debugger. Additional benefits is that I was able to use same
defines for both structures, which simplifes some code.
Most of the code is just removing Alter_info:: and Alter_inplace_info::
from alter table flags.
Following renames was done:
HA_ALTER_FLAGS -> alter_table_operations
CHANGE_CREATE_OPTION -> ALTER_CHANGE_CREATE_OPTION
Alter_info::ADD_INDEX -> ALTER_ADD_INDEX
DROP_INDEX -> ALTER_DROP_INDEX
ADD_UNIQUE_INDEX -> ALTER_ADD_UNIQUE_INDEX
DROP_UNIQUE_INDEx -> ALTER_DROP_UNIQUE_INDEX
ADD_PK_INDEX -> ALTER_ADD_PK_INDEX
DROP_PK_INDEX -> ALTER_DROP_PK_INDEX
Alter_info:ALTER_ADD_COLUMN -> ALTER_PARSE_ADD_COLUMN
Alter_info:ALTER_DROP_COLUMN -> ALTER_PARSE_DROP_COLUMN
Alter_inplace_info::ADD_INDEX -> ALTER_ADD_NON_UNIQUE_NON_PRIM_INDEX
Alter_inplace_info::DROP_INDEX -> ALTER_DROP_NON_UNIQUE_NON_PRIM_INDEX
Other things:
- Added typedef alter_table_operatons for alter table flags
- DROP CHECK CONSTRAINT can now be done online
- Added checks for Aria tables in alter_table_online.test
- alter_table_flags now takes an ulonglong as argument.
- Don't support online operations if checksum option is used.
- sql_lex.cc doesn't add ALTER_ADD_INDEX if index is not created
The problem resided in this branch of the "option_value_no_option_type" rule:
| '@' '@' opt_var_ident_type internal_variable_name equal set_expr_or_default
Summary:
1. internal_variable_name initialized tmp.var to trg_new_row_fake_var (0x01).
2. The condition "if (tmp.var == NULL)" did not check
the special case with trg_new_row_fake_var,
so Lex->set_system_variable(&tmp, $3, $6) was
called with tmp.var pointing to trg_new_row_fake_var,
which created a sys_var instance pointing to 0x01 instead of
a real system variable.
3. Later, at the trigger invocation time, this method was called:
sys_var::do_deprecated_warning (this=0x1, thd=0x7ffe6c000a98)
Notice, "this" is equal to trg_new_row_fake_var (0x01)
Solution:
The old implementation with separate rules
internal_variable_name (in sql_yacc.yy and sql_yacc_ora.yy) and
internal_variable_name_directly_assignable (in sql_yacc_ora.yy only)
was too complex and hard to follow.
Rewriting the code in a more straightforward way.
1. Changing LEX::set_system_variable()
from:
bool set_system_variable(struct sys_var_with_base *, enum_var_type, Item *);
to:
bool set_system_variable(enum_var_type, sys_var *, const LEX_CSTRING *, Item *);
2. Adding new methods in LEX, which operate with variable names:
bool set_trigger_field(const LEX_CSTRING *, const LEX_CSTRING *, Item *);
bool set_system_variable(enum_var_type var_type, const LEX_CSTRING *name,
Item *val);
bool set_system_variable(THD *thd, enum_var_type var_type,
const LEX_CSTRING *name1,
const LEX_CSTRING *name2,
Item *val);
bool set_default_system_variable(enum_var_type var_type,
const LEX_CSTRING *name,
Item *val);
bool set_variable(const LEX_CSTRING *name, Item *item);
3. Changing the grammar to call the new methods directly
in option_value_no_option_type,
Removing rules internal_variable_name and
internal_variable_name_directly_assignable.
4. Removing "struct sys_var_with_base" and trg_new_row_fake_var.
Good side effect:
- The code in /sql reduced from 314 to 183 lines.
- MDEV-15615 Unexpected syntax error instead of "Unknown system variable" ...
was also fixed automatically
Backporting from bb-10.2-compatibility to bb-10.2-ext
Version: 2018-01-26
- CREATE PACKAGE [BODY] statements are now
entirely written to mysql.proc with type='PACKAGE' and type='PACKAGE BODY'.
- CREATE PACKAGE BODY now supports IF NOT EXISTS
- DROP PACKAGE BODY now supports IF EXISTS
- CREATE OR REPLACE PACKAGE [BODY] is now supported
- CREATE PACKAGE [BODY] now support the DEFINER clause:
CREATE DEFINER user@host PACKAGE pkg ... END;
CREATE DEFINER user@host PACKAGE BODY pkg ... END;
- CREATE PACKAGE [BODY] now supports SQL SECURITY and COMMENT clauses, e.g.:
CREATE PACKAGE p1 SQL SECURITY INVOKER COMMENT "comment" AS ... END;
- Package routines are now created from the package CREATE PACKAGE BODY
statement and don't produce individual records in mysql.proc.
- CREATE PACKAGE BODY now supports package-wide variables.
Package variables can be read and set inside package routines.
Package variables are stored in a separate sp_rcontext,
which is cached in THD on the first packate routine call.
- CREATE PACKAGE BODY now supports the initialization section.
- All public routines (i.e. declared in CREATE PACKAGE)
must have implementations in CREATE PACKAGE BODY
- Only public package routines are available outside of the package
- {CREATE|DROP} PACKAGE [BODY] now respects CREATE ROUTINE and ALTER ROUTINE
privileges
- "GRANT EXECUTE ON PACKAGE BODY pkg" is now supported
- SHOW CREATE PACKAGE [BODY] is now supported
- SHOW PACKAGE [BODY] STATUS is now supported
- CREATE and DROP for PACKAGE [BODY] now works for non-current databases
- mysqldump now supports packages
- "SHOW {PROCEDURE|FUNCTION) CODE pkg.routine" now works for package routines
- "SHOW PACKAGE BODY CODE pkg" now works (the package initialization section)
- A new package body level MDL was added
- Recursive calls for package procedures are now possible
- Routine forward declarations in CREATE PACKATE BODY are now supported.
- Package body variables now work as SP OUT parameters
- Package body variables now work as SELECT INTO targets
- Package body variables now support ROW, %ROWTYPE, %TYPE
- CREATE PACKAGE [BODY] statements are now
entirely written to mysql.proc with type='PACKAGE' and type='PACKAGE BODY'.
- CREATE PACKAGE BODY now supports IF NOT EXISTS
- DROP PACKAGE BODY now supports IF EXISTS
- CREATE OR REPLACE PACKAGE [BODY] is now supported
- CREATE PACKAGE [BODY] now support the DEFINER clause:
CREATE DEFINER user@host PACKAGE pkg ... END;
CREATE DEFINER user@host PACKAGE BODY pkg ... END;
- CREATE PACKAGE [BODY] now supports SQL SECURITY and COMMENT clauses, e.g.:
CREATE PACKAGE p1 SQL SECURITY INVOKER COMMENT "comment" AS ... END;
- Package routines are now created from the package CREATE PACKAGE BODY
statement and don't produce individual records in mysql.proc.
- CREATE PACKAGE BODY now supports package-wide variables.
Package variables can be read and set inside package routines.
Package variables are stored in a separate sp_rcontext,
which is cached in THD on the first packate routine call.
- CREATE PACKAGE BODY now supports the initialization section.
- All public routines (i.e. declared in CREATE PACKAGE)
must have implementations in CREATE PACKAGE BODY
- Only public package routines are available outside of the package
- {CREATE|DROP} PACKAGE [BODY] now respects CREATE ROUTINE and ALTER ROUTINE
privileges
- "GRANT EXECUTE ON PACKAGE BODY pkg" is now supported
- SHOW CREATE PACKAGE [BODY] is now supported
- SHOW PACKAGE [BODY] STATUS is now supported
- CREATE and DROP for PACKAGE [BODY] now works for non-current databases
- mysqldump now supports packages
- "SHOW {PROCEDURE|FUNCTION) CODE pkg.routine" now works for package routines
- "SHOW PACKAGE BODY CODE pkg" now works (the package initialization section)
- A new package body level MDL was added
- Recursive calls for package procedures are now possible
- Routine forward declarations in CREATE PACKATE BODY are now supported.
- Package body variables now work as SP OUT parameters
- Package body variables now work as SELECT INTO targets
- Package body variables now support ROW, %ROWTYPE, %TYPE
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
Counter for select numbering made stored with the statement (before was global)
So now it does have always accurate value which does not depend on
interruption of statement prepare by errors like lack of table in
a view definition.
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db
Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
TRASH was mapped to TRASH_FREE and was supposed to be used for memory
that should not be accessed anymore, while TRASH_ALLOC() is to be
used for uninitialized but to-be-used memory.
But sometimes TRASH() was used in the latter sense.
Remove TRASH() macro, always use explicit TRASH_ALLOC() or TRASH_FREE().
Other changes done to get this to work:
- Added 'internal_tables' to TABLE object to list which sequence tables
is needed to use the table.
- Mark any expression using DEFAULT() with LEX->default_used.
This is needed when deciding if we should open internal sequence
tables when a table is opened (we don't need to open sequence tables
if the main table is only used with SELECT).
- Create_and_open_temporary_table() can now also open all internal
sequence tables.
- Added option MYSQL_LOCK_USE_MALLOC to mysql_lock_tables()
to force memory allocation to be used with malloc instead of
memroot.
- Added flag to MYSQL_LOCK to remember if allocation was done with
malloc or memroot (makes code simpler and safer).
- init_one_table_for_prelocking() now takes argument for what lock to
use instead of it's a routine or something else.
- Renamed prelocking placeholders to make them more understandable as
they are now used in more code.
- Changed test in check_lock_and_start_stmt() if found table has correct
locks. The old test didn't work for tables that has lock
TL_WRITE_ALLOW_WRITE, which is what sequence tables are using.
- Added VCOL_NOT_VIRTUAL option to ensure that sequence functions can't
be used with virtual columns
- More sequence tests
This commit implements aggregate stored functions. The basic idea behind
the feature is:
* Implement a special instruction FETCH GROUP NEXT ROW that will pause
the execution of the stored function. When the instruction is reached,
execution of the initial query resumes "as if" the function returned.
This gives the server the opportunity to advance to the next row in the
result set.
* Stored aggregates behave like regular aggregate functions. The
implementation of thus resides in the class Item_sum_sp. Because it is
an aggregate function, for each new row in the group, the
Item_sum_sp::add() method will be called. This is when execution resumes
and the function does another iteration to "add" one extra element to
the final result.
* When the end of group is reached, val_xxx() method will be called for
the item. This case is handled by another execute step for the stored
function, only with a special flag to force a call to the return
handler. See Item_sum_sp::execute() for details.
To allow this pause and resume semantic, we must preserve the function
context across executions. This is stored in Item_sp::sp_query_arena only for
aggregate stored functions, but has no impact for regular functions.
We also enforce aggregate functions to include the "FETCH GROUP NEXT ROW"
instruction.
Signed-off-by: Vicențiu Ciorbaru <vicentiu@mariadb.org>