This patch always provides columns of the temporary table used for
materialization of a table value constructor with some names.
Before this patch these names were always borrowed from the items
of the first row of the table value constructor. When this row
contained expressions and expressions were not named then it could cause
different kinds of problems. In particular if the TVC is used as the
specification of a derived table this could cause a crash.
The names given to the expressions used in a TVC are the same as those
given to the columns of the result set from the corresponding SELECT.
For Item name_const , we should never do typecast it to Item_field because we
always expect it to be a constant value.
So instead of checking the type() its better to introduce a function in the
Item class get_item_field, which would return the item_field object for the item
which have type of FIELD_ITEM
1. Adding LEX::make_item_sysvar() and reusing it
in sql_yacc.yy and sql_yacc_ora.yy.
Removing the "opt_component" rule.
2. Renaming rules to better reflect their purpose:
- keyword to keyword_ident
- keyword_sp to keyword_label
- keyword_sp_not_data_type to keyword_sp_var_and_label
Also renaming:
- sp_decl_ident_keyword to keyword_sp_decl for naming consistency
- keyword_alias to keyword_table_alias,
for consistency with ident_table_alias
- keyword_sp_data_type to keyword_data_type,
as it has nothing SP-specific.
3. Moving GLOBAL_SYM, LOCAL_SYM, SESSION_SYM from
keyword_sp_var_and_label to a separate rule keyword_sysvar_type.
We don't have system variables with these names anyway.
Adding ident_sysvar_name and using it in the grammar that needs
a system variable name instead of ident_or_text.
This removed a number of shift/reduce conflicts
between GLOBAL_SYM/LOCAL_SYM/SESSION_SYM as a variable scope and
as a variable name.
4. Moving keywords BEGIN_SYM, END (in both *.yy fiels)
and EXCEPTION_SYM (in sql_yacc_ora.yy) into a separate
rule keyword_sp_block_section, because in Oracle verb keywords
(COMMIT, DO, HANDLER, OPEN, REPAIR, ROLLBACK, SAVEPOINT, SHUTDOWN, TRUNCATE)
are good variables names and can appear in e.g. DECLARE,
while block keywords (BEGIN, END, EXCEPTION) are not good variable names
and cannot appear in DECLARE.
5. Further splitting keyword_directly_not_assignable in sql_yacc_ora.yy:
moving keyword_sp_verb_clause out. Renaming the rest of
keyword_directly_not_assignable to keyword_sp_head,
which represents keywords that can appear in optional
clauses in CREATE PROCEDURE/FUNCTION/TRIGGER.
6. Renaming keyword_sp_verb_clause to keyword_verb_clause,
as now it does not contains anything SP-specific.
As a result or #4,#5,#6, the rule keyword_directly_not_assignable
was replaced to three separate rules:
- keyword_sp_block
- keyword_sp_head
- keyword_verb_clause
Adding the same rules in sql_yacc.yy, for unification.
6. Adding keyword_sp_head and keyword_verb_clause into keyword_sp_decl.
This fixes MDEV-16244.
7. Reorganizing the rest of keyword related rules into two groups:
a. Rules defining a list of keywords and consisting of only terminal symbols:
- keyword_sp_var_not_label
- keyword_sp_head
- keyword_sp_verb_clause
- keyword_sp_block_section
- keyword_sysvar_type
b. Rules that combine the above lists into keyword places:
- keyword_table_alias
- keyword_ident
- keyword_label
- keyword_sysvar_name
- keyword_sp_decl
Rules from the group "b" use on the right side only rules
from the group "a" (with optional terminal symbols added).
Rules from the group "b" DO NOT mutually use each other any more.
This makes them easier to read (and see the difference between them).
Sorting the right sides of the group "b" keyword rules alphabetically,
for yet better readability.
Merging the following features from sql_yacc.yy to sql_yacc_ora.yy:
- system versioning
- column compression
- table value constructor
- spatial predicate WITHIN
- DELETE_DOMAIN_ID
The current code does not support recursive CTEs whose specifications
contain a mix of ALL UNION and DISTINCT UNION operations.
This patch catches such specifications and reports errors for them.
Fixing shift/reduce conflicts introduced by the new system versioning
syntax.
Additionally, fixing old shift/reduce conflicts:
In PREVIOUS/NEXT being identifiers or sequence operations:
SELECT PREVIOUS FROM t1;
SELECT PREVIOUS VALUE FOR t1;
In TIME/DATE/TIMESTAMP being literals, functions or identifiers:
SELECT TIMESTAMP'2001-01-01 10:20:30' FROM t1;
SELECT TIMESTAMP('2001-01-01 10:20:30') FROM t1;
SELECT TIMESTAMP FROM t1;
We'll be fixing soon shift-reduce conflicts introduced in the new
10.3 syntax (see MDEV-15818 for details) by defining precedence for
a number of tokens (e.g. TIMESTAMP, TRANSACTION_SYM, TEXT_STRING)
and adding "%prec" directives.
Before doing this, it's better to have the existing precedences set properly,
for easier readability and maintainability.
Details:
- Changing precedence of NOT to its proper position (between AND and IS).
It was wrong. It worked fine only because the relevant grammar reside
in different separate rules (expr and predicate).
- Moving NOT2_SYM and BINARY to the same line with NEG and ~
It worked fine because operators !, BINARY, ~ do not conflict
to each other.
- Fixing associativity of NOT_SYM, NOT2_SYM, BINARY, COLLATE_SYM
from "right" to "left". They are not dyadic (operate on a single expression
only). So "%left" or "%right" is needed only to set precedence,
while associativity does not matter.
Note, it would be better to use "%precedence" for these tokens
instead of "%left" though, but we use an old version of Bison on windows,
which may not support %precedence yet.
This patch does not change behavior. The generated sql_yacc.cc and
sql_yacc_ora.cc are exactly the same before and after this change.
MDEV-16100 FOR SYSTEM_TIME erroneously resolves string user variables as transaction IDs
Problem:
Vers_history_point::resolve_unit() tested item->result_type() before
item->fix_fields() was called.
- Item_func_get_user_var::result_type() returned REAL_RESULT by default.
This caused MDEV-16100.
- Item_func_sp::result_type() crashed on assert.
This caused MDEV-16094
Changes:
1. Adding item->fix_fields() into Vers_history_point::resolve_unit()
before using data type specific properties of the history point
expression.
2. Adding a new virtual method Type_handler::Vers_history_point_resolve_unit()
3. Implementing type-specific
Type_handler_xxx::Type_handler::Vers_history_point_resolve_unit()
in the way to:
a. resolve temporal and general purpose string types to TIMESTAMP
b. resolve BIT and general purpose INT types to TRANSACTION
c. disallow use of non-relevant data type expressions in FOR SYSTEM_TIME
Note, DOUBLE and DECIMAL data types are disallowed intentionally.
- DOUBLE does not have enough precision to hold huge BIGINT UNSIGNED values
- DECIMAL rounds on conversion to INT
Both lack of precision and rounding might potentionally lead to
very unpredictable results when a wrong transaction ID would be chosen.
If one really wants dangerous use of DOUBLE and DECIMAL, explicit CAST
can be used:
FOR SYSTEM_TIME AS OF CAST(double_or_decimal AS UNSIGNED)
QQ: perhaps DECIMAL(N,0) could still be allowed.
4. Adding a new virtual method Item::type_handler_for_system_time(),
to make HEX hybrids and bit literals work as TRANSACTION rather
than TIMESTAMP.
5. sql_yacc.yy: replacing the rule temporal_literal to "TIMESTAMP TEXT_STRING".
Other temporal literals now resolve to TIMESTAMP through the new
Type_handler methods. No special grammar needed. This removed
a few shift/resolve conflicts.
(TIMESTAMP related conflicts in "history_point:" will be removed separately)
6. Removing the "timestamp_only" parameter from
vers_select_conds_t::resolve_units() and Vers_history_point::resolve_unit().
It was a hint telling that a table did not have any TRANSACTION-aware
system time columns, so it's OK to resolve to TIMESTAMP in case of uncertainty.
In the new reduction it works as follows:
- the decision between TIMESTAMP and TRANSACTION is first made
based only on the expression data type only
- then, in case if the expression resolved to TRANSACTION, the table
is checked if TRANSACTION-aware columns really exist.
This way is safer against possible ALTER TABLE statements changing
ROW START and ROW END columns from "BIGINT UNSIGNED" to "TIMESTAMP(x)"
or the other way around.
Don't use hidden system time in versioning,
but keep the system time logic in THD
to workaround low-res system clock and
replication not versioned to versioned.
This reverts MDEV-14788 (System versioning cannot
be based on local timestamps, as it is now).
Versioning is based on local timestamps again,
but timestamps are protected by MDEV-15923
(option to control who can set session @@timestamp).
1. Adding THD::convert_string(LEX_CSTRING *to,...) as a wrapper
for convert_string(LEX_STRING *to,...), as LEX_CSTRING
is now frequently used for conversion purpose.
This reduced duplicate code in TEXT_STRING_sys,
TEXT_STRING_literal, TEXT_STRING_filesystem grammar rules in *.yy
2. Adding yet another THD::convert_string() with an extra parameter
"bool simple_copy_is_possible". This even more reduced
repeatable code in the mentioned grammar rules in *.yy
3. Deriving Lex_ident_cli_st from Lex_string_with_metadata_st,
as they have very similar functionality. Moving m_quote
from Lex_ident_cli_st to Lex_string_with_metadata_st,
as m_quote will be used later to optimize string literals anyway
(e.g. avoid redundant copying on the tokenizer stage).
Adjusting Lex_input_stream::get_text() accordingly.
4. Moving the reminders of the code in TEXT_STRING_sys, TEXT_STRING_literal,
TEXT_STRING_filesystem grammar rules as new methods in THD:
- make_text_string_sys()
- make_text_string_connection()
- make_text_string_filesystem()
and changing *.yy to use these new methods.
This reduced the amount of similar code in
sql_yacc.yy and sql_yacc_ora.yy.
5. Removing duplicate code in Lex_input_stream::body_utf8_append_ident():
by reusing THD::make_text_string_sys(). Thanks to #3 and #4.
6. Making THD members charset_is_system_charset,
charset_is_collation_connection, charset_is_character_set_filesystem
private, as they are not needed externally any more.
The code in the "sp_tail" rule in sql_yacc.yy always
used YYLIP->get_cpp_tok_start() as the start of the body,
and did not check for possible lookahead which happens
for keywords "FOR", "VALUES" and "WITH" for LALR(2)
resolution in Lex_input_stream::lex_token().
In case of the lookahead token presence,
get_tok_start_prev() should have been used instead
of get_cpp_tok_start() as the beginning of the SP body.
Change summary:
This patch hides the implementation of the lookahead
token completely inside Lex_input_stream.
The users of Lex_input_stream now just get token-by-token
transparently and should not care about lookahead any more.
Now external users of Lex_input_stream
are not aware of the lookahead token at all.
Change details:
- Moving Lex_input_stream::has_lookahead() into the "private" section.
- Removing Lex_input_stream::get_tok_start_prev() and
Lex_input_stream::get_cpp_start_prev().
- Fixing the external code to call get_tok_start() and get_cpp_tok_start()
in all places where get_tok_start_prev() and get_cpp_start_prev()
where used.
- Adding a test for has_lookahead() right inside
get_tok_start() and get_cpp_tok_start().
If there is a lookahead token, these methods now
return the position of the previous token automatically:
const char *get_tok_start()
{
return has_lookahead() ? m_tok_start_prev : m_tok_start;
}
const char *get_cpp_tok_start()
{
return has_lookahead() ? m_cpp_tok_start_prev : m_cpp_tok_start;
}
- Fixing the internal code inside Lex_input_stream methods
to use m_tok_start and m_cpp_tok_start directly,
instead of calling get_tok_start() and get_cpp_tok_start(),
to make sure to access to the *current* token position
(independently of a lookahead token presence).
Reasoning:
- Shorter and clearer code
- Better encapsulation
(a fair number of Lex_input_stream methods and members were
moved to the private section)
New methods:
int lex_token(union YYSTYPE *yylval, THD *thd);
bool consume_comment(int remaining_recursions_permitted);
int lex_one_token(union YYSTYPE *yylval, THD *thd);
int find_keyword(Lex_ident_cli_st *str, uint len, bool function);
LEX_CSTRING get_token(uint skip, uint length);
Additional changes:
- Removing Lex_input_stream::yylval.
In the original code it was just an alias
for the "yylval" passed to lex_one_token().
This coding style is bug prone and is hard to follow.
In the new reduction "yylval" (or its components) is passed to
the affected methods as a parameter.
- Moving the code in sql_lex.h up and down between "private" and "public"
sections (sorry if this made the diff somewhat harder to read)
As thd->alloc() and new automatically calls my_error(ER_OUTOFMEORY)
there is no reason to call mem_alloc_error()
Other things:
- Fixed bug in mysql_unpack_partition() where lex.part_info was
changed even if it would be a null pointer
Change all my_stcasecmp() calls that uses lexical keywords to use
lex_string_eq. This is faster as we only call strcasecmp() for
strings of different lengths.
Removed not used function lex_string_syseq()
The code passing positions in the query to constructors of
Rewritable_query_parameter descendants (e.g. Item_splocal)
was not reliable. It used various Lex_input_stream methods:
- get_tok_start()
- get_tok_start_prev()
- get_tok_end()
- get_ptr()
to find positions of the recently scanned tokens.
The challenge was mostly to choose between get_tok_start()
and get_tok_start_prev(), taking into account to the current
grammar (depending if lookahead takes place before
or after we read the positions in every particular rule).
But this approach did not work at all in combination
with token contractions, when MYSQLlex() translates
two tokens into one token ID, for example:
WITH ROLLUP -> WITH_ROLLUP_SYM
As a result, the tokenizer is already one more token ahead.
So in query fragment:
"GROUP BY d, spvar WITH ROLLUP"
get_tok_start() points to "ROLLUP".
get_tok_start_prev() points to "WITH".
As a result, it was "WITH" who was erroneously replaced
to NAME_CONST() instead of "spvar".
This patch modifies the code to do it a different way.
Changes:
1. For keywords and identifiers, the tokenizer now
returns LEX_CTRING pointing directly to the query
fragment. So query positions are now just available using:
- $1.str - for the beginning of a token
- $1.str+$1.length - for the end of a token
2. Identifiers are not allocated on the THD memory root
in the tokenizer any more. Allocation is now done
on later stages, in methods like LEX::create_item_ident().
3. Two LEX_CSTRING based structures were added:
- Lex_ident_cli_st - used to store the "client side"
identifier representation, pointing to the
query fragment. Note, these identifiers
are encoded in @@character_set_client
and can have broken byte sequences.
- Lex_ident_sys_st - used to store the "server side"
identifier representation, pointing to the
THD allocated memory. This representation
guarantees that the identifier was checked
for being well-formed, and is encoded in utf8.
4. To distinguish between two identifier types
in the grammar, two Bison types were added:
<ident_cli> and <ident_sys>
5. All non-reserved keywords were marked as
being of the type <ident_cli>.
All reserved keywords are still of the type NONE.
6. All curly brackets in rules collecting
non-reserved keywords into non-terminal
symbols were removed, e.g.:
Was:
keyword_sp_data_type:
BIT_SYM {}
| BOOLEAN_SYM {}
Now:
keyword_sp_data_type:
BIT_SYM
| BOOLEAN_SYM
This is important NOT to have brackets here!!!!
This is needed to make sure that the underlying
Lex_ident_cli_ststructure correctly passes up to
the calling rule.
6. The code to scan identifiers and keywords
was moved from lex_one_token() into new
Lex_input_stream methods:
scan_ident_sysvar()
scan_ident_start()
scan_ident_middle()
scan_ident_delimited()
This was done to:
- get rid of enormous amount of references to &yylval->lex_str
- and remove a lot of references like lip->xxx
7. The allocating functionality which puts identifiers on the
THD memory root now resides in methods of Lex_ident_sys_st,
and in THD::to_ident_sys_alloc().
get_quoted_token() was removed.
8. Cleanup: check_simple_select() was moved as a method to LEX.
9. Cleanup: Some more functionality was moved from *.yy
to new methods were added to LEX:
make_item_colon_ident_ident()
make_item_func_call_generic()
create_item_qualified_asterisk()
Element_type& Bounds_checked_array<Element_type>::operator[]
(size_t) [with Element_type = Item*; size_t = long unsigned int]
In sql_yacc.yy the semantic actions for the MEDIAN window function
lacked a call of st_select_lex::prepare_add_window_spec().
This function saves the head of the thd->lex->order_list into
lex->save_order_list in order this head to be restored in
st_select_lex::add_window_spec after the specification of the
window function has been parsed.
Without a call of prepare_add_window_spec() when add_window_spec()
was called the head of an empty list was copied into
thd->lex->order_list (instead of assumed saved head this list).
This made the list thd->lex->order_list invalid and potentially
could cause many different problems.
Corrected the result set in the test case for MDEV-15899 that
used the MEDIAN window function and could not be correct
without this fix.
This is done to get more free flag bits for alter_info->flags
Renamed all ALTER PARTITION defines to start with ALTER_PARTITION_
Renamed ALTER_PARTITION to ALTER_PARTITION_INFO
Renamed ALTER_TABLE_REORG to ALTER_PARTITION_TABLE_REORG
Other things:
- Shifted some ALTER_xxx defines to get empty bits at end
Main reason was to make it easier to print the above structures in
a debugger. Additional benefits is that I was able to use same
defines for both structures, which simplifes some code.
Most of the code is just removing Alter_info:: and Alter_inplace_info::
from alter table flags.
Following renames was done:
HA_ALTER_FLAGS -> alter_table_operations
CHANGE_CREATE_OPTION -> ALTER_CHANGE_CREATE_OPTION
Alter_info::ADD_INDEX -> ALTER_ADD_INDEX
DROP_INDEX -> ALTER_DROP_INDEX
ADD_UNIQUE_INDEX -> ALTER_ADD_UNIQUE_INDEX
DROP_UNIQUE_INDEx -> ALTER_DROP_UNIQUE_INDEX
ADD_PK_INDEX -> ALTER_ADD_PK_INDEX
DROP_PK_INDEX -> ALTER_DROP_PK_INDEX
Alter_info:ALTER_ADD_COLUMN -> ALTER_PARSE_ADD_COLUMN
Alter_info:ALTER_DROP_COLUMN -> ALTER_PARSE_DROP_COLUMN
Alter_inplace_info::ADD_INDEX -> ALTER_ADD_NON_UNIQUE_NON_PRIM_INDEX
Alter_inplace_info::DROP_INDEX -> ALTER_DROP_NON_UNIQUE_NON_PRIM_INDEX
Other things:
- Added typedef alter_table_operatons for alter table flags
- DROP CHECK CONSTRAINT can now be done online
- Added checks for Aria tables in alter_table_online.test
- alter_table_flags now takes an ulonglong as argument.
- Don't support online operations if checksum option is used.
- sql_lex.cc doesn't add ALTER_ADD_INDEX if index is not created
fill_alter_table() always thought that index was changed because of
of a wrong check of block_size. Some engines had code to correct this
that should not be needed, Aria didn't and because of this some online
operations didn't work.
This code fixes the comparision of block_size to only compare if it's set.
Main changes:
- Changing the constructor to accept a CHARSET_INFO pointer, instead of an Item pointer
- Updating the bison grammar accordingly
Additional cleanups:
- Simplifying Item_func_set_collation::eq() by reusing Item_func::eq()
- Removing unused binary_keyword