In collaboration with Sergey Vojtovich <svoj@mariadb.org>
The COMPRESSED clause is now a part of the data type and goes immediately
after the data type and length, but before the CHARACTER SET clause,
and before column attributes such as DEFAULT, COLLATE, ON UPDATE,
SYSTEM VERSIONING, engine specific column attributes.
In the old reduction, the COMPRESSED clause was a column attribute.
New syntax:
<varchar or text data type> <length> <compression> <character set> <column attributes>
<varbinary or blob data type> <length> <compression> <column attributes>
New syntax examples:
VARCHAR(1000) COMPRESSED CHARACTER SET latin1 DEFAULT ''
BLOB COMPRESSED DEFAULT ''
Deprecate syntax examples:
VARCHAR(1000) CHARACTER SET latin1 COMPRESSED DEFAULT ''
TEXT CHARACTER SET latin1 DEFAULT '' COMPRESSED
VARBINARY(1000) DEFAULT '' COMPRESSED
As a side effect:
- COMPRESSED is not valid as an SP label name in SQL/PSM routines any more
(but it's still valid as an SP label name in sql_mode=ORACLE)
- COMPRESSED is now allowed in combination with GENERATED ALWAYS AS:
TEXT COMPRESSED GENERATED ALWAYS AS REPEAT('a',1000)
query with VALUES()
A table value constructor can be used in all contexts where a select
can be used. In particular an ORDER BY clause or a LIMIT clause or both
of them can be attached to a table value constructor to produce a new
query. Unfortunately execution of such queries was not supported.
This patch fixes the problem.
MDEV-17660 sql_mode=ORACLE: Some keywords do not work as label names: history, system, versioning, without
MDEV-17661 Add sql_mode specific tokens for the keyword DECODE
This patch always provides columns of the temporary table used for
materialization of a table value constructor with some names.
Before this patch these names were always borrowed from the items
of the first row of the table value constructor. When this row
contained expressions and expressions were not named then it could cause
different kinds of problems. In particular if the TVC is used as the
specification of a derived table this could cause a crash.
The names given to the expressions used in a TVC are the same as those
given to the columns of the result set from the corresponding SELECT.
1. Adding LEX::make_item_sysvar() and reusing it
in sql_yacc.yy and sql_yacc_ora.yy.
Removing the "opt_component" rule.
2. Renaming rules to better reflect their purpose:
- keyword to keyword_ident
- keyword_sp to keyword_label
- keyword_sp_not_data_type to keyword_sp_var_and_label
Also renaming:
- sp_decl_ident_keyword to keyword_sp_decl for naming consistency
- keyword_alias to keyword_table_alias,
for consistency with ident_table_alias
- keyword_sp_data_type to keyword_data_type,
as it has nothing SP-specific.
3. Moving GLOBAL_SYM, LOCAL_SYM, SESSION_SYM from
keyword_sp_var_and_label to a separate rule keyword_sysvar_type.
We don't have system variables with these names anyway.
Adding ident_sysvar_name and using it in the grammar that needs
a system variable name instead of ident_or_text.
This removed a number of shift/reduce conflicts
between GLOBAL_SYM/LOCAL_SYM/SESSION_SYM as a variable scope and
as a variable name.
4. Moving keywords BEGIN_SYM, END (in both *.yy fiels)
and EXCEPTION_SYM (in sql_yacc_ora.yy) into a separate
rule keyword_sp_block_section, because in Oracle verb keywords
(COMMIT, DO, HANDLER, OPEN, REPAIR, ROLLBACK, SAVEPOINT, SHUTDOWN, TRUNCATE)
are good variables names and can appear in e.g. DECLARE,
while block keywords (BEGIN, END, EXCEPTION) are not good variable names
and cannot appear in DECLARE.
5. Further splitting keyword_directly_not_assignable in sql_yacc_ora.yy:
moving keyword_sp_verb_clause out. Renaming the rest of
keyword_directly_not_assignable to keyword_sp_head,
which represents keywords that can appear in optional
clauses in CREATE PROCEDURE/FUNCTION/TRIGGER.
6. Renaming keyword_sp_verb_clause to keyword_verb_clause,
as now it does not contains anything SP-specific.
As a result or #4,#5,#6, the rule keyword_directly_not_assignable
was replaced to three separate rules:
- keyword_sp_block
- keyword_sp_head
- keyword_verb_clause
Adding the same rules in sql_yacc.yy, for unification.
6. Adding keyword_sp_head and keyword_verb_clause into keyword_sp_decl.
This fixes MDEV-16244.
7. Reorganizing the rest of keyword related rules into two groups:
a. Rules defining a list of keywords and consisting of only terminal symbols:
- keyword_sp_var_not_label
- keyword_sp_head
- keyword_sp_verb_clause
- keyword_sp_block_section
- keyword_sysvar_type
b. Rules that combine the above lists into keyword places:
- keyword_table_alias
- keyword_ident
- keyword_label
- keyword_sysvar_name
- keyword_sp_decl
Rules from the group "b" use on the right side only rules
from the group "a" (with optional terminal symbols added).
Rules from the group "b" DO NOT mutually use each other any more.
This makes them easier to read (and see the difference between them).
Sorting the right sides of the group "b" keyword rules alphabetically,
for yet better readability.
Merging the following features from sql_yacc.yy to sql_yacc_ora.yy:
- system versioning
- column compression
- table value constructor
- spatial predicate WITHIN
- DELETE_DOMAIN_ID
Fixing shift/reduce conflicts introduced by the new system versioning
syntax.
Additionally, fixing old shift/reduce conflicts:
In PREVIOUS/NEXT being identifiers or sequence operations:
SELECT PREVIOUS FROM t1;
SELECT PREVIOUS VALUE FOR t1;
In TIME/DATE/TIMESTAMP being literals, functions or identifiers:
SELECT TIMESTAMP'2001-01-01 10:20:30' FROM t1;
SELECT TIMESTAMP('2001-01-01 10:20:30') FROM t1;
SELECT TIMESTAMP FROM t1;
We'll be fixing soon shift-reduce conflicts introduced in the new
10.3 syntax (see MDEV-15818 for details) by defining precedence for
a number of tokens (e.g. TIMESTAMP, TRANSACTION_SYM, TEXT_STRING)
and adding "%prec" directives.
Before doing this, it's better to have the existing precedences set properly,
for easier readability and maintainability.
Details:
- Changing precedence of NOT to its proper position (between AND and IS).
It was wrong. It worked fine only because the relevant grammar reside
in different separate rules (expr and predicate).
- Moving NOT2_SYM and BINARY to the same line with NEG and ~
It worked fine because operators !, BINARY, ~ do not conflict
to each other.
- Fixing associativity of NOT_SYM, NOT2_SYM, BINARY, COLLATE_SYM
from "right" to "left". They are not dyadic (operate on a single expression
only). So "%left" or "%right" is needed only to set precedence,
while associativity does not matter.
Note, it would be better to use "%precedence" for these tokens
instead of "%left" though, but we use an old version of Bison on windows,
which may not support %precedence yet.
This patch does not change behavior. The generated sql_yacc.cc and
sql_yacc_ora.cc are exactly the same before and after this change.
1. Adding THD::convert_string(LEX_CSTRING *to,...) as a wrapper
for convert_string(LEX_STRING *to,...), as LEX_CSTRING
is now frequently used for conversion purpose.
This reduced duplicate code in TEXT_STRING_sys,
TEXT_STRING_literal, TEXT_STRING_filesystem grammar rules in *.yy
2. Adding yet another THD::convert_string() with an extra parameter
"bool simple_copy_is_possible". This even more reduced
repeatable code in the mentioned grammar rules in *.yy
3. Deriving Lex_ident_cli_st from Lex_string_with_metadata_st,
as they have very similar functionality. Moving m_quote
from Lex_ident_cli_st to Lex_string_with_metadata_st,
as m_quote will be used later to optimize string literals anyway
(e.g. avoid redundant copying on the tokenizer stage).
Adjusting Lex_input_stream::get_text() accordingly.
4. Moving the reminders of the code in TEXT_STRING_sys, TEXT_STRING_literal,
TEXT_STRING_filesystem grammar rules as new methods in THD:
- make_text_string_sys()
- make_text_string_connection()
- make_text_string_filesystem()
and changing *.yy to use these new methods.
This reduced the amount of similar code in
sql_yacc.yy and sql_yacc_ora.yy.
5. Removing duplicate code in Lex_input_stream::body_utf8_append_ident():
by reusing THD::make_text_string_sys(). Thanks to #3 and #4.
6. Making THD members charset_is_system_charset,
charset_is_collation_connection, charset_is_character_set_filesystem
private, as they are not needed externally any more.
As thd->alloc() and new automatically calls my_error(ER_OUTOFMEORY)
there is no reason to call mem_alloc_error()
Other things:
- Fixed bug in mysql_unpack_partition() where lex.part_info was
changed even if it would be a null pointer
Change all my_stcasecmp() calls that uses lexical keywords to use
lex_string_eq. This is faster as we only call strcasecmp() for
strings of different lengths.
Removed not used function lex_string_syseq()
The code passing positions in the query to constructors of
Rewritable_query_parameter descendants (e.g. Item_splocal)
was not reliable. It used various Lex_input_stream methods:
- get_tok_start()
- get_tok_start_prev()
- get_tok_end()
- get_ptr()
to find positions of the recently scanned tokens.
The challenge was mostly to choose between get_tok_start()
and get_tok_start_prev(), taking into account to the current
grammar (depending if lookahead takes place before
or after we read the positions in every particular rule).
But this approach did not work at all in combination
with token contractions, when MYSQLlex() translates
two tokens into one token ID, for example:
WITH ROLLUP -> WITH_ROLLUP_SYM
As a result, the tokenizer is already one more token ahead.
So in query fragment:
"GROUP BY d, spvar WITH ROLLUP"
get_tok_start() points to "ROLLUP".
get_tok_start_prev() points to "WITH".
As a result, it was "WITH" who was erroneously replaced
to NAME_CONST() instead of "spvar".
This patch modifies the code to do it a different way.
Changes:
1. For keywords and identifiers, the tokenizer now
returns LEX_CTRING pointing directly to the query
fragment. So query positions are now just available using:
- $1.str - for the beginning of a token
- $1.str+$1.length - for the end of a token
2. Identifiers are not allocated on the THD memory root
in the tokenizer any more. Allocation is now done
on later stages, in methods like LEX::create_item_ident().
3. Two LEX_CSTRING based structures were added:
- Lex_ident_cli_st - used to store the "client side"
identifier representation, pointing to the
query fragment. Note, these identifiers
are encoded in @@character_set_client
and can have broken byte sequences.
- Lex_ident_sys_st - used to store the "server side"
identifier representation, pointing to the
THD allocated memory. This representation
guarantees that the identifier was checked
for being well-formed, and is encoded in utf8.
4. To distinguish between two identifier types
in the grammar, two Bison types were added:
<ident_cli> and <ident_sys>
5. All non-reserved keywords were marked as
being of the type <ident_cli>.
All reserved keywords are still of the type NONE.
6. All curly brackets in rules collecting
non-reserved keywords into non-terminal
symbols were removed, e.g.:
Was:
keyword_sp_data_type:
BIT_SYM {}
| BOOLEAN_SYM {}
Now:
keyword_sp_data_type:
BIT_SYM
| BOOLEAN_SYM
This is important NOT to have brackets here!!!!
This is needed to make sure that the underlying
Lex_ident_cli_ststructure correctly passes up to
the calling rule.
6. The code to scan identifiers and keywords
was moved from lex_one_token() into new
Lex_input_stream methods:
scan_ident_sysvar()
scan_ident_start()
scan_ident_middle()
scan_ident_delimited()
This was done to:
- get rid of enormous amount of references to &yylval->lex_str
- and remove a lot of references like lip->xxx
7. The allocating functionality which puts identifiers on the
THD memory root now resides in methods of Lex_ident_sys_st,
and in THD::to_ident_sys_alloc().
get_quoted_token() was removed.
8. Cleanup: check_simple_select() was moved as a method to LEX.
9. Cleanup: Some more functionality was moved from *.yy
to new methods were added to LEX:
make_item_colon_ident_ident()
make_item_func_call_generic()
create_item_qualified_asterisk()
This is done to get more free flag bits for alter_info->flags
Renamed all ALTER PARTITION defines to start with ALTER_PARTITION_
Renamed ALTER_PARTITION to ALTER_PARTITION_INFO
Renamed ALTER_TABLE_REORG to ALTER_PARTITION_TABLE_REORG
Other things:
- Shifted some ALTER_xxx defines to get empty bits at end
Main reason was to make it easier to print the above structures in
a debugger. Additional benefits is that I was able to use same
defines for both structures, which simplifes some code.
Most of the code is just removing Alter_info:: and Alter_inplace_info::
from alter table flags.
Following renames was done:
HA_ALTER_FLAGS -> alter_table_operations
CHANGE_CREATE_OPTION -> ALTER_CHANGE_CREATE_OPTION
Alter_info::ADD_INDEX -> ALTER_ADD_INDEX
DROP_INDEX -> ALTER_DROP_INDEX
ADD_UNIQUE_INDEX -> ALTER_ADD_UNIQUE_INDEX
DROP_UNIQUE_INDEx -> ALTER_DROP_UNIQUE_INDEX
ADD_PK_INDEX -> ALTER_ADD_PK_INDEX
DROP_PK_INDEX -> ALTER_DROP_PK_INDEX
Alter_info:ALTER_ADD_COLUMN -> ALTER_PARSE_ADD_COLUMN
Alter_info:ALTER_DROP_COLUMN -> ALTER_PARSE_DROP_COLUMN
Alter_inplace_info::ADD_INDEX -> ALTER_ADD_NON_UNIQUE_NON_PRIM_INDEX
Alter_inplace_info::DROP_INDEX -> ALTER_DROP_NON_UNIQUE_NON_PRIM_INDEX
Other things:
- Added typedef alter_table_operatons for alter table flags
- DROP CHECK CONSTRAINT can now be done online
- Added checks for Aria tables in alter_table_online.test
- alter_table_flags now takes an ulonglong as argument.
- Don't support online operations if checksum option is used.
- sql_lex.cc doesn't add ALTER_ADD_INDEX if index is not created
fill_alter_table() always thought that index was changed because of
of a wrong check of block_size. Some engines had code to correct this
that should not be needed, Aria didn't and because of this some online
operations didn't work.
This code fixes the comparision of block_size to only compare if it's set.
Main changes:
- Changing the constructor to accept a CHARSET_INFO pointer, instead of an Item pointer
- Updating the bison grammar accordingly
Additional cleanups:
- Simplifying Item_func_set_collation::eq() by reusing Item_func::eq()
- Removing unused binary_keyword