Problem:
========
The test now fails with the following trace:
CURRENT_TEST: rpl.rpl_parallel_temptable
--- /mariadb/10.4/mysql-test/suite/rpl/r/rpl_parallel_temptable.result
+++ /mariadb/10.4/mysql-test/suite/rpl/r/rpl_parallel_temptable.reject
@@ -194,7 +194,6 @@
30 conservative
31 conservative
32 optimistic
-33 optimistic
Analysis:
=========
The part of test which fails with result content mismatch is given below.
CREATE TEMPORARY TABLE t4 (a INT PRIMARY KEY) ENGINE=InnoDB;
INSERT INTO t4 VALUES (32);
INSERT INTO t4 VALUES (33);
INSERT INTO t1 SELECT a, "optimistic" FROM t4;
slave_parallel_mode=optimistic
The expectation of the above test script is, INSERT FROM SELECT should read both
32, 33 and populate table 't1'. But this expectation fails occasionally.
All three INSERT statements are handed over to three different slave parallel
workers. Temporary tables are not safe for parallel replication. They were
designed to be visible to one thread only, so have no table locking. Thus there
is no protection against two conflicting transactions committing in parallel and
things like that.
So anything that uses temporary tables will be serialized with anything before
it, when using parallel replication by using a "wait_for_prior_commit" function
call. This will ensure that the each transaction is executed sequentially.
But there exists a code path in which the above wait doesn't happen. Because of
this at times INSERT from SELECT doesn't wait for the INSERT (33) to complete
and it completes its executes and enters commit stage. Hence only row 32 is
found in those cases resulting in test failure.
The wait needs to be added within "open_temporary_table" call. The code looks
like this within "open_temporary_table".
Each thread tries to open temporary table in 3 different ways:
case 1: Find a temporary table which is already in use by using
find_temporary_table(tl) && wait_for_prior_commit()
case 2: If above failed then try to look for temporary table which is marked for
free for reuse. This internally calls "wait_for_prior_commit()" if table
is found.
find_and_use_tmp_table(tl, &table)
case 3: If none of the above open a new table handle from table share.
if (!table && (share= find_tmp_table_share(tl)))
{ table= open_temporary_table(share, tl->get_table_name(), true); }
At present the "wait_for_prior_commit" happens only in case 1 & 2.
Fix:
====
On slave add a call for "wait_for_prior_commit" for case 3.
The above wait on slave will solve the issue. A more detailed fix would be to
mark temporary tables as not safe for parallel execution on the master side.
In order to do that, on the master side, mark the Gtid_log_event specific flag
FL_TRANSACTIONAL to be false all the time. So that they are not scheduled
parallely.
query with VALUES()
A table value constructor can be used in all contexts where a select
can be used. In particular an ORDER BY clause or a LIMIT clause or both
of them can be attached to a table value constructor to produce a new
query. Unfortunately execution of such queries was not supported.
This patch fixes the problem.
If a derived table has SELECT DISTINCT, provide index statistics for it so that the join optimizer in the
upper select knows that ref access to the table will produce one row.
it always required UPDATE privilege on views, not being able to detect
when a views was not actually updated in multi-update.
fix: instead of marking all tables as "updating" by default,
only set "updating" on tables that will actually be updated
by multi-update. And mark the view "updating" if any of the
view's tables is.
A syntax error was reported for any INSERT statement with explicit
partition selection it if i used a column list.
Fixed by saving the parsing place before parsing the clause for explicit
partition selection and restoring it when the clause has been parsed.
This bug is caused by pushdown from HAVING into WHERE.
It appears because condition that is pushed wasn't fixed.
It is also discovered that condition pushdown from HAVING into
WHERE is done wrong. There is no need to build clones for some
conditions that can be pushed. They can be simply moved from HAVING
into WHERE without cloning.
build_pushable_cond_for_having_pushdown(),
remove_pushed_top_conjuncts_for_having() methods are changed.
It is found that there is no transformation made for fields of
pushed condition.
field_transformer_for_having_pushdown transformer is added.
New tests are added. Some comments are changed.
The patches features an optional shutdown behavior to hold on until
after all connected slaves have been sent the last binlogged event.
The connected slave is one whose START SLAVE has been acknowledged and
that was not stopped since that though it could be technically
reconnecting in background.
The solution therefore disallows killing the dump thread until is has
found EOF of the latest binlog file. It is up to the shutdown
requester (DBA) to set up a sufficiently large shutdown timeout value
for shudown to wait patiently until lagging behind slaves have been
synchronized. On the other hand if a specific slave needs exclusion
from synchronization the DBA would have to stop it manually which
would terminate its dump thread.
`mysqladmin shutdown' is extended with a `--wait_for_all_slaves' option
which translates to `SHUTDOW WAIT FOR ALL SLAVES' sql query
to enable the feature on the client side.
The patch also performs a small refactoring of the server shutdown
around close_connections() to introduce kill thread phases which
are two as of current.
Part#2 (final): rewritting the code to pass the correct enum_sp_aggregate_type
to the sp_head constructor, so sp_head never changes its aggregation type
later on. The grammar has been simplified and defragmented.
This allowed to check aggregate specific instructions right after
a routine body has been scanned, by calling new LEX methods:
sp_body_finalize_{procedure|function|trigger|event}()
Moving some C++ code from *.yy to a few new helper methods in LEX.
1. Always drop merged_for_insert flag on cleanup (there could be errors which prevent TABLE to be assigned)
2. Make more precise cleanup of select parts which was touched
st_select_lex::handle_derived() and mysql_handle_list_of_derived() had
exactly the same implementations.
- Adding a new method LEX::handle_list_of_derived() instead
- Removing public function mysql_handle_list_of_derived()
- Reusing LEX::handle_list_of_derived() in st_select_lex::handle_derived()
post-merge changes:
* handle password expiration on old tables like everything else -
make changes in memory, even if they cannot be done on disk
* merge "debug" tests with non-debug tests, they don't use dbug anyway
* only run rpl password expiration in MIXED mode, it doesn't replicate
anything, so no need to repeat it thrice
* restore update_user_table_password() prototype, it should not change
ACL_USER, this is done in acl_user_update()
* don't parse json twice in get_password_lifetime and get_password_expired
* remove LEX_USER::is_changing_password, see if there was any auth instead
* avoid overflow in expiration calculations
* don't initialize Account_options in the constructor, it's bzero-ed later
* don't create ulong sysvars - they're not portable, prefer uint or ulonglong
* misc simplifications
This patch adds support for expiring user passwords.
The following statements are extended:
CREATE USER user@localhost PASSWORD EXPIRE [option]
ALTER USER user@localhost PASSWORD EXPIRE [option]
If no option is specified, the password is expired with immediate
effect. If option is DEFAULT, global policy applies according to
the default_password_lifetime system var (if 0, password never
expires, if N, password expires every N days). If option is NEVER,
the password never expires and if option is INTERVAL N DAY, the
password expires every N days.
The feature also supports the disconnect_on_expired_password system
var and the --connect-expired-password client option.
Closes#1166
* inject portion of time updates into mysql_delete main loop
* triggered case emits delete+insert, no updates
* PORTION OF `SYSTEM_TIME` is forbidden
* `DELETE HISTORY .. FOR PORTION OF ...` is forbidden as well
Condition can be pushed from the HAVING clause into the WHERE clause
if it depends only on the fields that are used in the GROUP BY list
or depends on the fields that are equal to grouping fields.
Aggregate functions can't be pushed down.
How the pushdown is performed on the example:
SELECT t1.a,MAX(t1.b)
FROM t1
GROUP BY t1.a
HAVING (t1.a>2) AND (MAX(c)>12);
=>
SELECT t1.a,MAX(t1.b)
FROM t1
WHERE (t1.a>2)
GROUP BY t1.a
HAVING (MAX(c)>12);
The implementation scheme:
1. Extract the most restrictive condition cond from the HAVING clause of
the select that depends only on the fields that are used in the GROUP BY
list of the select (directly or indirectly through equalities)
2. Save cond as a condition that can be pushed into the WHERE clause
of the select
3. Remove cond from the HAVING clause if it is possible
The optimization is implemented in the function
st_select_lex::pushdown_from_having_into_where().
New test file having_cond_pushdown.test is created.
1. Renaming Type_handler_json to Type_handler_json_longtext
There will be other JSON handlers soon, e.g. Type_handler_json_varchar.
2. Making the code more symmetric for data types:
- Adding a new virtual method
Type_handler::Column_definition_validate_check_constraint()
- Moving JSON-specific code from sql_yacc.yy to
Type_handler_json_longtext::Column_definition_validate_check_constraint()
3. Adding new files sql_type_json.cc and sql_type_json.h
and moving Type_handler+JSON related code into these files.
move account options from LEX to Account_options structure
namely, mqh and ssl_*
Also, use LEX_CSTRING for ssl_*/x509_* strings and move
setting of ACL_USER::account_locked where it belongs
Add server support for user account locking.
This patch extends the ALTER/CREATE USER statements for
denying a user's subsequent login attempts:
ALTER USER
user [, user2] ACCOUNT [LOCK | UNLOCK]
CREATE USER
user [, user2] ACCOUNT [LOCK | UNLOCK]
The SHOW CREATE USER statement was updated to display the
locking state of an user.
Closes#1006
When creating a field of type JSON, it will be automatically
converted to TEXT with CHECK (json_valid(`a`)), if there wasn't any
previous check for the column.
Additional things:
- Added two bug fixes that was found while testing JSON. These bug
fixes has also been pushed to 10.3 (with a test case), but as they
where minimal and needed to get this task done and tested, the fixes
are repeated here.
- CREATE TABLE ... SELECT drops constraints for columns that
are both in the create and select part.
- If one has both a default expression and check constraint for a
column, one can get the error "Expression for field `a` is refering
to uninitialized field `a`.
- Removed some duplicate MYSQL_PLUGIN_IMPORT symbols
MDEV-17631 select_handler for a full query pushdown
Interfaces + Proof of Concept for federatedx with test cases.
The interfaces have been developed for integration of ColumnStore engine.
The problem was originally stated in
http://bugs.mysql.com/bug.php?id=82212
The size of an base64-encoded Rows_log_event exceeds its
vanilla byte representation in 4/3 times.
When a binlogged event size is about 1GB mysqlbinlog generates
a BINLOG query that can't be send out due to its size.
It is fixed with fragmenting the BINLOG argument C-string into
(approximate) halves when the base64 encoded event is over 1GB size.
The mysqlbinlog in such case puts out
SET @binlog_fragment_0='base64-encoded-fragment_0';
SET @binlog_fragment_1='base64-encoded-fragment_1';
BINLOG @binlog_fragment_0, @binlog_fragment_1;
to represent a big BINLOG.
For prompt memory release BINLOG handler is made to reset the BINLOG argument
user variables in the middle of processing, as if @binlog_fragment_{0,1} = NULL
is assigned.
Notice the 2 fragments are enough, though the client and server still may
need to tweak their @@max_allowed_packet to satisfy to the fragment
size (which they would have to do anyway with greater number of
fragments, should that be desired).
On the lower level the following changes are made:
Log_event::print_base64()
remains to call encoder and store the encoded data into a cache but
now *without* doing any formatting. The latter is left for time
when the cache is copied to an output file (e.g mysqlbinlog output).
No formatting behavior is also reflected by the change in the meaning
of the last argument which specifies whether to cache the encoded data.
Rows_log_event::print_helper()
is made to invoke a specialized fragmented cache-to-file copying function
which is
copy_cache_to_file_wrapped()
that takes care of fragmenting also optionally wraps encoded
strings (fragments) into SQL stanzas.
my_b_copy_to_file()
is refactored to into my_b_copy_all_to_file(). The former function
is generalized
to accepts more a limit argument to constraint the copying and does
not reinitialize anymore the cache into reading mode.
The limit does not do any effect on the fully read cache.
Part of MDEV-5336 Implement LOCK FOR BACKUP
- Changed check of Global_only_lock to also include BACKUP lock.
- We store latest MDL_BACKUP_DDL lock in thd->mdl_backup_ticket to be able
to downgrade lock during copy_data_between_tables()
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.
main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
The test and also rpl_gtid_delete_domain failed on PPC64 platform
due to an incorrectly specified actual key for searching
in a gtid domain system hash. While the correct size is 32 bits
the supplied value was 8 bytes of long int size on the platform.
The problem became evident thanks to the big endiness which
cut off the *least* significant part of the value field.
Fixed with correcting a dynamic array initialization to hold
now uint32 values as well as the values extraction for
searching in the gtid domain system hash.
A new added test ensures no overflowed values are accepted
for deletion which prevents inadvertent action. Notice though
MariaDB [test]> set @@session.gtid_domain_id=(1 << 32) + 1;
MariaDB [test]> show warnings;
+---------+------+--------------------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------------------+
| Warning | 1292 | Truncated incorrect gtid_domain_id value: '4294967297' |
+---------+------+--------------------------------------------------------+
MariaDB [test]> select @@session.gtid_domain_id;
+--------------------------+
| @@session.gtid_domain_id |
+--------------------------+
| 4294967295 |
+--------------------------+
This patch fills a serious flaw in the implementation of common table
expressions. Before this patch an attempt to prepare a statement from
a query with a parameter marker in a CTE that was used more than once
in the query ended up with a bogus error message. Similarly if a statement
in a stored procedure contained a CTE whose specification used a
local variables and this CTE was referred to more than once in the
statement then the server failed to execute the stored procedure returning
a bogus error message on a non-existing field.
The problems appeared due to incorrect handling of parameter markers /
local variables in CTEs that were referred more than once.
This patch fixes the problems by differentiating between the original
occurrences of a parameter marker / local variable used in the
specification of a CTE and the corresponding occurrences used
in copies of this specification. These copies are substituted
instead of non-first references to the CTE.
The idea of the fix and even some code were taken from the MySQL
implementation of the common table expressions.
Before this patch if no default database was set the server threw
an error for any table name reference that was not fully qualified by
database name. In particular it happened for table names referenced
CTE tables. This was incorrect.
The error message was thrown at the parser stage when the names referencing
different tables were not resolved yet.
Now if no default database is set and a with clause is used in the
processed statement any table reference is just supplied with a dummy
database name "*none*" at the parser stage. Later after a call
of check_dependencies_in_with_clauses() when the names for CTE tables
can be resolved error messages are thrown only for those names that
refer to non-CTE tables. This is done in open_and_process_table().
MDEV-10581 sql_mode=ORACLE: Explicit cursor FOR LOOP
MDEV-12098 sql_mode=ORACLE: Implicit cursor FOR loop
Cleanup changes:
- Removing sp_lex_cursor::m_cursor_name
- Adding sp_instr_cursor_copy_struct::m_cursor (the cursor global index)
- Fixing sp_instr_cursor_copy_struct::print() to access to the cursor
name using m_ctx and m_cursor (like other cursor related instructions do)
instead of m_cursor_name.
This change is needed to unify sp_assignment_lex and sp_cursor_lex later,
to fix this problem easier:
MDEV-16558 Parenthesized expression does not work as a lower FOR loop bound
materialized derived table/view that uses aliases is done
The problem appears when a column alias inside the materialized derived
table/view t1 definition coincides with the column name used in the
GROUP BY clause of t1. If the condition that can be pushed into t1
uses that ambiguous column name this column is determined as a column that
is used in the GROUP BY clause instead of the alias used in the projection
list of t1. That causes wrong result.
To prevent it resolve_ref_in_select_and_group() was changed.
The problem described in the bug report happened because the code
did not test check_cols(1) after fix_fields() in a few places.
Additionally, fix_fields() could be called multiple times for SP variables,
because they are all fixed at a early stage in append_for_log().
Solution:
1. Adding a few helper methods
- fix_fields_if_needed()
- fix_fields_if_needed_for_scalar()
- fix_fields_if_needed_for_bool()
- fix_fields_if_needed_for_order_by()
and using it in many cases instead of fix_fields() where
the "fixed" status is not definitely known to be "false".
2. Adding DBUG_ASSERT(!fixed) into Item_splocal*::fix_fields()
to catch double execution.
3. Adding tests.
As a good side effect, the patch removes a lot of duplicate code (~60 lines):
if (!item->fixed &&
item->fix_fields(..) &&
item->check_cols(1))
return true;
1. Adding LEX::make_item_sysvar() and reusing it
in sql_yacc.yy and sql_yacc_ora.yy.
Removing the "opt_component" rule.
2. Renaming rules to better reflect their purpose:
- keyword to keyword_ident
- keyword_sp to keyword_label
- keyword_sp_not_data_type to keyword_sp_var_and_label
Also renaming:
- sp_decl_ident_keyword to keyword_sp_decl for naming consistency
- keyword_alias to keyword_table_alias,
for consistency with ident_table_alias
- keyword_sp_data_type to keyword_data_type,
as it has nothing SP-specific.
3. Moving GLOBAL_SYM, LOCAL_SYM, SESSION_SYM from
keyword_sp_var_and_label to a separate rule keyword_sysvar_type.
We don't have system variables with these names anyway.
Adding ident_sysvar_name and using it in the grammar that needs
a system variable name instead of ident_or_text.
This removed a number of shift/reduce conflicts
between GLOBAL_SYM/LOCAL_SYM/SESSION_SYM as a variable scope and
as a variable name.
4. Moving keywords BEGIN_SYM, END (in both *.yy fiels)
and EXCEPTION_SYM (in sql_yacc_ora.yy) into a separate
rule keyword_sp_block_section, because in Oracle verb keywords
(COMMIT, DO, HANDLER, OPEN, REPAIR, ROLLBACK, SAVEPOINT, SHUTDOWN, TRUNCATE)
are good variables names and can appear in e.g. DECLARE,
while block keywords (BEGIN, END, EXCEPTION) are not good variable names
and cannot appear in DECLARE.
5. Further splitting keyword_directly_not_assignable in sql_yacc_ora.yy:
moving keyword_sp_verb_clause out. Renaming the rest of
keyword_directly_not_assignable to keyword_sp_head,
which represents keywords that can appear in optional
clauses in CREATE PROCEDURE/FUNCTION/TRIGGER.
6. Renaming keyword_sp_verb_clause to keyword_verb_clause,
as now it does not contains anything SP-specific.
As a result or #4,#5,#6, the rule keyword_directly_not_assignable
was replaced to three separate rules:
- keyword_sp_block
- keyword_sp_head
- keyword_verb_clause
Adding the same rules in sql_yacc.yy, for unification.
6. Adding keyword_sp_head and keyword_verb_clause into keyword_sp_decl.
This fixes MDEV-16244.
7. Reorganizing the rest of keyword related rules into two groups:
a. Rules defining a list of keywords and consisting of only terminal symbols:
- keyword_sp_var_not_label
- keyword_sp_head
- keyword_sp_verb_clause
- keyword_sp_block_section
- keyword_sysvar_type
b. Rules that combine the above lists into keyword places:
- keyword_table_alias
- keyword_ident
- keyword_label
- keyword_sysvar_name
- keyword_sp_decl
Rules from the group "b" use on the right side only rules
from the group "a" (with optional terminal symbols added).
Rules from the group "b" DO NOT mutually use each other any more.
This makes them easier to read (and see the difference between them).
Sorting the right sides of the group "b" keyword rules alphabetically,
for yet better readability.
- Using array_elements() instead of a constant to iterate through an array
- Adding some comments
- Adding new-line function comments
- Using STRING_WITH_LEN instead of C_STRING_WITH_LEN
Merging the following features from sql_yacc.yy to sql_yacc_ora.yy:
- system versioning
- column compression
- table value constructor
- spatial predicate WITHIN
- DELETE_DOMAIN_ID
The current code does not support recursive CTEs whose specifications
contain a mix of ALL UNION and DISTINCT UNION operations.
This patch catches such specifications and reports errors for them.
In this issue we hit the assert because we are adding addition fields to the field JOIN::all_fields list. This
is done because HEAP tables can't index BIT fields so we need to use an additional hidden field for grouping because later it will be
converted to a LONG field. Original field will remain of the BIT type and will be returned. This happens when we convert DISTINCT to
GROUP BY.
The solution is to take into account the number of such hidden fields that would be added to the field
JOIN::all_fields list while calculating the size of the ref_pointer_array.
The logic and the implementation scheme are similar with the
MDEV-9197 Pushdown conditions into non-mergeable views/derived tables
How the push down is made on the example:
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y) from t2 group by x);
-->
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y)
from t2
where x>3
group by x
having max(y)>10);
The implementation scheme:
1. Search for the condition cond that depends only on the fields
from the left part of the IN subquery (left_part)
2. Find fields F_group in the select of the right part of the
IN subquery (right_part) that are used in the GROUP BY
3. Extract from the cond condition cond_where that depends only on the
fields from the left_part that stay at the same places in the left_part
(have the same indexes) as the F_group fields in the projection of the
right_part
4. Transform cond_where so it can be pushed into the WHERE clause of the
right_part and delete cond_where from the cond
5. Transform cond so it can be pushed into the HAVING clause of the right_part
The optimization is made in the
Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the
variable condition_pushdown_for_subquery.
New test file in_subq_cond_pushdown.test is created.
There are also some changes made for setup_jtbm_semi_joins().
Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins()
that is called before optimize_cond() for cond and setup_jtbm_semi_joins()
that is called after optimize_cond().
New setup_jtbm_semi_joins() is made in the way so that the result of its work is
the same as if it was called before optimize_cond().
The code that is common for pushdown into materialized derived and into materialized
IN subqueries is factored out into pushdown_cond_for_derived(),
Item_in_subselect::pushdown_cond_for_in_subquery() and
st_select_lex::pushdown_cond_into_where_clause().
Make sure that SELECT_LEX_UNIT::derived, behaves as documented
(points to the "TABLE_LIST representing this union in the
embedding select"). For recursive CTE this was not necessarily
the case, it could've pointed to the TABLE_LIST inside the CTE,
not in the embedding select.
To fix:
* don't update unit->derived in mysql_derived_prepare(), pass derived
as an argument to st_select_lex_unit::prepare()
* prefer to set unit->derived in TABLE_LIST::init_derived()
to the TABLE_LIST in the embedding select, not to the recursive
reference. Fail if there are many TABLE_LISTs in the embedding
select with conflicting FOR SYSTEM_TIME clauses.
cleanup:
* remove redundant THD* argument from st_select_lex_unit::prepare()
1. Adding THD::convert_string(LEX_CSTRING *to,...) as a wrapper
for convert_string(LEX_STRING *to,...), as LEX_CSTRING
is now frequently used for conversion purpose.
This reduced duplicate code in TEXT_STRING_sys,
TEXT_STRING_literal, TEXT_STRING_filesystem grammar rules in *.yy
2. Adding yet another THD::convert_string() with an extra parameter
"bool simple_copy_is_possible". This even more reduced
repeatable code in the mentioned grammar rules in *.yy
3. Deriving Lex_ident_cli_st from Lex_string_with_metadata_st,
as they have very similar functionality. Moving m_quote
from Lex_ident_cli_st to Lex_string_with_metadata_st,
as m_quote will be used later to optimize string literals anyway
(e.g. avoid redundant copying on the tokenizer stage).
Adjusting Lex_input_stream::get_text() accordingly.
4. Moving the reminders of the code in TEXT_STRING_sys, TEXT_STRING_literal,
TEXT_STRING_filesystem grammar rules as new methods in THD:
- make_text_string_sys()
- make_text_string_connection()
- make_text_string_filesystem()
and changing *.yy to use these new methods.
This reduced the amount of similar code in
sql_yacc.yy and sql_yacc_ora.yy.
5. Removing duplicate code in Lex_input_stream::body_utf8_append_ident():
by reusing THD::make_text_string_sys(). Thanks to #3 and #4.
6. Making THD members charset_is_system_charset,
charset_is_collation_connection, charset_is_character_set_filesystem
private, as they are not needed externally any more.
The code in the "sp_tail" rule in sql_yacc.yy always
used YYLIP->get_cpp_tok_start() as the start of the body,
and did not check for possible lookahead which happens
for keywords "FOR", "VALUES" and "WITH" for LALR(2)
resolution in Lex_input_stream::lex_token().
In case of the lookahead token presence,
get_tok_start_prev() should have been used instead
of get_cpp_tok_start() as the beginning of the SP body.
Change summary:
This patch hides the implementation of the lookahead
token completely inside Lex_input_stream.
The users of Lex_input_stream now just get token-by-token
transparently and should not care about lookahead any more.
Now external users of Lex_input_stream
are not aware of the lookahead token at all.
Change details:
- Moving Lex_input_stream::has_lookahead() into the "private" section.
- Removing Lex_input_stream::get_tok_start_prev() and
Lex_input_stream::get_cpp_start_prev().
- Fixing the external code to call get_tok_start() and get_cpp_tok_start()
in all places where get_tok_start_prev() and get_cpp_start_prev()
where used.
- Adding a test for has_lookahead() right inside
get_tok_start() and get_cpp_tok_start().
If there is a lookahead token, these methods now
return the position of the previous token automatically:
const char *get_tok_start()
{
return has_lookahead() ? m_tok_start_prev : m_tok_start;
}
const char *get_cpp_tok_start()
{
return has_lookahead() ? m_cpp_tok_start_prev : m_cpp_tok_start;
}
- Fixing the internal code inside Lex_input_stream methods
to use m_tok_start and m_cpp_tok_start directly,
instead of calling get_tok_start() and get_cpp_tok_start(),
to make sure to access to the *current* token position
(independently of a lookahead token presence).
Reasoning:
- Shorter and clearer code
- Better encapsulation
(a fair number of Lex_input_stream methods and members were
moved to the private section)
New methods:
int lex_token(union YYSTYPE *yylval, THD *thd);
bool consume_comment(int remaining_recursions_permitted);
int lex_one_token(union YYSTYPE *yylval, THD *thd);
int find_keyword(Lex_ident_cli_st *str, uint len, bool function);
LEX_CSTRING get_token(uint skip, uint length);
Additional changes:
- Removing Lex_input_stream::yylval.
In the original code it was just an alias
for the "yylval" passed to lex_one_token().
This coding style is bug prone and is hard to follow.
In the new reduction "yylval" (or its components) is passed to
the affected methods as a parameter.
- Moving the code in sql_lex.h up and down between "private" and "public"
sections (sorry if this made the diff somewhat harder to read)
The code passing positions in the query to constructors of
Rewritable_query_parameter descendants (e.g. Item_splocal)
was not reliable. It used various Lex_input_stream methods:
- get_tok_start()
- get_tok_start_prev()
- get_tok_end()
- get_ptr()
to find positions of the recently scanned tokens.
The challenge was mostly to choose between get_tok_start()
and get_tok_start_prev(), taking into account to the current
grammar (depending if lookahead takes place before
or after we read the positions in every particular rule).
But this approach did not work at all in combination
with token contractions, when MYSQLlex() translates
two tokens into one token ID, for example:
WITH ROLLUP -> WITH_ROLLUP_SYM
As a result, the tokenizer is already one more token ahead.
So in query fragment:
"GROUP BY d, spvar WITH ROLLUP"
get_tok_start() points to "ROLLUP".
get_tok_start_prev() points to "WITH".
As a result, it was "WITH" who was erroneously replaced
to NAME_CONST() instead of "spvar".
This patch modifies the code to do it a different way.
Changes:
1. For keywords and identifiers, the tokenizer now
returns LEX_CTRING pointing directly to the query
fragment. So query positions are now just available using:
- $1.str - for the beginning of a token
- $1.str+$1.length - for the end of a token
2. Identifiers are not allocated on the THD memory root
in the tokenizer any more. Allocation is now done
on later stages, in methods like LEX::create_item_ident().
3. Two LEX_CSTRING based structures were added:
- Lex_ident_cli_st - used to store the "client side"
identifier representation, pointing to the
query fragment. Note, these identifiers
are encoded in @@character_set_client
and can have broken byte sequences.
- Lex_ident_sys_st - used to store the "server side"
identifier representation, pointing to the
THD allocated memory. This representation
guarantees that the identifier was checked
for being well-formed, and is encoded in utf8.
4. To distinguish between two identifier types
in the grammar, two Bison types were added:
<ident_cli> and <ident_sys>
5. All non-reserved keywords were marked as
being of the type <ident_cli>.
All reserved keywords are still of the type NONE.
6. All curly brackets in rules collecting
non-reserved keywords into non-terminal
symbols were removed, e.g.:
Was:
keyword_sp_data_type:
BIT_SYM {}
| BOOLEAN_SYM {}
Now:
keyword_sp_data_type:
BIT_SYM
| BOOLEAN_SYM
This is important NOT to have brackets here!!!!
This is needed to make sure that the underlying
Lex_ident_cli_ststructure correctly passes up to
the calling rule.
6. The code to scan identifiers and keywords
was moved from lex_one_token() into new
Lex_input_stream methods:
scan_ident_sysvar()
scan_ident_start()
scan_ident_middle()
scan_ident_delimited()
This was done to:
- get rid of enormous amount of references to &yylval->lex_str
- and remove a lot of references like lip->xxx
7. The allocating functionality which puts identifiers on the
THD memory root now resides in methods of Lex_ident_sys_st,
and in THD::to_ident_sys_alloc().
get_quoted_token() was removed.
8. Cleanup: check_simple_select() was moved as a method to LEX.
9. Cleanup: Some more functionality was moved from *.yy
to new methods were added to LEX:
make_item_colon_ident_ident()
make_item_func_call_generic()
create_item_qualified_asterisk()
vers_setup_conds() used to AND all conditions on row_start/row_end
columns and store it either in the WHERE clause or in the ON
clause for some table. In some cases this caused ON clause
to have conditions for tables that aren't part of that ON's join.
Fixed to put a table's condition always in the ON clause of the
corresponding table.
Removed unnecessary ... `OR row_end IS NULL` clause, it's not needed
in the ON clause.
Simplified handling on PS and SP.
Main reason was to make it easier to print the above structures in
a debugger. Additional benefits is that I was able to use same
defines for both structures, which simplifes some code.
Most of the code is just removing Alter_info:: and Alter_inplace_info::
from alter table flags.
Following renames was done:
HA_ALTER_FLAGS -> alter_table_operations
CHANGE_CREATE_OPTION -> ALTER_CHANGE_CREATE_OPTION
Alter_info::ADD_INDEX -> ALTER_ADD_INDEX
DROP_INDEX -> ALTER_DROP_INDEX
ADD_UNIQUE_INDEX -> ALTER_ADD_UNIQUE_INDEX
DROP_UNIQUE_INDEx -> ALTER_DROP_UNIQUE_INDEX
ADD_PK_INDEX -> ALTER_ADD_PK_INDEX
DROP_PK_INDEX -> ALTER_DROP_PK_INDEX
Alter_info:ALTER_ADD_COLUMN -> ALTER_PARSE_ADD_COLUMN
Alter_info:ALTER_DROP_COLUMN -> ALTER_PARSE_DROP_COLUMN
Alter_inplace_info::ADD_INDEX -> ALTER_ADD_NON_UNIQUE_NON_PRIM_INDEX
Alter_inplace_info::DROP_INDEX -> ALTER_DROP_NON_UNIQUE_NON_PRIM_INDEX
Other things:
- Added typedef alter_table_operatons for alter table flags
- DROP CHECK CONSTRAINT can now be done online
- Added checks for Aria tables in alter_table_online.test
- alter_table_flags now takes an ulonglong as argument.
- Don't support online operations if checksum option is used.
- sql_lex.cc doesn't add ALTER_ADD_INDEX if index is not created
The problem resided in this branch of the "option_value_no_option_type" rule:
| '@' '@' opt_var_ident_type internal_variable_name equal set_expr_or_default
Summary:
1. internal_variable_name initialized tmp.var to trg_new_row_fake_var (0x01).
2. The condition "if (tmp.var == NULL)" did not check
the special case with trg_new_row_fake_var,
so Lex->set_system_variable(&tmp, $3, $6) was
called with tmp.var pointing to trg_new_row_fake_var,
which created a sys_var instance pointing to 0x01 instead of
a real system variable.
3. Later, at the trigger invocation time, this method was called:
sys_var::do_deprecated_warning (this=0x1, thd=0x7ffe6c000a98)
Notice, "this" is equal to trg_new_row_fake_var (0x01)
Solution:
The old implementation with separate rules
internal_variable_name (in sql_yacc.yy and sql_yacc_ora.yy) and
internal_variable_name_directly_assignable (in sql_yacc_ora.yy only)
was too complex and hard to follow.
Rewriting the code in a more straightforward way.
1. Changing LEX::set_system_variable()
from:
bool set_system_variable(struct sys_var_with_base *, enum_var_type, Item *);
to:
bool set_system_variable(enum_var_type, sys_var *, const LEX_CSTRING *, Item *);
2. Adding new methods in LEX, which operate with variable names:
bool set_trigger_field(const LEX_CSTRING *, const LEX_CSTRING *, Item *);
bool set_system_variable(enum_var_type var_type, const LEX_CSTRING *name,
Item *val);
bool set_system_variable(THD *thd, enum_var_type var_type,
const LEX_CSTRING *name1,
const LEX_CSTRING *name2,
Item *val);
bool set_default_system_variable(enum_var_type var_type,
const LEX_CSTRING *name,
Item *val);
bool set_variable(const LEX_CSTRING *name, Item *item);
3. Changing the grammar to call the new methods directly
in option_value_no_option_type,
Removing rules internal_variable_name and
internal_variable_name_directly_assignable.
4. Removing "struct sys_var_with_base" and trg_new_row_fake_var.
Good side effect:
- The code in /sql reduced from 314 to 183 lines.
- MDEV-15615 Unexpected syntax error instead of "Unknown system variable" ...
was also fixed automatically
Backporting from bb-10.2-compatibility to bb-10.2-ext
Version: 2018-01-26
- CREATE PACKAGE [BODY] statements are now
entirely written to mysql.proc with type='PACKAGE' and type='PACKAGE BODY'.
- CREATE PACKAGE BODY now supports IF NOT EXISTS
- DROP PACKAGE BODY now supports IF EXISTS
- CREATE OR REPLACE PACKAGE [BODY] is now supported
- CREATE PACKAGE [BODY] now support the DEFINER clause:
CREATE DEFINER user@host PACKAGE pkg ... END;
CREATE DEFINER user@host PACKAGE BODY pkg ... END;
- CREATE PACKAGE [BODY] now supports SQL SECURITY and COMMENT clauses, e.g.:
CREATE PACKAGE p1 SQL SECURITY INVOKER COMMENT "comment" AS ... END;
- Package routines are now created from the package CREATE PACKAGE BODY
statement and don't produce individual records in mysql.proc.
- CREATE PACKAGE BODY now supports package-wide variables.
Package variables can be read and set inside package routines.
Package variables are stored in a separate sp_rcontext,
which is cached in THD on the first packate routine call.
- CREATE PACKAGE BODY now supports the initialization section.
- All public routines (i.e. declared in CREATE PACKAGE)
must have implementations in CREATE PACKAGE BODY
- Only public package routines are available outside of the package
- {CREATE|DROP} PACKAGE [BODY] now respects CREATE ROUTINE and ALTER ROUTINE
privileges
- "GRANT EXECUTE ON PACKAGE BODY pkg" is now supported
- SHOW CREATE PACKAGE [BODY] is now supported
- SHOW PACKAGE [BODY] STATUS is now supported
- CREATE and DROP for PACKAGE [BODY] now works for non-current databases
- mysqldump now supports packages
- "SHOW {PROCEDURE|FUNCTION) CODE pkg.routine" now works for package routines
- "SHOW PACKAGE BODY CODE pkg" now works (the package initialization section)
- A new package body level MDL was added
- Recursive calls for package procedures are now possible
- Routine forward declarations in CREATE PACKATE BODY are now supported.
- Package body variables now work as SP OUT parameters
- Package body variables now work as SELECT INTO targets
- Package body variables now support ROW, %ROWTYPE, %TYPE
- CREATE PACKAGE [BODY] statements are now
entirely written to mysql.proc with type='PACKAGE' and type='PACKAGE BODY'.
- CREATE PACKAGE BODY now supports IF NOT EXISTS
- DROP PACKAGE BODY now supports IF EXISTS
- CREATE OR REPLACE PACKAGE [BODY] is now supported
- CREATE PACKAGE [BODY] now support the DEFINER clause:
CREATE DEFINER user@host PACKAGE pkg ... END;
CREATE DEFINER user@host PACKAGE BODY pkg ... END;
- CREATE PACKAGE [BODY] now supports SQL SECURITY and COMMENT clauses, e.g.:
CREATE PACKAGE p1 SQL SECURITY INVOKER COMMENT "comment" AS ... END;
- Package routines are now created from the package CREATE PACKAGE BODY
statement and don't produce individual records in mysql.proc.
- CREATE PACKAGE BODY now supports package-wide variables.
Package variables can be read and set inside package routines.
Package variables are stored in a separate sp_rcontext,
which is cached in THD on the first packate routine call.
- CREATE PACKAGE BODY now supports the initialization section.
- All public routines (i.e. declared in CREATE PACKAGE)
must have implementations in CREATE PACKAGE BODY
- Only public package routines are available outside of the package
- {CREATE|DROP} PACKAGE [BODY] now respects CREATE ROUTINE and ALTER ROUTINE
privileges
- "GRANT EXECUTE ON PACKAGE BODY pkg" is now supported
- SHOW CREATE PACKAGE [BODY] is now supported
- SHOW PACKAGE [BODY] STATUS is now supported
- CREATE and DROP for PACKAGE [BODY] now works for non-current databases
- mysqldump now supports packages
- "SHOW {PROCEDURE|FUNCTION) CODE pkg.routine" now works for package routines
- "SHOW PACKAGE BODY CODE pkg" now works (the package initialization section)
- A new package body level MDL was added
- Recursive calls for package procedures are now possible
- Routine forward declarations in CREATE PACKATE BODY are now supported.
- Package body variables now work as SP OUT parameters
- Package body variables now work as SELECT INTO targets
- Package body variables now support ROW, %ROWTYPE, %TYPE
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
Counter for select numbering made stored with the statement (before was global)
So now it does have always accurate value which does not depend on
interruption of statement prepare by errors like lack of table in
a view definition.
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db
Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
TRASH was mapped to TRASH_FREE and was supposed to be used for memory
that should not be accessed anymore, while TRASH_ALLOC() is to be
used for uninitialized but to-be-used memory.
But sometimes TRASH() was used in the latter sense.
Remove TRASH() macro, always use explicit TRASH_ALLOC() or TRASH_FREE().
Other changes done to get this to work:
- Added 'internal_tables' to TABLE object to list which sequence tables
is needed to use the table.
- Mark any expression using DEFAULT() with LEX->default_used.
This is needed when deciding if we should open internal sequence
tables when a table is opened (we don't need to open sequence tables
if the main table is only used with SELECT).
- Create_and_open_temporary_table() can now also open all internal
sequence tables.
- Added option MYSQL_LOCK_USE_MALLOC to mysql_lock_tables()
to force memory allocation to be used with malloc instead of
memroot.
- Added flag to MYSQL_LOCK to remember if allocation was done with
malloc or memroot (makes code simpler and safer).
- init_one_table_for_prelocking() now takes argument for what lock to
use instead of it's a routine or something else.
- Renamed prelocking placeholders to make them more understandable as
they are now used in more code.
- Changed test in check_lock_and_start_stmt() if found table has correct
locks. The old test didn't work for tables that has lock
TL_WRITE_ALLOW_WRITE, which is what sequence tables are using.
- Added VCOL_NOT_VIRTUAL option to ensure that sequence functions can't
be used with virtual columns
- More sequence tests
This commit implements aggregate stored functions. The basic idea behind
the feature is:
* Implement a special instruction FETCH GROUP NEXT ROW that will pause
the execution of the stored function. When the instruction is reached,
execution of the initial query resumes "as if" the function returned.
This gives the server the opportunity to advance to the next row in the
result set.
* Stored aggregates behave like regular aggregate functions. The
implementation of thus resides in the class Item_sum_sp. Because it is
an aggregate function, for each new row in the group, the
Item_sum_sp::add() method will be called. This is when execution resumes
and the function does another iteration to "add" one extra element to
the final result.
* When the end of group is reached, val_xxx() method will be called for
the item. This case is handled by another execute step for the stored
function, only with a special flag to force a call to the return
handler. See Item_sum_sp::execute() for details.
To allow this pause and resume semantic, we must preserve the function
context across executions. This is stored in Item_sp::sp_query_arena only for
aggregate stored functions, but has no impact for regular functions.
We also enforce aggregate functions to include the "FETCH GROUP NEXT ROW"
instruction.
Signed-off-by: Vicențiu Ciorbaru <vicentiu@mariadb.org>
Most "new" failures fixed in the following files:
- sql_select.cc
- item.cc
- item_func.cc
- opt_subselect.cc
Other things:
- Allocate udf_handler strings in mem_root
- Required changes in sql_string.h
- Add mem_root as argument to some new [] calls
- Mark udf_handler strings as thread specific
- Removed some comment blocks with code
Issue:
------
VALUES doesn't have a type() function and is considered a
Item_field.
Solution for 5.7:
-----------------
Add a new type() function for Item_values_insert.
On 8.0 and trunk it was fixed by Mithun's Bug#19601973.
Solution for 5.6:
-----------------
Additionally Bug#17458914 is backported.
This will address the problem of using VALUES() in
INSERT ... ON DUPLICATE KEY UPDATE. Create a field object
only if it is in the UPDATE clause, else return a NULL
item.
This will also address the problems mentioned in
Bug#14789787 and Bug#16756402.
Solution for 5.5:
-----------------
As mentioned above Bug#17458914 is backported.
Additionally Bug#14786324 is also backported.
When VALUES() is detected outside its meaningful place,
it should be treated as NULL and is thus replaced with a
Field_null object, with the same name as the original
field.
Fields with type NULL are generally not handled well inside
the server (e.g Innodb will not accept them and it is
impossible to create them in regular tables). So create a
new const NULL item instead.
As reported in MDEV-11969 "there's no way to ditch knowledge" about some
domain that is no longer updated on a server. Besides being of annoyance to
clutter output in DBA console stale domains can prevent the slave
to connect the master as MDEV-12012 witnesses.
What domain is obsolete must be evaluated by the user (DBA) according
to whether the domain info is still relevant and will the domain ever
receive any update.
This patch introduces a method to discard obsolete gtid domains from
the server binlog state. The removal requires no event group from such
domain present in existing binlog files though. If there are any the
containing logs must be first PURGEd in order for
FLUSH BINARY LOGS DELETE_DOMAIN_ID=(list-of-domains)
succeed. Otherwise the command returns an error.
The list of obsolete domains can be computed through
intersecting two sets - the earliest (first) binlog's Gtid_list
and the current value of @@global.gtid_binlog_state - and extracting
the domain id components from the intersection list items.
The new DELETE_DOMAIN_ID featured FLUSH continues to rotate binlog
omitting the deleted domains from the active binlog file's Gtid_list.
Notice though when the command is ineffective - that none of requested to delete
domain exists in the binlog state - rotation does not occur.
Obsolete domain deletion is not harmful for connected slaves as long
as master side binlog files *purge* is synchronized with FLUSH-DELETE_DOMAIN_ID.
The slaves must have the last event from purged files processed as usual,
in order not to bump later into requesting a gtid from a file which
was already gone.
While the command is not replicated (as ordinary FLUSH BINLOG LOGS is)
slaves, even though having extra domains, won't suffer from reconnection errors
thanks to master-slave gtid connection protocol allowing the master
to be ignorant about a gtid domain.
Should at failover such slave to be promoted into master role it may run
the ex-master's
FLUSH BINARY LOGS DELETE_DOMAIN_ID=(list-of-domains)
to clean its own binlog state.
NOTES.
suite/perfschema/r/start_server_low_digest.result
is re-recorded as consequence of internal parser codes changes.