This is a post-push fix addressing review requests and
problems with extra warnings.
Problem 1: The sub-statement where an unsafe warning was detected was
printed as part of the warning. This was ok for statements that
were unsafe due to, e.g., calls to UUID(), but did not make
sense for statements that were unsafe because there was more than
one autoincrement column (unsafeness in this case comes from the
combination of several sub-statements).
Fix 1: Instead of printing the sub-statement, print an explanation
of why the statement is unsafe.
Problem 2:
When a recursive construct (i.e., stored proceure, stored
function, trigger, view, prepared statement) contained several
sub-statements, and at least one of them was unsafe, there would be
one unsafeness warning per sub-statement - even for safe
sub-statements.
Fix 2:
Ensure that each type of warning is printed at most once, by
remembering throughout the execution of the statement which types
of warnings have been printed.
General overview:
The logic for switching to row format when binlog_format=MIXED had
numerous flaws. The underlying problem was the lack of a consistent
architecture.
General purpose of this changeset:
This changeset introduces an architecture for switching to row format
when binlog_format=MIXED. It enforces the architecture where it has
to. It leaves some bugs to be fixed later. It adds extensive tests to
verify that unsafe statements work as expected and that appropriate
errors are produced by problems with the selection of binlog format.
It was not practical to split this into smaller pieces of work.
Problem 1:
To determine the logging mode, the code has to take several parameters
into account (namely: (1) the value of binlog_format; (2) the
capabilities of the engines; (3) the type of the current statement:
normal, unsafe, or row injection). These parameters may conflict in
several ways, namely:
- binlog_format=STATEMENT for a row injection
- binlog_format=STATEMENT for an unsafe statement
- binlog_format=STATEMENT for an engine only supporting row logging
- binlog_format=ROW for an engine only supporting statement logging
- statement is unsafe and engine does not support row logging
- row injection in a table that does not support statement logging
- statement modifies one table that does not support row logging and
one that does not support statement logging
Several of these conflicts were not detected, or were detected with
an inappropriate error message. The problem of BUG#39934 was that no
appropriate error message was written for the case when an engine
only supporting row logging executed a row injection with
binlog_format=ROW. However, all above cases must be handled.
Fix 1:
Introduce new error codes (sql/share/errmsg.txt). Ensure that all
conditions are detected and handled in decide_logging_format()
Problem 2:
The binlog format shall be determined once per statement, in
decide_logging_format(). It shall not be changed before or after that.
Before decide_logging_format() is called, all information necessary to
determine the logging format must be available. This principle ensures
that all unsafe statements are handled in a consistent way.
However, this principle is not followed:
thd->set_current_stmt_binlog_row_based_if_mixed() is called in several
places, including from code executing UPDATE..LIMIT,
INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and
SET @@binlog_format. After Problem 1 was fixed, that caused
inconsistencies where these unsafe statements would not print the
appropriate warnings or errors for some of the conflicts.
Fix 2:
Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from
code executed after decide_logging_format(). Compensate by calling the
set_current_stmt_unsafe() at parse time. This way, all unsafe statements
are detected by decide_logging_format().
Problem 3:
INSERT DELAYED is not unsafe: it is logged in statement format even if
binlog_format=MIXED, and no warning is printed even if
binlog_format=STATEMENT. This is BUG#45825.
Fix 3:
Made INSERT DELAYED set itself to unsafe at parse time. This allows
decide_logging_format() to detect that a warning should be printed or
the binlog_format changed.
Problem 4:
LIMIT clause were not marked as unsafe when executed inside stored
functions/triggers/views/prepared statements. This is
BUG#45785.
Fix 4:
Make statements containing the LIMIT clause marked as unsafe at
parse time, instead of at execution time. This allows propagating
unsafe-ness to the view.
comment can't be read back
A change to the lexer in 5.1 caused slash-asterisk-bang-version
sections to be terminated early if there exists a slash-asterisk-
style comment inside it. Nesting comments is usually illegal,
but we rely on versioned comment blocks in mysqldump, and the
contents of those sections must be allowed to have comments.
The problem was that when encountering open-comment tokens and
consuming -or- passing through the contents, the "in_comment"
state at the end was clobbered with the not-in-a-comment value,
regardless of whether we were in a comment before this or not.
So, """/*!VER one /* two */ three */""" would lose its in-comment
state between "two" and "three". Save the echo and in-comment
state, and restore it at the end of the comment if we consume a
comment.
The problem is that a SELECT .. FOR UPDATE statement might open
a table and later wait for a impeding global read lock without
noticing whether it is holding a table that is being waited upon
the the flush phase of the process that took the global read
lock.
The same problem also affected the following statements:
LOCK TABLES .. WRITE
UPDATE .. SET (update and multi-table update)
TRUNCATE TABLE ..
LOAD DATA ..
The solution is to make the above statements wait for a impending
global read lock before opening the tables. If there is no
impending global read lock, the statement raises a temporary
protection against global read locks and progresses smoothly
towards completion.
Important notice: the patch does not try to address all possible
cases, only those which are common and can be fixed unintrusively
enough for 5.0.
An unnecessarily restrictive lock were taken on sub-SELECTs during DELETE.
During parsing, a global structure is reused for sub-SELECTs and the attribute
keeping track of lock options were not reset properly.
This patch introduces a new attribute to keep track on the syntactical lock
option elements found in a sub-SELECT and then sets the lock options accordingly.
Now the sub-SELECTs will try to acquire a READ lock if possible
instead of a WRITE lock as inherited from the outer DELETE statement.
- Remove bothersome warning messages. This change focuses on the warnings
that are covered by the ignore file: support-files/compiler_warnings.supp.
- Strings are guaranteed to be max uint in length
on non-partitioned table
Problem was that partitioning specific commands was accepted
for non partitioned tables and treated like
ANALYZE/CHECK/OPTIMIZE/REPAIR TABLE, after bug-20129 was fixed,
which changed the code path from mysql_alter_table to
mysql_admin_table.
Solution was to check if the table was partitioned before
trying to execute the admin command
``FLUSH TABLES WITH READ LOCK''
Concurrent execution of 1) multitable update with a
NATURAL/USING join and 2) a such query as "FLUSH TABLES
WITH READ LOCK" or "ALTER TABLE" of updating table led
to a server crash.
The mysql_multi_update_prepare() function call is optimized
to lock updating tables only, so it postpones locking to
the last, and if locking fails, it does cleanup of modified
syntax structures and repeats a query analysis. However,
that cleanup procedure was incomplete for NATURAL/USING join
syntax data: 1) some Field_item items pointed into freed
table structures, and 2) the TABLE_LIST::join_columns fields
was not reset.
Major change:
short-living Field *Natural_join_column::table_field has
been replaced with long-living Item*.
columns data types
The "SELECT @lastId, @lastId := Id FROM t" query returns
different result sets depending on the type of the Id column
(INT or BIGINT).
Note: this fix doesn't cover the case when a select query
references an user variable and stored function that
updates a value of that variable, in this case a result
is indeterminate.
The server uses incorrect assumption about a constantness of
an user variable value as a select list item:
The server caches a last query number where that variable
was changed and compares this number with a current query
number. If these numbers are different, the server guesses,
that the variable is not updating in the current query, so
a respective select list item is a constant. However, in some
common cases the server updates cached query number too late.
The server has been modified to memorize user variable
assignments during the parse phase to take them into account
on the next (query preparation) phase independently of the
order of user variable references/assignments in a select
item list.
This fix is for 5.0 only : back porting the 6.0 patch manually
The parser code in sql/sql_yacc.yy needs to be more robust to out of
memory conditions, so that when parsing a query fails due to OOM,
the thread gracefully returns an error.
Before this fix, a new/alloc returning NULL could:
- cause a crash, if dereferencing the NULL pointer,
- produce a corrupted parsed tree, containing NULL nodes,
- alter the semantic of a query, by silently dropping token values or nodes
With this fix:
- C++ constructors are *not* executed with a NULL "this" pointer
when operator new fails.
This is achieved by declaring "operator new" with a "throw ()" clause,
so that a failed new gracefully returns NULL on OOM conditions.
- calls to new/alloc are tested for a NULL result,
- The thread diagnostic area is set to an error status when OOM occurs.
This ensures that a request failing in the server properly returns an
ER_OUT_OF_RESOURCES error to the client.
- OOM conditions cause the parser to stop immediately (MYSQL_YYABORT).
This prevents causing further crashes when using a partially built parsed
tree in further rules in the parser.
No test scripts are provided, since automating OOM failures is not
instrumented in the server.
Tested under the debugger, to verify that an error in alloc_root cause the
thread to returns gracefully all the way to the client application, with
an ER_OUT_OF_RESOURCES error.
build)
The crash was caused by freeing the internal parser stack during the parser
execution.
This occured only for complex stored procedures, after reallocating the parser
stack using my_yyoverflow(), with the following C call stack:
- MYSQLparse()
- any rule calling sp_head::restore_lex()
- lex_end()
- x_free(lex->yacc_yyss), xfree(lex->yacc_yyvs)
The root cause is the implementation of stored procedures, which breaks the
assumption from 4.1 that there is only one LEX structure per parser call.
The solution is to separate the LEX structure into:
- attributes that represent a statement (the current LEX structure),
- attributes that relate to the syntax parser itself (Yacc_state),
so that parsing multiple statements in stored programs can create multiple
LEX structures while not changing the unique Yacc_state.
Now, Yacc_state and the existing Lex_input_stream are aggregated into
Parser_state, a structure that represent the complete state of the (Lexical +
Syntax) parser.
Mixing aggregate functions and non-grouping columns is not allowed in the
ONLY_FULL_GROUP_BY mode. However in some cases the error wasn't thrown because
of insufficient check.
In order to check more thoroughly the new algorithm employs a list of outer
fields used in a sum function and a SELECT_LEX::full_group_by_flag.
Each non-outer field checked to find out whether it's aggregated or not and
the current select is marked accordingly.
All outer fields that are used under an aggregate function are added to the
Item_sum::outer_fields list and later checked by the Item_sum::check_sum_func
function.
between 5.0 and 5.1.
The problem was that in the patch for Bug#11986 it was decided
to store original query in UTF8 encoding for the INFORMATION_SCHEMA.
This approach however turned out to be quite difficult to implement
properly. The main problem is to preserve the same IS-output after
dump/restore.
So, the fix is to rollback to the previous functionality, but also
to fix it to support multi-character-set-queries properly. The idea
is to generate INFORMATION_SCHEMA-query from the item-tree after
parsing view declaration. The IS-query should:
- be completely in UTF8;
- not contain character set introducers.
For more information, see WL4052.
The problem is that when a stored procedure is being parsed for
the first execution, the body is copied to a temporary buffer
which is disregarded sometime after the statement is parsed.
And during this parsing phase, the rule for CREATE VIEW was
holding a reference to the string being parsed for use during
the execution of the CREATE VIEW statement, leading to invalid
memory access later.
The solution is to allocate and copy the SELECT of a CREATE
VIEW statement using the thread memory root, which is set to
the permanent arena of the stored procedure.
When the server was out of memory it crashed because of invalid memory access.
This patch adds detection for failed memory allocations and make the server
output a proper error message.
partitioned table
Trying INSERT DELAYED on a partitioned table, that has not been
used right before, crashes the server. When a table is used for
select or update, it is kept open for some time. This period I
mean with "right before".
Information about partitioning of a table is stored in form of
a string in the .frm file. Parsing of this string requires a
correctly set up lexical analyzer (lex). The partitioning code
uses a new temporary instance of a lex. But it does still refer
to the previously active lex. The delayd insert thread does not
initialize its lex though...
Added initialization for thd->lex before open table in the delayed
thread and at all other places where it is necessary to call
lex_start() if all tables would be partitioned and need to parse
the .frm file.
The SET PASSWORD statement is non-transactional (no explicit transaction
boundaries) in nature and hence is forbidden inside stored functions and
triggers, but it weren't being effectively forbidden.
The implemented fix is to issue a implicit commit with every SET PASSWORD
statement, effectively prohibiting these statements in stored functions
and triggers.
Problem: creating a partitioned table during name resolution for the
partition function we search for column names in all parts of the
CREATE TABLE query. It is superfluous (and wrong) sometimes.
Fix: launch name resolution for the partition function against
the table we're creating.