Here is the scenario that causes the failure.(by Mats)
1. The to-be corrupt log event (let's call it X), is split into two
packets B and C on the network level (net_write_buff()). The parts
are X = (x',x''). The part x' ends up in packet B and part x''
ends up in packet C. Prior to the corrupt event X, the event Y has
been written successfully, but has been split into two packets as
well, which we call (y',y'').
2. The master sends packet A = (y'',x') to the slave, increases the
packet sequence number, the slave receives the packet, but fails
to reply before the master gets a timeout.
3. Since the master got a timeout, it reports failure, and aborts
sending the binary log by exiting mysql_binlog_send(). However, it
leaves the buffer intact, still holding y'' (but not x', since the
write_pos is not increased).
4. After exiting mysql_binlog_send(), the master does a
disconnection of the client thread, which involves sending an
error message e to the client (i.e., the slave).
5. In this case, net_write_buff() is used again, but this time the
old contents of the packet is used so that the new packet is
D = (y'',e). Note that this will use a new packet sequence number,
since the packet number was increased in step 2.
6. The slave receives the tail y'' of the Y log event, concatenates
this with x' (which it already received), and writes the event
(x',y'') it to the relay log since it hasn't noticed anything is
amiss.
7. It then tries to read more bytes, which is either e (if the length
given for X just happened to match the length given for Y, or just
plain garbage because the slave is out of sync with what is
actually sent.
8. After a while, the SQL thread tries to execute the event (x',y''),
which is very likely to be just nonsense.
The problem can be fixed by not resetting net->error after the call of
mysql_binlog_send, so the error message will not be sent and the connection
will be closed.
The problem is that some DDL statements (ALTER TABLE, CREATE
TRIGGER, FLUSH TABLES, ...) when under LOCK TABLES need to
momentarily drop the lock, reopen the table and grab the write
lock again (using reopen_tables). When grabbing the lock again,
reopen_tables doesn't pass a flag to mysql_lock_tables in
order to ignore the impending global read lock, which causes a
assertion because LOCK_open is being hold. Also dropping the
lock must not signal to any threads that the table has been
relinquished (related to the locking/flushing protocol).
The solution is to correct the way the table is reopenned
and the locks grabbed. When reopening the table and under
LOCK TABLES, the table version should be set to 0 so other
threads have to wait for the table. When grabbing the lock,
any other flush should be ignored because it's theoretically
a atomic operation. The chosen solution also fixes a potential
discrepancy between binlog and GRL (global read lock) because
table placeholders were being ignored, now a FLUSH TABLES WITH
READ LOCK will properly for table with open placeholders.
It's also important to mention that this patch doesn't fix
a potential deadlock if one uses two GRLs under LOCK TABLES
concurrently.
cause ROLLBACK of statement", part 1. Review fixes.
Do not send OK/EOF packets to the client until we reached the end of
the current statement.
This is a consolidation, to keep the functionality that is shared by all
SQL statements in one place in the server.
Currently this functionality includes:
- close_thread_tables()
- log_slow_statement().
After this patch and the subsequent patch for Bug#12713, it shall also include:
- ha_autocommit_or_rollback()
- net_end_statement()
- query_cache_end_of_result().
In future it may also include:
- mysql_reset_thd_for_next_command().
When read_only option was enabled, a user without SUPER privilege could
perform CREATE DATABASE and DROP DATABASE operations.
This patch adds a check to make sure this isn't possible. It also attempts to
simplify the logic used to determine if relevant tables are updated,
making it more human readable.
The problem was that THD::killed was reset after a command was
read from the socket, but before it was actually handled. That lead
to a race: if another KILL statement was issued for this connection
in the middle of reading from the socket and processing a command,
THD::killed state would be cleaned.
The fix is to move this cleanup into net_send_error() function.
A sample test case exists in binlog_killed.test:
- connection 1: start a new transaction on table t1;
- connection 2: send query to the server (w/o waiting for the
result) to update data in table t1 -- this query will be blocked
since there is unfinished transaction;
- connection 1: kill query in connection 2 and finish the transaction;
- connection 2: get result of the previous query -- it should be
the "query-killed" error.
This test however contains race condition, which can not be fixed
with the current protocol: there is no way to guarantee, that the
server will receive and start processing the query in connection 2
(which is intended to get blocked) before the KILL command (sent in
the connection 1) will arrive. In other words, there is no way to
ensure that the following sequence will not happen:
- connection 1: start a new transaction on table t1;
- connection 1: kill query in connection 2 and finish the transaction;
- connection 2: send query to the server (w/o waiting for the
result) to update data in table t1 -- this query will be blocked
since there is unfinished transaction;
- connection 2: get result of the previous query -- the query will
succeed.
So, there is no test case for this bug, since it's impossible
to write a reliable test case under the current circumstances.
Loading 4.1 into 5.0 or 5.1 failed silently because procs_priv table missing.
This caused the server to crash on any attempt to store new grants because
of uninitialized structures.
This patch breaks up the grant loading function into two phases to allow
for procs_priv table to fail with an warning instead of crashing the server.
FLUSH TABLES WITH READ LOCK fails to properly detect write locked
tables when running under low priority updates.
The problem is that when trying to aspire a global read lock, the
reload_acl_and_cache() function fails to properly check if the thread
has a low priority write lock, which later my cause a server crash or
deadlock.
The solution is to simple check if the thread has any type of the
possible exclusive write locks.
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
Crash happens as a result of NO_EMBEDDED_ACCESS_CHECKS option
(which is default for embedded server).
check_table_access failed on using unintialized structure.
Better solutions here is to disable that code completely in this case.
Though the crash happens only in 6.0 i belive it's good to do it in 5.1
Between 5.0 and 5.1, the step of incrementing the global query id
changed, which broke how the profiler noticed when a new query had
started. That reset the state list and caused all but the last
five (or so) states to be thrown away.
Now, don't watch for query_id changes in the lower level.
Add a bogus state change at the end of profiling so that the last
real state change is timed.
Emit source reference for the start of the span of time instead of
the end of it.
If a stored function that contains a drop temporary table statement
is invoked by a create temporary table of the same name may cause
a server crash. The problem is that when dropping a table no check
is done to ensure that table is not being used by some outer query
(or outer statement), potentially leaving the outer query with a
reference to a stale (freed) table.
The solution is when dropping a temporary table, always check if
the table is being used by some outer statement as a temporary
table can be dropped inside stored procedures.
The check is performed by looking at the TABLE::query_id value for
temporary tables. To simplify this check and to solve a bug related
to handling of temporary tables in prelocked mode, this patch changes
the way in which this member is used to track the fact that table is
used/unused. Now we ensure that TABLE::query_id is zero for unused
temporary tables (which means that all temporary tables which were
used by a statement should be marked as free for reuse after it's
execution has been completed).
The mysql_change_user command fails to properly update the database pointer
when no database is selected, leading to "use after free" errors. The same
happens on the user privilege pointer in the thread security context.
The solution is to properly reset and update the database name. Also update
the user_priv pointer so that it doesn't point to freed memory.
check_user()/check_connection()/check_for_max_user_connections().
This is a pre-requisite patch for the fix for Bug#12713 "Error in a stored
function called from a SELECT doesn't cause ROLLBACK of statem"
Implement review comments.
The SET PASSWORD statement is non-transactional (no explicit transaction
boundaries) in nature and hence is forbidden inside stored functions and
triggers, but it weren't being effectively forbidden.
The implemented fix is to issue a implicit commit with every SET PASSWORD
statement, effectively prohibiting these statements in stored functions
and triggers.