With statement- or mixed-mode logging, "LOAD DATA INFILE" queries
are written to the binlog using special types of log events.
When mysqlbinlog reads such events, it re-creates the file in a
temporary directory with a generated filename and outputs a
"LOAD DATA INFILE" query where the filename is replaced by the
generated file. The temporary file is not deleted by mysqlbinlog
after termination.
To fix the problem, in mixed mode we go to row-based. In SBR, we
document it to remind user the tmpfile is left in a temporary
directory.
With statement- or mixed-mode logging, "LOAD DATA INFILE" queries
are written to the binlog using special types of log events.
When mysqlbinlog reads such events, it re-creates the file in a
temporary directory with a generated filename and outputs a
"LOAD DATA INFILE" query where the filename is replaced by the
generated file. The temporary file is not deleted by mysqlbinlog
after termination.
To fix the problem, in mixed mode we go to row-based. In SBR, we
document it to remind user the tmpfile is left in a temporary
directory.
mysql_client_binlog_statement
Problem: server may read from unassigned memory performing
"wrong" BINLOG queries.
Fix: never read from unassigned memory.
Some of the test cases reference to binlog position and
these position numbers are written into result explicitly.
It is difficult to maintain if log event format changes.
There are a couple of cases explicit position number appears,
we handle them in different ways
A. 'CHANGE MASTER ...' with MASTER_LOG_POS or/and RELAY_LOG_POS options
Use --replace_result to mask them.
B. 'SHOW BINLOG EVENT ...'
Replaced by show_binlog_events.inc or wait_for_binlog_event.inc.
show_binlog_events.inc file's function is enhanced by given
$binlog_file and $binlog_limit.
C. 'SHOW SLAVE STATUS', 'show_slave_status.inc' and 'show_slave_status2.inc'
For the test cases just care a few items in the result of 'SHOW SLAVE STATUS',
only the items related to each test case are showed.
'show_slave_status.inc' is rebuild, only the given items in $status_items
will be showed.
'check_slave_is_running.inc' and 'check_slave_no_error.inc'
and 'check_slave_param.inc' are auxiliary files helping
to show running status and error information easily.
------------------------------------------------------------
revno: 3094
revision-id: vasil.dimov@oracle.com-20100513074652-0cvlhgkesgbb2bfh
parent: vasil.dimov@oracle.com-20100512173700-byf8xntxjur1hqov
committer: Vasil Dimov <vasil.dimov@oracle.com>
branch nick: mysql-trunk-innodb
timestamp: Thu 2010-05-13 10:46:52 +0300
message:
Followup to Bug#51920, fix binlog.binlog_killed
This is a followup to the fix of
Bug#51920 Innodb connections in row lock wait ignore KILL until lock wait
timeout
in that fix (rb://279) the behavior was changed to honor when a trx is
interrupted during lock wait, but the returned error code was still
"lock wait timeout" when it should be "interrupted".
This change fixes the non-deterministically failing test binlog.binlog_killed,
that failed like this:
binlog.binlog_killed 'stmt' [ fail ]
Test ended at 2010-05-12 11:39:08
CURRENT_TEST: binlog.binlog_killed
mysqltest: At line 208: query 'reap' failed with wrong errno 1205: 'Lock wait timeout exceeded; try restarting transaction', instead of 0...
Approved by: Sunny Bains (rb://344)
------------------------------------------------------------
This merge is non-trivial since it has to introduce the DB_INTERRUPTED
error code.
Also revert vasil.dimov@oracle.com-20100408165555-9rpjh24o0sa9ad5y
which adjusted the binlog.binlog_killed test to the new (wrong) behavior
This patch fixes two problems described as follows:
1 - If there is an on-going transaction and a temporary table is created or
dropped, any failed statement that follows the "create" or "drop commands"
triggers a rollback and by consequence the slave will go out sync because
the binary log will have a wrong sequence of events.
To fix the problem, we changed the expression that evaluates when the
cache should be flushed after either the rollback of a statment or
transaction.
2 - When a "CREATE TEMPORARY TABLE SELECT * FROM" was executed the
OPTION_KEEP_LOG was not set into the thd->options. For that reason, if
the transaction had updated only transactional engines and was rolled
back at the end (.e.g due to a deadlock) the changes were not written
to the binary log, including the creation of the temporary table.
To fix the problem, we have set the OPTION_KEEP_LOG into the thd->options
when a "CREATE TEMPORARY TABLE SELECT * FROM" is executed.
The test was used to fail because of
UPDATE t3,t4 SET t3.a=t4.a + bug27417(1);
did not prescribe the order of two row operations implied by the update.
Fixed with forcing the order with adding a where condition w/o
affecting the former bug fixes logics.
When mysqlbinlog was given the --database=X flag, it always printed
'ROLLBACK TO', but the corresponding 'SAVEPOINT' statement was not
printed. The replicated filter(replicated-do/ignore-db) and binlog
filter (binlog-do/ignore-db) has the same problem. They are solved
in this patch together.
After this patch, We always check whether the query is 'SAVEPOINT'
statement or not. Because this is a literal check, 'SAVEPOINT' and
'ROLLBACK TO' statements are also binlogged in uppercase with no
any comments.
The binlog before this patch can be handled correctly except one case
that any comments are in front of the keywords. for example:
/* bla bla */ SAVEPOINT a;
/* bla bla */ ROLLBACK TO a;
The test case added in previous patch missed a RESET MASTER on
test start up. Without it, showing binary log contents can
sometimes show spurious entries from previously executed tests,
ultimately causing test failure - result mismatch.
The test file was added in:
revid:luis.soares@sun.com-20100224190153-k0bpdx9abe88uoo2
This patch also moves the test case into binlog_innodb_row.test
file. This way we avoid having yet another test file,
binlog_row_innodb_truncate.test, whose only purpose is to host
one test case. This had been actually suggested during original
patch review, but somehow the binlog_innodb_row was missed when
searching for a file to host the test case.
+ failing statements
Implicit DROP event for temporary table is not getting
LOG_EVENT_THREAD_SPECIFIC_F flag, because, in the previous
executed statement in the same thread, which might even be a
failed statement, the thread_specific_used flag is set to
FALSE (in mysql_reset_thd_for_next_command) and not set to TRUE
before connection is shutdown. This means that implicit DROP
event will take the FALSE value from thread_specific_used and
will not set LOG_EVENT_THREAD_SPECIFIC_F in the event header. As
a consequence, mysqlbinlog will not print the pseudo_thread_id
from the DROP event, because one of the requirements for the
printout is that this flag is set to TRUE.
We fix this by setting thread_specific_used whenever we are
binlogging a DROP in close_temporary_tables, and resetting it to
its previous value afterward.
For temporary tables that are created with an engine that does
not provide the HTON_CAN_RECREATE, the truncate operation is
performed resorting to the optimized handler::ha_delete_all_rows
method. However, this means that the truncate will share
execution path, from mysql_delete, with truncate on regular
tables and other delete operations. As a consequence the truncate
operation, for the temporary table is logged, even if in row mode
because there is no distinction between this and the other delete
operations at binlogging time.
We fix this by checking if: (i) the binlog format, when the
truncate operation was issued, is ROW; (ii) if the operation is a
truncate; and (iii) if the table is a temporary table; before
writing to the binary log. If all three conditions are met, we
skip writing to the binlog. A side effect of this fix is that we
limit the scope of setting and resetting the
current_stmt_binlog_row_based. Now we just set and reset it
inside mysql_delete in the boundaries of the
handler::ha_write_row loop. This way we have access to
thd->current_stmt_binlog_row_based real value inside
mysql_delete.
In RBR, DDL statement will change binlog format to non row-based
format before it is binlogged, but the binlog format was not be
restored, and then manipulating a temporary table can not reset binlog
format to row-based format rightly. So that the manipulated statement
is binlogged with statement-based format.
To fix the problem, restore the state of binlog format after the DDL
statement is binlogged.
Problem: When RAND() is binlogged in statement mode, the seed is
binlogged too, so the replication slave generates the same
sequence of random numbers. This makes replication work in many
cases, but not in all cases: the order of rows is not guaranteed
for, e.g., UPDATE or INSERT...SELECT statements, so the row data
will be different if master and slave retrieve the rows in
different orders.
Fix: Mark RAND() as unsafe. It will generate a warning if
binlog_format=STATEMENT and switch to row-logging if
binlog_format=ROW.
For tables with metadata sizes ranging from 251 to 255 the size
of the event data (m_data_size) was being improperly calculated
in the Table_map_log_event constructor. This was due to the fact
that when writing the Table_map_log_event body (in
Table_map_log_event::write_data_body) a call to net_store_length
is made for packing the m_field_metadata_size. It happens that
net_store_length uses *one* byte for storing
m_field_metadata_size when it is smaller than 251 but *three*
bytes when it exceeds that value. BUG 42749 had already
pinpointed and fix this fact, but the fix was incomplete, as the
calculation in the Table_map_log_event constructor considers 255
instead of 251 as the threshold to increment m_data_size by
three. Thence, the window for having a mismatch between the
number of bytes written and the number of bytes accounted in the
event length (m_data_size) was left open for
m_field_metadata_size values between 251 and 255.
We fix this by changing the condition in the Table_map_log_event
constructor to match the one in the net_store_length, ie,
increment one byte if m_field_metadata_size < 251 and three if it
exceeds this value.
escaped field names
When in mixed or statement mode, the master logs LOAD DATA
queries by resorting to an Execute_load_query_log_event. This
event does not contain the original query, but a rewritten
version of it, which includes the table field names. However, the
rewrite does not escape the field names. If these names match a
reserved keyword, then the slave will stop with a syntax error
when executing the event.
We fix this by escaping the fields names as it happens already
for the table name.
This patch fixes three bugs as follows. First, aborting the server while purging
binary logs might generate orphan files due to how the purge operation was
implemented:
(purge routine - sql/log.cc - MYSQL_BIN_LOG::purge_logs)
1 - register the files to be removed in a temporary buffer.
2 - update the log-bin.index.
3 - flush the log-bin.index.
4 - erase the files whose names where register in the temporary buffer
in step 1.
Thus a failure while executing step 4 would generate an orphan file. Second,
a similar issue might happen while creating a new binary as follows:
(create routine - sql/log.cc - MYSQL_BIN_LOG::open)
1 - open the new log-bin.
2 - update the log-bin.index.
Thus a failure while executing step 1 would generate an orphan file.
To fix these issues, we record the files to be purged or created before really
removing or adding them. So if a failure happens such records can be used to
automatically remove dangling files. The new steps might be outlined as follows:
(purge routine - sql/log.cc - MYSQL_BIN_LOG::purge_logs)
1 - register the files to be removed in the log-bin.~rec~ placed in
the data directory.
2 - update the log-bin.index.
3 - flush the log-bin.index.
4 - delete the log-bin.~rec~.
(create routine - sql/log.cc - MYSQL_BIN_LOG::open)
1 - register the file to be created in the log-bin.~rec~
placed in the data directory.
2 - open the new log-bin.
3 - update the log-bin.index.
4 - delete the log-bin.~rec~.
(recovery routine - sql/log.cc - MYSQL_BIN_LOG::open_index_file)
1 - open the log-bin.index.
2 - open the log-bin.~rec~.
3 - for each file in log-bin.~rec~.
3.1 Check if the file is in the log-bin.index and if so ignore it.
3.2 Otherwise, delete it.
The third issue can be described as follows. The purge operation was allowing
to remove a file in use thus leading to the loss of data and possible
inconsistencies between the master and slave. Roughly, the routine was only
taking into account the dump threads and so if a slave was not connect the
file might be delete even though it was in use.
Problem: Some system functions that could return different values on
master and slave were not marked unsafe. In particular:
GET_LOCK
IS_FREE_LOCK
IS_USED_LOCK
MASTER_POS_WAIT
RELEASE_LOCK
SLEEP
SYSDATE
VERSION
Fix: Mark these functions unsafe.
file
Issuing 'FLUSH LOGS' does not close and reopen indexfile.
Instead a SEEK_SET is performed.
This patch makes index file to be closed and reopened whenever a
rotation happens (FLUSH LOGS is issued or binary log exceeds
maximum configured size).
The BINLOG statement was sharing too much code with the slave SQL thread, introduced with
the patch for Bug#32407. This caused statements to be logged with the wrong server_id, the
id stored inside the events of the BINLOG statement rather than the id of the running
server.
Fix by rearranging code a bit so that only relevant parts of the code are executed by
the BINLOG statement, and the server_id of the server executing the statements will
not be overrided by the server_id stored in the 'format description BINLOG statement'.
Let
- T be a transactional table and N non-transactional table.
- B be begin, C commit and R rollback.
- N be a statement that accesses and changes only N-tables.
- T be a statement that accesses and changes only T-tables.
In RBR, changes to N-tables that happen early in a transaction are not immediately flushed
upon committing a statement. This behavior may, however, break consistency in the presence
of concurrency since changes done to N-tables become immediately visible to other
connections. To fix this problem, we do the following:
. B N N T C would log - B N C B N C B T C.
. B N N T R would log - B N C B N C B T R.
Note that we are not preserving history from the master as we are introducing a commit that
never happened. However, this seems to be more acceptable than the possibility of breaking
consistency in the presence of concurrency.
Let
- T be a transactional table and N non-transactional table.
- B be begin, C commit and R rollback.
- M be a mixed statement, i.e. a statement that updates both T and N.
- M* be a mixed statement that fails while updating either T or N.
This patch restore the behavior presented in 5.1.37 for rows either produced in
the RBR or MIXED modes, when a M* statement that happened early in a transaction
had their changes written to the binary log outside the boundaries of the
transaction and wrapped in a BEGIN/ROLLBACK. This was done to keep the slave
consistent with with the master as the rollback would keep the changes on N and
undo them on T. In particular, we do what follows:
. B M* T C would log - B M* R B T C.
Note that, we are not preserving history from the master as we are introducing a
rollback that never happened. However, this seems to be more acceptable than
making the slave diverge. We do not fix the following case:
. B T M* C would log B T M* C.
The slave will diverge as the changes on T tables that originated from the M
statement are rolled back on the master but not on the slave. Unfortunately, we
cannot simply rollback the transaction as this would undo any uncommitted
changes on T tables.
SBR is not considered in this patch because a failing statement is written to
the binary along with the error code and a slave executes and then rolls back
the statement when it has an associated error code, thus undoing the effects
on T. In RBR and MBR, a full-fledged fix will be pushed after the WL 2687.
The 'BEGIN/COMMIT/ROLLBACK' log event could be filtered out if the
database is not selected by --database option of mysqlbinlog command.
This can result in problem if there are some statements in the
transaction are not filtered out.
To fix the problem, mysqlbinlog will output 'BEGIN/ROLLBACK/COMMIT'
in regardless of the database filtering rules.
"load data" statements were written to the binlog as a mix of the original statement
and bits recreated from parse-info. This relied on implementation details and broke
with IGNORE_SPACES and versioned comments.
We now completely resynthesize the query for LOAD DATA for binlog (which among other
things normalizes them somewhat with regard to case, spaces, etc.).
We have already parsed the query properly, so we make use of that rather
than mix-and-match string literals and parsed items.
This should make us safe with regard to versioned comments, even those
spanning multiple tokens. Also no longer affected by IGNORE_SPACES.
In RBR, 'DROP TEMPORARY TABLE IF EXISTS...' statement is binlogged when the table
does not exist.
In fact, 'DROP TEMPORARY TABLE ...' statement should never be binlogged in RBR
no matter if the table exists or not.
This patch addresses this by checking whether we are dropping a
temporary table or not, when building the custom drop statement.
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
This test case uses mysqlbinlog to dump the content of master-bin.000001,
but the content of master-bin.000001 is not that this test needs.
MTR runs a lot of test cases on one server, so when this test starts, the current binlog file
might not be master-bin.000001, or there are other events are written by tests before.
'RESET MASTER' command must be called at the begin, it ensures that binlog of this test
is wrote to master-bin.000001 correctly.
Three other tests have the same problem, They were fixed together.
mysqlbinlog-cp932
binlog_incident
binlog_tmp_table
binlog
Mixing transactional (T) and non-transactional (N) tables on behalf of a
transaction may lead to inconsistencies among masters and slaves in STATEMENT
mode. The problem stems from the fact that although modifications done to
non-transactional tables on behalf of a transaction become immediately visible
to other connections they do not immediately get to the binary log and therefore
consistency is broken. Although there may be issues in mixing T and M tables in
STATEMENT mode, there are safe combinations that clients find useful.
In this bug, we fix the following issue. Mixing N and T tables in multi-level
(e.g. a statement that fires a trigger) or multi-table table statements (e.g.
update t1, t2...) were not handled correctly. In such cases, it was not possible
to distinguish when a T table was updated if the sequence of changes was N and T.
In a nutshell, just the flag "modified_non_trans_table" was not enough to reflect
that both a N and T tables were changed. To circumvent this issue, we check if an
engine is registered in the handler's list and changed something which means that
a T table was modified.
Check WL 2687 for a full-fledged patch that will make the use of either the MIXED or
ROW modes completely safe.