binlog
Mixing transactional (T) and non-transactional (N) tables on behalf of a
transaction may lead to inconsistencies among masters and slaves in STATEMENT
mode. The problem stems from the fact that although modifications done to
non-transactional tables on behalf of a transaction become immediately visible
to other connections they do not immediately get to the binary log and therefore
consistency is broken. Although there may be issues in mixing T and M tables in
STATEMENT mode, there are safe combinations that clients find useful.
In this bug, we fix the following issue. Mixing N and T tables in multi-level
(e.g. a statement that fires a trigger) or multi-table table statements (e.g.
update t1, t2...) were not handled correctly. In such cases, it was not possible
to distinguish when a T table was updated if the sequence of changes was N and T.
In a nutshell, just the flag "modified_non_trans_table" was not enough to reflect
that both a N and T tables were changed. To circumvent this issue, we check if an
engine is registered in the handler's list and changed something which means that
a T table was modified.
Check WL 2687 for a full-fledged patch that will make the use of either the MIXED or
ROW modes completely safe.
mysql-test/suite/binlog/r/binlog_row_mix_innodb_myisam.result:
Truncate statement is wrapped in BEGIN/COMMIT.
mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result:
Truncate statement is wrapped in BEGIN/COMMIT.
Large transactions and statements may corrupt the binary log if the size of the
cache, which is set by the max_binlog_cache_size, is not enough to store the
the changes.
In a nutshell, to fix the bug, we save the position of the next character in the
cache before starting processing a statement. If there is a problem, we simply
restore the position thus removing any effect of the statement from the cache.
Unfortunately, to avoid corrupting the binary log, we may end up loosing changes
on non-transactional tables if they do not fit in the cache. In such cases, we
store an Incident_log_event in order to stop the slave and alert users that some
changes were not logged.
Precisely, for every non-transactional changes that do not fit into the cache,
we do the following:
a) the statement is *not* logged
b) an incident event is logged after committing/rolling back the transaction,
if any. Note that if a failure happens before writing the incident event to
the binary log, the slave will not stop and the master will not have reported
any error.
c) its respective statement gives an error
For transactional changes that do not fit into the cache, we do the following:
a) the statement is *not* logged
b) its respective statement gives an error
To work properly, this patch requires two additional things. Firstly, callers to
MYSQL_BIN_LOG::write and THD::binlog_query must handle any error returned and
take the appropriate actions such as undoing the effects of a statement. We
already changed some calls in the sql_insert.cc, sql_update.cc and sql_insert.cc
modules but the remaining calls spread all over the code should be handled in
BUG#37148. Secondly, statements must be either classified as DDL or DML because
DDLs that do not get into the cache must generate an incident event since they
cannot be rolled back.
conflicts:
Text conflict in client/mysqltest.cc
Text conflict in mysql-test/include/wait_until_connected_again.inc
Text conflict in mysql-test/lib/mtr_report.pm
Text conflict in mysql-test/mysql-test-run.pl
Text conflict in mysql-test/r/events_bugs.result
Text conflict in mysql-test/r/log_state.result
Text conflict in mysql-test/r/myisam_data_pointer_size_func.result
Text conflict in mysql-test/r/mysqlcheck.result
Text conflict in mysql-test/r/query_cache.result
Text conflict in mysql-test/r/status.result
Text conflict in mysql-test/suite/binlog/r/binlog_index.result
Text conflict in mysql-test/suite/binlog/r/binlog_innodb.result
Text conflict in mysql-test/suite/rpl/r/rpl_packet.result
Text conflict in mysql-test/suite/rpl/t/rpl_packet.test
Text conflict in mysql-test/t/disabled.def
Text conflict in mysql-test/t/events_bugs.test
Text conflict in mysql-test/t/log_state.test
Text conflict in mysql-test/t/myisam_data_pointer_size_func.test
Text conflict in mysql-test/t/mysqlcheck.test
Text conflict in mysql-test/t/query_cache.test
Text conflict in mysql-test/t/rpl_init_slave_func.test
Text conflict in mysql-test/t/status.test
The special TRUNCATE TABLE (DDL) transaction wasn't being properly
rolled back if a error occurred during row by row deletion. The
error can be caused by a foreign key restriction imposed by InnoDB
SE and would cause the server to erroneously issue a implicit
commit.
The solution is to rollback the transaction if a truncation via row
by row deletion fails, otherwise commit. All effects of a TRUNCATE
ABLE operation are rolled back if a row by row deletion fails.
mysql-test/include/commit.inc:
Truncate always starts a transaction and commits at the end.
The commit at the end increases the count by two, one is the
storage engine commit and the other is the binary log.
mysql-test/r/commit_1innodb.result:
Update test case results.
mysql-test/r/innodb_mysql.result:
Update test case results.
mysql-test/t/innodb_mysql.test:
Add test case for Bug#37016
sql/sql_delete.cc:
Move truncation using row by row deletion to its own function.
If row by row deletion fails, rollback the transaction.
Remove the meddling with disabling and enabling of autocommit
as TRUNCATE transaction is now explicitly ended (committed
or rolled back).
The test explicitly warned on existence of a bug in its 27th part.
The expected values of prepare and commit counters changed, corrected, by
fixes to bug#40221.
Notice, that binlog does not have to register for a statement with
the statement binlog-format because the statement rollback does not need
to do anything in that mode. It's not so with the ROW format which was
bug#40221 concern.
Fixed with correcting the expected values of the mentioned counters and
explained that with comments in the test.
mysql-test/include/commit.inc:
Removing `Sic' that warned on a bug (The one is bug#40221).
Correcting the expected values of prepare and commit counters due to fixes to bug#40221.
mysql-test/r/commit_1innodb.result:
results changed.
into pilot.mysql.com:/data/msvensson/mysql/mysql-5.1-mtr
BitKeeper/etc/ignore:
auto-union
BitKeeper/deleted/.del-rpl_row_charset.test:
Auto merged
CMakeLists.txt:
Auto merged
configure.in:
Auto merged
client/mysqltest.c:
Auto merged
mysql-test/extra/binlog_tests/blackhole.test:
Auto merged
mysql-test/include/commit.inc:
Auto merged
mysql-test/include/mix1.inc:
Auto merged
mysql-test/lib/mtr_report.pm:
Auto merged
mysql-test/r/commit_1innodb.result:
Auto merged
mysql-test/r/create.result:
Auto merged
mysql-test/r/ctype_big5.result:
Auto merged
mysql-test/r/drop.result:
Auto merged
mysql-test/r/group_by.result:
Auto merged
mysql-test/r/information_schema.result:
Auto merged
mysql-test/r/loaddata.result:
Auto merged
mysql-test/r/mysqlbinlog.result:
Auto merged
mysql-test/r/partition_error.result:
Auto merged
mysql-test/r/query_cache.result:
Auto merged
mysql-test/r/sp.result:
Auto merged
mysql-test/r/view.result:
Auto merged
mysql-test/r/warnings.result:
Auto merged
mysql-test/suite/binlog/r/binlog_row_mix_innodb_myisam.result:
Auto merged
mysql-test/suite/binlog/r/binlog_stm_blackhole.result:
Auto merged
mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result:
Auto merged
mysql-test/suite/binlog/r/binlog_unsafe.result:
Auto merged
mysql-test/suite/binlog/t/binlog_unsafe.test:
Auto merged
mysql-test/suite/federated/federated.result:
Auto merged
mysql-test/suite/federated/federated.test:
Auto merged
mysql-test/suite/parts/r/partition_alter1_myisam.result:
Auto merged
mysql-test/suite/parts/r/partition_alter2_myisam.result:
Auto merged
mysql-test/suite/rpl/r/rpl_row_basic_11bugs.result:
Auto merged
mysql-test/suite/rpl/r/rpl_row_log.result:
Auto merged
mysql-test/suite/rpl/r/rpl_row_log_innodb.result:
Auto merged
mysql-test/suite/rpl/t/disabled.def:
Auto merged
mysql-test/suite/rpl/t/rpl_flushlog_loop.test:
Auto merged
mysql-test/suite/rpl_ndb/r/rpl_ndb_log.result:
Auto merged
mysql-test/suite/rpl_ndb/t/rpl_ndb_transaction.test:
Auto merged
mysql-test/t/create.test:
Auto merged
mysql-test/t/csv.test:
Auto merged
mysql-test/t/disabled.def:
Auto merged
mysql-test/t/distinct.test:
Auto merged
mysql-test/t/drop.test:
Auto merged
mysql-test/t/group_by.test:
Auto merged
mysql-test/t/innodb.test:
Auto merged
mysql-test/t/loaddata.test:
Auto merged
mysql-test/t/partition_error.test:
Auto merged
mysql-test/t/query_cache.test:
Auto merged
mysql-test/t/sp.test:
Auto merged
mysql-test/t/view.test:
Auto merged
mysql-test/t/warnings.test:
Auto merged
sql/ha_ndbcluster.cc:
Auto merged
BitKeeper/deleted/.del-combinations:
Delete: mysql-test/suite/binlog/combinations
mysql-test/r/partition_not_windows.result:
Use remote
mysql-test/r/partition_symlink.result:
Use remote
mysql-test/r/symlink.result:
SCCS merged
mysql-test/suite/parts/inc/partition_basic.inc:
SCCS merged
mysql-test/suite/parts/inc/partition_check_drop.inc:
Use remote
mysql-test/suite/parts/inc/partition_layout_check1.inc:
Use remote
mysql-test/suite/parts/inc/partition_layout_check2.inc:
Use remote
mysql-test/suite/parts/r/partition_basic_innodb.result:
Use remote
mysql-test/suite/parts/r/partition_basic_myisam.result:
Use remote
mysql-test/suite/parts/r/partition_engine_myisam.result:
Use remote
mysql-test/suite/parts/t/partition_sessions.test:
SCCS merged
mysql-test/t/partition.test:
SCCS merged
mysql-test/t/partition_not_windows.test:
Use remote
mysql-test/t/partition_symlink.test:
Use remote
mysql-test/t/symlink.test:
Use remote
mysql-test/suite/binlog/r/binlog_multi_engine.result:
Manual merge, name of binlog file changed
mysql-test/suite/rpl/t/rpl_row_mysqlbinlog.test:
Manual merge
mysys/my_init.c:
Manual merge
tables is not zero any more. For row-based logging, there is an extra commit
for sending rows changed by the statement to the binary log.
mysql-test/include/commit.inc:
For row-based logging, an extra commit is done for each statement to commit
non-transactional changes to the binary log.
mysql-test/r/commit_1innodb.result:
Result change.
mysql-test/include/commit.inc:
Adjust path
Add missing drop tables
mysql-test/mysql-test-run.pl:
Remove duplicate printout
mysql-test/lib/mtr_report.pm:
Only print each test that has failed once
mysql-test/r/commit_1innodb.result:
Adjust path
Add missing drop table
mysql-test/suite/binlog/r/binlog_multi_engine.result:
Remove merge error - extra s
mysql-test/suite/binlog/r/binlog_unsafe.result:
Remove drop of non existing view
mysql-test/suite/binlog/t/binlog_unsafe.test:
Remove drop of non existing view
into dl145h.mysql.com:/data0/mkindahl/mysql-5.1-rpl
mysql-test/include/commit.inc:
Auto merged
mysql-test/lib/mtr_report.pl:
Auto merged
mysql-test/r/commit_1innodb.result:
Auto merged
mysql-test/r/variables.result:
Auto merged
mysql-test/suite/binlog/r/binlog_stm_ctype_ucs.result:
Auto merged
mysql-test/t/variables.test:
Auto merged
sql/handler.cc:
Auto merged
sql/item_func.cc:
Auto merged
sql/item_func.h:
Auto merged
sql/log.cc:
Auto merged
sql/rpl_rli.cc:
Auto merged
sql/set_var.cc:
Auto merged
sql/sql_yacc.yy:
Auto merged
Problem: in mixed and statement mode, a query that refers to a
system variable will use the slave's value when replayed on
slave. So if the value of a system variable is inserted into a
table, the slave will differ from the master.
Fix: mark statements that refer to a system variable as "unsafe",
meaning they will be replicated by row in mixed mode and produce a warning
in statement mode. There are some exceptions: some variables are actually
replicated. Those should *not* be marked as unsafe.
BUG#34732: mysqlbinlog does not print default values for auto_increment variables
Problem: mysqlbinlog does not print default values for some variables,
including auto_increment_increment and others. So if a client executing
the output of mysqlbinlog has different default values, replication will
be wrong.
Fix: Always print default values for all variables that are replicated.
I need to fix the two bugs at the same time, because the test cases would
fail if I only fixed one of them.
include/m_ctype.h:
Added definition of ILLEGAL_CHARSET_INFO_NUMBER. We just need a symbol
for a number that will never be used by any charset. ~0U should be safe
since charset numbers are sequential, starting from 0.
mysql-test/include/commit.inc:
Upated test to avoid making statements unsafe.
mysql-test/r/commit_1innodb.result:
Updated test needs updated result file.
mysql-test/r/mysqlbinlog.result:
Updated result file.
mysql-test/r/mysqlbinlog2.result:
Updated result file.
mysql-test/r/user_var-binlog.result:
Updated result file.
mysql-test/suite/binlog/r/binlog_base64_flag.result:
Updated result file.
mysql-test/suite/binlog/r/binlog_stm_ctype_ucs.result:
Updated result file.
mysql-test/suite/binlog/r/binlog_unsafe.result:
Modified test file needs modified result file.
mysql-test/suite/binlog/t/binlog_base64_flag.test:
Need to filter out pseudo_thread_id from result since it is
nondeterministic.
mysql-test/suite/binlog/t/binlog_unsafe.test:
Add tests that using variables is unsafe. The 'CREATE VIEW' tests didn't
make sense, so I removed them. SHOW WARNINGS is not necessary either,
because we get warnings for each statement in the result file.
mysql-test/suite/rpl/r/rpl_row_mysqlbinlog.result:
Updated result file.
mysql-test/suite/rpl/r/rpl_skip_error.result:
Updated result file.
mysql-test/suite/rpl/t/rpl_skip_error.test:
The test used @@server_id, which is not safe to replicate, so it would
have given a warning. The way it used @@server_id was hackish (issue a
query on master that removes rows only on master), so I replaced it by a
more robust way to do the same thing (connect to slave and insert the
rows only there).
Also clarified what the test case does.
mysql-test/t/mysqlbinlog2.test:
Use --short-form instead of manually filtering out nondeterministic stuff
from mysqlbinlog (because we added the nondeterministic @@pseudo_thread_id
to the output).
sql/item_func.cc:
Added method of Item_func_get_system_var that indicates whether the given
system variable will be written to the binlog or not.
sql/item_func.h:
Added method of Item_func_get_system_var that indicates whether the given
system variable will be written to the binlog or not.
sql/log_event.cc:
- auto_increment_offset was not written to the binlog if
auto_increment_increment=1
- mysqlbinlog did not output default values for some variables
(BUG#34732). In st_print_event_info, we remember for each variable whether
it has been printed or not. This is achieved in different ways for
different variables:
- For auto_increment_*, lc_time_names, charset_database_number,
we set the default values in st_print_event_info to something
illegal, so that it will look like they have changed the first time
they are seen.
- For charset, sql_mode, pseudo_thread_id, we add a flag to
st_print_event_info which indicates whether the variable has been
printed.
- Since pseudo_thread_id is now printed more often, and its value is
not guaranteed to be constant across different runs of the same
test script, I replaced it by a constant if --short-form is used.
- Moved st_print_event_info constructor from log_event.h to log_event.cc,
since it now depends on ILLEGAL_CHARSET_NUMBER, which is defined in
m_ctype.h, which is better to include from a .cc file than from a header
file.
sql/log_event.h:
Added fields to st_print_event_info that indicate whether some of the
variables have been written or not. Since the initialization of
charset_database_number now depends on ILLEGAL_CHARSET_INFO_NUMBER, which
is defined in a header file, which we'd better not include from this
header file -- I moved the constructor from here to log_event.cc.
sql/set_var.cc:
System variables now have a flag binlog_status, which indicates if they
are written to the binlog. If nothing is specified, all variables are
marked as not written to the binlog (NOT_IN_BINLOG) when created. In this
file, the variables that are written to the binlog are marked with
SESSION_VARIABLE_IN_BINLOG.
sql/set_var.h:
Added flag binlog_status to class sys_var. Added a getter and a
constructor parameter that sets it.
Since I had to change sys_var_thd_enum constructor anyways, I simplified
it to use default values of arguments instead of three copies of the
constructor.
sql/sql_yacc.yy:
Mark statements that refer to a system variable as "unsafe",
meaning they will be replicated by row in mixed mode. Added comment to
explain strange piece of code just above.
mysql-test/include/diff_tables.inc:
New auxiliary test file that tests whether two tables (possibly one on
master and one on slave) differ.
mysql-test/suite/rpl/r/rpl_variables.result:
New test case needs new result file.
mysql-test/suite/rpl/r/rpl_variables_stm.result:
New test file needs new result file.
mysql-test/suite/rpl/t/rpl_variables.test:
Test that INSERT of @@variables is replicated correctly (by switching to
row-based mode).
mysql-test/suite/rpl/t/rpl_variables_stm.test:
Test that replication of @@variables which are replicated explicitly works
as expected in statement mode (without giving warnings).
a SELECT doesn't cause ROLLBACK of statem".
The idea of the fix is to ensure that we always commit the current
statement at the end of dispatch_command(). In order to not issue
redundant disc syncs, an optimization of the two-phase commit
protocol is implemented to bypass the two phase commit if
the transaction is read-only.
mysql-test/suite/binlog/r/binlog_row_mix_innodb_myisam.result:
Update test results.
mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result:
Update test results.
mysql-test/suite/rpl_ndb/t/disabled.def:
Disable the tests, for which this changeset reveals a bug:
the injector thread does not always add 'statement commit' to the
rows injected in circular replication set up.
To be investigated separately.
sql/ha_ndbcluster_binlog.cc:
Add close_thread_tables() to run_query: this ensures
that all tables are closed and there is no pending statement transaction.
sql/handler.cc:
Implement optimisation of read-only transactions.
If a transaction consists only of DML statements that do not change
data, we do not perform a two-phase commit for it
(run one phase commit only).
sql/handler.h:
Implement optimisation of read-only transactions.
If a transaction consists only of DML statements that do not change
data, we do not perform a two-phase commit for it
(run one phase commit only).
sql/log.cc:
Mark the binlog transaction read-write whenever it's started.
We never read from binlog, so it's safe and least intrusive to add
this mark up here.
sql/log_event.cc:
Update to the new layout of thd->transaction.
sql/rpl_injector.cc:
Always commit statement transaction before committing the global one.
sql/sp.cc:
Ad comments.
sql/sp_head.cc:
Add comments.
sql/sql_base.cc:
Commit transaction at the end of the statement. Always.
sql/sql_class.cc:
Update thd_ha_data to return the right pointer in the new layout.
Fix select_dumpvar::send_data to properly return operation status.
A test case from commit.inc would lead to an assertion failure in the
diagnostics area (double assignment). Not test otherwise by the test suite.
sql/sql_class.h:
Implement a new layout of storage engine transaction info in which
it is easy to access all members related to the handlerton only
based on ht->slot.
sql/sql_cursor.cc:
Update to the new layout of thd->transaction.
sql/sql_delete.cc:
Remove wrong and now redundant calls to ha_autocommit_or_rollback.
The transaction is committed in one place, at the end of the statement.
Remove calls to mysql_unlock_tables, since some engines count locks
and commit statement transaction in unlock_tables(), which essentially
equates mysql_unlock_tables to ha_autocommit_or_rollback.
Previously it was necessary to unlock tables soon because we wanted
to avoid sending of 'ok' packet to the client under locked tables.
This is no longer necessary, since OK packet is also sent from one place
at the end of transaction.
sql/sql_do.cc:
Add DO always clears the error, we must rollback the current
statement before this happens. Otherwise the statement will be committed,
and not rolled back in the end.
sql/sql_insert.cc:
Remove wrong and now redundant calls to ha_autocommit_or_rollback.
The transaction is committed in one place, at the end of the statement.
Remove calls to mysql_unlock_tables, since some engines count locks
and commit statement transaction in unlock_tables(), which essentially
equates mysql_unlock_tables to ha_autocommit_or_rollback.
Previously it was necessary to unlock tables soon because we wanted
to avoid sending of 'ok' packet to the client under locked tables.
This is no longer necessary, since OK packet is also sent from one place
at the end of transaction.
sql/sql_load.cc:
Remove wrong and now redundant calls to ha_autocommit_or_rollback.
The transaction is committed in one place, at the end of the statement.
Remove calls to mysql_unlock_tables, since some engines count locks
and commit statement transaction in unlock_tables(), which essentially
equates mysql_unlock_tables to ha_autocommit_or_rollback.
Previously it was necessary to unlock tables soon because we wanted
to avoid sending of 'ok' packet to the client under locked tables.
This is no longer necessary, since OK packet is also sent from one place
at the end of transaction.
sql/sql_parse.cc:
Implement optimisation of read-only transactions: bypass 2-phase
commit for them.
Always commit statement transaction before commiting the global one.
Fix an unrelated crash in check_table_access, when called from
information_schema.
sql/sql_partition.cc:
Partitions commit at the end of a DDL operation.
Make sure that send_ok() is done only if the commit has succeeded.
sql/sql_table.cc:
Use ha_autocommit_or_rollback and end_active_trans everywhere.
Add end_trans to mysql_admin_table, so that it leaves no pending
transaction.
sql/sql_udf.cc:
Remvove a redundant call to close_thread_tables()
sql/sql_update.cc:
Remove wrong and now redundant calls to ha_autocommit_or_rollback.
The transaction is committed in one place, at the end of the statement.
Remove calls to mysql_unlock_tables, since some engines count locks
and commit statement transaction in unlock_tables(), which essentially
equates mysql_unlock_tables to ha_autocommit_or_rollback.
Previously it was necessary to unlock tables soon because we wanted
to avoid sending of 'ok' packet to the client under locked tables.
This is no longer necessary, since OK packet is also sent from one place
at the end of transaction.
mysql-test/include/commit.inc:
New BitKeeper file ``mysql-test/include/commit.inc''
mysql-test/r/commit_1innodb.result:
New BitKeeper file ``mysql-test/r/commit_1innodb.result''
mysql-test/t/commit_1innodb.test:
New BitKeeper file ``mysql-test/t/commit_1innodb.test''