Implement binlog_optimize_thread_scheduling option to allow benchmarking the
effect of running commit_ordered() for multiple transactions all in one
thread.
Before this fix, all the performance schema instrumentation for both the binary log
and the relay log would use the following instruments:
- wait/io/file/sql/binlog
- wait/io/file/sql/binlog_index
- wait/synch/mutex/sql/MYSQL_BIN_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond
This instrumentation is too general and can be more specific.
With this fix, the binlog instrumentation is identical,
and the relay log instrumentation is changed to:
- wait/io/file/sql/relaylog
- wait/io/file/sql/relaylog_index
- wait/synch/mutex/sql/MYSQL_RELAY_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond
With this change, the performance instrumentation for the binary log and the relay log,
which share the same structure but have different uses, is more detailed.
This is especially important for hosts in the middle of a replication chain,
that are both masters (binlog) and slaves (relaylog).
Before this fix, all the performance schema instrumentation for both the binary log
and the relay log would use the following instruments:
- wait/io/file/sql/binlog
- wait/io/file/sql/binlog_index
- wait/synch/mutex/sql/MYSQL_BIN_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond
This instrumentation is too general and can be more specific.
With this fix, the binlog instrumentation is identical,
and the relay log instrumentation is changed to:
- wait/io/file/sql/relaylog
- wait/io/file/sql/relaylog_index
- wait/synch/mutex/sql/MYSQL_RELAY_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond
With this change, the performance instrumentation for the binary log and the relay log,
which share the same structure but have different uses, is more detailed.
This is especially important for hosts in the middle of a replication chain,
that are both masters (binlog) and slaves (relaylog).
- MWL#47, allowing to annotate row-based binlog events with the SQL test of
the originating query (eg. in mysqlbinlog output).
- row_based_replication_without_primary_key.patch, providing more intelligent
selection of index to use on slave when applying row-based binlog events
for tables with no primary key.
- Make mysqlbinlog omit redundant `use` around BEGIN/SAVEPOINT/COMMIT/
ROLLBACK in 5.0 binlogs.
This patch adds options to annotate the binlog (and the mysqlbinlog
output) with the original SQL query for queries that are logged
using row-based replication.
Manual merge from mysql-5.1-bugteam into mysql-5.5-bugteam.
Conflicts
=========
Text conflict in sql/log.cc
Text conflict in sql/log.h
Text conflict in sql/slave.cc
Text conflict in sql/sql_parse.cc
Text conflict in sql/sql_priv.h
Manual merge from mysql-5.1-bugteam into mysql-5.5-bugteam.
Conflicts
=========
Text conflict in sql/log.cc
Text conflict in sql/log.h
Text conflict in sql/slave.cc
Text conflict in sql/sql_parse.cc
Text conflict in sql/sql_priv.h
when generating new name.
If find_uniq_filename returns an error, then this error is not
being propagated upwards, and execution does not report error to
the user (although a entry in the error log is generated).
Additionally, some more errors were ignored in new_file_impl:
- when writing the rotate event
- when reopening the index and binary log file
This patch addresses this by propagating the error up in the
execution stack. Furthermore, when rotation of the binary log
fails, an incident event is written, because there may be a
chance that some changes for a given statement, were not properly
logged. For example, in SBR, LOAD DATA INFILE statement requires
more than one event to be logged, should rotation fail while
logging part of the LOAD DATA events, then the logged data would
become inconsistent with the data in the storage engine.
mysql-test/include/restart_mysqld.inc:
Refactored restart_mysqld so that it is not hardcoded for
mysqld.1, but rather for the current server.
mysql-test/suite/binlog/t/binlog_index.test:
The error on open of index and binary log on new_file_impl
is now caught. Thence the user will get an error message.
We need to accomodate this change in the test case for the
failing FLUSH LOGS.
mysql-test/suite/rpl/t/rpl_binlog_errors-master.opt:
Sets max_binlog_size to 4096.
mysql-test/suite/rpl/t/rpl_binlog_errors.test:
Added some test cases for asserting that the error is found
and reported.
sql/handler.cc:
Catching error now returned by unlog (in ha_commit_trans) and
returning it.
sql/log.cc:
Propagating errors from new_file_impl upwards. The errors that
new_file_impl catches now are:
- error on generate_new_name
- error on writing the rotate event
- error when opening the index or the binary log file.
sql/log.h:
Changing declaration of:
- rotate_and_purge
- new_file
- new_file_without_locking
- new_file_impl
- unlog
They now return int instead of void.
sql/mysql_priv.h:
Change signature of reload_acl_and_cache so that write_to_binlog
is an int instead of bool.
sql/mysqld.cc:
Redeclaring not_used var as int instead of bool.
sql/rpl_injector.cc:
Changes to catch the return from rotate_and_purge.
sql/slave.cc:
Changes to catch the return values for new_file and rotate_relay_log.
sql/slave.h:
Changes to rotate_relay_log declaration (now returns int
instead of void).
sql/sql_load.cc:
In SBR, some logging of LOAD DATA events goes through
IO_CACHE_CALLBACK invocation at mf_iocache.c:_my_b_get. The
IO_CACHE implementation is ignoring the return value for from
these callbacks (pre_read and post_read), so we need to find out
at the end of the execution if the error is set or not in THD.
sql/sql_parse.cc:
Catching the rotate_relay_log and rotate_and_purge return values.
Semantic change in reload_acl_and_cache so that we report errors
in binlog interactions through the write_to_binlog output parameter.
If there was any failure while rotating the binary log, we should
then report the error to the client when handling SQLCOMM_FLUSH.
when generating new name.
If find_uniq_filename returns an error, then this error is not
being propagated upwards, and execution does not report error to
the user (although a entry in the error log is generated).
Additionally, some more errors were ignored in new_file_impl:
- when writing the rotate event
- when reopening the index and binary log file
This patch addresses this by propagating the error up in the
execution stack. Furthermore, when rotation of the binary log
fails, an incident event is written, because there may be a
chance that some changes for a given statement, were not properly
logged. For example, in SBR, LOAD DATA INFILE statement requires
more than one event to be logged, should rotation fail while
logging part of the LOAD DATA events, then the logged data would
become inconsistent with the data in the storage engine.
Before this fix, file io for the binary log file was not accounted properly,
and showed no io at all.
This bug was due to the following issues:
1) file io for the binlog was instrumented:
- sometime as "wait/io/file/sql/binlog"
- sometime as "wait/io/file/sql/MYSQL_LOG"
leading to inconsistent event_names.
2) the binlog file itself was using an IO_CACHE,
but the IO_CACHE implementation in mysys/mf_iocache.c was
not instrumented to make performance schema calls to record file io.
3) The "wait/io/file/sql/MYSQL_LOG" instrumentation was used
for several log files, such as:
- the binary log
- the slow log
- the query log
which caused file io in these different log files to be accounted
against the same instrument.
The instrumentation needs to have a finer grain and report io
in different event_names, because each file really serves a different purpose.
With this fix:
- the IO_CACHE implementation is now instrumented
- the "wait/io/file/sql/MYSQL_LOG" instrument has been removed
- binlog io is now always instrumented with "wait/io/file/sql/binlog"
- the slow log is instrumented with a new name, "wait/io/file/sql/slow_log"
- the query log is instrumented with a new name, "wait/io/file/sql/query_log"
Before this fix, file io for the binary log file was not accounted properly,
and showed no io at all.
This bug was due to the following issues:
1) file io for the binlog was instrumented:
- sometime as "wait/io/file/sql/binlog"
- sometime as "wait/io/file/sql/MYSQL_LOG"
leading to inconsistent event_names.
2) the binlog file itself was using an IO_CACHE,
but the IO_CACHE implementation in mysys/mf_iocache.c was
not instrumented to make performance schema calls to record file io.
3) The "wait/io/file/sql/MYSQL_LOG" instrumentation was used
for several log files, such as:
- the binary log
- the slow log
- the query log
which caused file io in these different log files to be accounted
against the same instrument.
The instrumentation needs to have a finer grain and report io
in different event_names, because each file really serves a different purpose.
With this fix:
- the IO_CACHE implementation is now instrumented
- the "wait/io/file/sql/MYSQL_LOG" instrument has been removed
- binlog io is now always instrumented with "wait/io/file/sql/binlog"
- the slow log is instrumented with a new name, "wait/io/file/sql/slow_log"
- the query log is instrumented with a new name, "wait/io/file/sql/query_log"
Make the binlog handlerton participate in START TRANSACTION WITH CONSISTENT
SNAPSHOT, recording the binlog position corresponding to the snapshot taken
in other MVCC storage engines.
Expose this consistent binlog position as the new status variables
binlog_trx_file and binlog_trx_position. This enables to get a fully
non-locking snapshot of the database (including binlog position for
slave provisioning), avoiding the need for FLUSH TABLES WITH READ LOCK.
Modify mysqldump to detect if the server supports this new feature, and
if so, avoid FLUSH TABLES WITH READ LOCK for --single-transaction
--master-data snapshot backups.
After the WL#2687, the binlog_cache_size and max_binlog_cache_size affect both the
stmt-cache and the trx-cache. This means that the resource used is twice the amount
expected/defined by the user.
The binlog_cache_use is incremented when the stmt-cache or the trx-cache is used
and binlog_cache_disk_use is incremented when the disk space from the stmt-cache or the
trx-cache is used. This behavior does not allow to distinguish which cache may be harming
performance due to the extra disk accesses and needs to have its in-memory cache
increased.
To fix the problem, we introduced two new options and status variables related to the
stmt-cache:
Options:
. binlog_stmt_cache_size
. max_binlog_stmt_cache_size
Status Variables:
. binlog_stmt_cache_use
. binlog_stmt_cache_disk_use
So there are
. binlog_cache_size that defines the size of the transactional cache for
updates to transactional engines for the binary log.
. binlog_stmt_cache_size that defines the size of the statement cache for
updates to non-transactional engines for the binary log.
. max_binlog_cache_size that sets the total size of the transactional
cache.
. max_binlog_stmt_cache_size that sets the total size of the statement
cache.
. binlog_cache_use that identifies the number of transactions that used the
temporary transactional binary log cache.
. binlog_cache_disk_use that identifies the number of transactions that used
the temporary transactional binary log cache but that exceeded the value of
binlog_cache_size.
. binlog_stmt_cache_use that identifies the number of statements that used the
temporary non-transactional binary log cache.
. binlog_stmt_cache_disk_use that identifies the number of statements that used
the temporary non-transactional binary log cache but that exceeded the value of
binlog_stmt_cache_size.
include/my_sys.h:
Updated message on disk_writes' usage.
mysql-test/extra/binlog_tests/binlog_cache_stat.test:
Updated the test case and added code to check the new status variables
binlog_stmt_cache_use and binlog_stmt_cache_disk_use.
mysql-test/extra/rpl_tests/rpl_binlog_max_cache_size.test:
Updated the test case to use the new system variables max_binlog_stmt_cache_size and binlog_stmt_cache_size.
mysql-test/r/mysqld--help-notwin.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_mixed_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_row_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_stm_cache_stat.result:
Updated the result file.
mysql-test/suite/rpl/r/rpl_mixed_binlog_max_cache_size.result:
Update the result file.
mysql-test/suite/rpl/r/rpl_row_binlog_max_cache_size.result:
Update the result file.
mysql-test/suite/rpl/r/rpl_stm_binlog_max_cache_size.result:
Updated the result file.
mysql-test/suite/sys_vars/inc/binlog_stmt_cache_size_basic.inc:
Added a test case to check the binlog_stmt_cache_size.
mysql-test/suite/sys_vars/r/binlog_stmt_cache_size_basic_32.result:
Updated the result file.
mysql-test/suite/sys_vars/r/binlog_stmt_cache_size_basic_64.result:
Updated the result file.
mysql-test/suite/sys_vars/r/max_binlog_stmt_cache_size_basic.result:
Updated the result file.
mysql-test/suite/sys_vars/t/binlog_stmt_cache_size_basic_32.test:
Added a test case to check the binlog_stmt_cache_size.
mysql-test/suite/sys_vars/t/binlog_stmt_cache_size_basic_64.test:
Added a test case to check the binlog_stmt_cache_size.
mysql-test/suite/sys_vars/t/max_binlog_cache_size_func-master.opt:
Removed because there is no test case max_binlog_cache_size_func.
mysql-test/suite/sys_vars/t/max_binlog_stmt_cache_size_basic.test:
Added a test case to check the system variable max_binlog_stmt_cache_size.
sql/log.cc:
There two main changes in here:
. Changed the set_write_error() as an error message is set according
to the type of the cache.
. Created the function set_binlog_cache_info where references to the
appropriate status and system variables are set and the server can
smoothly compute statistics and set the maximum size for each cache.
sql/log.h:
Changed the signature of the function in order to identify the error message
to be printed out as there is a different error code for each type of cache.
sql/mysqld.cc:
Added new status variables binlog_stmt_cache_use and binlog_stmt_cache_disk_use.
sql/mysqld.h:
Added new system variables max_binlog_stmt_cache_size and binlog_stmt_cache_size.
sql/share/errmsg-utf8.txt:
Added new error message related to the statement cache.
sql/sys_vars.cc:
Added new system variables max_binlog_stmt_cache_size and binlog_stmt_cache_size.
After the WL#2687, the binlog_cache_size and max_binlog_cache_size affect both the
stmt-cache and the trx-cache. This means that the resource used is twice the amount
expected/defined by the user.
The binlog_cache_use is incremented when the stmt-cache or the trx-cache is used
and binlog_cache_disk_use is incremented when the disk space from the stmt-cache or the
trx-cache is used. This behavior does not allow to distinguish which cache may be harming
performance due to the extra disk accesses and needs to have its in-memory cache
increased.
To fix the problem, we introduced two new options and status variables related to the
stmt-cache:
Options:
. binlog_stmt_cache_size
. max_binlog_stmt_cache_size
Status Variables:
. binlog_stmt_cache_use
. binlog_stmt_cache_disk_use
So there are
. binlog_cache_size that defines the size of the transactional cache for
updates to transactional engines for the binary log.
. binlog_stmt_cache_size that defines the size of the statement cache for
updates to non-transactional engines for the binary log.
. max_binlog_cache_size that sets the total size of the transactional
cache.
. max_binlog_stmt_cache_size that sets the total size of the statement
cache.
. binlog_cache_use that identifies the number of transactions that used the
temporary transactional binary log cache.
. binlog_cache_disk_use that identifies the number of transactions that used
the temporary transactional binary log cache but that exceeded the value of
binlog_cache_size.
. binlog_stmt_cache_use that identifies the number of statements that used the
temporary non-transactional binary log cache.
. binlog_stmt_cache_disk_use that identifies the number of statements that used
the temporary non-transactional binary log cache but that exceeded the value of
binlog_stmt_cache_size.
Remove the extra class hierarchy with classes TC_LOG_queued, TC_LOG_unordered,
and TC_LOG_group_commit, folding the code into the TC_LOG_MMAP and
TC_LOG_BINLOG classes. In particular TC_LOG_BINLOG is greatly simplified by
this, unifying the code path for transactional and non-transactional
commit.
Remove unnecessary locking of LOCK_log in MYSQL_BIN_LOG::write() (backport
of same fix from mysql-5.5).
Make TC_LOG_MMAP (and TC_LOG_DUMMY) derive directly from TC_LOG, avoiding the
inheritance hierarchy TC_LOG_queued->TC_LOG_unordered.
Put the wakeup facility for commit_ordered() calls into the THD class.
Some renaming to get better names.
In sql/log.c, member function wait_for_update_bin_log, a condition is entered with
THD::enter_cond but is not exited. This might leave dangling references to the
mutex/condition in the per-thread information area.
To fix the problem, we call exit_cond to properly remove references to the mutex,
LOCK_log.
In sql/log.c, member function wait_for_update_bin_log, a condition is entered with
THD::enter_cond but is not exited. This might leave dangling references to the
mutex/condition in the per-thread information area.
To fix the problem, we call exit_cond to properly remove references to the mutex,
LOCK_log.
A CREATE...SELECT that fails is written to the binary log if a non-transactional
statement is updated. If the logging format is ROW, the CREATE statement and the
changes are written to the binary log as distinct events and by consequence the
created table is not rolled back in the slave.
In this patch, we opted to let the slave goes out of sync by not writting to the
binary log the CREATE statement. We do this by simply reseting the binary log's
cache.
mysql-test/suite/rpl/r/rpl_drop.result:
Added a test case.
mysql-test/suite/rpl/t/rpl_drop.test:
Added a test case.
sql/log.cc:
Introduced a function to clean up the cache.
sql/log.h:
Introduced a function to clean up the cache.
sql/sql_insert.cc:
Cleaned up the binary log cache if a CREATE...SELECT fails.
A CREATE...SELECT that fails is written to the binary log if a non-transactional
statement is updated. If the logging format is ROW, the CREATE statement and the
changes are written to the binary log as distinct events and by consequence the
created table is not rolled back in the slave.
In this patch, we opted to let the slave goes out of sync by not writting to the
binary log the CREATE statement. We do this by simply reseting the binary log's
cache.
BUG#54872 MBR: replication failure caused by using tmp table inside transaction
Changed criteria to classify a statement as unsafe in order to reduce the
number of spurious warnings. So a statement is classified as unsafe when
there is on-going transaction at any point of the execution if:
1. The mixed statement is about to update a transactional table and
a non-transactional table.
2. The mixed statement is about to update a temporary transactional
table and a non-transactional table.
3. The mixed statement is about to update a transactional table and
read from a non-transactional table.
4. The mixed statement is about to update a temporary transactional
table and read from a non-transactional table.
5. The mixed statement is about to update a non-transactional table
and read from a transactional table when the isolation level is
lower than repeatable read.
After updating a transactional table if:
6. The mixed statement is about to update a non-transactional table
and read from a temporary transactional table.
7. The mixed statement is about to update a non-transactional table
and read from a temporary transactional table.
8. The mixed statement is about to update a non-transactionala table
and read from a temporary non-transactional table.
9. The mixed statement is about to update a temporary non-transactional
table and update a non-transactional table.
10. The mixed statement is about to update a temporary non-transactional
table and read from a non-transactional table.
11. A statement is about to update a non-transactional table and the
option variables.binlog_direct_non_trans_update is OFF.
The reason for this is that locks acquired may not protected a concurrent
transaction of interfering in the current execution and by consequence in
the result. So the patch reduced the number of spurious unsafe warnings.
Besides we fixed a regression caused by BUG#51894, which makes temporary
tables to go into the trx-cache if there is an on-going transaction. In
MIXED mode, the patch for BUG#51894 ignores that the trx-cache may have
updates to temporary non-transactional tables that must be written to the
binary log while rolling back the transaction.
So we fix this problem by writing the content of the trx-cache to the
binary log while rolling back a transaction if a non-transactional
temporary table was updated and the binary logging format is MIXED.
BUG#54872 MBR: replication failure caused by using tmp table inside transaction
Changed criteria to classify a statement as unsafe in order to reduce the
number of spurious warnings. So a statement is classified as unsafe when
there is on-going transaction at any point of the execution if:
1. The mixed statement is about to update a transactional table and
a non-transactional table.
2. The mixed statement is about to update a temporary transactional
table and a non-transactional table.
3. The mixed statement is about to update a transactional table and
read from a non-transactional table.
4. The mixed statement is about to update a temporary transactional
table and read from a non-transactional table.
5. The mixed statement is about to update a non-transactional table
and read from a transactional table when the isolation level is
lower than repeatable read.
After updating a transactional table if:
6. The mixed statement is about to update a non-transactional table
and read from a temporary transactional table.
7. The mixed statement is about to update a non-transactional table
and read from a temporary transactional table.
8. The mixed statement is about to update a non-transactionala table
and read from a temporary non-transactional table.
9. The mixed statement is about to update a temporary non-transactional
table and update a non-transactional table.
10. The mixed statement is about to update a temporary non-transactional
table and read from a non-transactional table.
11. A statement is about to update a non-transactional table and the
option variables.binlog_direct_non_trans_update is OFF.
The reason for this is that locks acquired may not protected a concurrent
transaction of interfering in the current execution and by consequence in
the result. So the patch reduced the number of spurious unsafe warnings.
Besides we fixed a regression caused by BUG#51894, which makes temporary
tables to go into the trx-cache if there is an on-going transaction. In
MIXED mode, the patch for BUG#51894 ignores that the trx-cache may have
updates to temporary non-transactional tables that must be written to the
binary log while rolling back the transaction.
So we fix this problem by writing the content of the trx-cache to the
binary log while rolling back a transaction if a non-transactional
temporary table was updated and the binary logging format is MIXED.
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h