win x86 debug_max
The windows MTR run exhibited a different test execution
ordering (due to the fact that in these platforms MTR is invoked
with --parallel > 1). This uncovered a bug in the aforementioned
test case, which is triggered by the following conditions:
1. server is not restarted between two different tests;
2. the test before binlog.binlog_row_failure_mixing_engines
issues flush logs;
3. binlog.binlog_row_failure_mixing_engines uses binlog
positions to limit the output of show_binlog_events;
4. binlog.binlog_row_failure_mixing_engines does not state which
binlog file to use, thence it uses a wrong binlog file with
the correct position.
There are two possible fixes: 1. make sure that the test start
from a clean slate - binlog wise; 2. in addition to the position,
also state the binary log file before sourcing
show_binlog_events.inc .
We go for fix#1, ie, deploy a RESET MASTER before the test is
actually started.
Open issues:
- A better fix for #57688; Igor is working on this
- Test failure in index_merge_innodb.test ; Igor promised to look at this
- Some Innodb tests fails (need to merge with latest xtradb) ; Kristian promised to look at this.
- Failing tests: innodb_plugin.innodb_bug56143 innodb_plugin.innodb_bug56632 innodb_plugin.innodb_bug56680 innodb_plugin.innodb_bug57255
- Werror is disabled; Should be enabled after merge with xtradb.
After the WL#2687, the binlog_cache_size and max_binlog_cache_size affect both the
stmt-cache and the trx-cache. This means that the resource used is twice the amount
expected/defined by the user.
The binlog_cache_use is incremented when the stmt-cache or the trx-cache is used
and binlog_cache_disk_use is incremented when the disk space from the stmt-cache or the
trx-cache is used. This behavior does not allow to distinguish which cache may be harming
performance due to the extra disk accesses and needs to have its in-memory cache
increased.
To fix the problem, we introduced two new options and status variables related to the
stmt-cache:
Options:
. binlog_stmt_cache_size
. max_binlog_stmt_cache_size
Status Variables:
. binlog_stmt_cache_use
. binlog_stmt_cache_disk_use
So there are
. binlog_cache_size that defines the size of the transactional cache for
updates to transactional engines for the binary log.
. binlog_stmt_cache_size that defines the size of the statement cache for
updates to non-transactional engines for the binary log.
. max_binlog_cache_size that sets the total size of the transactional
cache.
. max_binlog_stmt_cache_size that sets the total size of the statement
cache.
. binlog_cache_use that identifies the number of transactions that used the
temporary transactional binary log cache.
. binlog_cache_disk_use that identifies the number of transactions that used
the temporary transactional binary log cache but that exceeded the value of
binlog_cache_size.
. binlog_stmt_cache_use that identifies the number of statements that used the
temporary non-transactional binary log cache.
. binlog_stmt_cache_disk_use that identifies the number of statements that used
the temporary non-transactional binary log cache but that exceeded the value of
binlog_stmt_cache_size.
include/my_sys.h:
Updated message on disk_writes' usage.
mysql-test/extra/binlog_tests/binlog_cache_stat.test:
Updated the test case and added code to check the new status variables
binlog_stmt_cache_use and binlog_stmt_cache_disk_use.
mysql-test/extra/rpl_tests/rpl_binlog_max_cache_size.test:
Updated the test case to use the new system variables max_binlog_stmt_cache_size and binlog_stmt_cache_size.
mysql-test/r/mysqld--help-notwin.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_mixed_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_row_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_stm_cache_stat.result:
Updated the result file.
mysql-test/suite/rpl/r/rpl_mixed_binlog_max_cache_size.result:
Update the result file.
mysql-test/suite/rpl/r/rpl_row_binlog_max_cache_size.result:
Update the result file.
mysql-test/suite/rpl/r/rpl_stm_binlog_max_cache_size.result:
Updated the result file.
mysql-test/suite/sys_vars/inc/binlog_stmt_cache_size_basic.inc:
Added a test case to check the binlog_stmt_cache_size.
mysql-test/suite/sys_vars/r/binlog_stmt_cache_size_basic_32.result:
Updated the result file.
mysql-test/suite/sys_vars/r/binlog_stmt_cache_size_basic_64.result:
Updated the result file.
mysql-test/suite/sys_vars/r/max_binlog_stmt_cache_size_basic.result:
Updated the result file.
mysql-test/suite/sys_vars/t/binlog_stmt_cache_size_basic_32.test:
Added a test case to check the binlog_stmt_cache_size.
mysql-test/suite/sys_vars/t/binlog_stmt_cache_size_basic_64.test:
Added a test case to check the binlog_stmt_cache_size.
mysql-test/suite/sys_vars/t/max_binlog_cache_size_func-master.opt:
Removed because there is no test case max_binlog_cache_size_func.
mysql-test/suite/sys_vars/t/max_binlog_stmt_cache_size_basic.test:
Added a test case to check the system variable max_binlog_stmt_cache_size.
sql/log.cc:
There two main changes in here:
. Changed the set_write_error() as an error message is set according
to the type of the cache.
. Created the function set_binlog_cache_info where references to the
appropriate status and system variables are set and the server can
smoothly compute statistics and set the maximum size for each cache.
sql/log.h:
Changed the signature of the function in order to identify the error message
to be printed out as there is a different error code for each type of cache.
sql/mysqld.cc:
Added new status variables binlog_stmt_cache_use and binlog_stmt_cache_disk_use.
sql/mysqld.h:
Added new system variables max_binlog_stmt_cache_size and binlog_stmt_cache_size.
sql/share/errmsg-utf8.txt:
Added new error message related to the statement cache.
sql/sys_vars.cc:
Added new system variables max_binlog_stmt_cache_size and binlog_stmt_cache_size.
After the WL#2687, the binlog_cache_size and max_binlog_cache_size affect both the
stmt-cache and the trx-cache. This means that the resource used is twice the amount
expected/defined by the user.
The binlog_cache_use is incremented when the stmt-cache or the trx-cache is used
and binlog_cache_disk_use is incremented when the disk space from the stmt-cache or the
trx-cache is used. This behavior does not allow to distinguish which cache may be harming
performance due to the extra disk accesses and needs to have its in-memory cache
increased.
To fix the problem, we introduced two new options and status variables related to the
stmt-cache:
Options:
. binlog_stmt_cache_size
. max_binlog_stmt_cache_size
Status Variables:
. binlog_stmt_cache_use
. binlog_stmt_cache_disk_use
So there are
. binlog_cache_size that defines the size of the transactional cache for
updates to transactional engines for the binary log.
. binlog_stmt_cache_size that defines the size of the statement cache for
updates to non-transactional engines for the binary log.
. max_binlog_cache_size that sets the total size of the transactional
cache.
. max_binlog_stmt_cache_size that sets the total size of the statement
cache.
. binlog_cache_use that identifies the number of transactions that used the
temporary transactional binary log cache.
. binlog_cache_disk_use that identifies the number of transactions that used
the temporary transactional binary log cache but that exceeded the value of
binlog_cache_size.
. binlog_stmt_cache_use that identifies the number of statements that used the
temporary non-transactional binary log cache.
. binlog_stmt_cache_disk_use that identifies the number of statements that used
the temporary non-transactional binary log cache but that exceeded the value of
binlog_stmt_cache_size.
The binlog_cache_use is incremented twice when changes to a transactional table
are committed, i.e. TC_LOG_BINLOG::log_xid calls is called. The problem happens
because log_xid calls both binlog_flush_stmt_cache and binlog_flush_trx_cache
without checking if such caches are empty thus unintentionally increasing the
binlog_cache_use value twice.
Although not explicitly mentioned in the bug, the binlog_cache_disk_use presents
the same problem.
The binlog_cache_use and binlog_cache_disk_use are status variables that are
incremented when transactional (i.e. trx-cache) or non-transactional (i.e.
stmt-cache) changes are committed. They are used to compute the ratio between
the use of disk and memory while gathering updates for a transaction.
The problem reported is fixed by avoiding incrementing the binlog_cache_use
and binlog_cache_disk_use if a cache is empty. We also have decided to increment
both variables when a cache is truncated as the cache is used although its
content is discarded and is not written to the binary log.
In this patch, we take the opportunity to re-organize the code around the
function binlog_flush_trx_cache and binlog_flush_stmt_cache.
mysql-test/extra/binlog_tests/binlog_cache_stat.test:
Updated the test case with comments and additional tests to check both
transactional and non-transactional changes and what happens when a
is transaction either committed or aborted.
mysql-test/suite/binlog/r/binlog_innodb.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_mixed_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_row_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_stm_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/t/binlog_mixed_cache_stat.test:
Updated the test case with a new source file.
mysql-test/suite/binlog/t/binlog_row_cache_stat.test:
Updated the test case with a new source file.
mysql-test/suite/binlog/t/binlog_stm_cache_stat.test:
Updated the test case with a new source file.
sql/log.cc:
There are four changes in here:
. Computed statistics on binlog_cache_use and binlog_cache_disk_use while
resting the cache.
. Refactored the code so binlog_flush_cache handles both the trx-cache and
stmt-cache.
. There are functions that encapsulate the calls to binlog_flush_cache and
make easier to read the code.
. binlog_commit_flush_stmt_cache is now taking into account the result:
success or error.
sql/log_event.h:
Guaranteed Xid_log_event is always classified as a transactional event.
The binlog_cache_use is incremented twice when changes to a transactional table
are committed, i.e. TC_LOG_BINLOG::log_xid calls is called. The problem happens
because log_xid calls both binlog_flush_stmt_cache and binlog_flush_trx_cache
without checking if such caches are empty thus unintentionally increasing the
binlog_cache_use value twice.
Although not explicitly mentioned in the bug, the binlog_cache_disk_use presents
the same problem.
The binlog_cache_use and binlog_cache_disk_use are status variables that are
incremented when transactional (i.e. trx-cache) or non-transactional (i.e.
stmt-cache) changes are committed. They are used to compute the ratio between
the use of disk and memory while gathering updates for a transaction.
The problem reported is fixed by avoiding incrementing the binlog_cache_use
and binlog_cache_disk_use if a cache is empty. We also have decided to increment
both variables when a cache is truncated as the cache is used although its
content is discarded and is not written to the binary log.
In this patch, we take the opportunity to re-organize the code around the
function binlog_flush_trx_cache and binlog_flush_stmt_cache.
replication aborts
When recieving a 'SLAVE STOP' command, slave SQL thread will roll back the
transaction and stop immidiately if there is only transactional table updated,
even through 'CREATE|DROP TEMPOARY TABLE' statement are in it. But These
statements can never be rolled back. Because the temporary tables to the user
session mapping remain until 'RESET SLAVE', Therefore it will abort SQL thread
with an error that the table already exists or doesn't exist, when it restarts
and executes the whole transaction again.
After this patch, SQL thread always waits till the transaction ends and then stops,
if 'CREATE|DROP TEMPOARY TABLE' statement are in it.
mysql-test/extra/rpl_tests/rpl_stop_slave.test:
Auxiliary file which is used to test this bug.
mysql-test/suite/rpl/t/rpl_stop_slave.test:
Test case for this bug.
sql/slave.cc:
Checking if OPTION_KEEP_LOG is set. If it is set, SQL thread should wait
until the transaction ends.
sql/sql_parse.cc:
Add a debug point for testing this bug.
replication aborts
When recieving a 'SLAVE STOP' command, slave SQL thread will roll back the
transaction and stop immidiately if there is only transactional table updated,
even through 'CREATE|DROP TEMPOARY TABLE' statement are in it. But These
statements can never be rolled back. Because the temporary tables to the user
session mapping remain until 'RESET SLAVE', Therefore it will abort SQL thread
with an error that the table already exists or doesn't exist, when it restarts
and executes the whole transaction again.
After this patch, SQL thread always waits till the transaction ends and then stops,
if 'CREATE|DROP TEMPOARY TABLE' statement are in it.
The root of the problem is that to interrupt a slave SQL thread
wait, the STOP SLAVE implementation uses thd->awake(THD::NOT_KILLED).
This appears as a spurious wakeup (e.g. from a sleep on a
condition variable) to the code that the slave SQL thread is
executing at the time of the STOP. If the code is not written
to be spurious-wakeup safe, unexpected behavior can occur. For
the reported case, this problem led to an infinite loop around
the interruptible_wait() function in item_func.cc (SLEEP()
function implementation). The loop was not being properly
restarted and, consequently, would not come to an end. Since the
SLEEP function sleeps on a timed event in order to be killable
and to perform periodic checks until the requested time has
elapsed, the spurious wake up was causing the requested sleep
time to be reset every two seconds.
The solution is to calculate the requested absolute time only
once and to ensure that the thread only sleeps until this
time is elapsed. In case of a spurious wake up, the sleep is
restarted using the previously calculated absolute time. This
restores the behavior present in previous releases. If a slave
thread is executing a SLEEP function, a STOP SLAVE statement
will wait until the time requested in the sleep function
has elapsed.
mysql-test/extra/rpl_tests/rpl_start_stop_slave.test:
Add test case for Bug#56096.
mysql-test/suite/rpl/r/rpl_stm_start_stop_slave.result:
Add test case result for Bug#56096.
sql/item_func.cc:
Reorganize interruptible_wait into a class so that the absolute
time can be preserved across calls to the wait function. This
allows the sleep to be properly restarted in the presence of
spurious wake ups, including those generated by a STOP SLAVE.
The root of the problem is that to interrupt a slave SQL thread
wait, the STOP SLAVE implementation uses thd->awake(THD::NOT_KILLED).
This appears as a spurious wakeup (e.g. from a sleep on a
condition variable) to the code that the slave SQL thread is
executing at the time of the STOP. If the code is not written
to be spurious-wakeup safe, unexpected behavior can occur. For
the reported case, this problem led to an infinite loop around
the interruptible_wait() function in item_func.cc (SLEEP()
function implementation). The loop was not being properly
restarted and, consequently, would not come to an end. Since the
SLEEP function sleeps on a timed event in order to be killable
and to perform periodic checks until the requested time has
elapsed, the spurious wake up was causing the requested sleep
time to be reset every two seconds.
The solution is to calculate the requested absolute time only
once and to ensure that the thread only sleeps until this
time is elapsed. In case of a spurious wake up, the sleep is
restarted using the previously calculated absolute time. This
restores the behavior present in previous releases. If a slave
thread is executing a SLEEP function, a STOP SLAVE statement
will wait until the time requested in the sleep function
has elapsed.
heavy load".
rpl_row_sp003.test has sporadically failed when run on machine
under heavy load or on slow hardware.
This patch fixes races in the test which were causing these
failures and also removes unnecessary 100 second wait from it.
heavy load".
rpl_row_sp003.test has sporadically failed when run on machine
under heavy load or on slow hardware.
This patch fixes races in the test which were causing these
failures and also removes unnecessary 100 second wait from it.