Problem: After START SLAVE, the Slave_IO_Status column of
SHOW SLAVE STATUS goes from No to Yes asynchronously. That
caused sporadic failures on pushbuild in rpl_stm_until since
the test contains SHOW SLAVE STATUS right after START SLAVE.
Fix: Wait until Slave_IO_Status becomes Yes after each
START SLAVE.
Problem: master binlog has 'create table t1'. Master binlog
was removed before slave could replicate it. In test's cleanup
code, master did 'drop table t1', which caused slave sql
thread to stop with an error since slave sql thread did not
know about t1.
Fix: t1 is just an auxiliary construction, only needed on
master. Hence, we turn off binlogging before t1 is created,
drop t1 as soon as we don't need it anymore, and then turn
on binlogging again.
Problem was that ha_partition had HA_FILE_BASED flag set
(since it uses a .par file), but after open it uses the first partitions
flags, which results in different case handling for create and for
open.
Solution was to change the underlying partition name so it was consistent.
(Only happens when lower_case_table_names = 2, i.e. Mac OS X and storage
engines without HA_FILE_BASED, like InnoDB and Memory.)
(Recommit after adding rename of check_lowercase_names to
get_canonical_filename, and moved it from handler.h to mysql_priv.h)
NOTE: if a mixed case name for a partitioned table was created when
lower_case_table_name = 2 it should be renamed or dropped before using
the updated version (See bug#37402 for more info)
Problem 1: tests often fail in pushbuild with a timeout when waiting
for the slave to start/stop/receive error.
Fix 1: Updated the wait_for_slave_* macros in the following way:
- The timeout is increased by a factor ten
- Refactored the macros so that wait_for_slave_param does the work for
the other macros.
Problem 2: Tests are often incorrectly written, lacking a
source include/wait_for_slave_to_[start|stop].inc.
Fix 2: Improved the chance to get it right by adding
include/start_slave.inc and include/stop_slave.inc, and updated tests
to use these.
Problem 3: The the built-in test language command
wait_for_slave_to_stop is a misnomer (does not wait for the slave io
thread) and does not give as much debug info in case of failure as
the otherwise equivalent macro
source include/wait_for_slave_sql_to_stop.inc
Fix 3: Replaced all calls to the built-in command by a call to the
macro.
Problem 4: Some, but not all, of the wait_for_slave_* macros had an
implicit connection slave. This made some tests confusing to read,
and made it more difficult to use the macro in circular replication
scenarios, where the connection named master needs to wait.
Fix 4: Removed the implicit connection slave from all
wait_for_slave_* macros, and updated tests to use an explicit
connection slave where necessary.
Problem 5: The macros wait_slave_status.inc and wait_show_pattern.inc
were unused. Moreover, using them is difficult and error-prone.
Fix 5: remove these macros.
Problem 6: log_bin_trust_function_creators_basic failed when running
tests because it assumed @@global.log_bin_trust_function_creators=1,
and some tests modified this variable without resetting it to its
original value.
Fix 6: All tests that use this variable have been updated so that
they reset the value at end of test.
Bug #37940 rpl_dual_pos_advance fails sporadically on pushbuild,fail wait_for_slave_to_stop
Bug #37941 rpl_flushlog_loop fails sporadically on pushbuild
Several tests fail when waiting for the slave to stop in what
appears to be timeouts caused by a timeout value set to low.
This causes false failures when the servers are loaded.
In order to try to avoid false negatives, we increase the
timeout 10 times and also print some more information in the
event that the slave fails to stop when expected to.
We add a printout of the current processes running to be able
to see if any process have been executing for an unexpectedly
long time, and also print the binlog events at the position
indicated by SHOW SLAVE STATUS.
Problem: rpl_ndb_transaction fails because it assumes nothing
is written to the binlog at a certain point. However, ndb may
binlog updates in ndb system tables at a nondeterministic
time point after an ndb table update has been committed.
Fix: break the test into two. rpl_ndb_transaction still does
the ndb updates needed by the first half of the test. The new
test case rpl_bug26395 includes the part that assumes nothing
more will be written to the binlog.
The problem was that when comparing tables for a possible
fast alter table, the comparison was being performed using
the parsed information and not the final definition.
The solution is to use the possible final table layout to
compare if a fast alter is possible or not.
Select of the test could not perform deterministically, because the table remains to be
updatable by the running event handler.
Fixed with changing verification to use a logical values instead of comparison
with a pre-recorded results.