In mtr.check_warnings, `text` was declares as type text, which is
64K, and when the server log grows larger than this, it would be
truncated, and then check_warnings was actually checking the
error messages of a previous test and complain warnings.
This patch fixed the problem by change the type of `text` to
mediumtext, which is 16M.
Problem 1: The test waits for an error in the slave sql thread,
then resolves the error and issues 'start slave'. However, there
is a gap between when the error is reported and the slave sql
thread stops. If this gap was long, the slave would still be
running when 'start slave' happened, so 'start slave' would fail
and cause a test failure.
Fix 1: Made wait_for_slave_sql_error wait for the slave to stop
instead of wait for error in the IO thread. After stopping, the
error code is verified. If the error code is wrong, debug info
is printed. To print debug info, the debug printing code in
wait_for_slave_param.inc was moved out to a new file,
show_rpl_debug_info.inc.
Problem 2: rpl_stm_mystery22 is a horrible name, the comments in
the file didn't explain anything useful, the test was generally
hard to follow, and the test was essentially duplicated between
rpl_stm_mystery22 and rpl_row_mystery22.
Fix 2: The test is about conflicts in the slave SQL thread,
hence I renamed the tests to rpl_{stm,row}_conflicts. Refactored
the test so that the work is done in
extra/rpl_tests/rpl_conflicts.inc, and
rpl.rpl_{row,stm}_conflicts merely sets some variables and then
sourced extra/rpl_tests/rpl_conflicts.inc.
The tests have been rewritten and comments added.
Problem 3: When calling wait_for_slave_sql_error.inc, you always
want to verify that the sql thread stops because of the expected
error and not because of some other error. Currently,
wait_for_slave_sql_error.inc allows the caller to omit the error
code, in which case all error codes are accepted.
Fix 3: Made wait_for_slave_sql_error.inc fail if no error code
is given. Updated rpl_filter_tables_not_exist accordingly.
Problem 4: rpl_filter_tables_not_exist had a typo, the dollar
sign was missing in a 'let' statement.
Fix 4: Added dollar sign.
Problem 5: When replicating from other servers than the one named
'master', the wait_for_slave_* macros were unable to print debug
info on the master.
Fix 5: Replace parameter $slave_keep_connection by
$master_connection.
Problem: when mtr tries to create a directory, and the target
exists but is a file instead of directory, it tries several times
to create the directory again before it fails.
Fix: make it check if the target exists and is a non-directory.
If server has not been initialized as a slave (by CHANGE MASTER), then
SHOW SLAVE STATUS will return an empty set, and caused the waiting for
Slave_IO_running or Slave_SQL_running to 'No' fail.
This patch fixed the problem by return immediately if slave is not
initialized in include/wait_for_slave_*_to_stop.inc.
mysqltest command 'shutdown_server' is supposed to shutdown the server
and wait for it to be gone, and kill it when timeout. But because the
arguments passed to my_kill were in the wrong order, 'shutdown_server'
does not wait nor kill the server at all. So after 'shutdown_server',
the server is still running, and the server may still accepting
connections.
Bug#38540 rpl_server_id2 uses show slave status unnecessarily
Slave did not perform any event recorded into the relay log from some
different master when it was started with --replicate-same-server-id.
The reason appeared to be a consequence of BUG#38734 which stopped the
sql thread at its startup time.
The real fixes for the current bug are in the patch for BUG#38734.
This changeset carries only a regression test for the bugs. Bug#38540
gets fixed too by means of eliminating an extra show slave status.
Adding --loose-skip-falcon option to the mysqld options provided by MTR (v2) during mysqld bootstrap in order to avoid plugin (in this case Falcon) initialization of static variables etc. Options --loose-skip-innodb and --loose-skip-ndbcluster were already included.
This will fix Bug#41014 (falcon_bug_39708 fails in pushbuild in 6.0-rpl: "succeeded - should have failed")
in the case of MTR v2 (which currently is available in -rpl branches only).
MTR v1 (e.g. in main 6.0 branch) does not have this problem.
It would be more ideal to remove the --loose-skip-* options and provide a single option disabling all plugin initialization instead, or have bootstrap do this by default. Server modifications are (most likely) needed to be able to do that.
The test reacted on the way how mtr orders arguments for the server
that are gathered from different source. It appeared that the opt-file
options were parsed before those that supplied to mtr via its command
line. In effect, the opt-file preferences got overriden by the command
line and some tests, like no-threads, were caught by surprise: a test
expects an option value that had been "hardcoded" into its opt-file
but gets another one.
This server options ordering problem exists on in the new rpl trees
mtr. In option of the author of this patch, the opt-file shall be
considered as having the highest preference weight. The opt-file is
merely a part of the header of a test, namely a part that can not be
technically deployed along the test file.
It's unnatural for the test writer to provide both the opt file value
and a guard that guarantees the value will be set on in the run time.
It's logical to provide either one: the option and its value or the
guard.
Fixed with relocating parse of the opt file to be the last among
sources of the sever's options.
A side effect: fixing a small problem of resetting the suite options
at time the opt file starts parsing.
A side effect: main.log_bin_trust_function_creators_func is disabled to
be re-enabled with the fixes for bug#41003 will be merged from the main trees.
exact number of error. The patch does following:
1) Add new parameter $slave_sql_errno for wait_for_slave_sql_error.inc
2) Add waiting error 1062 (Duplicate PK) for slave SQL thread in test case.
where timeout can happen:
1. Added waiting start/stop slave to make sure that slave works properly.
2. Added cleanup for slave.
3. Updated related result files.
The problem here is that embedded server starts handle_thread manager
thread on mysql_library_init() does not stop it on mysql_library_end().
At shutdown, my_thread_global_end() waits for thread count to become 0,
but since we did not stop the thread it will give up after 5 seconds.
Solution is to move shutdown for handle_manager thread from kill_server()
(mysqld specific) to clean_up() that is used by both embedded and mysqld.
This patch also contains some refactorings - to avoid duplicate code,
start_handle_manager() and stop_handle_manager() functions are introduced.
Unused variables are eliminated. handle_manager does not rely on global
variable abort_loop anymore to stop (abort_loop is not set for embedded).
Note: Specifically on Windows and when using DBUG version of libmysqld,
the complete solution requires removing obsolete code my_thread_init()
from my_thread_var(). This has a side effect that a DBUG statement
after my_thread_end() can cause thread counter to be incremented, and
embedded will hang for some seconds. Or worse, my_thread_init() will
crash if critical sections have been deleted by the global cleanup
routine that runs in a different thread.
This patch also fixes and revert prior changes for Bug#38293
"Libmysqld crash in mysql_library_init if language file missing".
Root cause of the crash observed in Bug#38293 was bug in my_thread_init()
described above