Some galera tests starts 6 galera nodes. Each galera node requires
three ports: 6*3 = 18. Plus 6 ports are needed for 6 mariadbd servers.
Since the number of ports is rounded up to 10 everywhere in mtr, we
will take 30 as the default value for the port group size parameter.
- Suppress the "Difficult to find free blocks" warning
globally to avoid many different test case failing.
- Demote the error information in validate_first_page() to note.
So first page can recovered from doublewrite buffer and can throw
error in case the page wasn't found in doublewrite buffer.
New mtr option --skip-not-found makes it to show not found tests
as skipped
main.a [ skipped ] not found
(but only if the test was specified with the suite name)
and not error out early with
mysql-test-run: *** ERROR: Could not find 'a' in 'main' suite
This is useful in buildbot, on builders that generate the list
of tests dynamically.
The problem is in manager/worker communication when worker sends
WARNINGS and then TESTRESULT. If manager yet didn't read WARNINGS
response both responses get into the same buffer, can_read() will
indicate we have data only once and we must read all the data from the
socket at once. Otherwise TESTRESULT response is lost and manager
waits it forever.
The fix now instead of single line reads the socket in a loop. But if
there is only one response in the buffer the second read will be
blocked waiting until new data arrives. That can be overcame by
blocking(0) which sets the handle into non-blocking mode. If there is
no data second read just returns undef.
The problem is non-blocking mode is not supported by all perl flavors
on Windows. Strawberry and ActiveState do not support it. Cygwin and
MSYS2 do support. There is some ioctl() hack that was known to "work"
but it doesn't do what is expected (it does not return data when there
is data). So for Windows if it is not Cygwin we disable the fix.
Cygwin is more Unix-oriented. It does not treat \n as \r\n in regexps
(fixed by \R), it supplies Unix-style paths (fixed by
mixed_path()). It does some cleanup on paths when running exe, so it
will be different in exe output (like with $exe_mysqld, comparing
basename() is enough).
Cygwin installation
1. Just install latest perl version (only base package) and
patchutils from cygwin-setup;
2. Don't forget to add c:\cygwin64\bin into system path
before any other perl flavors;
3. There is path-style conflict (see below), you must replace
c:\cygwin64\bin\sh.exe with the wrapper. Run MTR with
--cygwin-subshell-fix=do for that. Make sure you are running Cygwin
perl for the option to work.
4. Restart buildbot via net stop buildbot; net start buildbot
Path-style conflict of Cygwin-ish Perl
Some exe paths are passed to mysqltest which are executed by a native
call. This requires native-style paths (\-style). These exe paths also
executed by Perl itself. Either by MTR itself which is not so
critical, but also by tests' --perl blocks which is impossible to
change. And if Perl detects shell-expansion or uses pipe command it
passess this exe path to /bin/sh which is Cygwin-compiled bash that
cannot work with \-style (or at least in -c processing). Thus we require
\-style on some parts of MTR execution and /-style on another parts.
The examples of tests which cover these different parts are:
main.mysqlbinlog_row_compressed \
main.sp_trans_log
That could be great to force Perl to use something different from
/bin/sh, but unfortunately /bin/sh is compiled-in into binary. So the
only solution left is to overwrite /bin/sh with some wrapper script
which passes the command to cmd.exe instead of bash.
See "Path-style conflict" in "MDEV-30836 MTR Cygwin fix" for explanation.
To install subshell fix use --cygwin-subshell-fix=do
To uninstall use --cygwin-subshell-fix=remove
This works only from Cygwin environment. As long as perl on PATH is
from Cygwin you are on Cygwin environment. Check it with
perl --version
This is perl 5, version 36, subversion 1 (v5.36.1) built for
x86_64-cygwin-threads-multi
run_test_server() is actually manager main loop. We move out this
function into Manager package and split into run() and
parse_protocol(). The latter is needed for the fix. Moving into
separate package helps to make some common variables which was local
to run_test_server().
Functions from the main package is now prefixed with main:: (should be
reorganized somehow later or auto-imported).
This ensures that no mtr test can change install.db after it's initial
creation as changing it while as another thread is coping it will lead to
failures in at least InnoDB and Aria recovery.
Fixed spider/bugfix.mdev_30370 that was wrongly used install.db
Adds new parameter $restart_bindir for restart_mysqld.inc.
Example:
let $restart_bindir= /home/midenok/src/mariadb/10.3b/build;
--source include/restart_mysqld.inc
It is good to return back original server before check_mysqld will be
run at the test end:
let $restart_bindir=;
--source include/restart_mysqld.inc
Passing $opt_parallel as $childs is wrong: child can be killed before
it connects and you will never decrement $childs for this.
Another problem is (and that is the cause of this bug): child can be
killed and never close server socket. This can happen f.ex. after
unmaskable KILL signal. In such case the socket is closed by reaping
the child but that never happens inside reading the socket loop in
run_test_server().
The proper design is the waitless reap of children inside the socket
loop and if there is no more children we finish the socket loop. Since
there is Windows variation where we don't control the children via
waitpid(), all the clients must normally close the socket and only
this can finish the socket loop. For Unix variation we reckon that
case as all children closed the socket but not all yet died and for
that we do final waiting waitpid() (was done before the patch as
well).
To be more complete, we now handle 3 end-of-game scenarios in Unix:
1. all children closed socket, all children died: everything is
handled by the socket loop;
2. all children closed socket, not all yet died: we wait for alive
children to die after exiting the socket loop;
3. not all children closed socket, all children died: everything is
handled by the socket loop.
For Windows end-of-game scenario is only one:
All children close the socket.
66832e3a introduced change that prints core dumps in very detailed
format. That's completely out of user-friendliness but serves as a
measure for debugging hard-reproducible bugs.
The proper way to implement this:
1. it must be controlled by command-line and environment variable;
2. detailed traces must be default for buildbots only, for user
invocations normal stack traces should be printed.
Options for control are: MTR_PRINT_CORE and --print-core that accept
the following values:
no Don't print core
short Print stack trace of failed thread
medium Print stack traces of all threads
detailed Print all stack traces with debug context
custom:<code> Use debugger commands <code> to print stack trace
Default setting is: short (see env_or_default() call in pre_setup())
For environment variable wrong values are silently ignored (falls back
to default setting, see env_or_default()).
Command-line option --print-core (or -C) overrides environment
variable. Its default value is 'short' if not specified explicitly
(same env_or_default() call in pre_setup()). Explicit values are
checked for validity.
--print-method option can specify by which debugger we print
cores. For Windows there is only one choice: cdb. For Unix the values
are: gdb, dbx, lldb, auto. Default value is: auto
In 'auto' we try to use all possible debuggers until success.
I change from `exit;` to `exit(1);` on a function `usage()`.
When we try to run mtr with a wrong option, a function `usage()` is called with the wrong option as its argument. In this case, because the function call `exit` in a first if statement, we get exit status 0.