MTR is stuck for about 20 seconds checking for free ports.
The reason is that perl's connect() takes 1 second on windows
if port is not opened.
This patch fixes the mtr_ping_port implementation on Windows
to use Net::Ping for the port checking with small (0.1sec) timeout.
This patch also removes pointless second call to check_ports_free()
in case of auto build thread.
perl
The problem here was the method how MTR gets its unique thread ids.
Prior to this patch, the method to do it was to maintain a global
table of pid,mtr_unique_id) pairs. The table was backed by a text
file. The table was cleaned up one in a while and dead processes leaking
unique_ids were determined with with kill(0) or with scripting tasklist
on Windows.
This method is flawed specifically on native Windows Perl. fork() is
implemented with starting a new thread, give it a syntetic negative PID
(threadID*(-1)), until this thread creates a new process with exec()
However, neither tasklist nor any other native Windows tool can cope with
negative perl PIDs. This lead to incorrect determination of dead process
and reusing already used mtr_unique_id.
The patch introduces alternative portable method of solving unique-id
problem. When a process needs a unique id in range [min...max], it just
starts to open files named min, min+1,...max in a loop . After file is
opened, we do non-blocking flock(). When flock() succeeds, process has
allocated the ID. When process dies, file is unlocked . Checks for zombies
are not necessary.
Since the change would create a co-existence problems with older version
of MTR, because of different way to calculate IDs, the default ID range
is changed from 250-299 to 300-349.
Another fix that was necessary enable --parallel option was to serialize
spawn() calls on Windows. specifically, IO redirects needed to be protected.
This patch also fixes hanging CRTL-C (as described in Bug #38629) for the
"new" MTR. The fix was already in 6.0 and is now downported.
- Some tests need to modify the server(s) so much that a total restart of all servers are
necessary after test. Make it possible for a test to signal it want mtr.pl to restart
all servers.
When the thread executing a DDL was killed after finished its
execution but before writing the binlog event, the error code in
the binlog event could be set wrongly to ER_SERVER_SHUTDOWN or
ER_QUERY_INTERRUPTED.
This patch fixed the problem by ignoring the kill status when
constructing the event for DDL statements.
This patch also included the following changes in order to
provide the test case.
1) modified mysqltest to support variable for connection command
2) modified mysql-test-run.pl, add new variable MYSQL_SLAVE to
run mysql client against the slave mysqld.
testcase checks are made.
MTR spawns mysqltest to run check-testcase test before and after each testcase
it runs. It can also run check-warnings using mysqltest. Since it happened on PB
that these checks hanged, this patch provides additional feedback to help
investigating such failures:
- mysqltest is modified to give feedback about main steps in execution of a
testcase if run in verbose mode (including connection to the server),
- MTR is modified to run mysqltest in verbose mode when doing check-testcase or
check-warnings. The diagnostic output from mysqltest is preserved so that it is
saved upon test failure.
- Since we are only using the auto cleanup in one place of mtr.pl today, disable the
autocleanup and write our own END handler that clean up the tmpdir only when the process
that created it exits.
- Make a rough filtering of the servers error log and write
all suspicious warnings to $error_log.warnings
The .warnings file is then examined more carefully by check_warnings.test
- This will speed things up, doing all of this in a server running
under valgrind takes far too long time.
- Add a "skip-ssl=1" to [mysqltest] section so that
mysqltest will not run with ssl turned on by default
but stil be able to turn it on when requested
- This avoids that check_warnings and check_testcase
connects to the server woth SSL turned on