This patch makes two changes:
* Remove unnecessary `set SESSION wsrep_sync_wait=0`.
Disabling `wsrep_sync_wait` caused the test to fail occasionally,
due to subsequent `SELECT`s observing stale values.
* Remove redundant `--source include/galera_wait_ready.inc`,
`galera_wait_ready.inc` is already included in the above
`wait_until_connected_again.inc`
The test tends to fail if many parallel instances of it are executed:
```
mysqltest: At line 23: query 'ALTER TABLE t1 ADD PRIMARY KEY (f1)' failed:
1317: Query execution was interrupted
```
The `ALTER` fails because it is BF aborted due to an earlier `INSERT SELECT`
that is being applied:
```
INSERT INTO t1 (f1) SELECT ...
--connection node_2
SET GLOBAL wsrep_desync = TRUE;
SET SESSION wsrep_on = FALSE;
ALTER TABLE t1 ADD PRIMARY KEY (f1);
SET SESSION wsrep_on = TRUE;
SET GLOBAL wsrep_desync = FALSE;
```
And because the `ALTER` is executed with `wsrep_on = OFF`, it does not
run in total order isolation.
To avoid the problem it must be ensured that the `ALTER` only after the
large `INSERT SELECT` is done. To do so it is sufficient to issue
`SELECT COUNT(*) FROM t1;` from `node_2` before turning off wsrep.
The `SELECT` will trigger `wsrep_sync_wait` and proceed only after the
`INSERT SELECT` from node_1 is done.
The following changes are committed:
* `RESET MASTER` at the end of the test. This was necessary to allow the test
to run on repeated runs.
* `--source include/galera_wait_ready.inc` after setting `gmcast.isolate=0` to
get back to a primary component.
* Fix for assertion in `Protocol::end_statement()`. The assertion is due to
the fact that function `do_command()` calls `thd->protocol->end_statement()`,
without setting an error, when it is detected that galera is not ready yet.
Following line somehow disappeared in a past merge:
```
my_message(ER_UNKNOWN_COM_ERROR,
"WSREP has not yet prepared node for application use", MYF(0));
```
Test galera checks that a `GRA_x_x.log` file is created whenever
wsrep applier fails to apply some replication event. The file
contains the corresponding binlog event that failed to apply.
The test creates a new file by concatenating a pre-recorded
file containing the binlog header (see `std-data/binlog-header.log`)
and the `GRA_x_x.log` file. The test then checks that the resulting
file, containing the binlog header and the event that failed to
apply, is correctly read by `mysqlbinlog` program.
The test fails in MariaDB because the GRA_x_x.log file created
by MariaDB already contains the binlog header (see MDEV-7867).
This patch fixes/simplifies test `galera.galera_gra_log` so that
it doesn't concatenate `std-data/binlog-header.log` with the
`GRA_x_x.log` file. File `std-data/binlog-header.log` is deleted
altoghether, because not used by any other test.
kill_galera.inc can no longer rely on wait_until_connected_again.inc.
This is because wait_until_connected_again now tries to make sure
that the server it is connected eventually transition to ready
state. Whereas some tests may need to kill galera while the server
is in a non-primary view.
This patch makes two changes:
* It replaces `--sleep 1` with appropriate wait_conditions, and
removes another `--sleep 1` which is no longer necessary, after
MDEV-14144 has been fixed.
* It moves the `RESET MASTER` from node_1 at the very end, when it
it sure that the other nodes have applied the the final cleanup
`DROP TABLE t1,t2.`
This solves the problem with the `DROP TABLE` not being replicated
to the asynchronous slaves, which would make the test fail with:
`Timeout in wait_condition.inc for SELECT COUNT(*) = 0 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 't1';`
Test galera_suspend_slave checks that after suspending a galera
node, like this:
```
--perl
my $pid_filename = $ENV{'NODE_2_PIDFILE'};
my $mysqld_pid = `cat $pid_filename`;
chomp($mysqld_pid);
system("kill -SIGSTOP $mysqld_pid");
exit(0);
EOF
```
the remaining one node cluster is no longer able to successfully
INSERT new rows:
```
--error ER_UNKNOWN_COM_ERROR,ER_LOCK_WAIT_TIMEOUT,ER_LOCK_DEADLOCK,ER_ERROR_DURING_COMMIT
INSERT INTO t1 VALUES (1);
```
On rare occasions when the system is overloaded, it appears that
suspending the process with ```system("kill -SIGSTOP $mysqld_pid")```
takes some time, enough for the subsequent INSERT to succeed. In which
case the test fails because it rightly expects the INSERT to fail.
To fix the problem, the patch makes sure that the cluster has shrinked
to one node, before trying to INSERT the new value.
galera SST tests have a debug part, but we don't want to limit them
to fulltest2 builder. So, add support for test files that
have a debug part:
* add maybe_debug.inc and maybe_debug.combinations
* 'debug' combination is run when debug is available
* 'release' combination is run otherwise
* test wraps debug parts in if($with_debug) { ... }
* and creates ,debug.rdiff for debug results
* make galera.galera_sst_xtrabackup* not big
* auto-select between socat and nc, whatever available
* auto-skip xtrabackup tests if no xtrabackup or neither socat nor nc
fix galera.galera_sst_mysqldump test to work:
* must connect to 127.0.0.1, where mysqld is listening
* disable wsrep_sync_wait in wsrep_sst_mysqldump, otherwise
sst can deadlock
* allow 127.0.0.1 for bind_address and wsrep_sst_receive_address.
(it's useful in tests, or when two nodes are on the same box,
or when nodes are on different boxes, but the connection is
tunelled, or whatever. Don't judge user's setup). MDEV-14070
* don't wait for client connections to die when doing
mysqldump sst. they'll die in a due time, and if needed mysql
will wait on locks until they do. MDEV-14069
Also don't mark it big, to make sure it's sufficiently tested
It doesn't make sense to allow selects from I_S but disallow selects
that don't use any tables at all, because any (disallowed) select that
doesn't use tables can be made allowed by adding
"FROM I_S.COLLATIONS LIMIT 1" to the end.
And it break mysql-test rather badly, even check-testcase.test
fails on its first `SELECT '$tmp' = 'No such row'`
This reverts 9a89614857, c5dd2abf4c, and 33028f7c4b:
Refs: MW-245 - changed logic so that in non primary node it is possible to do SET + SHOW + SELECT from information and pfs schema, when dirty reads are not enabled - however, non table selects are not allowed (e.g. SELECT 1)
Refs MW-245 - logic was wrong in detecting if queries are allowed in non primary node. it allowed select with no table list to execute even if dirty reads was not specified
Refs: MW-245 - Adjust tests to account for the new behavior.
Two changes were made to the test:
1) Suppress warning "Refusing exit for the last slave thread."
This warning was already suppressed, but on the wrong node.
2) The test occasionally fails because it expects that the
number of applier threads changes immediately after
changing the value of ```variable wsrep_slave_threads```.
Which is not true. This patch turns snippets like this:
```
SET GLOBAL wsrep_slave_threads = x;
SELECT COUNT(*) = x FROM INFORMATION_SCHEMA.PROCESSLIST WHERE USER = 'system user';
```
Into proper wait_conditions:
```
SET GLOBAL wsrep_slave_threads = x;
let $wait_condition = SELECT COUNT(*) = x FROM ...;
--source include/wait_condition.inc
```