Commit graph

112 commits

Author SHA1 Message Date
Marko Mäkelä
d95361107c Merge 10.5 into 10.6 2021-09-24 14:38:52 +03:00
Marko Mäkelä
7e2b42324c Merge 10.4 into 10.5 2021-09-24 08:42:23 +03:00
Jan Lindström
913efaa328 MDEV-26566 : galera.galera_var_cluster_address MTR failed: InnoDB: Assertion failure in file row0ins.cc line 3206
Actual problem was that we tried to calculate persistent statistics
to wsrep_schema tables in this case wsrep_streaming_log. These tables
should not have persistent statistics. Therefore, in table creation
tables should be created with STATS_PERSISTENT=0 table option. During
rolling-upgrade tables naturally already exists, thus we need to
alter them to contain STATS_PERSISTENT=0 table option.
2021-09-23 12:59:39 +03:00
Marko Mäkelä
15139964d5 Merge 10.5 into 10.6 2021-09-11 17:55:27 +03:00
Vicențiu Ciorbaru
7c33ecb665 Merge remote-tracking branch 'upstream/10.4' into 10.5 2021-09-10 17:16:18 +03:00
Jan Lindström
987903b38e MDEV-26503 : galera_3nodes.galera_wsrep_schema MTR failed: mysql_shutdown failed
Add wait conditions and clean up.
2021-09-07 13:43:51 +03:00
Rucha Deodhar
4e19539c14 MDEV-22189: Change error messages inside code to have mariadb instead of
mysql

Fix: Changed error messages, rerecorded results and changed other relevant
files.
2021-05-24 11:38:13 +05:30
Jan Lindström
0238e68464 MDEV-25591 : Test case cleanups
galera_var_wsrep_on_off : Add wait conditions to make sure DDL is
replicated before continuing.

wsrep.[variables|variables_debug] :  Remove unnecessary parts
and add check to correct number of variables or skip

galera_ssl_reload: Add version check and SSL checks.
2021-05-05 09:32:06 +03:00
Jan Lindström
e0d61cb41c Merge remote-tracking branch 10.4 into 10.5 2021-05-04 12:12:15 +03:00
Jan Lindström
473e85e931 MDEV-25591 : Test case cleanups
galera_var_wsrep_on_off : Add wait conditions to make sure DDL is
replicated before continuing.

wsrep.[variables|variables_debug] :  Remove unnecessary parts
and add check to correct number of variables or skip

galera_ssl_reload: Add version check and SSL checks.
2021-05-04 11:34:06 +03:00
mkaruza
c409dd42ef MDEV-22131 allow transition from unencrypted to TLS cluster communication without cluster downtime
Cluster communication should be possible even when:

1. Node 2 is TCP
2. Node 1/3 is dynamic with SSL enabled

During test we shutdown Node 2 and enable SSL on it. It should connect
back to cluster successfully.
2021-04-29 08:09:20 +03:00
Marko Mäkelä
4930f9c94b Merge 10.5 into 10.6 2021-04-21 11:45:00 +03:00
Marko Mäkelä
80ed136e6d Merge 10.4 into 10.5 2021-04-21 09:01:01 +03:00
mkaruza
c3b016efde MDEV-22668: "Flush SSL" command doesn't reload wsrep cert
Trigger `socket.ssl_reload` when FLUSH SSL is issued. To triger reloading
of certificate, key and CA, files needs to be physically changed.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-04-15 08:50:01 +03:00
sjaakola
a1e70388c4 MDEV-24966 Galera multi-master regression
After the merging of MDEV-24915, 10.6 branch has regressions with handling of
concurrent write load against two or more cluster nodes. These regressions may
surface as cluster hanging, node crashes or data inconsistency. With some test
scenarios, the only visible symptom could be that the BF victim aborting happens
only by innodb lock wait timeout expiration. This would result only to poor
performance (by default 50 sec hang for each BF conflict), and could be somewhat
difficult to diagnose.

This pull request has following fixes to handle concurrent write load from
multiple nodes:

In lock_wait_wsrep_kill(), the victim trx was expected to be only in
TRX_STATE_ACTIVE state. With the delayed BF conflict handling, it can happen
that victim has advanced into pre commit state. This was fixed by choosing
victim both in TRX_STATE_ACTIVE and TRX_STATE_PREPARED states.

Victim transaction may be in several different states at the time of detected
lock conflict, and due to delayed BF aborting practice in MDEV-24915, the victim
may advance further before the actual BF aborting takes place. The BF aborting
in MDEV-24915 did not wake the victim, if it was in the state of waiting for
some other lock (than the one that was blocking the high priority thread).
This anomaly caused the innodb lock wait timeout expiration delays and poor
performance symptom. To fix this, lock_wait_wsrep_kill() now looks if
victim is in lock waiting state, and uses lock_cancel_waiting_and_release()
to cancel this lock wait.

wsrep_bf_abort() checks if the victim has active transaction (in wsrep-lib),
and starts a new transaction if there was no active transaction before.
Due to late BF aborting, the victim may have e.g. failed in certification
and is already aborting or has aborted at this stage. This has caused
problems in testing where BF aborter tries to BF abort himself.
The fix in wsrep_bf_abort() now skips the BF abort, if victim is aborting
or has aborted. Victim may not have started transaction yet in wsrep context,
but it may have acquired MDL locks (due to DDL execution), and this has
caused BF conflict. Such case does not require aborting in wsrep or
replication provider state.

BF aborting could cause BF-BF conflict scenario, if victim was already aborted
and changed to replayer having high priority as well. This BF-BF conflict
scenario is now avoided in lock_wait_wsrep() where we now check if blocking
lock holder is also high priority and is ordered before, caller should wait
for the lock in this situation.

The natural innodb deadlock resolving algorithm could pick BF thread as
deadlock victim. This is fixed by giving max weigh to BF threads in
Deadlock::report().

MDEV-24341 has changed excution paths in do_command() and this affects BF
aborted victim execution. This PR fixes one assert in do_command():
 DBUG_ASSERT(!thd->async_state.pending_ops())
Which fired if the thd was BF aborted earlier. This assert is now changed
to allow pending_ops() if thd was BF aborted before.

With these fixes, long term highly conflicting write load could be run against
to node cluster. If binlogging is configured, log_slave_updates should be
also set.
2021-04-13 14:58:54 +03:00
Marko Mäkelä
7b48da4d7e Merge 10.4 into 10.5 2021-04-08 07:47:49 +03:00
Jan Lindström
5b71e0424c MDEV-21402 : sql_safe_updates breaks Galera 4
Added handling for sql_safe_updated i.e. we disable it while
we do wsrep_schema operations.
2021-04-06 15:33:13 +03:00
mkaruza
f8488370d6 MDEV-24956: ALTER TABLE not replicated with Galera in MariaDB 10.5.9
`WSREP_CLIENT` is used as condition for starting ALTER/OPTIMIZE/REPAIR TOI.
Using this condition async replicated affected DDL's will not be replicated.
Fixed by removing this condition.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-04-05 09:30:29 +03:00
Jan Lindström
e2bcf68279 MDEV-24010 : galera_3nodes.GCF-354 MTR fails : WSREP has not yet prepared node for application use
Correct test.
2021-01-27 10:55:10 +02:00
Marko Mäkelä
6a1e655cb0 Merge 10.4 into 10.5 2020-12-02 18:29:49 +02:00
Daniele Sciascia
60035bd2f1 Make test galera_parallel_apply_3nodes deterministic
Test galera_parallel_apply_3nodes started to failed occasionally.
The test assumes that one round of autocommit retry is sufficient in
order to avoid a deadlock error when two conflicting UPDATE statements
run concurrently.
This assumption no longer holds after galera library has changed
last_committed() to return the seqno of the last transaction that left
apply monitor, rather than commit monitor. So it is possible that
after a BF abort, a command is re-executed before it's BF abortee has
left the apply monitor. Thus causing another retry or a deadlock error.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2020-11-19 12:42:54 +02:00
Daniele Sciascia
694926a4f7 Fix suppression in MTR test galera_3nodes.inconsistency_shutdown 2020-11-17 13:00:44 +02:00
Marko Mäkelä
533a13af06 Merge 10.3 into 10.4 2020-11-03 14:49:17 +02:00
Oleksandr Byelkin
8e1e2856f2 Merge branch '10.4' into 10.5 2020-11-01 14:26:15 +01:00
Oleksandr Byelkin
80c951ce28 Merge branch '10.3' into 10.4 2020-10-31 21:06:49 +01:00
Jan Lindström
5482d62760 Fix sporadic test failure on galera_parallel_apply_3nodes.
Test itself is not deterministic.
2020-10-30 09:19:29 +02:00
Marko Mäkelä
882ce206db Merge 10.4 into 10.5 2020-09-23 11:32:43 +03:00
Jan Lindström
a0e2a293bc Fix try. 2020-09-21 13:59:13 +03:00
Marko Mäkelä
3a423088ac Merge 10.3 into 10.4 2020-09-21 12:29:00 +03:00
Jan Lindström
69d536a22d MDEV-23751 : galera_3nodes test failures on ipv6 sst tests
Fix assertion text it was too tight for some systems.

This is backport from 10.4 and for Galera 3.
2020-09-18 14:09:24 +03:00
Jan Lindström
29847a3736 MDEV-23751 : galera_3nodes test failures on ipv6 sst tests
Fix assertion text it was too tight for some systems.
2020-09-18 07:33:37 +03:00
Jan Lindström
f381e019b6 MDEV-23574 : galera_3nodes.galera_ipv6_mariabackup_section MTR failed: Could not open '../galera/include/have_mariabackup.inc'
Test case and configuration cleanup.
2020-09-17 12:55:06 +03:00
Jan Lindström
e3e657373a MDEV-21769 : galera_3nodes.galera_safe_to_bootstrap fails
Add wait_condition to wait correct cluster configuration.
2020-09-17 08:25:07 +03:00
Jan Lindström
fd5cbbb91e MDEV-23591 : galera_3nodes.GCF-354 MTR failed: 1047: WSREP has not yet prepared node for application use
Stabilize test.
2020-09-14 11:48:36 +03:00
Jan Lindström
bc2dbdb601 MDEV-23587 : galera_3nodes.galera_var_dirty_reads2 MTR failed: 1047: WSREP has not yet prepared node for application use
Add wait_condition tomake sure insert is replicated and server
is after isolation back on ready state.
2020-09-11 14:49:42 +03:00
Marko Mäkelä
5ff7e68c7e Merge 10.4 into 10.5 2020-09-04 18:44:44 +03:00
Marko Mäkelä
c9cf6b13f6 Merge 10.3 into 10.4 2020-09-03 15:53:38 +03:00
Jan Lindström
97d830565f MDEV-23574 : galera_3nodes.galera_ipv6_mariabackup_section MTR failed: Could not open '../galera/include/have_mariabackup.inc'
Fix the include and add force_restart to stabilize.
2020-08-27 14:50:21 +03:00
Jan Lindström
b3e43eeec7 Remove xtrabackup and innobackupex test cases. 2020-08-27 14:30:12 +03:00
Marko Mäkelä
d5d8756de3 Merge 10.4 into 10.5 2020-08-20 12:52:44 +03:00
Daniele Sciascia
8a6a084578 Re-record MTR tests galera_3nodes.galera_join_with_cc_{A|B|C} 2020-08-19 10:34:25 +03:00
Marko Mäkelä
1c58748196 Merge 10.4 into 10.5 2020-08-10 21:38:55 +03:00
Jan Lindström
14a5f73cda Add wait conditions for cluster size. 2020-08-04 14:15:06 +03:00
Marko Mäkelä
1813d92d0c Merge 10.4 into 10.5 2020-07-02 09:41:44 +03:00
Marko Mäkelä
f347b3e0e6 Merge 10.3 into 10.4 2020-07-02 07:39:33 +03:00
Marko Mäkelä
1df1a63924 Merge 10.2 into 10.3 2020-07-02 06:17:51 +03:00
Marko Mäkelä
ea2bc974dc Merge 10.1 into 10.2 2020-07-01 12:03:55 +03:00
Alexey Yurchenko
8e58eeba78 MTR tests to test Galera fix for node joining over several configuration
changes.

This requires Galera commit 065e484144c5999709ae8fd19844da72bb785073
2020-06-24 08:10:57 +03:00
Marko Mäkelä
5203bc10f1 Merge 10.4 into 10.5 2020-03-21 11:37:10 +02:00
seppo
d529389358
MDEV-21979 Galera test sporadic failure on galera_3nodes.galera_pc_weight (#1473)
Forcing wait on nodes 2 and 3, to turn wsrep_ready to 'ON' before querying wsrep status variables.
This guarantees that status reads don't come too early on these nodes
2020-03-20 15:38:37 +02:00