Commit graph

335 commits

Author SHA1 Message Date
Monty
9bf479b0cf Update galera to work with independent sub transactions 2020-05-23 12:29:10 +03:00
Monty
b1fabf6cc9 Performance improvements to test if WSREP if active 2020-05-23 12:29:10 +03:00
Sergey Vojtovich
91734431ba Move all thread cache specific code to a new class
Part of
MDEV-18353 - Shutdown may miss to wait for connection thread
2020-05-06 13:50:35 +04:00
Eugene Kosov
89ff4176c1 MDEV-22437 make THR_THD* variable thread_local
Now all access goes through _current_thd() and set_current_thd()
functions.

Some functions like THD::store_globals() can not fail now.
2020-05-05 18:13:31 +03:00
Marko Mäkelä
fbe2712705 Merge 10.4 into 10.5
The functional changes of commit 5836191c8f
(MDEV-21168) are omitted due to MDEV-742 having addressed the issue.
2020-04-25 21:57:52 +03:00
Marko Mäkelä
2c39f69d34 MDEV-22203: WSREP_ON is unnecessarily expensive WITH_WSREP=OFF
If the server is compiled WITH_WSREP=OFF, we should avoid evaluating
conditions on a global variable that is constant.

WSREP_ON_: Renamed from WSREP_ON. Defined only WITH_WSREP=ON.

WSREP_ON: Defined as unlikely(WSREP_ON_).

wsrep_on(): Defined as WSREP_ON && wsrep_service->wsrep_on_func().

The reason why we have wsrep_on() at all is that the macro WSREP(thd)
depends on the definition of THD, and that is intentionally an opaque
data type for InnoDB. So, we cannot avoid invoking wsrep_on(), but
we can evaluate the less expensive condition WSREP_ON before calling
the function.
2020-04-24 15:25:39 +03:00
Jan Lindström
93475aff8d MDEV-22203: WSREP_ON is unnecessarily expensive to evaluate
Replaced WSREP_ON macro by single global variable WSREP_ON
that is then updated at server statup and on wsrep_on and
wsrep_provider update functions.
2020-04-24 13:12:46 +03:00
Teemu Ollakka
c79051e587 MDEV-22271 Excessive stack memory usage due to WSREP_LOG
- Made WSREP_LOG a function and moved the body out of header.
- Reduced the stack allocated buffer size and implemented
  reprint into dynamically allocated buffer if stack buffer is not
  large enough to hold the message.
2020-04-17 10:46:09 +03:00
Jan Lindström
c7ab676192 MDEV-22075 : Server crashes in wsrep_should_replicate_ddl_iterate upon CREATE VIEW
Fixed incorrect pointer reference when table is not available.
2020-04-08 18:09:28 +03:00
mkaruza
edc3899d97 MDEV-22051: Protocol::end_statement(): Assertion `0' failed on Galera node upon DDL attempt with conflicting lock
If FTWRL is issued, DDL statements should report error back to user before
TOI is started.
2020-04-08 16:42:18 +03:00
Sergei Golubchik
70e7b5095d perfschema sp instrumentation related changes 2020-03-10 19:24:23 +01:00
Sergei Golubchik
7c58e97bf6 perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
Jan Lindström
e6a50e41da MDEV-20051: Add new mode to wsrep_OSU_method in which Galera checks storage engine of the effected table
Introduced a new wsrep_strict_ddl configuration variable in which
Galera checks storage engine of the effected table. If table is not
InnoDB (only storage engine currently fully supporting Galera
replication) DDL-statement will return error code:

ER_GALERA_REPLICATION_NOT_SUPPORTED
       eng "DDL-statement is forbidden as table storage engine does not support Galera replication"

However, when wsrep_replicate_myisam=ON we allow DDL-statements to
MyISAM tables. If effected table is allowed storage engine Galera
will run normal TOI.

This new setting should be for now set globally on all
nodes in a cluster. When this setting is set following DDL-clauses
accessing tables not supporting Galera replication are refused:

* CREATE TABLE (e.g. CREATE TABLE t1(a int) engine=Aria
* ALTER TABLE
* TRUNCATE TABLE
* CREATE VIEW
* CREATE TRIGGER
* CREATE INDEX
* DROP INDEX
* RENAME TABLE
* DROP TABLE

Statements on PROCEDURE, EVENT, FUNCTION are allowed as effected
tables are known only at execution. Furthermore, USER, ROLE, SERVER,
DATABASE statements are also allowed as they do not really have
effected table.
2020-02-11 15:17:50 +02:00
Alexander Barkov
83e75b39b3 MDEV-21702 Add a data type for privileges 2020-02-11 08:10:26 +04:00
mkaruza
74f7620636 MDEV-21598 Galera test galera.galera_sst_mysqldump does not take wsrep-new-cluster into account
Variable `wsrep_new_cluster` should be set to false after `wsrep_init_startup`.
Problem was that this was done before when mysqldump is used as SST method so option
wsrep-new-cluster didn't have any effect.
2020-01-30 14:53:42 +02:00
mkaruza
41bc736871 Galera GTID support
Support for galera GTID consistency thru cluster. All nodes in cluster
should have same GTID for replicated events which are originating from cluster.
Cluster originating commands need to contain sequential WSREP GTID seqno
Ignore manual setting of gtid_seq_no=X.

In master-slave scenario where master is non galera node replicated GTID is
replicated and is preserved in all nodes.

To have this - domain_id, server_id and seqnos should be same on all nodes.
Node which bootstraps the cluster, to achieve this, sends domain_id and
server_id to other nodes and this combination is used to write GTID for events
that are replicated inside cluster.

Cluster nodes that are executing non replicated events are going to have different
GTID than replicated ones, difference will be visible in domain part of gtid.

With wsrep_gtid_domain_id you can set domain_id for WSREP cluster.

Functions WSREP_LAST_WRITTEN_GTID, WSREP_LAST_SEEN_GTID and
WSREP_SYNC_WAIT_UPTO_GTID now works with "native" GTID format.

Fixed galera tests to reflect this chances.

Add variable to manually update WSREP GTID seqno in cluster

Add variable to manipulate and change WSREP GTID seqno. Next command
originating from cluster and on same thread will have set seqno and
cluster should change their internal counter to it's value.
Behavior is same as using @@gtid_seq_no for non WSREP transaction.
2020-01-29 15:06:06 +02:00
Marko Mäkelä
a983b24407 Merge 10.4 into 10.5 2020-01-28 14:17:09 +02:00
Jan Lindström
8a931e4d16 MDEV-17571 : Make systemd timeout behavior more compatible with long Galera SSTs
This is 10.4 version.

Idea is to create monitor thread for both donor and joiner that will
periodically if needed extend systemd timeout while SST is being
processed. In 10.4 actual SST is executed by running SST script
and exchanging messages on pipe using blocking fgets. This fix
starts monitoring thread before SST script is started and
we stop monitoring thread when SST has been completed.
2020-01-22 16:55:59 +02:00
Marko Mäkelä
8cc15c036d Merge 10.4 into 10.5 2019-12-27 21:17:16 +02:00
Jan Lindström
088de81d96 MDEV-21335 : Galera test failure on suite wsrep
Problem was that wsrep_on was OFF.

This is 10.4 version.
2019-12-18 08:22:07 +02:00
Marko Mäkelä
3c7718150d Merge 10.4 into 10.5 2019-12-17 14:46:57 +02:00
Teemu Ollakka
67e063eb94 Update wsrep-lib. (#1426)
This commit updates the wsrep-lib. The changes are a cleanup in
client_state TOI processing and stub methods for future extensions.
2019-12-16 07:50:15 +02:00
Marko Mäkelä
28c89b7151 Merge 10.4 into 10.5 2019-12-16 07:47:17 +02:00
Oleksandr Byelkin
a15234bf4b Merge branch '10.3' into 10.4 2019-12-09 15:09:41 +01:00
Oleksandr Byelkin
008ee867a4 Merge branch '10.2' into 10.3 2019-12-04 17:46:28 +01:00
Daniele Sciascia
aab6cefe8d MDEV-20848 Fixes for MTR test galera_sr.GCF-1060 (#1421)
This patch contains two fixes:

* wsrep_handle_mdl_conflict(): handle the case where SR transaction
  is in aborting state. Previously, a BF-BF conflict was reported, and
  the process would abort.
* wsrep_thd_bf_abort(): do not restore thread vars after calling
  wsrep_bf_abort(). Thread vars are already restored in wsrep-lib if
  necessary. This also removes the assumption that the caller of
  wsrep_thd_bf_abort() is the given bf_thd, which is not the case.

Also in this patch:

* Remove unnecessary check for active victim transaction in
  wsrep_thd_bf_abort(): the exact same check is performed later in
  wsrep_bf_abort().
* Make wsrep_thd_bf_abort() and wsrep_log_thd() const-correct.
* Change signature of wsrep_abort_thd() to take THD pointers instead
  of void pointers.
2019-12-04 09:21:14 +02:00
Oleksandr Byelkin
f8b5e147da Merge branch '10.1' into 10.2 2019-12-03 14:45:06 +01:00
Marko Mäkelä
5a00792c69 Merge 10.4 into 10.5 2019-11-29 11:25:40 +02:00
Tony Reix
ad5b7b157b MDEV-19510 Issue with: sql/wsrep_mysqld.cc : ip_len
Patch `36a2a185fe18` introduced `wsrep_server_incoming_address()` in `10.4`.
Since AIX `/usr/include/netinet/ip.h` header defines `ip_len` as `ip_ff.ip_flen`
and `size_t const ip_len` is preprocessed as `size_t const ip_ff.ip_vhltl.ip_x.ip_xlen`,
to prevent the define from overwriting code in MariaDB,
rename the variable name to `ip_len_mdb`.

This patch is done by Tony Reix <tony.reix@atos.net>.
This patch was submitted under MCA.
Closes #1307
2019-11-29 07:01:39 +01:00
seppo
5c68343db7 MDEV-18497 CTAS async replication from mariadb master crashes galera nodes (#1410)
This PR contains a mtr test for reproducing a failure with replicating create table as select statement (CTAS) through asynchronous mariadb replication to mariadb galera cluster.
The problem happens when CTAS replication contains both create table statement followed by row events for populating the table. In such situation, the galera node operating as mariadb replication slave, will first replicate only the create table part into the cluster, and then perform another replication containing both the create table and row events. This will lead all other nodes to fail for duplicate table create attempt, and crash due to this failure.

PR contains also a fix, which identifies the situation when CTAS has been replicated, and makes further scan in async replication stream to see if there are following row events. The slave node will replicate either single TOI in case the CTAS table is empty, or if CTAS table contains rows, then single bundled write set with create table and row events is replicated to galera cluster.

This fix should keep master server's GTID's for CTAS replication in sync with GTID's in galera cluster.
2019-11-18 15:18:00 +02:00
Oleksandr Byelkin
3ad37ed0eb Merge 10.4 into 10.5 2019-11-07 08:52:30 +01:00
Marko Mäkelä
ec40980ddd Merge 10.3 into 10.4 2019-11-01 15:23:18 +02:00
Marko Mäkelä
6801f80afa MDEV-19457 sys_vars.wsrep_provider_basic failed
wsrep_init_provider_status_variables(): Always set wsrep_inited
to ensure that the memory will be freed.

The initial patch was provided by Julius Goryavsky.
2019-11-01 15:19:13 +02:00
Oleksandr Byelkin
55b2281a5d Merge branch '10.2' into 10.3 2019-10-31 10:58:06 +01:00
Jan Lindström
36a9694378 MDEV-18562 [ERROR] InnoDB: WSREP: referenced FK check fail: Lock wait index
Lock wait can happen on secondary index when doing FK checks for wsrep.
We should just return error to upper layer and applier will retry
operation when needed.
2019-10-30 10:14:56 +02:00
Michael Widenius
716d396bb3 Remove \n from DBUG_PRINT statements 2019-10-21 18:41:58 +03:00
Marko Mäkelä
627027a674 Merge 10.4 into 10.5 2019-10-04 10:56:47 +03:00
seppo
c42c4233cb MDEV-20225 BF aborting SP execution (#1394)
* MDEV-20225 BF aborting SP execution

When stored procedure execution was chosen as victim for a BF abort, the old implemnetationn called for rollback immediately
when execution was inside SP isntruction. Technically this happened in wsrep_after_statement() call, which identified the
need for a rollback.
The problem was that MariaDB does not accept rollback (nor commit) inside sub statement, there are several asserts about it,
checking for THD::in_sub_stmt.

This patch contains a fix, which skips calling wsrep_after_statement() for SP execution, which is marked as BF must abort. Instead,
we return error code to upper level, where rollback will eventually happen, ouside of SP execution.
Also, appending the affected trigger table (dropped or created) in the populated key set for the write set,
which prevents parallel applying of other transactions working on the same table.

* MDEV-20225 BF aborting SP execution, second patch

First PR missed 4 commits, which are now squashed in this patch:
- Added galera_sp_bf_abort test.
  A MTR test case which will reproduce BF-BF conflict if all keys
  corresponding to affected tables are not assigned for DROP TRIGGER.
- Fixed incorrect use of sync pointsin MDEV-20225
- Added condition for SQLCOM_DROP_TRIGGER in wsrep_can_run_in_toi()
  to make it replicate.

* MDEV-20225 BF aborting SP execution, third patch

The galera_trigger.test caused a situation, where SP invocation caused a trigger
to fire, and the trigger executed as sub statement SP, and was BF aborted by applier.
because of wsrep_after_statement() was called for the sub-statement level, it ended up
in exeuting rollback and asserted there.
Thus fix will catch sub-statement level SP execution, and avoids calling wsrep_after_statement()
2019-10-01 10:41:33 +03:00
Marko Mäkelä
d28686ada6 Merge 10.4 into 10.5 2019-09-12 16:36:46 +03:00
Marko Mäkelä
60c04be659 Merge 10.3 into 10.4 2019-09-12 12:16:40 +03:00
Marko Mäkelä
ff5ecfd335 Correct the merge 0f83c8878d
Re-enable some Galera tests that should have been enabled.

Add client_ed25519.so to debian/libmariadb3.install;
merge e47a143fc0 correctly.

Remove a duplicated #include from wsrep_mysqld.cc.
2019-09-10 10:04:04 +03:00
Marko Mäkelä
780d2bb8a7 Merge 10.4 into 10.5 2019-09-06 14:25:20 +03:00
Teemu Ollakka
9487e0b259 MDEV-19826 10.4 seems to crash with "pool-of-threads" (#1370)
MariaDB 10.4 was crashing when thread-handling was set to
pool-of-threads and wsrep was enabled.

There were two apparent reasons for the crash:
- Connection handling in threadpool_common.cc was missing calls to
  control wsrep client state.
- Thread specific storage which contains thread variables (THR_KEY_mysys)
  was not handled appropriately by wsrep patch when pool-of-threads
  was configured.

This patch addresses the above issues in the following way:
- Wsrep client state open/close was moved in thd_prepare_connection() and
  end_connection() to have common handling for one-thread-per-connection
  and pool-of-threads.
- Thread local storage handling in wsrep patch was reworked by introducing
  set of wsrep_xxx_threadvars() calls which replace calls to
  THD store_globals()/reset_globals() and deal with thread handling
  specifics internally.

Wsrep-lib was updated to version which relaxes internal concurrency
related sanity checks.

Rollback code from wsrep_rollback_process() was extracted to separate calls
for better readability.

Post rollback thread was removed as it was completely unused.
2019-08-30 08:42:24 +03:00
Alexey Yurchenko
41fa564c88 MDEV-17048 Inconsistency voting support (#1373)
* Collect and pass apply error data to provider
 * Rollback failed transaction and continue operation if provider returns
   SUCCESS
 * MTR tests for inconsistency voting
2019-08-28 09:19:24 +03:00
Monty
1fbaf8b6a8 Decrease stack space usage of mysql_execute_command()
The extensive usage of stack space, especially when used with ASan
(AdressSanitizer) of mysql_execute_command caused the test
rpl.rpl_row_sp011 to fail because it did run out of stack.  In this
test case mysql_execute_command is called recursively for each
function all.

Changes done:
- Changed a few functions that used big local variables to be marked
  __attribute__ ((noinline))
- Moved sub parts that used big local variables to external functions.
- Changed wsrep_commit_empty() from inline to normal function as this used
  more than 1K of stack space and because there is no reason for this
  rarely used function to be inline.

End result (with gcc 7.4.1 on Intel Xeon):

Starting point for stack space usage:

gcc -O:                                  7800
gcc with -fsanitize=address -O (ASan) : 27240

After this patch:

gcc -O:                                  1160
gcc -O0 (debug build)                    1584
gcc with -fsanitize=address -O (ASan):   4424
gcc with -fsanitize=address -O2 (ASan):  3874

A 6x improvement and will allow us to run all mtr tests with ASan.
2019-08-23 22:06:30 +02:00
Jan Lindström
7b4de10477 MDEV-20378: Galera uses uninitialized memory
Problem was that wsrep thread argument was deleted on wrong
place. Furthermore, scan method incorrectly used unsafe c_ptr().
Finally, fixed wsrep thread initialization to correctly set
up thread_id and pass correct argument to functions and
fix signess problem causing compiler errors.
2019-08-20 10:32:04 +03:00
Marko Mäkelä
67ddb6507d Merge 10.4 into 10.5 2019-08-16 14:35:32 +03:00
Marko Mäkelä
1d15a28e52 Merge 10.3 into 10.4 2019-08-14 18:06:51 +03:00
Marko Mäkelä
65d48b4a7b Merge 10.2 to 10.3 2019-08-13 19:28:51 +03:00
Marko Mäkelä
624dd71b94 Merge 10.4 into 10.5 2019-08-13 18:57:00 +03:00