If the server is compiled WITH_WSREP=OFF, we should avoid evaluating
conditions on a global variable that is constant.
WSREP_ON_: Renamed from WSREP_ON. Defined only WITH_WSREP=ON.
WSREP_ON: Defined as unlikely(WSREP_ON_).
wsrep_on(): Defined as WSREP_ON && wsrep_service->wsrep_on_func().
The reason why we have wsrep_on() at all is that the macro WSREP(thd)
depends on the definition of THD, and that is intentionally an opaque
data type for InnoDB. So, we cannot avoid invoking wsrep_on(), but
we can evaluate the less expensive condition WSREP_ON before calling
the function.
- Made WSREP_LOG a function and moved the body out of header.
- Reduced the stack allocated buffer size and implemented
reprint into dynamically allocated buffer if stack buffer is not
large enough to hold the message.
Introduced a new wsrep_strict_ddl configuration variable in which
Galera checks storage engine of the effected table. If table is not
InnoDB (only storage engine currently fully supporting Galera
replication) DDL-statement will return error code:
ER_GALERA_REPLICATION_NOT_SUPPORTED
eng "DDL-statement is forbidden as table storage engine does not support Galera replication"
However, when wsrep_replicate_myisam=ON we allow DDL-statements to
MyISAM tables. If effected table is allowed storage engine Galera
will run normal TOI.
This new setting should be for now set globally on all
nodes in a cluster. When this setting is set following DDL-clauses
accessing tables not supporting Galera replication are refused:
* CREATE TABLE (e.g. CREATE TABLE t1(a int) engine=Aria
* ALTER TABLE
* TRUNCATE TABLE
* CREATE VIEW
* CREATE TRIGGER
* CREATE INDEX
* DROP INDEX
* RENAME TABLE
* DROP TABLE
Statements on PROCEDURE, EVENT, FUNCTION are allowed as effected
tables are known only at execution. Furthermore, USER, ROLE, SERVER,
DATABASE statements are also allowed as they do not really have
effected table.
Variable `wsrep_new_cluster` should be set to false after `wsrep_init_startup`.
Problem was that this was done before when mysqldump is used as SST method so option
wsrep-new-cluster didn't have any effect.
Support for galera GTID consistency thru cluster. All nodes in cluster
should have same GTID for replicated events which are originating from cluster.
Cluster originating commands need to contain sequential WSREP GTID seqno
Ignore manual setting of gtid_seq_no=X.
In master-slave scenario where master is non galera node replicated GTID is
replicated and is preserved in all nodes.
To have this - domain_id, server_id and seqnos should be same on all nodes.
Node which bootstraps the cluster, to achieve this, sends domain_id and
server_id to other nodes and this combination is used to write GTID for events
that are replicated inside cluster.
Cluster nodes that are executing non replicated events are going to have different
GTID than replicated ones, difference will be visible in domain part of gtid.
With wsrep_gtid_domain_id you can set domain_id for WSREP cluster.
Functions WSREP_LAST_WRITTEN_GTID, WSREP_LAST_SEEN_GTID and
WSREP_SYNC_WAIT_UPTO_GTID now works with "native" GTID format.
Fixed galera tests to reflect this chances.
Add variable to manually update WSREP GTID seqno in cluster
Add variable to manipulate and change WSREP GTID seqno. Next command
originating from cluster and on same thread will have set seqno and
cluster should change their internal counter to it's value.
Behavior is same as using @@gtid_seq_no for non WSREP transaction.
This is 10.4 version.
Idea is to create monitor thread for both donor and joiner that will
periodically if needed extend systemd timeout while SST is being
processed. In 10.4 actual SST is executed by running SST script
and exchanging messages on pipe using blocking fgets. This fix
starts monitoring thread before SST script is started and
we stop monitoring thread when SST has been completed.
This patch contains two fixes:
* wsrep_handle_mdl_conflict(): handle the case where SR transaction
is in aborting state. Previously, a BF-BF conflict was reported, and
the process would abort.
* wsrep_thd_bf_abort(): do not restore thread vars after calling
wsrep_bf_abort(). Thread vars are already restored in wsrep-lib if
necessary. This also removes the assumption that the caller of
wsrep_thd_bf_abort() is the given bf_thd, which is not the case.
Also in this patch:
* Remove unnecessary check for active victim transaction in
wsrep_thd_bf_abort(): the exact same check is performed later in
wsrep_bf_abort().
* Make wsrep_thd_bf_abort() and wsrep_log_thd() const-correct.
* Change signature of wsrep_abort_thd() to take THD pointers instead
of void pointers.
Patch `36a2a185fe18` introduced `wsrep_server_incoming_address()` in `10.4`.
Since AIX `/usr/include/netinet/ip.h` header defines `ip_len` as `ip_ff.ip_flen`
and `size_t const ip_len` is preprocessed as `size_t const ip_ff.ip_vhltl.ip_x.ip_xlen`,
to prevent the define from overwriting code in MariaDB,
rename the variable name to `ip_len_mdb`.
This patch is done by Tony Reix <tony.reix@atos.net>.
This patch was submitted under MCA.
Closes#1307
This PR contains a mtr test for reproducing a failure with replicating create table as select statement (CTAS) through asynchronous mariadb replication to mariadb galera cluster.
The problem happens when CTAS replication contains both create table statement followed by row events for populating the table. In such situation, the galera node operating as mariadb replication slave, will first replicate only the create table part into the cluster, and then perform another replication containing both the create table and row events. This will lead all other nodes to fail for duplicate table create attempt, and crash due to this failure.
PR contains also a fix, which identifies the situation when CTAS has been replicated, and makes further scan in async replication stream to see if there are following row events. The slave node will replicate either single TOI in case the CTAS table is empty, or if CTAS table contains rows, then single bundled write set with create table and row events is replicated to galera cluster.
This fix should keep master server's GTID's for CTAS replication in sync with GTID's in galera cluster.
wsrep_init_provider_status_variables(): Always set wsrep_inited
to ensure that the memory will be freed.
The initial patch was provided by Julius Goryavsky.
Lock wait can happen on secondary index when doing FK checks for wsrep.
We should just return error to upper layer and applier will retry
operation when needed.
* MDEV-20225 BF aborting SP execution
When stored procedure execution was chosen as victim for a BF abort, the old implemnetationn called for rollback immediately
when execution was inside SP isntruction. Technically this happened in wsrep_after_statement() call, which identified the
need for a rollback.
The problem was that MariaDB does not accept rollback (nor commit) inside sub statement, there are several asserts about it,
checking for THD::in_sub_stmt.
This patch contains a fix, which skips calling wsrep_after_statement() for SP execution, which is marked as BF must abort. Instead,
we return error code to upper level, where rollback will eventually happen, ouside of SP execution.
Also, appending the affected trigger table (dropped or created) in the populated key set for the write set,
which prevents parallel applying of other transactions working on the same table.
* MDEV-20225 BF aborting SP execution, second patch
First PR missed 4 commits, which are now squashed in this patch:
- Added galera_sp_bf_abort test.
A MTR test case which will reproduce BF-BF conflict if all keys
corresponding to affected tables are not assigned for DROP TRIGGER.
- Fixed incorrect use of sync pointsin MDEV-20225
- Added condition for SQLCOM_DROP_TRIGGER in wsrep_can_run_in_toi()
to make it replicate.
* MDEV-20225 BF aborting SP execution, third patch
The galera_trigger.test caused a situation, where SP invocation caused a trigger
to fire, and the trigger executed as sub statement SP, and was BF aborted by applier.
because of wsrep_after_statement() was called for the sub-statement level, it ended up
in exeuting rollback and asserted there.
Thus fix will catch sub-statement level SP execution, and avoids calling wsrep_after_statement()
Re-enable some Galera tests that should have been enabled.
Add client_ed25519.so to debian/libmariadb3.install;
merge e47a143fc0 correctly.
Remove a duplicated #include from wsrep_mysqld.cc.
MariaDB 10.4 was crashing when thread-handling was set to
pool-of-threads and wsrep was enabled.
There were two apparent reasons for the crash:
- Connection handling in threadpool_common.cc was missing calls to
control wsrep client state.
- Thread specific storage which contains thread variables (THR_KEY_mysys)
was not handled appropriately by wsrep patch when pool-of-threads
was configured.
This patch addresses the above issues in the following way:
- Wsrep client state open/close was moved in thd_prepare_connection() and
end_connection() to have common handling for one-thread-per-connection
and pool-of-threads.
- Thread local storage handling in wsrep patch was reworked by introducing
set of wsrep_xxx_threadvars() calls which replace calls to
THD store_globals()/reset_globals() and deal with thread handling
specifics internally.
Wsrep-lib was updated to version which relaxes internal concurrency
related sanity checks.
Rollback code from wsrep_rollback_process() was extracted to separate calls
for better readability.
Post rollback thread was removed as it was completely unused.
* Collect and pass apply error data to provider
* Rollback failed transaction and continue operation if provider returns
SUCCESS
* MTR tests for inconsistency voting
The extensive usage of stack space, especially when used with ASan
(AdressSanitizer) of mysql_execute_command caused the test
rpl.rpl_row_sp011 to fail because it did run out of stack. In this
test case mysql_execute_command is called recursively for each
function all.
Changes done:
- Changed a few functions that used big local variables to be marked
__attribute__ ((noinline))
- Moved sub parts that used big local variables to external functions.
- Changed wsrep_commit_empty() from inline to normal function as this used
more than 1K of stack space and because there is no reason for this
rarely used function to be inline.
End result (with gcc 7.4.1 on Intel Xeon):
Starting point for stack space usage:
gcc -O: 7800
gcc with -fsanitize=address -O (ASan) : 27240
After this patch:
gcc -O: 1160
gcc -O0 (debug build) 1584
gcc with -fsanitize=address -O (ASan): 4424
gcc with -fsanitize=address -O2 (ASan): 3874
A 6x improvement and will allow us to run all mtr tests with ASan.
Problem was that wsrep thread argument was deleted on wrong
place. Furthermore, scan method incorrectly used unsafe c_ptr().
Finally, fixed wsrep thread initialization to correctly set
up thread_id and pass correct argument to functions and
fix signess problem causing compiler errors.