Commit graph

370 commits

Author SHA1 Message Date
Sergei Golubchik
25d9d2e37f Merge branch 'bb-10.4-release' into bb-10.5-release 2021-02-15 16:43:15 +01:00
Sergei Golubchik
2696538723 updating @@wsrep_cluster_address deadlocks
wsrep_cluster_address_update() causes LOCK_wsrep_slave_threads
to be locked under LOCK_wsrep_cluster_config, while normally
the order should be the opposite.

Fix: don't protect @@wsrep_cluster_address value with the
LOCK_wsrep_cluster_config, LOCK_global_system_variables is enough.

Only protect wsrep reinitialization with the LOCK_wsrep_cluster_config.
And make it use a local copy of the global @@wsrep_cluster_address.

Also, introduce a helper function that checks whether
wsrep_cluster_address is set and also asserts that it can be safely
read by the caller.
2021-02-14 23:18:42 +01:00
Sergei Golubchik
259a1902a0 cleanup: THD::abort_current_cond_wait()
* reuse the loop in THD::abort_current_cond_wait, don't duplicate it
* find_thread_by_id should return whatever it has found, it's the
  caller's task not to kill COM_DAEMON (if the caller's a killer)

and other minor changes
2021-02-12 18:05:34 +01:00
Marko Mäkelä
8de233af81 Merge 10.4 into 10.5 2021-01-11 16:29:51 +02:00
Jan Lindström
49b8774951 MDEV-24546 : AddressSanitizer: initialization-order-fiasco on address ... in Sys_var_integer from __static_initialization_and_destruction_0, possibly inside global var wsrep_gtid_server
Galera parameter wsrep_gtid_domain_id was defined using a class where
actual parameter was not a first member. Fixed this by using normal
variable and assigning this value to class member value.
2021-01-09 09:03:39 +02:00
Alexey Yurchenko
033f8d13ce Update wsrep-lib (new logger interface)
Ensure consistent use of logging macros in wsrep-related code

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-01-07 17:41:21 +02:00
Oleksandr Byelkin
02e7bff882 Merge commit '10.4' into 10.5 2021-01-06 10:53:00 +01:00
mkaruza
b79b3ff655 MDEV-23468: inline_mysql_socket_send: Assertion `mysql_socket.fd != -1' failed on shutdown
Closing remaining threads in `wsrep_close_client_connections` should also
check `thd_is_connection_alive` for thd before closing connection. Assert is
happening when thread already doing shutdown, but still not removed from threads
list.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2020-12-21 11:24:24 +02:00
Marko Mäkelä
6a1e655cb0 Merge 10.4 into 10.5 2020-12-02 18:29:49 +02:00
Marko Mäkelä
24ec8eaf66 MDEV-15532 after-merge fixes from Monty
The Galera tests were massively failing with debug assertions.
2020-12-02 16:16:29 +02:00
Marko Mäkelä
d7a5824899 Merge 10.4 into 10.5 2020-11-13 21:54:21 +02:00
sjaakola
2fbcddbeaf MDEV-24119 MDL BF-BF Conflict caused by TRUNCATE TABLE
A follow-up fix, for original fix for MDEV-21577, which did not handle well
temporary tables.

OPTIMIZE and REPAIR TABLE statements can take a list of tables as argument,
and some of the tables may be temporary. Proper handling of temporary tables
is to skip them and continue working on the real tables. The bad version, skipped all tables,
if a single temporary table was in the argument list. And this resulted so that FK parent
tables were not scnanned for the remaining real tables. Some mtr tests, using OPTIMIZE or REPAIR
for temporary tables caused regressions bacause of this, e.g. galera.galera_optimize_analyze_multi

The fix in this PR opens temporary and real tables first, and in the table handling loop skips
temporary tables, FK parent scanning is done only for real tables.

The test has new scenario for OPTIMIZE and REPAIR issued for two tables of which one is temporary table.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2020-11-11 17:46:50 +02:00
sjaakola
4d6c661144 MDEV-21577 MDL BF-BF conflict
Some DDL statements appear to acquire MDL locks for a table referenced by
foreign key constraint from the actual affected table of the DDL statement.
OPTIMIZE, REPAIR and ALTER TABLE belong to this class of DDL statements.

Earlier MariaDB version did not take this in consideration, and appended
only affected table in the certification key list in write set.
Because of missing certification information, it could happen that e.g.
OPTIMIZE table for FK child table could be allowed to apply in parallel
with DML operating on the foreign key parent table, and this could lead to
unhandled MDL lock conflicts between two high priority appliers (BF).

The fix in this patch, changes the TOI replication for OPTIMIZE, REPAIR and
ALTER TABLE statements so that before the execution of respective DDL
statement, there is foreign key parent search round. This FK parent search
contains following steps:
* open and lock the affected table (with permissive shared locks)
* iterate over foreign key contstraints and collect and array of Fk parent
  table names
* close all tables open for the THD and release MDL locks
* do the actual TOI replication with the affected table and FK parent
  table names as key values

The patch contains also new mtr test for verifying that the above mentioned
DDL statements replicate without problems when operating on FK child table.
The mtr test scenario #1, which can be used to check if some other DDL
(on top of OPTIMIZE, REPAIR and ALTER) could cause similar excessive FK
parent table locking.

Reviewed-by: Aleksey Midenkov <aleksey.midenkov@mariadb.com>
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2020-11-03 19:40:06 +02:00
Jan Lindström
c5517cd864 MDEV-23638 : DROP TRIGGER in Galera Cluster not replicating
Drop trigger handling was missing from wsrep_can_run_in_toi
in 10.5 for some reason.
2020-09-08 11:14:37 +03:00
Marko Mäkelä
5ff7e68c7e Merge 10.4 into 10.5 2020-09-04 18:44:44 +03:00
Marko Mäkelä
c9cf6b13f6 Merge 10.3 into 10.4 2020-09-03 15:53:38 +03:00
Jan Lindström
33ae1616e0 MDEV-21578 : CREATE OR REPLACE TRIGGER in Galera cluster not replicating
In 10.3 OR REPLACE trigger option is part of create_info.
2020-09-03 14:10:42 +03:00
Marko Mäkelä
c3752cef3c Merge 10.2 into 10.3 2020-09-03 09:26:54 +03:00
Jan Lindström
c710c450e3 MDEV-21578 : CREATE OR REPLACE TRIGGER in Galera cluster not replicating
While doing TOI buffer OR REPLACE option was not added to replicated
string.
2020-08-28 16:40:12 +03:00
Marko Mäkelä
d5d8756de3 Merge 10.4 into 10.5 2020-08-20 12:52:44 +03:00
Daniele Sciascia
09dd06f14a MDEV-22443 wsrep::runtime_error on START TRANSACTION
This happens with global wsrep_on disabled and local wsrep_on enabled.
The fix consists in avoiding sync wait when global wsrep_on is
disabled.
2020-08-19 13:12:00 +03:00
Monty
61c15ebe32 Remove String::lex_string() and String::lex_cstring()
- Better to use 'String *' directly.
- Added String::get_value(LEX_STRING*) for the few cases where we want to
  convert a String to LEX_CSTRING.

Other things:
- Use StringBuffer for some functions to avoid mallocs
2020-07-23 10:54:32 +03:00
Eugene Kosov
d712956526 MDEV-19749 MDL scalability regression after backup locks
use ilist instread of I_P_List because it's generally
slightly faster on inserting, removing and iterating
2020-06-23 23:34:42 +03:00
Marko Mäkelä
b3e395a13e Merge 10.2 into 10.3 2020-06-06 18:50:25 +03:00
Julius Goryavsky
5f55f69e4a Merge 10.1 into 10.2 2020-06-05 18:32:37 +02:00
sjaakola
8ec0e9111a MDEV-22763 backporting MDEV-20225 fix into 10.1
Backported the support for aborting and replaying stored procedure and fix for trigger
key assigments from 10.4 version.
Backported also two mtr tests: wsrep_sp_bf_abort and MDEV-20225
2020-06-03 15:34:44 +02:00
Monty
9bf479b0cf Update galera to work with independent sub transactions 2020-05-23 12:29:10 +03:00
Monty
b1fabf6cc9 Performance improvements to test if WSREP if active 2020-05-23 12:29:10 +03:00
Sergey Vojtovich
91734431ba Move all thread cache specific code to a new class
Part of
MDEV-18353 - Shutdown may miss to wait for connection thread
2020-05-06 13:50:35 +04:00
Eugene Kosov
89ff4176c1 MDEV-22437 make THR_THD* variable thread_local
Now all access goes through _current_thd() and set_current_thd()
functions.

Some functions like THD::store_globals() can not fail now.
2020-05-05 18:13:31 +03:00
Marko Mäkelä
fbe2712705 Merge 10.4 into 10.5
The functional changes of commit 5836191c8f
(MDEV-21168) are omitted due to MDEV-742 having addressed the issue.
2020-04-25 21:57:52 +03:00
Marko Mäkelä
2c39f69d34 MDEV-22203: WSREP_ON is unnecessarily expensive WITH_WSREP=OFF
If the server is compiled WITH_WSREP=OFF, we should avoid evaluating
conditions on a global variable that is constant.

WSREP_ON_: Renamed from WSREP_ON. Defined only WITH_WSREP=ON.

WSREP_ON: Defined as unlikely(WSREP_ON_).

wsrep_on(): Defined as WSREP_ON && wsrep_service->wsrep_on_func().

The reason why we have wsrep_on() at all is that the macro WSREP(thd)
depends on the definition of THD, and that is intentionally an opaque
data type for InnoDB. So, we cannot avoid invoking wsrep_on(), but
we can evaluate the less expensive condition WSREP_ON before calling
the function.
2020-04-24 15:25:39 +03:00
Jan Lindström
93475aff8d MDEV-22203: WSREP_ON is unnecessarily expensive to evaluate
Replaced WSREP_ON macro by single global variable WSREP_ON
that is then updated at server statup and on wsrep_on and
wsrep_provider update functions.
2020-04-24 13:12:46 +03:00
Marko Mäkelä
e52a36d37b Merge 10.1 into 10.2 2020-04-22 13:50:53 +03:00
Marko Mäkelä
7198c6ab2d MDEV-22271 Excessive stack memory usage due to WSREP_LOG
Several tests that involve stored procedures fail on 10.4 kvm-asan
(clang 10) due to stack overrun. The main contributor to this stack
overrun is mysql_execute_command(), which is invoked recursively
during stored procedure execution.

Rebuilding with cmake -DWITH_WSREP=OFF shrunk the stack frame size
of mysql_execute_command() by more than 10 kilobytes in a
WITH_ASAN=ON, CMAKE_BUILD_TYPE=Debug build. The culprit
turned out to be the macro WSREP_LOG, which is allocating a
separate 1KiB buffer for every occurrence.

We replace the macro with a function, so that the stack will be
allocated only when the function is actually invoked. In this way,
no stack space will be wasted by default (when WSREP and Galera
are disabled).

This backports commit b6c5657ef2
from MariaDB 10.3.1.

Without ASAN, compilers can be smarter and optimize the stack usage.
The original commit message mentions that 1KiB was saved on GCC 5.4,
and 4KiB on Mac OS X Lion, which presumably uses a clang-based compiler.
2020-04-17 10:54:56 +03:00
Teemu Ollakka
c79051e587 MDEV-22271 Excessive stack memory usage due to WSREP_LOG
- Made WSREP_LOG a function and moved the body out of header.
- Reduced the stack allocated buffer size and implemented
  reprint into dynamically allocated buffer if stack buffer is not
  large enough to hold the message.
2020-04-17 10:46:09 +03:00
Jan Lindström
c7ab676192 MDEV-22075 : Server crashes in wsrep_should_replicate_ddl_iterate upon CREATE VIEW
Fixed incorrect pointer reference when table is not available.
2020-04-08 18:09:28 +03:00
mkaruza
edc3899d97 MDEV-22051: Protocol::end_statement(): Assertion `0' failed on Galera node upon DDL attempt with conflicting lock
If FTWRL is issued, DDL statements should report error back to user before
TOI is started.
2020-04-08 16:42:18 +03:00
Marko Mäkelä
1a9b6c4c7f Merge 10.2 into 10.3 2020-03-30 11:12:56 +03:00
seppo
5918b17004
MDEV-21473 conflicts with async slave BF aborting (#1475)
If async slave thread (slave SQL handler), becomes a BF victim, it may occasionally happen that rollbacker thread is used to carry out the rollback instead of the async slave thread.
This can happen, if async slave thread has flagged "idle" state when BF thread tries to figure out how to kill the victim.
The issue was possible to test by using a galera cluster as slave for external master, and issuing high load of conflicting writes through async replication and directly against galera cluster nodes.
However, a deterministic mtr test for the "conflict window" has not yet been worked on.

The fix, in this patch makes sure that async slave thread state is never set to IDLE. This prevents the rollbacker thread to intervene.
The wsrep_query_state change was refactored to happen by dedicated function to make controlling the idle state change in one place.
2020-03-24 11:01:42 +02:00
Sergei Golubchik
70e7b5095d perfschema sp instrumentation related changes 2020-03-10 19:24:23 +01:00
Sergei Golubchik
7c58e97bf6 perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
Jan Lindström
e6a50e41da MDEV-20051: Add new mode to wsrep_OSU_method in which Galera checks storage engine of the effected table
Introduced a new wsrep_strict_ddl configuration variable in which
Galera checks storage engine of the effected table. If table is not
InnoDB (only storage engine currently fully supporting Galera
replication) DDL-statement will return error code:

ER_GALERA_REPLICATION_NOT_SUPPORTED
       eng "DDL-statement is forbidden as table storage engine does not support Galera replication"

However, when wsrep_replicate_myisam=ON we allow DDL-statements to
MyISAM tables. If effected table is allowed storage engine Galera
will run normal TOI.

This new setting should be for now set globally on all
nodes in a cluster. When this setting is set following DDL-clauses
accessing tables not supporting Galera replication are refused:

* CREATE TABLE (e.g. CREATE TABLE t1(a int) engine=Aria
* ALTER TABLE
* TRUNCATE TABLE
* CREATE VIEW
* CREATE TRIGGER
* CREATE INDEX
* DROP INDEX
* RENAME TABLE
* DROP TABLE

Statements on PROCEDURE, EVENT, FUNCTION are allowed as effected
tables are known only at execution. Furthermore, USER, ROLE, SERVER,
DATABASE statements are also allowed as they do not really have
effected table.
2020-02-11 15:17:50 +02:00
Alexander Barkov
83e75b39b3 MDEV-21702 Add a data type for privileges 2020-02-11 08:10:26 +04:00
mkaruza
74f7620636 MDEV-21598 Galera test galera.galera_sst_mysqldump does not take wsrep-new-cluster into account
Variable `wsrep_new_cluster` should be set to false after `wsrep_init_startup`.
Problem was that this was done before when mysqldump is used as SST method so option
wsrep-new-cluster didn't have any effect.
2020-01-30 14:53:42 +02:00
mkaruza
41bc736871 Galera GTID support
Support for galera GTID consistency thru cluster. All nodes in cluster
should have same GTID for replicated events which are originating from cluster.
Cluster originating commands need to contain sequential WSREP GTID seqno
Ignore manual setting of gtid_seq_no=X.

In master-slave scenario where master is non galera node replicated GTID is
replicated and is preserved in all nodes.

To have this - domain_id, server_id and seqnos should be same on all nodes.
Node which bootstraps the cluster, to achieve this, sends domain_id and
server_id to other nodes and this combination is used to write GTID for events
that are replicated inside cluster.

Cluster nodes that are executing non replicated events are going to have different
GTID than replicated ones, difference will be visible in domain part of gtid.

With wsrep_gtid_domain_id you can set domain_id for WSREP cluster.

Functions WSREP_LAST_WRITTEN_GTID, WSREP_LAST_SEEN_GTID and
WSREP_SYNC_WAIT_UPTO_GTID now works with "native" GTID format.

Fixed galera tests to reflect this chances.

Add variable to manually update WSREP GTID seqno in cluster

Add variable to manipulate and change WSREP GTID seqno. Next command
originating from cluster and on same thread will have set seqno and
cluster should change their internal counter to it's value.
Behavior is same as using @@gtid_seq_no for non WSREP transaction.
2020-01-29 15:06:06 +02:00
Marko Mäkelä
a983b24407 Merge 10.4 into 10.5 2020-01-28 14:17:09 +02:00
Oleksandr Byelkin
ceda5f724f Merge branch '10.2' into 10.3 2020-01-24 14:16:20 +01:00
Oleksandr Byelkin
f2ccfcaca1 Merge branch '10.1' into 10.2 2020-01-24 13:46:49 +01:00
Jan Lindström
8a931e4d16 MDEV-17571 : Make systemd timeout behavior more compatible with long Galera SSTs
This is 10.4 version.

Idea is to create monitor thread for both donor and joiner that will
periodically if needed extend systemd timeout while SST is being
processed. In 10.4 actual SST is executed by running SST script
and exchanging messages on pipe using blocking fgets. This fix
starts monitoring thread before SST script is started and
we stop monitoring thread when SST has been completed.
2020-01-22 16:55:59 +02:00