Merged from mysql-wsrep-bugs following:
GCF-1058 MTR test galera.MW-86 fails on repeated runs
Wait for the sync point sync.wsrep_apply_cb to be reached before
executing the test and clearing the debug flag sync.wsrep_apply_cb.
The race scenario:
Intended behavior:
node2: set sync.wsrep_apply_cb in order to start waiting in the background INSERT
node1: INSERT start
node2 (background): INSERT start
node1: INSERT end
node2: send signal to background INSERT: "stop waiting and continue executing"
node2: clear sync.wsrep_apply_cb as no longer needed
node2 (background): consume the signal
node2 (background): INSERT end
node2: DROP TABLE
node2: check no pending signals are left - ok
What happens occasionally (unexpected):
node2: set sync.wsrep_apply_cb in order to start waiting in the background INSERT
node1: INSERT start
node2 (background): INSERT start
node1: INSERT end
// The background INSERT still has _not_ reached the place where it starts
// waiting for the signal:
// DBUG_EXECUTE_IF("sync.wsrep_apply_cb", "now wait_for...");
node2: send signal to background INSERT: "stop waiting and continue executing"
node2: clear sync.wsrep_apply_cb as no longer needed
// The background INSERT reaches DBUG_EXECUTE_IF("sync.wsrep_apply_cb", ...)
// but sync.wsrep_apply_cb has already been cleared and the "wait" code is not
// executed. The signal remains unconsumed.
node2 (background): INSERT end
node2: DROP TABLE
node2: check no pending signals are left - failure, signal.wsrep_apply_cb is
pending (not consumed)
Remove MW-360 test case as it is not intended for MariaDB (uses
MySQL GTID).
Applier does not reset thd->wsrep_apply_toi if applier thread decides
to exit by setting 'exit= true'. The problem is that galera side may
decide not to kill the applier thread: for instance if we try to
SET wsrep_slave_threads = 0; then galera refuses to kill the last
applier thread. If this happens we are left with a thd which has not
been reset to the initial state.
This patch ensures that the thd is reset regardless of the applier
thread exiting or not.
... causes MariaDB to crash
On error, the wsrep replication buffer (binlog) is dumped to a file
to aid investigations. In order to also include the binlog header,
FDLE object is also needed. This object is only available for wsrep-
threads.
Fix: Instantiate an FDLE object for non-wsrep threads.
This changes variable wsrep_max_ws_size so that its value
is linked to the value of provider option repl.max_ws_size.
That is, changing the value of variable wsrep_max_ws_size
will change the value of provider option repl.max_ws_size,
and viceversa.
The writeset size limit is always enforced in the provider,
regardless of which option is used.
- Removes useless call to wsrep_xid_init() in wsrep_apply_events().
Transaction's xid is already initialized at that point.
- Adds call to wsrep_set_SE_checkpoint() for committing TOI events
in the applier side.
- Includes test case that reproduced the issue.
1. factored XID-related functions to a separate wsrep_xid.cc unit.
2. refactored them to take refrences instead of pointers where appropriate
3. implemented wsrep_get/set_SE_position to take wsrep_uuid_t and wsrep_seqno_t instead of XID
4. call wsrep_set_SE_position() in wsrep_sst_received() to reinitialize SE checkpoint after SST was received, avoid assert() in setting code by first checking current position.
Annotate_rows event needs to be preserved until the last Rows event has
been applied because after it has been applied thd->query points to the
query stored inside this event.