After two concurrent FTWRL/UNLOCK TABLES, the node stays in paused state
and the following CREATE TABLE fails with
ER_UNKNOWN_COM_ERROR (1047): Aborting TOI: Replication paused on
node for FTWRL/BACKUP STAGE.
The cause is the use of global `wsrep_locked_seqno` to determine
if the node should be resumed on UNLOCK TABLES. In some executions
the `wsrep_locked_seqno` is cleared by the first UNLOCK TABLES
after the second FTWRL gets past `make_global_read_lock_block_commit()`.
As a fix, use `thd->wsrep_desynced_backup_stage` to determine
if the thread should resume the node on UNLOCK TABLES.
Add MTR test galera.galera_ftwrl_concurrent to reproduce the
race. The test contains also cases for BACKUP STAGE which
uses similar mechanism for desyncing and pausing the node.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Revert the old work-around for buggy fdatasync() on Linux ext3. This bug was
fixed in Linux > 10 years ago back to kernel version at least 3.0.
Reviewed-by: Marko Mäkelä <marko.makela@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Making changes to wsrep_mysqld.h causes large parts of server code to
be recompiled. The reason is that wsrep_mysqld.h is included by
sql_class.h, even tough very little of wsrep_mysqld.h is needed in
sql_class.h. This commit introduces a new header file, wsrep_on.h,
which is meant to be included from sql_class.h, and contains only
macros and variable declarations used to determine whether wsrep is
enabled.
Also, header wsrep.h should only contain definitions that are also
used outside of sql/. Therefore, move WSREP_TO_ISOLATION* and
WSREP_SYNC_WAIT macros to wsrep_mysqld.h.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
MENT-328 wrongly assumed that the backup failed because of warnings from
mariabackup about not found files. This is normal (and the error message
should be deleted).
randgen failed because mariabackup didn't retry BACKUP STAGE BLOCK DDL
if it failed with a deadlock.
To simplify things, I implemented the retry loop in the server as
this particular deadlock should be quickly resolved.
make BACKUP STAGE behave as FTWRL, desyncing and pausing the node
to prevent BF threads (appliers) from interfering with blocking stages.
This is needed because BF threads don't respect BACKUP MDL locks.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
MDEV-20945: BACKUP UNLOCK + FTWRL assertion failure | SIGSEGV in I_P_List
from MDL_context::release_lock on INSERT w/ BACKUP LOCK (on optimized
builds) | Assertion `ticket->m_duration == MDL_EXPLICIT' failed
BACKUP LOCK behavior is modified so it won't be used wrong:
- BACKUP LOCK should commit any active transactions.
- BACKUP LOCK should not be allowed in stored procedures.
- When BACKUP LOCK is active, don't allow any DDL's for that connection.
- FTWRL is forbidden on the same connection while BACKUP LOCK is active.
Reviewed-by: monty@mariadb.com
It was mistakenly used by tdc_start_shutdown() to make sure TABLE_SHARE
gets evicted from table definition cache when it becomes unused. However
same effect is achieved by resetting tdc_size and tc_size.
Part of MDEV-17882 - Cleanup refresh version
The problem was two fault:
- flush_tables() wrongly gave errors when failing to open read only tables
- backup_block_ddl() didn't properly ignores errors from flush_tables()
The test case for this will be pushed in 10.5 as the test involves
S3 tables.
Part of MDEV-5336 Implement LOCK FOR BACKUP
- Changed check of Global_only_lock to also include BACKUP lock.
- We store latest MDL_BACKUP_DDL lock in thd->mdl_backup_ticket to be able
to downgrade lock during copy_data_between_tables()