mariadb/scripts/wsrep_sst_backup.sh

92 lines
2.4 KiB
Bash
Raw Permalink Normal View History

MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
#!/usr/bin/env bash
set -ue
2024-09-15 04:27:23 +02:00
# Copyright (C) 2017-2024 MariaDB
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
# Copyright (C) 2010-2014 Codership Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; version 2 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; see the file COPYING. If not, write to the
# Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston
# MA 02110-1335 USA.
2024-09-15 04:27:23 +02:00
# This is a reference script for backup recovery state snapshot transfer.
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
. $(dirname "$0")/wsrep_sst_common
MAGIC_FILE="$DATA/backup_sst_complete"
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
wait_previous_sst
[ -f "$MAGIC_FILE" ] && rm -f "$MAGIC_FILE"
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
if [ "$WSREP_SST_OPT_ROLE" = 'donor' ]
then
RC=0
if [ $WSREP_SST_OPT_BYPASS -eq 0 ]; then
FLUSHED="$DATA/tables_flushed"
ERROR="$DATA/sst_error"
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
[ -f "$FLUSHED" ] && rm -f "$FLUSHED"
[ -f "$ERROR" ] && rm -f "$ERROR"
echo 'flush tables'
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
# Wait for :
# (a) Tables to be flushed, AND
# (b) Cluster state ID & wsrep_gtid_domain_id to be written to the file, OR
# (c) ERROR file, in case flush tables operation failed.
while [ ! -r "$FLUSHED" ] || \
! grep -q -F ':' -- "$FLUSHED"
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
do
# Check whether ERROR file exists.
if [ -f "$ERROR" ]; then
# Flush tables operation failed.
rm "$ERROR"
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
exit 255
fi
sleep 0.2
done
STATE=$(cat "$FLUSHED")
rm "$FLUSHED"
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
else # BYPASS
wsrep_log_info "Bypassing state dump."
# Store donor's wsrep GTID (state ID) and wsrep_gtid_domain_id
# (separated by a space).
STATE="$WSREP_SST_OPT_GTID $WSREP_SST_OPT_GTID_DOMAIN_ID"
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
fi
echo 'continue' # now server can resume updating data
echo "$STATE" > "$MAGIC_FILE"
echo "done $STATE"
else # joiner
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
wsrep_log_error "Unsupported role: '$WSREP_SST_OPT_ROLE'"
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
exit 22 # EINVAL
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
fi
wsrep_log_info "$WSREP_METHOD $WSREP_TRANSFER_TYPE completed on $WSREP_SST_OPT_ROLE"
MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes This commit adds a mtr test for reproducing a test scenario where despite of innodb_disallow_writes blocking, writes to file system can still happen. The test launches a garbd node, which triggers one of the cluster node to switch to SST donor state. In this state, all disk activity should be halted, and e.g. innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb data directory when the node enters the donor state, and records another md5sum when the node leaves the donor state. If there is no IO activity in data directory, these hashes should be equal. For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be stopped when entering the donor state. The test uses this new dbug sync point, to control when to record the md5sums. New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor node to call this script, and to enter in donor state. The backup script could be later extended as general purpose backup method for the cluster. This commit fixes also one race condition happening in wsrep_sst_rsync, like this: * wsrep_rsync_sst script requests for flush tables, and then waits in a loop until mariadbd has created file tables_flushed, as confirmation that FLUSH TABLES has completed * mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL, and after this it creates the tables_flushed file * note that SST script will now continue to startup rsync sending * mariadbd's SST donor thread now calls for sst_disallow_writes(), so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point This race condition is fixed in this commit, by performing all disk IO blocking before creating the tables_flushed file. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-11 09:27:36 +01:00
exit 0