This commit restores defaults and functionality regarding binlogs
to the way it was prior to MDEV-27524. The mariabackup utility no
longer saves binlogs files as part of a backup without the --galera-info
option. However, since we use --galera-info during SST, the behavior
of mariabackup changes and, in combination with GTIDs support enabled,
mariabackup transfers one (most recent) binlog file obtained after
FLUSH BINARY LOGS. In other cases, binlogs are not transferred during
SST in mariabackup mode. As for SST in the rsync mode, it works the
same way as before MDEV-27524 - by default it transfers one last
binlog file.
The --sst-max-binlogs option for mariabackup and the sst_max_binlogs
parameter in the [sst] / server sections are no longer supported for
SST via mariabackup.
This commit contains workaround for a bug known as 'Red Hat issue 1870279'
(connection reset by peer issue in socat versions 1.7.3.3 to 1.7.4.0) which
further causes crashes during SST using mariabackup (when openssl is used).
Also fixed broken logic of automatic generation of the Diffie-Hellman parameters
for socat version less than 1.7.3 (which defaults to 512-bit values instead of
2048-bit ones).
This commit sends a flag indicating the presence of the "--bypass"
option from the donor node to the joiner nodes during rsync IST,
because without such a flag it is impossible to distinguish IST
from the SST on the joiner nodes (in IST/SST scripts, because the
"--bypass" option is still not passed to scripts from server code).
Specifically, this fixes an issue with binary logs disappearing
after IST (via rsync). There are also changes to diagnostic messages
here that will make it easier to diagnose script-related problems
in the future when debugging and when checking the logs. This commit
also adds more robust signal handlers - to handle exceptions during
script execution. These handlers won't mask some crashes and it
also unifies exit codes between different scripts. These changes
have already been helpful to debugging "bypass" flag handling.
This commit fixes an issue with IST handling in
version 10.9 which is a regression after MDEV-26971
and related to trying to get a non-existent "total"
tag on the IST branch (this tag is only defined in
SST mode).
This commit fixes a crash reported as MDEV-28377 and a number
of other crashes in automated tests with mtr that are related
to broken .cnf files in galera and galera_3nodes suites, which
happened when automatically migrating MDEV-26171 from 10.3 to
subsequent higher versions.
This commit fixes problems with parsing ipv6 addresses given via
the wsrep_sst_receive_address and wsrep_node_address options.
Also, this commit removes extra lines in the configuration files
in the mtr test suites for Galera related to these parameters.
Currenly SST script for mariabackup stops on any failure while archiving
logs, e.g. when unable to create directory, insufficient permissions, gzip
failure, etc. However, in case of such problems, the script should issue
a warning and continue without archiving, but not exit with a fatal error.
This commit adds this fix to the SST script for mariabackup.
We will remove the parameter innodb_disallow_writes because it is badly
designed and implemented. The parameter was never allowed at startup.
It was only internally used by Galera snapshot transfer.
If a user executed
SET GLOBAL innodb_disallow_writes=ON;
the server could hang even on subsequent read operations.
During Galera snapshot transfer, we will block writes
to implement an rsync friendly snapshot, as follows:
sst_flush_tables() will acquire a global lock by executing
FLUSH TABLES WITH READ LOCK, which will block any writes
at the high level.
sst_disable_innodb_writes(), invoked via ha_disable_internal_writes(true),
will suspend or disable InnoDB background tasks or threads that could
initiate writes. As part of this, log_make_checkpoint() will be invoked
to ensure that anything in the InnoDB buf_pool.flush_list will be written
to the data files. This has the nice side effect that the Galera joiner
will avoid crash recovery.
The changes to sql/wsrep.cc and to the tests are based on a prototype
that was developed by Jan Lindström.
Reviewed by: Jan Lindström
The reason for this fix was that when I tried to run mysql_upgrade
at home to update an old 10.5 installation, mysql_upgrade failed
with warnings about mariadb.sys user not existing.
If the server was started with --skip-grants, there would be no warnings
from mysql_upgrade, but in some cases running mysql_upgrade again could
produce new warnings.
The reason for the warnings was that any access of the mysql.user view
will produce a warning if the mariadb.sys user does not exists.
Fixed with the following changes:
- Disable warnings about mariadb.sys user not existing
- Don't overwrite old mariadb.sys entries in tables_priv and global_priv
- Ensure that tables_priv has an entry for mariadb.sys if the user exists.
This fixes an issue that tables_priv would not be updated if there
was a failure directly after global_priv was updated.
This commit contains a fix to use modern syntax for selecting
character classes in the tr utility options.
Also one of the tests for SST via rsync (galera_sst_rysnc2) is made
more reliable (to avoid rare failures during automatic testing).
from mysql.plugin table
Fix: Since mysql_upgrade runs commands from mysql_system_tables.fix,
added sql commands to check for semisync plugins in
INFORMATION_SCHEMA.PLUGINS and if they aren't there then delete them
from mysql.plugin.
This commit adds a mtr test for reproducing a test scenario where despite of
innodb_disallow_writes blocking, writes to file system can still happen.
The test launches a garbd node, which triggers one of the cluster node to switch to
SST donor state. In this state, all disk activity should be halted, and e.g.
innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb
data directory when the node enters the donor state, and records another md5sum
when the node leaves the donor state. If there is no IO activity in data directory, these
hashes should be equal.
For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be
stopped when entering the donor state. The test uses this new dbug sync point,
to control when to record the md5sums.
New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor
node to call this script, and to enter in donor state.
The backup script could be later extended as general purpose backup method for the cluster.
This commit fixes also one race condition happening in wsrep_sst_rsync, like this:
* wsrep_rsync_sst script requests for flush tables,
and then waits in a loop until mariadbd has created file tables_flushed,
as confirmation that FLUSH TABLES has completed
* mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL,
and after this it creates the tables_flushed file
* note that SST script will now continue to startup rsync sending
* mariadbd's SST donor thread now calls for sst_disallow_writes(),
so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point
This race condition is fixed in this commit, by performing all disk IO blocking before
creating the tables_flushed file.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>