This commit contains a large set of further bug fixes and
improvements to SST scripts for Galera, continuing the work
that was started in MDEV-24962 to make SST scripts work smoothly
in different network configurations (especially using ipv6) and
with different environment settings:
1) The ipv6 addresses were incorrectly handled in the SST script
for rsync (incorrect address substitution for establishing a
connection, incorrect address substitution for bind, and so on);
2) Checking the locality of the ip-address in SST scripts did not
support ipv6 addresses (such as "[::1]"), which were falsely
identified as non-local ip, which further did not allow running
two SSTs on different local addresses on the same machine.
On the other hand, this bug masked some other errors (related
to handling ipv6 addresses);
3) The code for checking the locality of the ip address was different
in the SST scripts for rsync and for mysqldump, with individual
flaws. This code is now made common and moved to wsrep_sst_common;
4) Waiting for the start of the transport channel (socat, nc, rsync,
stunnel) in the wait_for_listen() and check_pid_and_port() functions
did not process ipv6 addresses correctly in all cases (not for all
branches);
5) Waiting for the start of the transport channel (socat, nc, rsync,
stunnel) in the wait_for_listen() and check_pid_and_port() functions
for some code branches could give a false positive result due to
the textual match of prefixes in the port number and/or PID of
the process;
6) Waiting for the start of the transport channel (socat, nc, rsync,
stunnel) was supported through different utilities in SST scripts
for mariabackup and for rsync, and with various minor flaws in
the code. Now the code is still different in these scripts, but
it supports a common set of utilities (lsof, ss, sockstat) and
is synchronized across patterns that used to check the output
of these utilities;
7) In SST via mariabackup, the signal about readiness to receive data
is sometimes sent too early - immediately after listen(), and not
after accept() (which are called by socat or netcat utility).
8) Checking availability of the some options of some utilities was
done using the grep pattern, which easily gives false positives;
9) Common name (CN) for local addresses, if not explicitly specified,
is now always replaced to "localhost" to avoid the need to generate
many separate certificates for local addresses of one machine and
not to depend on which the local address is currently used in test
(ipv4 or ipv6, etc.);
10) In tests galera_sst_mariabackup_encrypt_with_key_server and
galera_sst_rsync_encrypt_with_key_server the correct certificate
is selected to avoid commonname (CN) mismatch problems;
11) Further refactoring to protect against spaces in file names.
12) Further general refactoring to eliminate bash-specific constructs
or to improve code readability;
13) The code for setting options for the nc (netcat) utility was
different in different scripts for SST - now it is made identical.
14) Fixed long-time broken encryption via xbcrypt in combination with
mariabackup and added support for key-based encryption via openssl
utility, which is now enabled by default for encrypt=1 mode (this
default mode can be changed using a new configuration file option
"encypt-format=openssl|xbcrypt", which can be placed in the [mysqld],
[sst] or in the [xtrabackup] section) - this change will allow us
to use and to test the encypt=1 encryption without installing
non-standard third-party utilities.
and configuration.
1. Pass joiner's authentication information to donor together with address
in State Transfer Request. This allows joiner to authenticate donor on
connection. Previously joiner would accept data from anywhere.
2. Deprecate custom SSL configuration variables tca, tcert and tkey in favor
of more familiar ssl-ca, ssl-cert and ssl-key. For backward compatibility
tca, tcert and tkey are still supported.
3. Allow falling back to server-wide SSL configuration in [mysqld] if no SSL
configuration is found in [sst] section of the config file.
4. Introduce ssl-mode variable in [sst] section that takes standard values
and has following effects:
- old-style SSL configuration present in [sst]: no effect
otherwise:
- ssl-mode=DISABLED or absent: retains old, backward compatible behavior
and ignores any other SSL configuration
- ssl-mode=VERIFY*: verify joiner's certificate and CN on donor,
verify donor's secret on joiner
(passed to donor via State Transfer Request)
BACKWARD INCOMPATIBLE BEHAVIOR
- anything else enables new SSL configuration convetions but does not
require verification
ssl-mode should be set to VERIFY only in a fully upgraded cluster.
Examples:
[mysqld]
ssl-cert=/path/to/cert
ssl-key=/path/to/key
ssl-ca=/path/to/ca
[sst]
-- server-wide SSL configuration is ignored, SST does not use SSL
[mysqld]
ssl-cert=/path/to/cert
ssl-key=/path/to/key
ssl-ca=/path/to/ca
[sst]
ssl-mode=REQUIRED
-- use server-wide SSL configuration for SST but don't attempt to
verify the peer identity
[sst]
ssl-cert=/path/to/cert
ssl-key=/path/to/key
ssl-ca=/path/to/ca
ssl-mode=VERIFY_CA
-- use SST-specific SSL configuration for SST and require verification
on both sides
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
1. Fix eval command line to correctly pass stunnel option to rsync on donor.
2. Deprecate `tkey`, `tcert` and `tca` options in [sst] section in favor of
conventional `ssl-key`, `ssl-cert` and `ssl-ca`, but keep their precedence
for backward compatibility.
3. Default to require SSL encryption if at least SSL key and cert files are
specified in configuration, either in [sst] or [mysqld] sections.
4. Enable `verify*` option for stunnel on donor only if
a. CA file is specified somewhere in the configuration
b. it is explicitly requested in [sst] section by either specifying
ssl-mode or CA file there. In this case if ssl-mode is not explicitly
given, it defaults to VERIFY_CA.
ssl-mode maps to stunnel options as follows:
VERIFY_CA -> verifyChain = yes
VERIFY_IDENTITY -> verifyPeer = yes
Example to require donor to verify joiner identity:
```
[mysqld]
ssl-cert=/path/to/cert
ssl-key=/path/to/key
ssl-ca=/path/to/ca
[sst]
ssl-mode=VERIFY_IDENTITY
```
5. If SSL verification is requested, joiner verifies donor by checking the
secret passed to donor via SST request.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
After switching to the new mariabackup interface (instead of
the outdated innobackupex interface, which is supported for
compatibility), we need to explicitly pass a path to the datadir
directory as a parameter, since in the new interface the value
of this option is not automatically set in such a way that it
always matches the SST/IST logic. This commit adds passing this
option as an explicit parameter to mariabackup. This commit also
removed unnecessary options that are not used and not supported
by mariabackup.
Also, numerous flaws in the common wsrep_sst_common script have
been fixed:
1) There are many bash-specific constructs in the script that
may not be supported by other interpreters, which can lead
to the most unexpected errors during SST, because failures
in the interpretation of bash-specific constructs lead to
incorrect parsing of arguments;
2) There is parse_cnf() function which is often called by other
scripts for the "mysqld" or "--mysqld" group, but it does not
take into account the default group suffix, which leads to
reading values only from the default group, which then leads
to errors due to reading the default values instead of the
values for a specific group;
3) Some options such as --user, --innodb-data-home-dir or --datadir
are not removed from the --mysqld-args list, although they are
processed inside scripts (and passing of these options funther
may cause problems for mariabackup);
4) If an argument that the script understands is present in
the --mysqld-args list twice, then this causes SST to fail,
instead of reading the most recent value;
5) The "--host" parameter is technically still supported among
the arguments of the SST scripts, but in reality scripts do not
work with it as expected, especially if it has an IPv6 address;
6) If the port number is absent in the --address parameter value,
but the port number is explicitly passed through the --port
argument, then the scripts for mariabackup and xtrabackup-v2
fail;
7) If a new address interface is used (with the --address parameter),
then automatic default port substitution is not performed, although
it is supported for the legacy --host/--port interface.
8) If there are spaces in the parameter values after --mysqld_args,
then their further transfer does not occur correctly, which
causes mariabackup to fail during SST - the space splits
the argument in such a way that it breaks the parsing of the
following parameters;
9) If most of the parameters that are names or paths to the files
or directories contain spaces, then SST scripts fail in an
unpredictable way due to incorrect variable substitutions;
10) If the --log-bin option is passed among the arguments of myqlds
(--mysqld-args) without a parameter, and the --binlog option
is not specified, then the script cannot substitute the default
name for binlog and cannot construct binlog name using the
--log-basename argument (which is against server specifications);
11) Tail slashes are not removed from the directory names, which,
upon further substitution, leads to the appearance of a double
slash in the file paths;
12) The explicit --binlog parameter (which is now always transmitted
from the server side) and the "hidden" --log-bin parameter in the
list of arguments after --mysqld-args are perceived as two different
parameters in different parts of the scripts, and if they are do not
match for some reason, this will lead to failures during SST;
Also, all new changes from the 10.6 branch have been migrated here,
including the latest pull requests for authentication (only the part
that concerns SST scripts).
It also fixes dozens of other bugs in all SST scripts.
Problem:
========
180511 11:07:58 [ERROR] Slave I/O: Unexpected master's heartbeat data:
heartbeat is not compatible with local info;the event's data: log_file_name
mysql-bin.000009 log_pos 1054262041, Error_code: 1623
Analysis:
=========
In replication setup when master server doesn't have any events to send to
slave server it sends an 'Heartbeat_log_event'. This event carries the
current binary log filename and offset details. The offset values is stored
within 4 bytes of event header. When the size of binary log is higher than
UINT32_MAX the log_pos values will not fit in 4 bytes memory. It overflows
and hence slave stops with an error.
Fix:
===
Since we cannot extend the common_header of Log_event class, a greater than
4GB value of Log_event::log_pos is made to be transported with a HeartBeat
event's sub-header. Log_event::log_pos in such case is set to zero to
indicate that the 8 byte sub-header is allocated in the event.
In case of cross version replication following behaviour is expected
OLD - Server without fix
NEW - Server with fix
OLD<->NEW : works bidirectionally as long as the binlog offset is
(normally) within 4GB.
When log_pos > UINT32_MAX
OLD->NEW : The 'log_pos' is bound to overflow and NEW slave may report
an invalid event/incompatible heart beat event error.
NEW->OLD : Since patched server sets log_pos=0 on overflow, OLD slave will
report invalid event error.
InnoDB tries to fetch the deleted doc ids for discarded
tablespace. In i_s_fts_deleted_generic_fill(), InnoDB needs
to check whether the table is discarded or not before fetching
deleted doc ids.
fil_ibd_load(): Remove a message that is basically saying that
everything works as expected. The other "Ignoring data file" message
about the presence of an extraneous file will be retained
(and expected by the test innodb.log_file_name).
The problem is that sharing default expression among set instruction
leads to attempt access result field of function created in
other instruction runtime MEM_ROOT and already freed
(a bit different then MySQL problem).
Fix is the same as in MySQL (but no optimisation for constant), turn
DECLARE a, b, c type DEFAULT expr;
to
DECLARE a type DEFAULT expr, b type DEFAULT a, c type DEFAULT a;
InnoDB startup hangs if a DDL transaction needs to be
rolled back and a recovered transaction on statistics
tables exists. In that case, InnoDB should rollback
the transaction which holds locks on innodb_table_stats
or innodb_index_stats during trx_rollback_or_clean_recovered().
InnoDB fails to fetch the index type when innodb dictionary
doesn't match with frm. InnoDB should return corrupted if it
can't find the index in ha_innobase::index_type().
table->move_fields wasn't undone in case of error.
1. move_fields is unconditionally undone even when error is occurred
2. cherry-pick an assertion in `ptr_in_record`, which is already in 10.5
The assertion is improved: storage engines like myisam always have to store
at least one field, so the assertion does not cover tables with no stored
columns.
So we are having a race condition of three of threads, resulting in a
deadlock backoff in purge, which is unexpected.
More precisely, the following happens:
T1: NOCOPY ALTER TABLE begins, and eventually it holds MDL_SHARED_NO_WRITE
lock;
T2: FLUSH TABLES begins. it sets share->tdc->flushed = true
T3: purge on a record with virtual column begins. it is going to open a
table. MDL_SHARED_READ lock is acquired therefore.
Since share->tdc->flushed is set, it waits for a TDC purge end.
T1: is going to elevate MDL LOCK to exclusive and therefore has to set
other waiters to back off.
T3: receives VICTIM status, reports a DEADLOCK, sets OT_BACKOFF_AND_RETRY
to Open_table_context::m_action
My fix is to allow opening table in purge while flushing. It is already
done the same way in other maintainance facilities like REPAIR TABLE.
Another way would be making an actual backoff, but Open_table_context
does not allow to distinguish it from other failure types, which still
seem to be unexpected. Making this would require hacking into
Open_table_context interface for no benefit, in comparison to passing
MYSQL_OPEN_IGNORE_FLUSH during table open.
innodb_debug_sync was introduced in commit
b393e2cb0c and reverted in
commit fc58c17216 due to memory leak reported
by valgrind, see MDEV-21336.
The leak is now fixed by adding `rw_lock_free(&slot->debug_sync_lock)`
after background thread working loop is finished, and the patch is
reapplied, with respect to c++98 fixes by Marko.
The missing DEBUG_SYNC for MDEV-18546 in row0vers.cc is also reapplied.
Item_func_history (is_history()) is a bool function that checks if the
row is the history row by checking row_end->is_max(). The argument to
this function must be row_end system field.
Added the above function to conjunction with SYSTEM_TIME_BEFORE
versioning condition.
row_merge_is_index_usable(): Allow access to any SEQUENCE, even if it was
created after the read view. SQL sequences are no-rollback tables with no
history at all.
This is a backport of
commit fd9ca2a742 (MDEV-23295) and
commit 9a156e1a23 (MDEV-23345) to 10.3.
An instant ADD/DROP/reorder column could create a dummy table
object with the wrong ROW_FORMAT when innodb_default_row_format
was changed between CREATE TABLE and ALTER TABLE.
prepare_inplace_alter_table_dict(): If we had promised that
ALGORITHM=INPLACE is supported, we must preserve the ROW_FORMAT.
The rest of the changes are related to adding
Alter_inplace_info::inplace_supported to cache the return value of
handler::check_if_supported_inplace_alter().
Back port upstream fix
commit 1800b015a1d487330f7b15f2020b887be348a66b
Author: Venkatesh Duggirala <venkatesh.duggirala@oracle.com>
Date: Fri Sep 8 20:29:22 2017 +0530
Bug#26027024 SLAVE_COMPRESSED_PROTOCOL DOESN'T WORK WITH
SEMI-SYNC REPLICATION IN MYSQL-5.7
Analysis: In mysql-5.6, dump thread (the thread that is created
on Master after Slave requested for a binlog dump) is also used
to receive acknowledgements from the Slave and act on them accordingly.
For performance reasons, a special thread called Ack Receiver thread
is added in mysql-5.7 Semi synchronous replication plugin.
This thread does not have special handling to receive acknowledgements
if Slave has enabled compression in the protocol. Hence Master is
unable to handle any slave if Slave_compressed_protocol is enabled
on it.
Fix: Enable compress flag on the communication channels if the Slave
has Slave_compressed_protocol ON.
row_sel_sec_rec_is_for_clust_rec(): If the field in the
clustered index record stored off page, always fetch it,
also when the secondary index field has been built on the
entire column. This was broken ever since the InnoDB Plugin
for MySQL Server 5.1 introduced ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPRESSED for InnoDB tables. That code was first
introduced in this tree in
commit 3945d5e554.
For the original ROW_FORMAT=REDUNDANT and the MySQL 5.0.3
ROW_FORMAT=COMPRESSED, there was no problem, because for
those tables we always stored at least a 768-byte prefix of
each column in the clustered index record.
row_sel_sec_rec_is_for_blob(): Allow prefix_len==0 for matching
the full column.
Before FRM is written walk vcol expressions through
check_table_name_processor() and check if field items match (db,
table_name) qualifier.
We cannot do this in check_vcol_func_processor() as there is already
no table name qualifiers in expressions of written and loaded FRM.
Buffer overflow in ib_push_warning() fixed by using vsnprintf().
InnoDB parser was obsoleted by MDEV-16417.
Thanks to Nikita Malyavin for review and suggestion.
There was race between a committing transaction and the following in binlog
order FLUSH LOGS that could create a 2nd Binlog checkpoint (BCP) event
in the new file *before* the first logged-in-old-binlog transaction gets committed in
Innodb. That would cause the transaction loss at recovery, should
the server stop right after the BCP.
The race is tackled by enforcing the necessary set of mutexes to be acquired
by FLUSH-LOGS handler in the correct order (of the group commit leader
pattern).
Note, there remain two cases where a similar race is still possible:
- the above race as it is when the server is run with ("unlikely")
non-default `--binlog-optimize-thread-scheduling=0` (MDEV-24530), and
- at unlikely event of bin-logging of Incident event (MDEV-24531) that
also triggers binlog rotation,
in both cases though with lesser chances after the current fixes.
Between btr_pcur_store_position() and btr_pcur_restore_position()
it is possible that purge empties a table and enlarges
index->n_core_fields and index->n_core_null_bytes.
Therefore, we must cache index->n_core_fields in
btr_pcur_t::old_n_core_fields so that btr_pcur_t::old_rec can be
parsed correctly.
Unfortunately, this is a huge change, because we will replace
"bool leaf" parameters with "ulint n_core"
(passing index->n_core_fields, or 0 for non-leaf pages).
For special cases where we know that index->is_instant() cannot hold,
we may also pass index->n_fields.
This commit removes the mtr test galera_sst_mariabackup_encrypt_with_key
from the list of disabled tests because the problem with it has already
been fixed.
Problem:
========
InnoDB fails to clean the index stub if it fails to add the
virtual index which contains new virtual column. But it clears
the newly virtual column from index in clear_added_indexes()
during inplace_alter_table. On commit, InnoDB evicts and
reload the table. In case of rollback, it doesn't happen.
InnoDB clears the ABORTED index while opening the table
or doing the DDL. In the mean time, InnoDB can access
the dropped virtual index columns while creating prebuilt
or rollback of concurrent DML.
Solution:
==========
(1) InnoDB should maintain newly added virtual column while
rollbacking the newly added virtual index.
(2) InnoDB must not defer the index removal
if the alter table is executed with LOCK=EXCLUSIVE.
(3) For LOCK=SHARED, InnoDB should check whether the table
has any other transaction lock other than alter transaction
before deferring the index stub.
Replaced has_new_v_col with dict_add_vcol_info in dict_index_t to
indicate whether the index has any new virtual column.
dict_index_t::has_new_v_col(): Returns whether the index has
newly added virtual column, it doesn't say which columns are
newly added virtual column
ha_innobase_inplace_ctx::is_new_vcol(): Return whether the
given column is added as a part of the current alter.
ha_innobase_inplace_ctx::clean_new_vcol_index(): Copy the newly
added virtual column to new_vcol_info in dict_index_t. Replace
the column in the index fields with virtual column stored
in new_vcol_info.
dict_index_t::assign_new_v_col(): Store the number of virtual
column added in index as a part of alter table.
dict_index_t::get_n_new_vcol(): Get the number of newly added
virtual column
dict_index_t::assign_drop_v_col(): Allocate the memory for
adding new virtual column in new_vcol_info.
dict_index_t::add_drop_v_col(): Add the newly added virtual
column in new_vcol_info.
dict_table_t::has_lock_for_other_trx(): Whether the table has
any other transaction lock than given transaction.
row_merge_drop_indexes(): Add parameter alter_trx and check
whether the table has any other lock than alter transaction.
The auto-increment variables may change intermittently during
the execution of some tests from the Galera mtr suite, causing
these tests to fail. This patch creates conditions in which
unpredictable changes to these variables are not possible
during the execution of those tests in which this problem
is noticed or their values are restored before the end of
testing.
server failure in different, confusing ways
InnoDB fails to free the buffer pool instance mutex and zip mutex
If the allocation of buffer pool instance chunk fails. So it leads
to freeing of buffer pool before freeing the mutexes and
leads to double freeing of memory while freeing the mutex
during shutdown.