The problem was that SHOW SLAVE STATUS was exceuted before the slave IO
thread had time to create a new relay log
Fixed by writing a command on the master and syncing the slave data.
This ensures that the slave creates a new relay log.
Starting with 10.5, InnoDB crash recovery tests seem to time out
more easily under Valgrind, which emulates multiple threads by
interleaving them in a single operating system thread.
These tests will still be covered by
AddressSanitizer and MemorySanitizer.
There are a few different cases to consider
Logging of CREATE TABLE and CREATE TABLE ... LIKE
- If REPLACE is used and there was an existing table, DDL log the drop of
the table.
- If discovery of table is to be done
- DDL LOG create table
else
- DDL log create table (with engine type)
- create the table
- If table was created
- Log entry to binary log with xid
- Mark DDL log completed
Crash recovery:
- If query was in binary log do nothing and exit
- If discoverted table
- Delete the .frm file
-else
- Drop created table and frm file
- If table was dropped, write a DROP TABLE statement in binary log
CREATE TABLE ... SELECT required a little more work as when one is using
statement logging the query is written to the binary log before commit is
done.
This was fixed by adding a DROP TABLE to the binary log during crash
recovery if the ddl log entry was not closed. In this case the binary log
will contain:
CREATE TABLE xxx ... SELECT ....
DROP TABLE xxx;
Other things:
- Added debug_crash_here() functionality to Aria to be able to test
crash in create table between the creation of the .MAI and the .MAD files.
Logging logic:
- Log tables, just before they are dropped, to the ddl log
- After the last table for the statement is dropped, log an xid for the
whole ddl log event
In case of crash:
- Remove first any active DROP TABLE events from the ddl log that matches
xids found in binary log (this mean the drop was successful and was
propery logged).
- Loop over all active DROP TABLE events
- Ensure that the table is completely dropped
- Write a DROP TABLE entry to the binary log with the dropped tables.
Other things:
- Added code to ha_drop_table() to be able to tell the difference if
a get_new_handler() failed because of out-of-memory or because the
handler refused/was not able to create a a handler. This was needed
to get sequences to work as sequences needs a share object to be passed
to get_new_handler()
- TC_LOG_BINLOG::recover() was changed to always collect Xid's from the
binary log and always call ddl_log_close_binlogged_events(). This was
needed to be able to collect DROP TABLE events with embedded Xid's
(used by ddl log).
- Added a new variable "$grep_script" to binlog filter to be able to find
only rows that matches a regexp.
- Had to adjust some test that changed because drop statements are a bit
larger in the binary log than before (as we have to store the xid)
Other things:
- MDEV-25588 Atomic DDL: Binlog query event written upon recovery is corrupt
fixed (in the original commit).
Merge 'replication_applier_status_by_coordinator' table.
This table captures SQL_THREAD status in case of both single threaded and
multi threaded slave configuration. When multi_source replication is enabled
this table will display each source specific SQL_THREAD status.
Added new columns for:
- LAST_SEEN_TRANSACTION
- LAST_TRANS_RETRY_COUNT
Merge 'replication_connection_configuration' table.
Replaced following column:
- AUTO_POSITION with USING_GTID
Added new columns for:
- IGNORE_SERVER_IDS
- DO_DOMAIN_IDS
- IGNORE_SERVER_IDS
Removed following columns as they are not part of mariadb replication
connection configuration:
- NETWORK_INTERFACE
- TLS_VERSION
@sql/mysqld.cc
Changed "master-retry-count" default value to 100000.
Shutdown of mtr tests may be too impatient, esp on CI environment where
10 seconds of `arg` of `shutdown_server arg` may not be enough for the clean
shutdown to complete.
This is fixed to remove explicit non-zero timeout argument to
`shutdown_server` from all mtr tests. mysqltest computes 60 seconds default
value for the timeout for the argless `shutdown_server` command.
This policy is additionally ensured with a compile time assert.
Fix:
===
Add "REPLICA" as an alias for "SLAVE". All commands which use "SLAVE" keyword
can be used with new alias "REPLICA".
List of commands:
On Master:
=========
SHOW REPLICA HOSTS <--> SHOW SLAVE HOSTS
Privilege "SLAVE" <--> "REPLICA"
On Slave:
=========
START SLAVE <--> START REPLICA
START ALL SLAVES <--> START ALL REPLICAS
START SLAVE UNTIL <--> START REPLICA UNTIL
STOP SLAVE <--> STOP REPLICA
STOP ALL SLAVES <--> STOP ALL REPLICAS
RESET SLAVE <--> RESET REPLICA
RESET SLAVE ALL <--> RESET REPLICA ALL
SLAVE_POS <--> REPLICA_POS
Description:
============
To change 'CONSERVATIVE' @@global.slave_parallel_mode default to 'OPTIMISTIC'
in 10.5.
@sql/sys_vars.cc
Changed default parallel_mode to 'OPTIMISTIC'
@sql/rpl_filter.cc
Changed default parallel_mode to 'OPTIMISTIC'
@sql/mysqld.cc
Removed the initialization of 'SLAVE_PARALLEL_CONSERVATIVE' to
'opt_slave_parallel_mode' variable.
@mysql-test/suite/rpl/t/rpl_parallel_mdev6589.test
@mysql-test/suite/rpl/t/rpl_mdev6386.test
Added 'mtr' suppression to ignore 'ER_PRIOR_COMMIT_FAILED'. In case of
'OPTIMISTIC' mode if a transaction gets killed during "wait_for_prior_commit"
it results in above error "1964". Hence suppression needs to be added for this
error.
@mysql-test/suite/rpl/t/rpl_parallel_conflicts.test
Test has a 'slave.opt' which explicitly sets slave_parallel_mode to
'conservative'. When the test ends this mode conflicts with new default mode.
Hence check test case reports an error. The 'slave.opt' is removed and options
are set and reset within test.
@mysql-test/suite/multi_source/info_logs.result
@mysql-test/suite/multi_source/reset_slave.result
@mysql-test/suite/multi_source/simple.result
Result content mismatch in "show slave status" output. This is expected as new
slave_parallel_mode='OPTIMISTIC'.
@mysql-test/include/check-testcase.test
Updated default 'slave_parallel_mode' to 'optimistic'.
Refactored rpl_parallel.test into following test cases.
Test case 1: @mysql-test/suite/rpl/t/rpl_parallel_domain.test
Test case 2: @mysql-test/suite/rpl/t/rpl_parallel_domain_slave_single_grp.test
Test case 3: @mysql-test/suite/rpl/t/rpl_parallel_single_grpcmt.test
Test case 4: @mysql-test/suite/rpl/t/rpl_parallel_stop_slave.test
Test case 5: @mysql-test/suite/rpl/t/rpl_parallel_slave_bgc_kill.test
Test case 6: @mysql-test/suite/rpl/t/rpl_parallel_gco_wait_kill.test
Test case 7: @mysql-test/suite/rpl/t/rpl_parallel_free_deferred_event.test
Test case 8: @mysql-test/suite/rpl/t/rpl_parallel_missed_error_handling.test
Test case 9: @mysql-test/suite/rpl/t/rpl_parallel_innodb_lock_conflict.test
Test case 10: @mysql-test/suite/rpl/t/rpl_parallel_gtid_slave_pos_update_fail.test
Test case 11: @mysql-test/suite/rpl/t/rpl_parallel_wrong_exec_master_pos.test
Test case 12: @mysql-test/suite/rpl/t/rpl_parallel_partial_binlog_trans.test
Test case 13: @mysql-test/suite/rpl/t/rpl_parallel_ignore_error_on_rotate.test
Test case 14: @mysql-test/suite/rpl/t/rpl_parallel_wrong_binlog_order.test
Test case 15: @mysql-test/suite/rpl/t/rpl_parallel_incorrect_relay_pos.test
Test case 16: @mysql-test/suite/rpl/t/rpl_parallel_retry_deadlock.test
Test case 17: @mysql-test/suite/rpl/t/rpl_parallel_deadlock_corrupt_binlog.test
Test case 18: @mysql-test/suite/rpl/t/rpl_parallel_mode.test
Test case 19: @mysql-test/suite/rpl/t/rpl_parallel_analyze_table_hang.test
Test case 20: @mysql-test/suite/rpl/t/rpl_parallel_record_gtid_wakeup.test
Test case 21: @mysql-test/suite/rpl/t/rpl_parallel_stop_on_con_kill.test
Test case 22: @mysql-test/suite/rpl/t/rpl_parallel_rollback_assert.test
Make all system tables in mysql directory of type
engine=Aria
Privilege tables are using transactional=1
Statistical tables are using transactional=0, to allow them
to be quickly updated with low overhead.
Help tables are also using transactional=0 as these are only
updated at init time.
Other changes:
- Aria store engine is now a required engine
- Update comment for Aria tables to reflect their new usage
- Fixed that _ma_reset_trn_for_table() removes unlocked table
from transaction table list. This was needed to allow one
to lock and unlock system tables separately from other
tables, for example when reading a procedure from mysql.proc
- Don't give a warning when using transactional=1 for engines
that is using transactions. This is both logical and also
to avoid warnings/errors when doing an alter of a privilege
table to InnoDB.
- Don't abort on warnings from ALTER TABLE for changes that
would be accepted by CREATE TABLE.
- New created Aria transactional tables are marked as not movable
(as they include create_rename_lsn).
- bootstrap.test was changed to kill orignal server, as one
can't anymore have two servers started at same time on same
data directory and data files.
- Disable maria.small_blocksize as one can't anymore change
aria block size after system tables are created.
- Speed up creation of help tables by using lock tables.
- wsrep_sst_resync now also copies Aria redo logs.
In this commit we are adding three more status variable to SHOW SLAVE
STATUS. Slave_DDL_Events and Slave_Non_Transactional_Events.
Slave_DDL_Groups:- This status variable counts the occurrence of DDL
statements
Slave_Non_Transactional_Groups:- This variable count the occurrence
of non-transnational event group.
Slave_Transactional_Groups:- This variable count the occurrence
of transnational event group.
Patch Credit:- Kristian Nielsen
Problem:- In the case of multisource replication/named slave
when we run "FLUSH LOGS" , it does not flush logs.
Solution:- A new function Master_info_index->flush_all_relay_logs()
is created which will rotate relay logs for all named slave.
This will be called in reload_acl_and_cache function when
connection_name.length == 0
Intermediate commit.
Update multi_source.gtid_ignore_duplicates test to avoid a sporadic
failure following MDEV-12179 functionality.
The test case manually messes with the mysql.gtid_slave_pos table.
Make sure to clean up that at the end of the test, and suppress the
messages from the server about these changes.
Intermediate commit.
This commit implements that record_gtid() selects a gtid_slave_posXXX table
with a storage engine already in use by current transaction, if any.
The default table mysql.gtid_slave_pos is used if no match can be found on
storage engine, or for GTID position updates with no specific storage
engine.
Table discovery of mysql.gtid_slave_pos* happens on initial GTID state load
as well as on every START SLAVE. Some effort is made to make this possible
without additional locking. New tables are added using lock-free atomics.
Removing tables requires stopping all slaves first. A warning is given in
the error log when a table is removed but a non-stopped slave still has a
reference to it.
If multiple mysql.gtid_slave_posXXX tables with same storage engine exist,
one is chosen arbitrarily to be used, with a warning in the error log. GTID
data from all tables is still read, but only one among redundant tables with
same storage engine will be updated.
- created binlog_encryption test suite and added it to the default list
- moved some tests from rpl, binlog and multisource suites to extra
so that they could be re-used in different suites
- made minor changes in include files
Merge feature into 10.2 from feature branch.
Delayed replication adds an option
CHANGE MASTER TO master_delay=<seconds>
Replication will then delay applying events with that many
seconds. This creates a replication slave that reflects the state of
the master some time in the past.
Feature is ported from MySQL source tree.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
1. use include/show_binlog_events.inc instead of SHOW BINLOG EVENTS
2. use include/show_relaylog_eventc.inc too
3. in all other places where a number might appear in the result
file, include binlog_start_pos.inc, calculate the position
like pos=`select $binlog_start_pos + 100`; and use
replace_result $pos <pos>
The --gtid-ignore-duplicates option was not working correctly with row-based
replication. When a row event was completed, but before committing, there
was a small window where another multi-source SQL thread could wrongly try
to re-execute the same transaction, without properly ignoring the duplicate
GTID. This would lead to duplicate key error or out-of-order GTID error or
similar.
Thanks to Matt Neth for reporting this and giving an easy way to reproduce
the issue.