Problem:- In the case of multisource replication/named slave
when we run "FLUSH LOGS" , it does not flush logs.
Solution:- A new function Master_info_index->flush_all_relay_logs()
is created which will rotate relay logs for all named slave.
This will be called in reload_acl_and_cache function when
connection_name.length == 0
The reason for this is that stop slave takes LOCK_active_mi over the
whole operation while some slave operations will also need LOCK_active_mi
which causes deadlocks.
Fixed by introducing object counting for Master_info and not taking
LOCK_active_mi over stop slave or even stop_all_slaves()
Another benefit of this approach is that it allows:
- Multiple threads can run SHOW SLAVE STATUS at the same time
- START/STOP/RESET/SLAVE STATUS on a slave will not block other slaves
- Simpler interface for handling get_master_info()
- Added some missing unlock of 'log_lock' in error condtions
- Moved rpl_parallel_inactivate_pool(&global_rpl_thread_pool) to end
of stop_slave() to not have to use LOCK_active_mi inside
terminate_slave_threads()
- Changed argument for remove_master_info() to Master_info, as we always
have this available
- Fixed core dump when doing FLUSH TABLES WITH READ LOCK and parallel
replication. Problem was that waiting for pause_for_ftwrl was not done
when deleting rpt->current_owner after a force_abort.
This has no functional changes, but it helps avoid merge problems from 10.0
to 10.1. In 10.0, code that checks for parallel replication uses
opt_slave_parallel_threads > 0, but this check needs to be
mi->using_parallel() in 10.1. By using the same check in 10.0 (with
unchanged semantics), merge problems to 10.1 are avoided.
- Added testing if connection is killed to shortcut reading of connection data
This will allow us later in 10.2 to do a cleaner shutdown of slaves (less errors in the log)
- Add new status variables: Slaves_connected, Slaves_running and Slave_connections.
- Use MYSQL_SLAVE_NOT_RUN instead of 0 with slave_running.
- Don't print obvious extra warnings to the error log when slave is shut down normally.
Delay spawning parallel replication worker threads until a slave SQL
thread is running, and de-spawn them when the last SQL thread stops.
This is especially useful to avoid needless threads on a master in a
setup where same my.cnf is used on masters and slaves.
Adjust the configuration options, as discussed on the
maria-developers@ mailing list.
The option to hint a transaction to not be replicated in parallel is
now called @@skip_parallel_replication, consistent with
@@skip_replication.
And the --slave-parallel-mode is now simplified to have just one of
the following values:
none
minimal
conservative
optimistic
aggressive
This reflects successively harder efforts to find opportunities to run
things in parallel on the slave. It allows to extend the server with
more automatic heuristics in the future without having to introduce a
new configuration option for each and every one.
Implement a new mode for parallel replication. In this mode, all transactions
are optimistically attempted applied in parallel. In case of conflicts, the
offending transaction is rolled back and retried later non-parallel.
This is an early-release patch to facilitate testing, more changes to user
interface / options will be expected. The new mode is not enabled by default.
If the slave gets a reconnect in the middle of a GTID event group, normally
it will re-fetch that event group, skipping the first part that was already
queued for the SQL thread.
However, if the master crashed while writing the event group, the group is
incomplete. This patch detects this case and makes sure that the
transaction is rolled back and nothing is skipped from any following
event groups.
Similarly, a network proxy might cause the reconnect to end up on a
different master server. Detect this by noticing a different server_id,
and similarly in this case roll back the partially received group.
The previous patch for this bug was unfortunately completely wrong.
The purpose of cached_charset is to remember which character set we
have installed currently in the THD, so that in the common case where
charset does not change between queries, we do not need to update it
in the THD. Thus, it is important that the cached_charset field is
tightly coupled to the THD for which it handles caching.
Thus the right place to put cached_charset seems to be in the THD.
This patch introduces a field THD:system_thread_info where such info
in general can be placed without further inflating the THD with unused
data for other threads (THD is already far too big as it is). It then
moves the cached_charset into this slot for the SQL driver thread and
for the parallel replication worker threads.
The THD::rpl_filter field is also moved inside system_thread_info, to
keep the size of THD unchanged. Moving further fields in to reduce the
size of THD is a separate task, filed as MDEV-6164.
Make IO thread check for end of event group, so that upon
disconnect at the end of an event group it can report the last
read GTID as expected. Also inject a fake Rotate event at
reconnect when skipping part of an initial event group, to
give SQL thread the correct Read_Master_Log_Pos.
Reported by Pavel Ivanov.
includes:
* remove some remnants of "Bug#14521864: MYSQL 5.1 TO 5.5 BUGS PARTITIONING"
* introduce LOCK_share, now LOCK_ha_data is strictly for engines
* rea_create_table() always creates .par file (even in "frm-only" mode)
* fix a 5.6 bug, temp file leak on dummy ALTER TABLE
Fix problems related to reconnect. When we need to reconnect (ie. explict
stop/start of just the IO thread by user, or automatic reconnect due to
loosing network connection with the master), it is a bit complex to correctly
resume at the right point without causing duplicate or missing events in the
relay log. The previous code had multiple problems in this regard.
With this patch, the problem is solved as follows. The IO thread keeps track
(in memory) of which GTID was last queued to the relay log. If it needs to
reconnect, it resumes at that GTID position. It also counts number of events
received within the last, possibly partial, event group, and skips the same
number of events after a reconnect, so that events already enqueued before the
reconnect are not duplicated.
(There is no need to keep any persistent state; whenever we restart slave
threads after both of them being stopped (such as after server restart), we
erase the relay logs and start over from the last GTID applied by SQL thread.
But while the SQL thread is running, this patch is needed to get correct relay
log).
Change of user interface to be more logical and more in line with expectations
to work similar to old-style replication.
User can now explicitly choose in CHANGE MASTER whether binlog position is
taken into account (master_gtid_pos=current_pos) or not (master_gtid_pos=
slave_pos) when slave connects to master.
@@gtid_pos is replaced by three separate variables @@gtid_slave_pos (can
be set by user, replicated GTIDs only), @@gtid_binlog_pos (read only), and
@@gtid_current_pos (a combination of the two, most recent GTID within each
domain). mysql.rpl_slave_state is renamed to mysql.gtid_slave_pos to match.
This fixes MDEV-4474.
Users can set different repplication filter rules for each replication connection, in my.cnf or command line.
But the rules set online will not record in master.info, it means if users restart MySQL, these rules will lose.
So if users wantn't their replication filter rules lose, they should write the rules in my.cnf.
Users can set rules by 2 ways:
1. Online SET command, "SET connection_name.replication_filter_settings = rules;".
2. In my.cnf, "connection_name.replication_filter_settings = rules".
If no connection_name in my.cnf, this rule will apply for ALL replication connection.
If no connetion_name in SET statement, this rull will apply for default_connection_name.
Fix MDEV-4275 - I/O thread restart duplicates events in the relay log.
The first time we connect to master after CHANGE MASTER or restart, we connect
from the GTID position. But then subsequent reconnects or IO thread restarts
reconnect with the old-style file/offset binlog pos from where it left off at
last disconnect. This is necessary to avoid duplicate events in the relay
logs, as there is nothing that synchronises the SQL thread update of GTID
state (multiple threads in case of multi-source) with IO thread reconnects.
Test cases.
Some small cleanups and fixes.
Give error for wrong parameters to CHANGE MASTER
Extend MASTER_PASSWORD and MASTER_HOST lengths
mysql-test/suite/rpl/r/rpl_password_boundaries.result:
Test length of MASTER_PASSWORD, MASTER_HOST and MASTER_USER
mysql-test/suite/rpl/r/rpl_semi_sync.result:
Use different password than user name for better test coverage
mysql-test/suite/rpl/t/rpl_password_boundaries.test:
Test length of MASTER_PASSWORD, MASTER_HOST and MASTER_USER
mysql-test/suite/rpl/t/rpl_semi_sync.test:
Use different password than user name for better test coverage
sql/rpl_mi.h:
Extend MASTER_PASSWORD and MASTER_HOST lengths
sql/sql_repl.cc:
Give error for wrong parameters to CHANGE MASTER
sql/sql_repl.h:
Extend MASTER_PASSWORD and MASTER_HOST lengths
This allows us to avoid calculating variables (including those involving mutex) that doesn't match the given
wildcard in SHOW STATUS LIKE '...'
Removed all references to active_mi that could cause problems for multi-source replication.
Added START|STOP ALL SLAVES
Added SHOW ALL SLAVES STATUS
include/mysql/plugin.h:
Added SHOW_SIMPLE_FUNC
include/mysql/plugin_audit.h.pp:
Updated .pp file
include/mysql/plugin_auth.h.pp:
Updated .pp file
include/mysql/plugin_ftparser.h.pp:
Updated .pp file
mysql-test/suite/multi_source/info_logs.result:
New columns in SHOW ALL SLAVES STATUS
mysql-test/suite/multi_source/info_logs.test:
Test new syntax
mysql-test/suite/multi_source/simple.result:
New columns in SHOW ALL SLAVES STATUS
mysql-test/suite/multi_source/simple.test:
test new syntax
mysql-test/suite/multi_source/syntax.result:
Updated result
mysql-test/suite/multi_source/syntax.test:
Test new syntax
mysql-test/suite/rpl/r/rpl_skip_replication.result:
Updated result
plugin/semisync/semisync_master_plugin.cc:
SHOW_FUNC -> SHOW_SIMPLE_FUNC
sql/item_create.cc:
Simplify code
sql/lex.h:
Added SLAVES keyword
sql/log.cc:
Constant -> define
sql/log_event.cc:
Added comment
sql/mysqld.cc:
SHOW_FUNC -> SHOW_SIMPLE_FUNC
Made slave_retried_trans, slave_received_heartbeats and heartbeat_period multi-source safe
Clear variable denied_connections and slave_retried_transactions on startup
sql/mysqld.h:
Added slave_retried_transactions
sql/rpl_mi.cc:
create_signed_file_name -> create_logfile_name_with_suffix
Added start_all_slaves() and stop_all_slaves()
sql/rpl_mi.h:
Added prototypes
sql/rpl_rli.cc:
create_signed_file_name -> create_logfile_name_with_suffix
added executed_entries
sql/rpl_rli.h:
Added executed_entries
sql/share/errmsg-utf8.txt:
More and better error messages
sql/slave.cc:
Added more fields to SHOW ALL SLAVES STATUS
Added slave_retried_transactions
Changed constants -> defines
sql/sql_class.h:
Added comment
sql/sql_insert.cc:
active_mi.rli -> thd->rli_slave
sql/sql_lex.h:
Added SQLCOM_SLAVE_ALL_START & SQLCOM_SLAVE_ALL_STOP
sql/sql_load.cc:
active_mi.rli -> thd->rli_slave
sql/sql_parse.cc:
Added START|STOP ALL SLAVES
sql/sql_prepare.cc:
Added SQLCOM_SLAVE_ALL_START & SQLCOM_SLAVE_ALL_STOP
sql/sql_reload.cc:
Made REFRESH RELAY LOG multi-source safe
sql/sql_repl.cc:
create_signed_file_name -> create_logfile_name_with_suffix
Don't send my_ok() from start_slave() or stop_slave() so that we can call it for all master connections
sql/sql_show.cc:
Compare wild cards early for all variables
sql/sql_yacc.yy:
Added START|STOP ALL SLAVES
Added SHOW ALL SLAVES STATUS
sql/sys_vars.cc:
Made replicate_events_marked_for_skip,slave_net_timeout and rpl_filter multi-source safe.
sql/sys_vars.h:
Simplify Sys_var_rpl_filter
Changed names of multi-source log files so that original suffixes are kept.
include/my_sys.h:
Added fn_ext2(), which returns pointer to last '.' in file name
mysql-test/extra/rpl_tests/rpl_max_relay_size.test:
Updated test
mysql-test/suite/multi_source/info_logs-master.opt:
Test with strange file names
mysql-test/suite/multi_source/info_logs.result:
Updated results
mysql-test/suite/multi_source/info_logs.test:
Changed to test with complex names to be able to verify the filename generator code
mysql-test/suite/multi_source/relaylog_events.result:
Updated results
mysql-test/suite/multi_source/reset_slave.result:
Updated results
mysql-test/suite/multi_source/skip_counter.result:
Updated results
mysql-test/suite/multi_source/skip_counter.test:
Added testing of max_relay_log_size
mysql-test/suite/rpl/r/rpl_row_max_relay_size.result:
Updated results
mysql-test/suite/rpl/r/rpl_stm_max_relay_size.result:
Updated results
mysql-test/suite/sys_vars/r/max_relay_log_size_basic.result:
Updated results
mysql-test/suite/sys_vars/t/max_relay_log_size_basic.test:
Updated results
mysys/mf_fn_ext.c:
Added fn_ext2(), which returns pointer to last '.' in file name
sql/log.cc:
Removed some wrong casts
sql/log.h:
Updated comment to reflect new code
sql/log_event.cc:
Updated DBUG_PRINT
sql/mysqld.cc:
Added that max_relay_log_size copies it's values from max_binlog_size
sql/mysqld.h:
Removed max_relay_log_size
sql/rpl_mi.cc:
Changed names of multi-source log files so that original suffixes are kept.
sql/rpl_mi.h:
Updated prototype
sql/rpl_rli.cc:
Updated comment to reflect new code
Made max_relay_log_size depending on master connection.
sql/rpl_rli.h:
Made max_relay_log_size depending on master connection.
sql/set_var.h:
Made option global so that one can check and change min & max values (sorry Sergei)
sql/sql_class.h:
Made max_relay_log_size depending on master connection.
sql/sql_repl.cc:
Updated calls to create_signed_file_name()
sql/sys_vars.cc:
Made max_relay_log_size depending on master connection.
Made old code more reusable
sql/sys_vars.h:
Changed Sys_var_multi_source_uint to ulong to be able to handle max_relay_log_size
Made old code more reusable
Documentation of the feature can be found at: http://kb.askmonty.org/en/multi-source-replication/
This code is based on code from Taobao, developed by Plinux
BUILD/SETUP.sh:
Added -Wno-invalid-offsetof to get rid of warning of offsetof() on C++ class (safe in the contex we use it)
client/mysqltest.cc:
Added support for error names starting with 'W'
Added connection_name support to --sync_with_master
cmake/maintainer.cmake:
Added -Wno-invalid-offsetof to get rid of warning of offsetof() on C++ class (safe in the contex we use it)
mysql-test/r/mysqltest.result:
Updated results
mysql-test/r/parser.result:
Updated results
mysql-test/suite/multi_source/my.cnf:
Setup of multi-master tests
mysql-test/suite/multi_source/simple.result:
Simple basic test of multi-source functionality
mysql-test/suite/multi_source/simple.test:
Simple basic test of multi-source functionality
mysql-test/suite/multi_source/syntax.result:
Test of multi-source syntax
mysql-test/suite/multi_source/syntax.test:
Test of multi-source syntax
mysql-test/suite/rpl/r/rpl_rotate_logs.result:
Updated results because of new error messages
mysql-test/t/parser.test:
Updated test as master_pos_wait() now takes more arguments than before
sql/event_scheduler.cc:
No reason to initialize slave_thread (it's guaranteed to be zero here)
sql/item_create.cc:
Added connection_name argument to master_pos_wait()
Simplified code
sql/item_func.cc:
Added connection_name argument to master_pos_wait()
sql/item_func.h:
Added connection_name argument to master_pos_wait()
sql/log.cc:
Added tag "Master 'connection_name'" to slave errors that has a connection name.
sql/mysqld.cc:
Added variable mysqld_server_initialized so that other functions can test if server is fully initialized.
Free all slave data in one place (fewer ifdef's)
Removed not needed call to close_active_mi()
Initialize slaves() later in startup to ensure that everthing is really initialized when slaves start.
Made status variable slave_running multi-source safe
sql/mysqld.h:
Added mysqld_server_initialized
sql/rpl_mi.cc:
Store connection name and cmp_connection_name (only used for show full slave status) in Master_info
Added code for Master_info_index, which handles storage of multi-master information
Don't write the empty "" connection_name to multi-master.info file. This is handled by the original code.
sql/rpl_mi.h:
Added connection_name and Master_info_index
sql/rpl_rli.cc:
Added connection_name to relay log files.
sql/rpl_rli.h:
Fixed type of slave_skip_counter as we now access it directly in sys_vars.cc, so it must be uint
sql/share/errmsg-utf8.txt:
Added new error messages needed for multi-source
Added multi-source name to error ER_MASTER_INFO and WARN_NO_MASTER_INFO
sql/slave.cc:
Moved things a bit around to make it easier to handle error conditions.
Create a global master_info_index and add the "" connection to it
Ensure that new Master_info doesn't fail.
Don't call terminate_slave_threads(active_mi..) on end_slave() as this is now done automaticly when deleting master_info_index.
Delete not needed function close_active_mi(). One can achive same thing by calling end_slave().
Added support for SHOW FULL SLAVE STATUS (show status for all master connections with connection_name as first column)
sql/slave.h:
Added new prototypes
sql/sql_base.cc:
More DBUG_PRINT
sql/sql_class.cc:
Reset thd->connection_name and thd-->default_master_connection
sql/sql_class.h:
Added thd->connection_name and thd-->default_master_connection
Added slave_skip_count to variables to make changing the @@sql_slave_skip_count variable thread safe
sql/sql_const.h:
Added MAX_CONNECTION_NAME
sql/sql_lex.cc:
Reset 'lex->verbose' (to simplify some sql_yacc.yy code)
sql/sql_lex.h:
Added connection_name
sql/sql_parse.cc:
Added support for connection_name to all SLAVE commands.
- Instead of using active_mi, we now get the current Master_info from master_info_index.
- Create new replication threads with CHANGE MASTER
- Added support for show_all_master_info()
sql/sql_reload.cc:
Made reset/full slave use master_info_index->get_master_info() instead of active_mi.
If one uses 'RESET SLAVE "connection_name" all' the connection is removed from master_info_index.
sql/sql_repl.cc:
sql_slave_skip_counter is moved to thd->variables to make it thread safe and fix some bugs with it
Add connection name to relay log files.
Added connection name to errors.
Added some logging for multi-master if log_warnings > 1
stop_slave():
- Don't check if thd is set. It's guaranteed to always be set.
change_master():
- Check for duplicate connection names in change_master()
- Check for wrong arguments first in file (to simplify error handling)
- Register new connections in master_info_index
sql/sql_yacc.yy:
Added optional connection_name to a all relevant master/slave commands
sql/strfunc.cc:
my_global.h shoud always be included first.
sql/sys_vars.cc:
Added variable default_master_connection
Made variable sql_slave_skip_counter multi-source safe
sql/sys_vars.h:
Added Sys_var_session_lexstring (needed for default_master_connection)
Added Sys_var_multi_source_uint (needed for sql_slave_skip_counter).
mysql-test/suite/innodb/t/group_commit_crash.test:
remove autoincrement to avoid rbr being used for insert ... select
mysql-test/suite/innodb/t/group_commit_crash_no_optimize_thread.test:
remove autoincrement to avoid rbr being used for insert ... select
mysys/my_addr_resolve.c:
a pointer to a buffer is returned to the caller -> the buffer cannot be on the stack
mysys/stacktrace.c:
my_vsnprintf() is ok here, in 5.5
Problem : The basic problem is the way the thread sleeps in mysql-5.5 and also in mysql-5.1
when we execute a stop slave on windows platform.
On windows platform if the stop slave is executed after the master dies, we have
this long wait before the stop slave return a value. This is because there is a
sleep of the thread. The sleep is uninterruptable in the two above version,
which was fixed by Davi patch for the BUG#11765860 for mysql-trunk. Backporting
his patch for mysql-5.5 fixes the problem.
Solution : A new pair of mutex and condition variable is introduced to synchronize thread
sleep and finalization. A new mutex is required because the slave threads are
terminated while holding the slave thread locks (run_lock), which can not be
relinquished during termination as this would affect the lock order.
mysql-test/suite/rpl/r/rpl_start_stop_slave.result:
The result file associated with the test added.
mysql-test/suite/rpl/t/rpl_start_stop_slave.test:
A test to check the new functionality.
sql/rpl_mi.cc:
The constructor using the new mutex and condition variables for the master_info.
sql/rpl_mi.h:
The condition variable and mutex have been added for the master_info.
sql/rpl_rli.cc:
The constructor using the new mutex and condition variables for the realy_log_info.
sql/rpl_rli.h:
The condition variable and mutex have been added for the relay_log_info.
sql/slave.cc:
Use a timed wait on a condition variable to implement a interruptible sleep.
The wait is registered with the THD object so that the thread will be woken
up if killed.