Commit graph

610 commits

Author SHA1 Message Date
Kristian Nielsen
2e82a8233c MDEV-7785: errorneous -> erroneous spelling mistake 2015-03-16 10:54:47 +01:00
Kristian Nielsen
3ef0b9b235 Merge MDEV-6589 and MDEV-6403 into 10.0. 2015-03-04 13:36:54 +01:00
Kristian Nielsen
78c74dbe30 MDEV-6403: Temporary tables lost at STOP SLAVE in GTID mode if master has not rotated binlog since restart
The binlog contains specially marked format description events to mark
when a master restart happened (which could have caused temporary
tables to be silently dropped). Such events also cause slave to close
temporary tables.

However, there was a bug that if after this, slave re-connects to the
master in GTID mode, the master can send an old format description
event again. If temporary tables are closed when such event is seen
for the second time, it might drop temporary tables created after that
event, and cause replication failure.

With this patch, the restart flag of the format description event is
cleared by the master when it is sent to the slave in a subsequent
connection, to avoid the errorneous temp table close.
2015-03-04 13:36:29 +01:00
Sergei Golubchik
b739103f12 MDEV-7591 master crashed when slave specfied a future position with semi-repl plugin
cherry-pick the upstream fix

commit d4ba10184cd7bde9c31c610e664ecd0c93605c46
Author: Sujatha Sivakumar <sujatha.sivakumar@oracle.com>
Date:   Wed Jul 2 11:34:11 2014 +0530

    Bug#17453826:ASSERTION ERROR WHEN SETTING FUTURE BINLOG
    FILE/POS WITH SEMISYNC

    Problem:
    ========
    When DMLs are in progress on the master stopping a slave and
    setting ahead binlog name/pos will cause an assert on the
    master.
    ...
2015-02-22 12:54:52 +01:00
Michael Widenius
70823e1d91 MDEV-5120 Test suite test maria-no-logging fails
The reason for the failure was a bug in an include file on debian that causes 'struct stat'
to have different sized depending on the environment.

This patch fixes so that we always include my_global.h or my_config.h before we include any other files.

Other things:
- Removed #include <my_global.h> in some include files; Better to always do this at the top level to have as few
  "always-include-this-file-first' files as possible.
- Removed usage of some include files that where already included by my_global.h or by other files.


client/mysql_plugin.c:
  Use my_global.h first
client/mysqlslap.c:
  Remove duplicated include files
extra/comp_err.c:
  Remove duplicated include files
include/m_string.h:
  Remove duplicated include files
include/maria.h:
  Remove duplicated include files
libmysqld/emb_qcache.cc:
  Use my_global.h first
plugin/semisync/semisync.h:
  Use my_pthread.h first
sql/datadict.cc:
  Use my_global.h first
sql/debug_sync.cc:
  Use my_global.h first
sql/derror.cc:
  Use my_global.h first
sql/des_key_file.cc:
  Use my_global.h first
sql/discover.cc:
  Use my_global.h first
sql/event_data_objects.cc:
  Use my_global.h first
sql/event_db_repository.cc:
  Use my_global.h first
sql/event_parse_data.cc:
  Use my_global.h first
sql/event_queue.cc:
  Use my_global.h first
sql/event_scheduler.cc:
  Use my_global.h first
sql/events.cc:
  Use my_global.h first
sql/field.cc:
  Use my_global.h first
  Remove duplicated include files
sql/field_conv.cc:
  Use my_global.h first
sql/filesort.cc:
  Use my_global.h first
  Remove duplicated include files
sql/gstream.cc:
  Use my_global.h first
sql/ha_ndbcluster.cc:
  Use my_global.h first
sql/ha_ndbcluster_binlog.cc:
  Use my_global.h first
sql/ha_ndbcluster_cond.cc:
  Use my_global.h first
sql/ha_partition.cc:
  Use my_global.h first
sql/handler.cc:
  Use my_global.h first
sql/hash_filo.cc:
  Use my_global.h first
sql/hostname.cc:
  Use my_global.h first
sql/init.cc:
  Use my_global.h first
sql/item.cc:
  Use my_global.h first
sql/item_buff.cc:
  Use my_global.h first
sql/item_cmpfunc.cc:
  Use my_global.h first
sql/item_create.cc:
  Use my_global.h first
sql/item_geofunc.cc:
  Use my_global.h first
sql/item_inetfunc.cc:
  Use my_global.h first
sql/item_row.cc:
  Use my_global.h first
sql/item_strfunc.cc:
  Use my_global.h first
sql/item_subselect.cc:
  Use my_global.h first
sql/item_sum.cc:
  Use my_global.h first
sql/item_timefunc.cc:
  Use my_global.h first
sql/item_xmlfunc.cc:
  Use my_global.h first
sql/key.cc:
  Use my_global.h first
sql/lock.cc:
  Use my_global.h first
sql/log.cc:
  Use my_global.h first
sql/log_event.cc:
  Use my_global.h first
sql/log_event_old.cc:
  Use my_global.h first
sql/mf_iocache.cc:
  Use my_global.h first
sql/mysql_install_db.cc:
  Remove duplicated include files
sql/mysqld.cc:
  Remove duplicated include files
sql/net_serv.cc:
  Remove duplicated include files
sql/opt_range.cc:
  Use my_global.h first
sql/opt_subselect.cc:
  Use my_global.h first
sql/opt_sum.cc:
  Use my_global.h first
sql/parse_file.cc:
  Use my_global.h first
sql/partition_info.cc:
  Use my_global.h first
sql/procedure.cc:
  Use my_global.h first
sql/protocol.cc:
  Use my_global.h first
sql/records.cc:
  Use my_global.h first
sql/records.h:
  Don't include my_global.h
  Better to do this at the upper level
sql/repl_failsafe.cc:
  Use my_global.h first
sql/rpl_filter.cc:
  Use my_global.h first
sql/rpl_gtid.cc:
  Use my_global.h first
sql/rpl_handler.cc:
  Use my_global.h first
sql/rpl_injector.cc:
  Use my_global.h first
sql/rpl_record.cc:
  Use my_global.h first
sql/rpl_record_old.cc:
  Use my_global.h first
sql/rpl_reporting.cc:
  Use my_global.h first
sql/rpl_rli.cc:
  Use my_global.h first
sql/rpl_tblmap.cc:
  Use my_global.h first
sql/rpl_utility.cc:
  Use my_global.h first
sql/set_var.cc:
  Added comment
sql/slave.cc:
  Use my_global.h first
sql/sp.cc:
  Use my_global.h first
sql/sp_cache.cc:
  Use my_global.h first
sql/sp_head.cc:
  Use my_global.h first
sql/sp_pcontext.cc:
  Use my_global.h first
sql/sp_rcontext.cc:
  Use my_global.h first
sql/spatial.cc:
  Use my_global.h first
sql/sql_acl.cc:
  Use my_global.h first
sql/sql_admin.cc:
  Use my_global.h first
sql/sql_analyse.cc:
  Use my_global.h first
sql/sql_audit.cc:
  Use my_global.h first
sql/sql_base.cc:
  Use my_global.h first
sql/sql_binlog.cc:
  Use my_global.h first
sql/sql_bootstrap.cc:
  Use my_global.h first
  Use my_global.h first
sql/sql_cache.cc:
  Use my_global.h first
sql/sql_class.cc:
  Use my_global.h first
sql/sql_client.cc:
  Use my_global.h first
sql/sql_connect.cc:
  Use my_global.h first
sql/sql_crypt.cc:
  Use my_global.h first
sql/sql_cursor.cc:
  Use my_global.h first
sql/sql_db.cc:
  Use my_global.h first
sql/sql_delete.cc:
  Use my_global.h first
sql/sql_derived.cc:
  Use my_global.h first
sql/sql_do.cc:
  Use my_global.h first
sql/sql_error.cc:
  Use my_global.h first
sql/sql_explain.cc:
  Use my_global.h first
sql/sql_expression_cache.cc:
  Use my_global.h first
sql/sql_handler.cc:
  Use my_global.h first
sql/sql_help.cc:
  Use my_global.h first
sql/sql_insert.cc:
  Use my_global.h first
sql/sql_lex.cc:
  Use my_global.h first
sql/sql_load.cc:
  Use my_global.h first
sql/sql_locale.cc:
  Use my_global.h first
sql/sql_manager.cc:
  Use my_global.h first
sql/sql_parse.cc:
  Use my_global.h first
sql/sql_partition.cc:
  Use my_global.h first
sql/sql_plugin.cc:
  Added comment
sql/sql_prepare.cc:
  Use my_global.h first
sql/sql_priv.h:
  Added error if we use this before including my_global.h
  This check is here becasue so many files includes sql_priv.h first.
sql/sql_profile.cc:
  Use my_global.h first
sql/sql_reload.cc:
  Use my_global.h first
sql/sql_rename.cc:
  Use my_global.h first
sql/sql_repl.cc:
  Use my_global.h first
sql/sql_select.cc:
  Use my_global.h first
sql/sql_servers.cc:
  Use my_global.h first
sql/sql_show.cc:
  Added comment
sql/sql_signal.cc:
  Use my_global.h first
sql/sql_statistics.cc:
  Use my_global.h first
sql/sql_table.cc:
  Use my_global.h first
sql/sql_tablespace.cc:
  Use my_global.h first
sql/sql_test.cc:
  Use my_global.h first
sql/sql_time.cc:
  Use my_global.h first
sql/sql_trigger.cc:
  Use my_global.h first
sql/sql_udf.cc:
  Use my_global.h first
sql/sql_union.cc:
  Use my_global.h first
sql/sql_update.cc:
  Use my_global.h first
sql/sql_view.cc:
  Use my_global.h first
sql/sys_vars.cc:
  Added comment
sql/table.cc:
  Use my_global.h first
sql/thr_malloc.cc:
  Use my_global.h first
sql/transaction.cc:
  Use my_global.h first
sql/uniques.cc:
  Use my_global.h first
sql/unireg.cc:
  Use my_global.h first
sql/unireg.h:
  Removed inclusion of my_global.h
storage/archive/ha_archive.cc:
  Added comment
storage/blackhole/ha_blackhole.cc:
  Use my_global.h first
storage/csv/ha_tina.cc:
  Use my_global.h first
storage/csv/transparent_file.cc:
  Use my_global.h first
storage/federated/ha_federated.cc:
  Use my_global.h first
storage/federatedx/federatedx_io.cc:
  Use my_global.h first
storage/federatedx/federatedx_io_mysql.cc:
  Use my_global.h first
storage/federatedx/federatedx_io_null.cc:
  Use my_global.h first
storage/federatedx/federatedx_txn.cc:
  Use my_global.h first
storage/heap/ha_heap.cc:
  Use my_global.h first
storage/innobase/handler/handler0alter.cc:
  Use my_global.h first
storage/maria/ha_maria.cc:
  Use my_global.h first
storage/maria/unittest/ma_maria_log_cleanup.c:
  Remove duplicated include files
storage/maria/unittest/test_file.c:
  Added comment
storage/myisam/ha_myisam.cc:
  Move sql_plugin.h first as this includes my_global.h
storage/myisammrg/ha_myisammrg.cc:
  Use my_global.h first
storage/oqgraph/oqgraph_thunk.cc:
  Use my_config.h and my_global.h first
  One could not include my_global.h before oqgraph_thunk.h (don't know why)
storage/spider/ha_spider.cc:
  Use my_global.h first
storage/spider/hs_client/config.cpp:
  Use my_global.h first
storage/spider/hs_client/escape.cpp:
  Use my_global.h first
storage/spider/hs_client/fatal.cpp:
  Use my_global.h first
storage/spider/hs_client/hstcpcli.cpp:
  Use my_global.h first
storage/spider/hs_client/socket.cpp:
  Use my_global.h first
storage/spider/hs_client/string_util.cpp:
  Use my_global.h first
storage/spider/spd_conn.cc:
  Use my_global.h first
storage/spider/spd_copy_tables.cc:
  Use my_global.h first
storage/spider/spd_db_conn.cc:
  Use my_global.h first
storage/spider/spd_db_handlersocket.cc:
  Use my_global.h first
storage/spider/spd_db_mysql.cc:
  Use my_global.h first
storage/spider/spd_db_oracle.cc:
  Use my_global.h first
storage/spider/spd_direct_sql.cc:
  Use my_global.h first
storage/spider/spd_i_s.cc:
  Use my_global.h first
storage/spider/spd_malloc.cc:
  Use my_global.h first
storage/spider/spd_param.cc:
  Use my_global.h first
storage/spider/spd_ping_table.cc:
  Use my_global.h first
storage/spider/spd_sys_table.cc:
  Use my_global.h first
storage/spider/spd_table.cc:
  Use my_global.h first
storage/spider/spd_trx.cc:
  Use my_global.h first
storage/xtradb/handler/handler0alter.cc:
  Use my_global.h first
storage/xtradb/handler/i_s.cc:
  Use my_global.h first
2014-09-30 20:31:14 +03:00
Sergei Golubchik
3da761912a MDEV-6616 Server crashes in my_hash_first if shutdown is performed when FLUSH LOGS is running
master_info_index becomes zero during shutdown.
check that it's valid (under a mutex) before dereferencing.
2014-09-06 08:33:56 +02:00
Sergei Golubchik
6fb17a0601 5.5.39 merge 2014-08-07 18:06:56 +02:00
Sergei Golubchik
1c6ad62a26 mysql-5.5.39 merge
~40% bugfixed(*) applied
~40$ bugfixed reverted (incorrect or we're not buggy)
~20% bugfixed applied, despite us being not buggy
(*) only changes in the server code, e.g. not cmakefiles
2014-08-02 21:26:16 +02:00
Kristian Nielsen
9150a0c7cb MDEV-4937: sql_slave_skip_counter does not work with GTID
The sql_slave_skip_counter is important to be able to recover replication from
certain errors. Often, an appropriate solution is to set
sql_slave_skip_counter to skip over a problem event. But setting
sql_slave_skip_counter produced an error in GTID mode, with a suggestion to
instead set @@gtid_slave_pos to point past the problem event. This however is
not always possible; for example, in case of an INCIDENT event, that event
does not have any GTID to assign to @@gtid_slave_pos.

With this patch, sql_slave_skip_counter now works in GTID mode the same was as
in non-GTID mode. When set, that many initial events are skipped when the SQL
thread starts, plus as many extra events are needed to completely skip any
partially skipped event group. The GTID position is updated to point past the
skipped event(s).
2014-06-25 15:24:11 +02:00
Venkatesh Duggirala
2870bd7423 Bug#17283409 4-WAY DEADLOCK: ZOMBIES, PURGING BINLOGS,
SHOW PROCESSLIST, SHOW BINLOGS

Problem:  A deadlock was occurring when 4 threads were
involved in acquiring locks in the following way
Thread 1: Dump thread ( Slave is reconnecting, so on
              Master, a new dump thread is trying kill
              zombie dump threads. It acquired thread's
              LOCK_thd_data and it is about to acquire
              mysys_var->current_mutex ( which LOCK_log)
Thread 2: Application thread is executing show binlogs and
               acquired LOCK_log and it is about to acquire
               LOCK_index.
Thread 3: Application thread is executing Purge binary logs
               and acquired LOCK_index and it is about to
               acquire LOCK_thread_count.
Thread 4: Application thread is executing show processlist
               and acquired LOCK_thread_count and it is
               about to acquire zombie dump thread's
               LOCK_thd_data.
Deadlock Cycle:
     Thread 1 -> Thread 2 -> Thread 3-> Thread 4 ->Thread 1

The same above deadlock was observed even when thread 4 is
executing 'SELECT * FROM information_schema.processlist' command and
acquired LOCK_thread_count and it is about to acquire zombie
dump thread's LOCK_thd_data.

Analysis:
There are four locks involved in the deadlock.  LOCK_log,
LOCK_thread_count, LOCK_index and LOCK_thd_data.
LOCK_log, LOCK_thread_count, LOCK_index are global mutexes
where as LOCK_thd_data is local to a thread.
We can divide these four locks in two groups.
Group 1 consists of LOCK_log and LOCK_index and the order
should be LOCK_log followed by LOCK_index.
Group 2 consists of other two mutexes
LOCK_thread_count, LOCK_thd_data and the order should
be LOCK_thread_count followed by LOCK_thd_data.
Unfortunately, there is no specific predefined lock order defined
to follow in the MySQL system when it comes to locks across these
two groups. In the above problematic example,
there is no problem in the way we are acquiring the locks
if you see each thread individually.
But If you combine all 4 threads, they end up in a deadlock.

Fix: 
Since everything seems to be fine in the way threads are taking locks,
In this patch We are changing the duration of the locks in Thread 4
to break the deadlock. i.e., before the patch, Thread 4
('show processlist' command) mysqld_list_processes()
function acquires LOCK_thread_count for the complete duration
of the function and it also acquires/releases
each thread's LOCK_thd_data.

LOCK_thread_count is used to protect addition and
deletion of threads in global threads list. While show
process list is looping through all the existing threads,
it will be a problem if a thread is exited but there is no problem
if a new thread is added to the system. Hence a new mutex is
introduced "LOCK_thd_remove" which will protect deletion
of a thread from global threads list. All threads which are
getting exited should acquire LOCK_thd_remove
followed by LOCK_thread_count. (It should take LOCK_thread_count
also because other places of the code still thinks that exit thread
is protected with LOCK_thread_count. In this fix, we are changing
only 'show process list' query logic )
(Eg: unlink_thd logic will be protected with
LOCK_thd_remove).

Logic of mysqld_list_processes(or file_schema_processlist)
will now be protected with 'LOCK_thd_remove' instead of
'LOCK_thread_count'.

Now the new locking order after this patch is:
LOCK_thd_remove -> LOCK_thd_data -> LOCK_log ->
LOCK_index -> LOCK_thread_count
2014-05-08 18:13:01 +05:30
unknown
2c2478b822 MDEV-5804: If same GTID is received on multiple master connections in multi-source replication, the event is double-executed causing corruption or replication failure
Before, the arrival of same GTID twice in multi-source replication
would cause double-apply or in gtid strict mode an error.

Keep the behaviour, but add an option --gtid-ignore-duplicates which
allows to correctly handle duplicates, ignoring all but the first.
This relies on the user ensuring correct configuration so that
sequence numbers are strictly increasing within each replication
domain; then duplicates can be detected simply by comparing the
sequence numbers against what is already applied.

Only one master connection (but possibly multiple parallel worker
threads within that connection) is allowed to apply events within
one replication domain at a time; any other connection that
receives a GTID in the same domain either discards it (if it is
already applied) or waits for the other connection to not have
any events to apply.

Intermediate patch, as proof-of-concept for testing. The main limitation
is that currently it is only implemented for parallel replication,
@@slave_parallel_threads > 0.
2014-03-09 10:27:38 +01:00
Sergei Golubchik
0dc23679c8 10.0-base merge 2014-02-26 15:28:07 +01:00
Sergei Golubchik
0b9a0a3517 5.5 merge 2014-02-25 16:04:35 +01:00
Sergei Golubchik
84651126c0 MySQL-5.5.36 merge
(without few incorrect bugfixes and with 1250 files where only a copyright year was changed)
2014-02-17 11:00:51 +01:00
unknown
dd93ec5633 Merge MariaDB 10.0-base to 10.0. 2014-02-10 15:12:17 +01:00
unknown
4e6606acad MDEV-4984: Implement MASTER_GTID_WAIT() and @@LAST_GTID.
MASTER_GTID_WAIT() is similar to MASTER_POS_WAIT(), but works with a
GTID position rather than an old-style filename/offset.

@@LAST_GTID gives the GTID assigned to the last transaction written
into the binlog.

Together, the two can be used by applications to obtain the GTID of
an update on the master, and then do a MASTER_GTID_WAIT() for that
position on any read slave where it is important to get results that
are caught up with the master at least to the point of the update.

The implementation of MASTER_GTID_WAIT() is implemented in a way
that tries to minimise the performance impact on the SQL threads,
even in the presense of many waiters on single GTID positions (as
from @@LAST_GTID).
2014-02-07 19:15:28 +01:00
Michael Widenius
0a20d762af Fix for MDEV-4117 @@global.relay_log_purge not per-master, conflicts between different masters in multisource replication
The fix is to not change @relay_log_purge as part of the CHANGE MASTER.
(There is no logical reason why this is done in the current source)

mysql-test/suite/rpl/r/rpl_slave_status.result:
  Ensure that CHANGE MASTER doesn't change relay_log_purge
mysql-test/suite/rpl/t/rpl_slave_status.test:
  Ensure that CHANGE MASTER doesn't change relay_log_purge
sql/sql_repl.cc:
  Don't change relay_log_purge in CHANGE MASTER
2014-01-14 19:00:38 +01:00
Sergei Golubchik
d28d3ba40d 10.0-base merge 2013-12-16 13:02:21 +01:00
unknown
170e9e593d MDEV-5306: Missing locking around rpl_global_gtid_binlog_state
There were some places where insufficient locking between
parallel threads could cause invalid memory accesses and
possibly other grief.

This patch adds the missing locking, and moves the locking
into the struct rpl_binlog_state methods to make it easier
to see that proper locking is in place everywhere.
2013-11-18 15:22:50 +01:00
Venkatesh Duggirala
e0efc2c39a Bug#17641586 INCORRECTLY PRINTED BINLOG DUMP INFORMATION
Problem:
When log_warnings is greater than 1, master prints binlog
dump thread information in mysqld.1.err file.
The information contains slave server id, binlog file and
binlog position. The slave server id is uint32 and the print
format was wrongly specifified (%d instead of %u).
Hence a server id which is more than 2 billion is getting
printed with a negative value.
Eg: Start binlog_dump to slave_server(-1340259414),
pos(mysql-bin.001663, 325187493)

Fix: Changed the uint32 format to %u.
2013-11-12 22:09:10 +05:30
Sergei Golubchik
9af177042e 10.0-base merge.
Partitioning/InnoDB changes are *not* merged (they'll come from 5.6)
TokuDB does not compile (not updated to 10.0 SE API)
2013-09-21 10:14:42 +02:00
unknown
ada15c7a0f Fix various places where code would work incorrectly if the common_header_len of events is different on master and slave
Patch developed with the help of Pavel Ivanov.

Also fix an uninitialised variable in queue_event().
2013-09-04 12:22:09 +02:00
unknown
f9c2b402f4 MDEV-26: Global transaction ID.
Implement @@gtid_binlog_state. This is the internal state of the binlog
(most recent GTID logged for every domain_id and server_id). This allows
to save the state before RESET MASTER and restore it afterwards.
2013-08-23 14:02:13 +02:00
unknown
f0deff867a MDEV-4820: Empty master does not give error for slave GTID position that does not exist in the binlog
The main bug here was the following situation:

Suppose we set up a completely new master2 as an extra multi-master to an
existing slave that already has a different master1 for domain_id=0. When the
slave tries to connect to master2, master2 will not have anything that slave
requests in domain_id=0, but that is fine as master2 is supposedly meant to
serve eg. domain_id=1. (This is MDEV-4485).

But suppose that master2 then actually starts sending events from
domain_id=0. In this case, the fix for MDEV-4485 was incomplete, and the code
would fail to give the error that the position requested by the slave in
domain_id=0 was missing from the binlogs of master2. This could lead to lost
events or completely wrong replication.

The patch for this bug fixes this issue.

In addition, it cleans up the code a bit, getting rid of the fake_gtid_hash in
the code. And the error message when slave and master have diverged due to
alternate future is clarified, as requested in the bug description.
2013-08-16 15:10:25 +02:00
Sergei Golubchik
b7b5f6f1ab 10.0-monty merge
includes:
* remove some remnants of "Bug#14521864: MYSQL 5.1 TO 5.5 BUGS PARTITIONING"
* introduce LOCK_share, now LOCK_ha_data is strictly for engines
* rea_create_table() always creates .par file (even in "frm-only" mode)
* fix a 5.6 bug, temp file leak on dummy ALTER TABLE
2013-07-21 16:39:19 +02:00
Sergei Golubchik
5f6380adde 10.0-base merge 2013-07-18 16:46:57 +02:00
Sergei Golubchik
97e640b9ae 5.5 merge 2013-07-17 21:24:29 +02:00
Sergei Golubchik
005c7e5421 mysql-5.5.32 merge 2013-07-16 19:09:54 +02:00
unknown
2f6a2494a5 MDEV-4708: GTID strict mode doesn't work on a database with purged binlogs
When a new master is provisioned that does not have any old binlogs,
the @@gtid_slave_pos is used to know where in the GTID history the
provisioning happened. A slave is allowed to connect at the point of
this value of @@gtid_slave_pos, even if that GTID is not in the
binlogs on the new master.

The code to handle this case when the binlog on the newly provisioned
master is completely empty was just wrong (couple of typos). Clearly it
had never been tested ... :-/
2013-07-10 12:01:52 +02:00
unknown
1e43277838 MDEV-4708: GTID strict mode doesn't work on a database with purged binlogs
When a new master is provisioned that does not have any old binlogs,
the @@gtid_slave_pos is used to know where in the GTID history the
provisioning happened. A slave is allowed to connect at the point of
this value of @@gtid_slave_pos, even if that GTID is not in the
binlogs on the new master.

But --gtid-strict-mode did not correctly handle this case. When strict
mode was enabled, an attempt to connect at the position would cause an
error about holes in the binlog, which is not correct.

This patch adds a hash of GTIDs that need to be treated specially by
GTID strict mode to deal correctly with this case.
2013-07-10 11:45:15 +02:00
Michael Widenius
5f1f2fc0e4 Applied all changes from Igor and Sanja 2013-06-15 18:32:08 +03:00
unknown
b5fcf33d24 MDEV-4490: Old-style master position points at the last GTID event after slave restart
Now whenever we reach the GTID point requested from the slave (when using GTID
position to connect), we send a fake Gtid_list event. This event is used by
the slave to know the current old-style position for MASTER_POS_WAIT(), and
later the similar binlog position for MASTER_GTID_WAIT().

Without this fake event, if the slave is already fully up-to-date with the
master, there may be no events sent at the given position for an indeterminate
time.
2013-06-07 14:39:00 +02:00
unknown
7b6ab5638a MDEV-4483: CHANGE MASTER TO master_use_gtid=xxx looses old-style coordinates.
There was some old code that cleared the position in CHANGE MASTER,
it was forgotten to be removed.

In addition, add code that saves/restores the old-style position
when we nuke the old relay logs as part of GTID slave start.
Normally we will not use these, but it could be useful in case
the GTID connect fails and user wants to go back to the old-style
coordinates.
2013-06-07 08:43:21 +02:00
Sergei Golubchik
72ba95873a 10.0-base merge
(without InnoDB - all InnoDB changes were ignored)
2013-06-06 21:32:29 +02:00
Sergei Golubchik
4749d40c63 5.5 merge 2013-06-06 17:51:28 +02:00
unknown
5cb486d159 MDEV-26: Global transaction ID.
Fix problems related to reconnect. When we need to reconnect (ie. explict
stop/start of just the IO thread by user, or automatic reconnect due to
loosing network connection with the master), it is a bit complex to correctly
resume at the right point without causing duplicate or missing events in the
relay log. The previous code had multiple problems in this regard.

With this patch, the problem is solved as follows. The IO thread keeps track
(in memory) of which GTID was last queued to the relay log. If it needs to
reconnect, it resumes at that GTID position. It also counts number of events
received within the last, possibly partial, event group, and skips the same
number of events after a reconnect, so that events already enqueued before the
reconnect are not duplicated.

(There is no need to keep any persistent state; whenever we restart slave
threads after both of them being stopped (such as after server restart), we
erase the relay logs and start over from the last GTID applied by SQL thread.
But while the SQL thread is running, this patch is needed to get correct relay
log).
2013-06-05 14:32:47 +02:00
unknown
385780f571 MDEV-4485: Master did not allow slave to connect from the very start (empty GTID pos) if GTIDs from other multi_source master was present
The idea in the code was to protect the user that tries to connect a slave
to a master with completely different domains than what was intended. If
none of the domains in the start position are present at all in the master
binlog, we gave an error.

However, this is a stupid idea. Because when a slave connects to a master
to start replication from the very start of binlogs - such as when setting
up new master->slave servers from scratch - there will be just this
situation, the requested slave position is empty for all the domains in the
master's binlog.

So the code that gives this error is wrong, and the solution is simply to
remove it.
2013-05-29 11:41:25 +02:00
unknown
a0fd7382bc Merge 10.0-base -> 10.0 2013-05-28 15:39:56 +02:00
unknown
ee2b7db3f8 MDEV-4478: Implement GTID "strict mode"
When @@GLOBAL.gtid_strict_mode=1, then certain operations result
in error that would otherwise result in out-of-order binlog files
between servers.

GTID sequence numbers are now allocated independently per domain;
this results in less/no holes in GTID sequences, increasing the
likelyhood that diverging binlogs will be caught by the slave when
GTID strict mode is enabled.
2013-05-28 13:28:31 +02:00
unknown
1cd6eb5f94 MDEV-26: Global transaction ID.
Change of user interface to be more logical and more in line with expectations
to work similar to old-style replication.

User can now explicitly choose in CHANGE MASTER whether binlog position is
taken into account (master_gtid_pos=current_pos) or not (master_gtid_pos=
slave_pos) when slave connects to master.

@@gtid_pos is replaced by three separate variables @@gtid_slave_pos (can
be set by user, replicated GTIDs only), @@gtid_binlog_pos (read only), and
@@gtid_current_pos (a combination of the two, most recent GTID within each
domain). mysql.rpl_slave_state is renamed to mysql.gtid_slave_pos to match.

This fixes MDEV-4474.
2013-05-22 17:36:48 +02:00
unknown
d795bc9ff8 Fix race condition in binlog dump thread during server shutdown.
There was missing a check for THD::killed after THD::enter_cond(). This could
cause the binlog dump thread to miss the kill signal during server shutdown
and hang until it was force-closed.

Also fix a race in a test case that occasionally fails in Buildbot.
2013-05-16 12:41:11 +02:00
unknown
9fae993024 MDEV-26: Global transaction ID.
Implement START SLAVE UNTIL master_gtid_pos = "<GTID position>".

Add test cases, including a test showing how to use this to promote
a new master among a set of slaves.
2013-05-15 19:52:21 +02:00
Sergei Golubchik
b381cf843c mysql-5.5.31 merge 2013-05-07 13:05:09 +02:00
unknown
d0d05dae07 Merge 10.0-base -> 10.0 2013-05-03 12:10:16 +02:00
unknown
5aa0d185ca MDEV-4473: mysql_binlog_send() starts sending events from wrong GTID position in some master failover scenarios
Suppose binlog file X has in its Gtid_list_event: 0-1-3,0-2-5, and suppose the
slave requests to start replicating after 0-1-3.

In this case the bug was that master would start sending events from the start
of X. This is wrong, because 0-2-4 and 0-2-5 are contained in X-1, and are
needed by the slave. So these events were lost.

On the other hand, if the slave requested 0-2-5, then it _is_ correct to start
sending from the beginning of binlog file X, because 0-2-5 is the last GTID
logged in earlier binlogs. The difference is that 0-2-5 is the last of the
GTIDs in the Gtid_list_event. The problem was that the code did not check that
the matched GTID was the last one in the list.

Fixed by checking if the gtid requested by slave that matches a gtid in the
Gtid_list_event is the last event for that domain in the list. If not, go back
to a prior binlog to ensure all needed events are sent to slave.

mysql-test/include/show_events.inc:
  Backport --let $binlog_file=LAST, used by MDEV-4473 test case.
2013-05-03 11:27:29 +02:00
Michael Widenius
8cdb118a0a Fixed: MDEV-4352; LOAD DATA was not multi-source safe
- Calls to cleanup_load_tmpdir() could delete temporary files for another master connection
- Concurrent LOAD DATA commands from two master connections could use the same file name

Other bug fixes:
- Enlarge buffer for connection names with 'special characters' one can't store in filenames

Optimization:
- Don't do 'lower case' of connection names. We can use cmp_connection_name, where we already have the connection name in lower case.


mysql-test/suite/multi_source/load_data.result:
  Test case for MDEV-4352
mysql-test/suite/multi_source/load_data.test:
  Test case for MDEV-4352
sql/log_event.cc:
  Fixed: MDEV-4352
  - Calls to cleanup_load_tmpdir() could delete temporary files for another master connection
  - Concurrent LOAD DATA commands from two master connections could use the same file name
  
  The fix was to add the connection name (if one exists) to all slave temporary files used by LOAD DATA
sql/rpl_mi.cc:
  Enlarge buffer for connection names with 'special characters' one can't store in filenames
  Use mi->cmp_connection_name for connection file names.
sql/rpl_rli.cc:
  Use mi->cmp_connection_name for connection file names.
sql/slave.cc:
  Removed not needed empty line
sql/sql_const.h:
  Added MAX_FILENAME_MBWIDTH to be able to calculate buffer length for connection_names stored in file names
sql/sql_repl.cc:
  Use mi->cmp_connection_name for connection file names.
2013-05-03 01:50:42 +03:00
unknown
56d485e2b5 Merge 10.0-base -> 10.0 2013-04-29 12:03:54 +02:00
unknown
59830e1ab8 MDEV-4446: Incorrect handling of binlog checksum when searching for GTID start position in binlog
When the slave connects, the master skips binlog event groups
until it reaches the position requested by the slave. To
identify event groups, it needs to detect COMMIT events. But
this detection did not correctly handle binlog checksums, so
could incorrectly skip extra groups due to not detecting the
end of an event group.
2013-04-29 10:57:48 +02:00
unknown
203264ddc9 Fix unsigned/signed conversion bug in event type during mysql_binlog_send().
Since event types can be >=128 and are read from a (possibly signed) char
pointer, we need to cast to unsigned char before extending to int, or we will
get an incorrect negative number. This was done in the main code path already,
but there is a rare case where we check for new events first without a lock
and then again with the lock. If the second check succeeds because a new event
turns up at just the right time, then we took a code path that was missing the
correct unsigned char cast, leading to incorrect handling of events for old
slave servers and possibly other grief.

(This was found from a sporadic failure in Buildbot of test case
rpl_mariadb_slave_capability).
2013-04-25 13:16:35 +02:00
unknown
6b97512b21 Add missing check for thd->killed in mysql_binlog_send().
The slave dump thread running on the master only checked thd->killed whenever
it reached the end of a binlog file, not between events. This could
unnecessarily delay server shutdown.

This was found by code inspection while tracking down some occasional "forcing
close of thread..." errors in Buildbot. Hopefully this will fix the failures,
but the fix is correct in any case.

Also increase the wait during server shutdown, 2 seconds is a bit tight in
case of heavy I/O stall, and it seems better to delay shutdown a bit than
force-kill threads unnecessarily.

Also fix some races in test cases that restart the mysqld server. The .expect
file should be changed with --append_file, --remove_file + --write_file
creates a short window where mysqld can error out due to .expect file missing.
2013-04-24 13:05:40 +02:00