Commit graph

77612 commits

Author SHA1 Message Date
unknown
5633dd8227 MDEV-4506: parallel replication.
Add a simple test case.
Fix bugs found.
2013-09-16 14:33:49 +02:00
unknown
d107bdaa01 MDEV-4506, parallel replication.
Some after-review fixes.
2013-09-13 15:09:57 +02:00
unknown
13fddb32de MDEV-4506: Parallel replication: Intermediate commit.
Move annotate-event stuff from Relay_log_info to rpl_group_info,
to make it thread safe.
2013-07-12 14:52:05 +02:00
unknown
47f8e0ef6e MDEV-4506: Parallel replication: Intermediate commit
Remove Relay_log_info::group_info. (It is not thread safe).
2013-07-12 14:42:48 +02:00
unknown
ba4b937af2 MDEV-4506: Parallel replication: Intermediate commit
Move the deferred event stuff from Relay_log_info to rpl_group_info
to make it thread safe for parallel replication.
2013-07-12 14:36:20 +02:00
unknown
6d5f237e09 MDEV-4506: Parallel replication: Intermediate commit.
Fix a number of failures in the test suite.
2013-07-09 13:15:53 +02:00
unknown
a99356fbe7 MDEV-4506: Parallel replication: intermediate commit.
Fix a bunch of issues found with locking, ordering, and non-thread-safe stuff
in Relay_log_info.

Now able to do a simple benchmark, showing 4.5 times speedup for applying a
binlog with 10000 REPLACE statements.
2013-07-08 16:47:07 +02:00
unknown
e654be3865 MDEV-4506: Parallel replication: Intermediate commit.
Impement options --binlog-commit-wait-count and
--binlog-commit-wait-usec.

These options permit the DBA to deliberately increase latency
of an individual commit to get more transactions in each
binlog group commit. This increases the opportunity for
parallel replication on the slave, and can also decrease I/O
load on the master.

The options also make it easier to test the parallel
replication with mysql-test-run.
2013-07-05 00:26:15 +02:00
unknown
b5a496a777 MDEV-4506: Parallel replication: Intermediate commit.
Fix some bugs around waiting for worker threads to end during SQL slave stop.

Free Log_event after parallel execution (still needs to be made thread-safe by
using rpl_group_info rather than rli).
2013-07-04 13:17:01 +02:00
unknown
a1cfd47346 MDEV-4506: Parallel replication: Intermediate commit.
Wait for all worker threads to finish when stopping the SQL thread.
(Only a basic wait; this still needs to be fixed to include timeout
logic as in sql_slave_killed()).
2013-07-04 09:20:56 +02:00
unknown
592e464a02 MDEV-4506: Parallel replication. Intermediate commit.
Pass down rpl_group_info * to remove one instance of non-threadsafe
use of rli->group_info.
2013-07-03 19:03:21 +02:00
unknown
31a5edb5c2 MDEV-4506: Parallel replication. Intermediate commit.
Hook in the wait-for-prior-commit logic (not really tested yet).
Clean up some resource maintenance around rpl_group_info (may still be some
smaller issues there though).
Add a ToDo list at the top of rpl_parallel.cc
2013-07-03 13:46:33 +02:00
unknown
1b3dc66e31 MDEV-4506: Parallel replication: Intermediate commit.
First step of splitting out part of Relay_log_info, so that different
event groups being applied in parallel can each use their own copy.
2013-06-28 15:19:30 +02:00
unknown
7e5dc4f074 MDEV-4506: Parallel replication. Intermediate commit.
Implement facility for the commit in one thread to wait for the commit of
another to complete first. The wait is done in a way that does not hinder
that a waiter and a waitee can group commit together with a single fsync()
in both binlog and InnoDB. The wait is done efficiently with respect to
locking.

The patch was originally made to support TaoBao parallel replication with
in-order commit; now it will be adapted to also be used for parallel
replication of group-committed transactions.

A waiter THD registers itself with a prior waitee THD. The waiter will then
complete its commit at the earliest in the same group commit of the waitee
(when using binlog). The wait can also be done explicitly by the waitee.
2013-06-26 12:10:35 +02:00
unknown
535de71728 MDEV-4506: Parallel replication: intermediate commit.
Fix typo in worker thread free list management.
Simple parallel INSERT from worker threads runs now.
2013-06-25 15:48:01 +02:00
unknown
6d1e55f518 MDEV-4506: Parallel replication: Intermediate commit.
A few fixes following tests. Now can apply one INSERT event in
a separate worker thread.
2013-06-25 09:30:19 +02:00
unknown
26a9fbc416 MDEV-4506: Parallel replication of group-committed transactions: Intermediate commit
First very rough sketch. We spawn and retire a pool of slave threads.
Test main.alias works, most likely not much else does.
2013-06-24 10:50:25 +02:00
unknown
6a0a4f00a1 Forgotten .result file update. 2013-06-08 12:36:21 +02:00
unknown
b5fcf33d24 MDEV-4490: Old-style master position points at the last GTID event after slave restart
Now whenever we reach the GTID point requested from the slave (when using GTID
position to connect), we send a fake Gtid_list event. This event is used by
the slave to know the current old-style position for MASTER_POS_WAIT(), and
later the similar binlog position for MASTER_GTID_WAIT().

Without this fake event, if the slave is already fully up-to-date with the
master, there may be no events sent at the given position for an indeterminate
time.
2013-06-07 14:39:00 +02:00
unknown
03aa4876e1 MDEV-4486: Allow to start old-style replication even if mysql.rpl_slave_state is unavailable
If the mysql.gtid_slave_pos table is not available, we cannot load nor update
the current GTID position persistently. This can happen eg. after an upgrade,
before mysql_upgrade_db is run, or if the table is InnoDB and the server is
restarted without the InnoDB storage engine enabled.

Before, replication always failed to start if the table was unavailable. With
this patch, we try to continue with old-style replication, after suitable
complaints in the error log. In strict mode, or if slave is configured to use
GTID, slave still refuses to start.
2013-06-07 10:58:34 +02:00
unknown
dbe2c5060e MDEV-4591:Setting gtid* values from inside a transaction might cause unexpected results
Now we give an error on attempts to set @@SESSION.gtid_domain_id or
@@SESSION.gtid_seq_no when a transaction is active.
2013-06-07 09:31:11 +02:00
unknown
7b6ab5638a MDEV-4483: CHANGE MASTER TO master_use_gtid=xxx looses old-style coordinates.
There was some old code that cleared the position in CHANGE MASTER,
it was forgotten to be removed.

In addition, add code that saves/restores the old-style position
when we nuke the old relay logs as part of GTID slave start.
Normally we will not use these, but it could be useful in case
the GTID connect fails and user wants to go back to the old-style
coordinates.
2013-06-07 08:43:21 +02:00
Sergei Golubchik
4749d40c63 5.5 merge 2013-06-06 17:51:28 +02:00
Vladislav Vaintroub
1ff1cb10fc fix compile error 2013-06-06 17:38:07 +02:00
Michael Widenius
5cf5a9a1e8 Fixed timing failure in myisam-metadata.test
mysql-test/include/wait_show_condition.inc:
  Print failing statement if timeout
mysql-test/r/myisam-metadata.result:
  Updated DBUG_SYNC
mysql-test/t/myisam-metadata.test:
  Updated DBUG_SYNC.
  Removed wait_show_condtion, as this is not needed when we use DBUG_SYNC
  This should fix timing issues with the test
mysys/thr_mutex.c:
  Added comments
sql/sql_acl.cc:
  atoi -> atoll()  (Safety)
storage/myisam/ha_myisam.cc:
  Send signal before mi_repair_by_sort.
2013-06-06 15:51:36 +03:00
unknown
64e53a0f81 Fix two small problems in previous push. 2013-06-05 15:32:44 +02:00
unknown
5cb486d159 MDEV-26: Global transaction ID.
Fix problems related to reconnect. When we need to reconnect (ie. explict
stop/start of just the IO thread by user, or automatic reconnect due to
loosing network connection with the master), it is a bit complex to correctly
resume at the right point without causing duplicate or missing events in the
relay log. The previous code had multiple problems in this regard.

With this patch, the problem is solved as follows. The IO thread keeps track
(in memory) of which GTID was last queued to the relay log. If it needs to
reconnect, it resumes at that GTID position. It also counts number of events
received within the last, possibly partial, event group, and skips the same
number of events after a reconnect, so that events already enqueued before the
reconnect are not duplicated.

(There is no need to keep any persistent state; whenever we restart slave
threads after both of them being stopped (such as after server restart), we
erase the relay logs and start over from the last GTID applied by SQL thread.
But while the SQL thread is running, this patch is needed to get correct relay
log).
2013-06-05 14:32:47 +02:00
unknown
7ad47ab0e0 MDEV-4605: Failing to load GTID slave position from rpl.gtid_slave_pos
There were several cases where the slave GTID position was not loaded
correctly before being used. This caused various failures such as
corrupting the position at slave start and empty values of
@@gtid_slave_pos and @@gtid_current_pos.

Fixed by adding more checks for loaded position, and by always loading
the position at server startup.
2013-06-03 07:41:38 +02:00
Vladislav Vaintroub
33ef993773 Fix a compile warning on NetBSD 2013-06-01 21:33:26 +02:00
Vladislav Vaintroub
689c1b44a9 MDEV-4607 : libreadline-related compilation problems on NetBSD.
Problem : 
libreadline.so was already present on the machine, however the cmake check NEW_READLINE_INTERFACE was unsuccessfull indicating, thus bundled library had to be used instead of  system library.
The problem was that the value for HAVE_HIST_ENTRY cmake variable  was cached with incorrect  value (1 on NetBSD).

The fix is to change HAVE_HIST_ENTRY to 0 with  CACHE FORCE, after switching to bundled readline.
2013-06-01 21:30:33 +02:00
unknown
22b60fa95c MDEV-4520: Assertion `0' fails in Query_cache::end_of_result on concurrent drop event and event executio
Fix for embedded library, where thd->net.vio is not set which efficently switched off QC in emmbedded server for previous patch.
2013-05-30 08:23:49 +03:00
unknown
6feadb1082 MDEV-4485: Incorrect error handling in record_gtid().
Fix the error handling when access to the table mysql.gtid_slave_pos
fails for whatever reason. Add some test cases.
2013-05-29 14:23:40 +02:00
unknown
385780f571 MDEV-4485: Master did not allow slave to connect from the very start (empty GTID pos) if GTIDs from other multi_source master was present
The idea in the code was to protect the user that tries to connect a slave
to a master with completely different domains than what was intended. If
none of the domains in the start position are present at all in the master
binlog, we gave an error.

However, this is a stupid idea. Because when a slave connects to a master
to start replication from the very start of binlogs - such as when setting
up new master->slave servers from scratch - there will be just this
situation, the requested slave position is empty for all the domains in the
master's binlog.

So the code that gives this error is wrong, and the solution is simply to
remove it.
2013-05-29 11:41:25 +02:00
Sergei Golubchik
1db0c42e53 followup for revision 3751 "centos5 gcc 4.1 asm bug"
remove the workaround from cmake/os/FreeBSD.cmake
2013-05-28 21:25:59 +02:00
unknown
3061ca2be5 Fix type-typo which caused windows build failure. 2013-05-28 16:35:05 +02:00
unknown
ee2b7db3f8 MDEV-4478: Implement GTID "strict mode"
When @@GLOBAL.gtid_strict_mode=1, then certain operations result
in error that would otherwise result in out-of-order binlog files
between servers.

GTID sequence numbers are now allocated independently per domain;
this results in less/no holes in GTID sequences, increasing the
likelyhood that diverging binlogs will be caught by the slave when
GTID strict mode is enabled.
2013-05-28 13:28:31 +02:00
unknown
f5319394e3 MDEV-4475 follow-up patch: Add forgotten initialisation of the padding for empty Gtid_List event 2013-05-25 06:32:00 +02:00
unknown
416aed25ed MDEV-4475: Replication from MariaDB 10.0 to 5.5 does not work
The problem was the Gtid_list event which is logged to the binlog in
10.0 and is not understood by the 5.5 server.

This event is supposed to be replaced with a dummy event for 5.5
servers. But the very first event logged in the very first binlog
has an empty list of GTID, which makes the event too short to be
replacable with an empty event.

The fix is to pad the empty Gtid_list event to be big enough to
be replacable by a dummy event.
2013-05-24 22:21:08 +02:00
unknown
b9ce8572d9 MDEV-4520: Assertion `0' fails in Query_cache::end_of_result on concurrent drop event and event execution
If there is no net.vio then query cache cant't get data via net_real_write() so it is better just do not try to cache such query.
2013-05-23 17:05:31 +03:00
unknown
1cd6eb5f94 MDEV-26: Global transaction ID.
Change of user interface to be more logical and more in line with expectations
to work similar to old-style replication.

User can now explicitly choose in CHANGE MASTER whether binlog position is
taken into account (master_gtid_pos=current_pos) or not (master_gtid_pos=
slave_pos) when slave connects to master.

@@gtid_pos is replaced by three separate variables @@gtid_slave_pos (can
be set by user, replicated GTIDs only), @@gtid_binlog_pos (read only), and
@@gtid_current_pos (a combination of the two, most recent GTID within each
domain). mysql.rpl_slave_state is renamed to mysql.gtid_slave_pos to match.

This fixes MDEV-4474.
2013-05-22 17:36:48 +02:00
Vladislav Vaintroub
7ba2ff93ac MDEV-4548 - compile sphinx.so/dll and include into packages
replaced snippets_udf.cc with the latest version (2.0.8 from sphinxsource.com), fixed trivial errors on Windows.
It will be compiled and installed into plugins directory now.
2013-05-22 16:44:44 +02:00
Vladislav Vaintroub
ef1e767ae3 MDEV-4553 - Fixes for compilation under NetBSD. 2013-05-27 16:35:42 +02:00
Sergei Golubchik
9bc4c4183d MDEV-4516 SELECT from I_S.QUERY_CACHE_INFO produces ER_UNKNOWN_ERROR when query cache size is 0
if qc->try_lock() fails, it's not an error
2013-05-24 14:33:04 +02:00
Sergei Golubchik
cb246b20d6 fix for compiled-in FederatedX 2013-05-21 18:56:35 +02:00
Sergei Golubchik
d6315e29c8 MDEV-388 Creating a federated table with a non-existing server returns a random error code
(part 2)
2013-05-21 13:03:37 +02:00
Sergei Golubchik
ec043aced0 5.3 merge 2013-05-21 09:43:34 +02:00
Sergei Golubchik
fce7fc43ba fixes for buildbot 2013-05-21 09:42:10 +02:00
Sergei Golubchik
62ab6982a4 MDEV-388 Creating a federated table with a non-existing server returns a random error code 2013-05-20 23:58:44 +02:00
Sergei Golubchik
7e4150d7cd increase MAX_HA (number of simultaneously installed storage engines) to 64 2013-05-20 13:41:03 +02:00
Sergei Golubchik
d7a6c801ac 5.3 merge.
change maria.distinct to use a function that doesn't require ssl-enabled  builds
2013-05-20 12:36:30 +02:00