Commit graph

440 commits

Author SHA1 Message Date
Sergei Golubchik
bb8e99fdc3 Merge branch 'bb-10.2-ext' into 10.3 2017-08-26 00:34:43 +02:00
Michael Widenius
4aaa38d26e Enusure that my_global.h is included first
- Added sql/mariadb.h file that should be included first by files in sql
  directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables
  that must be done before my_global.h is included)
- Removed a lot of include my_global.h from include files
- Removed include's of some files that my_global.h automatically includes
- Removed duplicated include's of my_sys.h
- Replaced include my_config.h with my_global.h
2017-08-24 01:05:44 +02:00
Sergei Golubchik
cb1e76e4de Merge branch '10.1' into 10.2 2017-08-17 11:38:34 +02:00
Sergei Golubchik
8e8d42ddf0 Merge branch '10.0' into 10.1 2017-08-08 10:18:43 +02:00
Vicențiu Ciorbaru
af40426fcd Fix purge_relay_logs post merge
slave_skip_counter must not be reset to 0 during purge_relay_logs. See
MDEV-4937 as to when this change happened.
2017-07-27 16:22:37 +03:00
Sergei Golubchik
7134afa22e MYSQL_BIN_LOG::open/close must be under LOCK_log 2017-07-27 12:43:45 +02:00
Vicențiu Ciorbaru
786ad0a158 Merge remote-tracking branch 'origin/5.5' into 10.0 2017-07-25 00:41:54 +03:00
Sergei Golubchik
9a5fe1f4ea Merge remote-tracking branch 'mysql/5.5' into 5.5 2017-07-18 14:59:10 +02:00
Marko Mäkelä
57fea53615 Merge bb-10.2-ext into 10.3 2017-07-07 12:39:43 +03:00
Sergei Golubchik
f6633bf058 Merge branch '10.1' into 10.2 2017-07-05 19:08:55 +02:00
Kristian Nielsen
c36620ddc3 MDEV-12179 post-merge fixes.
Fix LEX_STRING -> LEX_CSTRING issues.
2017-07-03 10:36:09 +02:00
Monty
f8dadbdf24 Ensure that we have LOG_log locked when relay_log.close is called
If open of the relay log failed, we got an assert in MYSQL_BIN_LOG::close
This only affected DEBUG systems
2017-07-03 11:16:13 +03:00
Kristian Nielsen
0db2cd7c76 MDEV-12179: Per-engine mysql.gtid_slave_pos table
Intermediate commit.

Fix compilation failure with different my_atomic implementation.

The my_atomic_loadptr* takes void ** as first argument, so variables
updated with it needs to be void * (it is not legal C to cast
some_type ** to void **).
2017-05-10 09:56:31 +02:00
Kristian Nielsen
89aad233de MDEV-12179: Per-engine mysql.gtid_slave_pos table
Intermediate commit.

Move the discovery of mysql.gtid_slave_pos* tables into the SQL thread.

This avoids doing things like opening tables and scanning the mysql
schema for tables inside of the START SLAVE statement, which might
interact badly with existing transaction or table locks.

(Even though START SLAVE is documented to implicitly commit any active
transactions, this appears not to be the case in current code).

Table discovery fits naturally in the SQL thread init code, next to
the loading of mysql.gtid_slave_pos state.
2017-04-23 10:49:58 +02:00
Kristian Nielsen
fdf2d40770 MDEV-12179: Per-engine mysql.gtid_slave_pos table
Intermediate commit.

Implement auto-creation of mysql.gtid_slave_pos* tables with needed engines,
if listed in --gtid-pos-auto-engines.

Uses an asynchronous approach to minimise locking overhead.

The list of available tables is extended with a flag. Extra entries are
added for --gtid-pos-auto-engines tables that do not exist yet, marked as
not existing but ready for auto-creation.

If record_gtid() needs a table marked for auto-creation, it sends a request
to the slave background thread to create the table, and continues to use an
existing table for the current and immediately coming transactions.

As soon as the slave background thread has made the new table available, it
will be used for all subsequent relevant transactions in record_gtid().

This asynchronous approach also avoids a lot of complex issues around trying
to do DDL in the middle of an on-going transaction.
2017-04-21 10:30:16 +02:00
Kristian Nielsen
6a84473c28 MDEV-12179: Per-engine mysql.gtid_slave_pos table
Intermediate commit.

This commit implements that record_gtid() selects a gtid_slave_posXXX table
with a storage engine already in use by current transaction, if any.

The default table mysql.gtid_slave_pos is used if no match can be found on
storage engine, or for GTID position updates with no specific storage
engine.

Table discovery of mysql.gtid_slave_pos* happens on initial GTID state load
as well as on every START SLAVE. Some effort is made to make this possible
without additional locking. New tables are added using lock-free atomics.
Removing tables requires stopping all slaves first. A warning is given in
the error log when a table is removed but a non-stopped slave still has a
reference to it.

If multiple mysql.gtid_slave_posXXX tables with same storage engine exist,
one is chosen arbitrarily to be used, with a warning in the error log. GTID
data from all tables is still read, but only one among redundant tables with
same storage engine will be updated.
2017-04-21 10:30:14 +02:00
Kristian Nielsen
c995ecbe98 MDEV-12179: Per-engine mysql.gtid_slave_pos table
Intermediate commit.

For each GTID recorded in mysq.gtid_slave_pos, keep track of which
engine the update was made in.

This will be later used to know which rows can be deleted in the table
of a given engine.
2017-04-21 08:00:06 +02:00
Kristian Nielsen
087cf02328 MDEV-12179: Per-engine mysql.gtid_slave_pos table
Intermediate commit.

Keep track of which mysql.gtid_slave_posXXX tables are available for each
engine, by searching for all tables in the mysql schema with names that
start with "gtid_slave_pos".

The list is computed at server start when the GTID position is loaded, and
it is re-computed on every START SLAVE command. This way, the DBA can
manually add a table for a new engine, and it will be automatically picked
up on next START SLAVE, so a full server restart is not needed.

The list is not yet actually used in the code.
2017-04-21 08:00:06 +02:00
Kristian Nielsen
141a1b09e6 MDEV-12179: Per-engine mysql.gtid_slave_pos table
Intermediate commit.

Refactor scan_all_gtid_slave_pos_table() so it can do generic processing on
the list of mysql.gtid_slave_pos_XXX tables.
2017-04-21 07:52:20 +02:00
Kristian Nielsen
d3837c69a2 MDEV-12179: Per-engine mysql.gtid_slave_pos table.
Intermediate commit.

On server start, look for and read all tables mysql.gtid_slave_pos* to
restore the GTID position.

Simple test case that moves the data to a new
mysql.gtid_slave_pos_innodb table and verifies that the new table is
read at server start.
2017-04-21 07:52:20 +02:00
Sergei Golubchik
da4d71d10d Merge branch '10.1' into 10.2 2017-03-30 12:48:42 +02:00
Marko Mäkelä
adc91387e3 Merge 10.0 into 10.1 2017-03-03 13:27:12 +02:00
Monty
f3c65ce951 Add protection to not access is_open() without LOCK_log mutex
Protection added to reopen_file() and new_file_impl().

Without this we could get an assert in fn_format() as name == 0,
because the file was closed and name reset, atthe same time
new_file_impl() was called.
2017-02-28 16:10:47 +01:00
Monty
4bad74e139 Added error checking for all calls to flush_relay_log_info() and stmt_done() 2017-02-28 16:10:47 +01:00
Sujatha Sivakumar
e619295e1b Bug#24901077: RESET SLAVE ALL DOES NOT ALWAYS RESET SLAVE
Description:
============
If you have a relay log index file that has ended up with
some relay log files that do not exists, then RESET SLAVE
ALL is not enough to get back to a clean state.

Analysis:
=========
In the bug scenario slave server is in stopped state and
some of the relay logs got deleted but the relay log index
file is not updated.

During slave server restart replication initialization fails
as some of the required relay logs are missing. User
executes RESET SLAVE/RESET SLAVE ALL command to start a
clean slave. As per the documentation RESET SLAVE command
clears the master info and relay log info repositories,
deletes all the relay log files, and starts a new relay log
file. But in a scenario where the slave server's
Relay_log_info object is not initialized slave will not
purge the existing relay logs. Hence the index file still
remains in a bad state. Users will not be able to start
the slave unless these files are cleared.

Fix:
===
RESET SLAVE/RESET SLAVE ALL commands should do the cleanup
even in a scenario where Relay_log_info object
initialization failed.

Backported a flag named 'error_on_rli_init_info' which is
required to identify slave's Relay_log_info object
initialization failure. This flag exists in MySQL-5.6
onwards as part of BUG#14021292 fix.

During RESET SLAVE/RESET SLAVE ALL execution this flag
indicates the Relay_log_info initialization failure.
In such a case open the relay log index/relay log files
and do the required clean up.
2017-02-28 10:00:51 +05:30
Sergei Golubchik
4a5d25c338 Merge branch '10.1' into 10.2 2016-12-29 13:23:18 +01:00
Sergei Golubchik
2f20d297f8 Merge branch '10.0' into 10.1 2016-12-11 09:53:42 +01:00
Kristian Nielsen
f1fcc1fc10 Back-port Master_info::using_parallel() to 10.0.
This has no functional changes, but it helps avoid merge problems from 10.0
to 10.1. In 10.0, code that checks for parallel replication uses
opt_slave_parallel_threads > 0, but this check needs to be
mi->using_parallel() in 10.1. By using the same check in 10.0 (with
unchanged semantics), merge problems to 10.1 are avoided.
2016-11-15 23:00:11 +01:00
vinchen
640051e06a Binlog compressed
Add some event types for the compressed event, there are:
     QUERY_COMPRESSED_EVENT,
     WRITE_ROWS_COMPRESSED_EVENT_V1,
     UPDATE_ROWS_COMPRESSED_EVENT_V1,
     DELETE_POWS_COMPRESSED_EVENT_V1,
     WRITE_ROWS_COMPRESSED_EVENT,
     UPDATE_ROWS_COMPRESSED_EVENT,
     DELETE_POWS_COMPRESSED_EVENT.
These events inheritance the uncompressed editor events. One of their constructor functions and write
function have been overridden for uncompressing and compressing. Anything but this is totally the same.

On slave, The IO thread will uncompress and convert them When it receiving the events from the master.
So the SQL and worker threads can be stay unchanged.

Now we use zlib as compress algorithm. It maybe support other algorithm in the future.
2016-10-19 20:20:35 +02:00
Kristian Nielsen
e1ef99c3dc MDEV-7145: Delayed replication
Merge feature into 10.2 from feature branch.

Delayed replication adds an option

  CHANGE MASTER TO master_delay=<seconds>

Replication will then delay applying events with that many
seconds. This creates a replication slave that reflects the state of
the master some time in the past.

Feature is ported from MySQL source tree.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-16 23:44:44 +02:00
Kristian Nielsen
b2bc6dadee MDEV-7145: Delayed replication, cleanup some code
The original MySQL patch left some refactoring todo's, possibly
because of known conflicts with other parallel development (like
info-repository feature perhaps).

This patch fixes those todos/refactorings.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:59 +02:00
Kristian Nielsen
19abe79fd1 MDEV-7145: Delayed replication, intermediate commit.
Initial merge of delayed replication from MySQL git.

The code from the initial push into MySQL is merged, and the
associated test case passes. A number of tasks are still pending:

1. Check full test suite run for any regressions or .result file updates.

2. Extend the feature to also work for parallel replication.

3. There are some todo-comments about future refactoring left from
MySQL, these should be located and merged on top.

4. There are some later related MySQL commits, these should be checked
and merged. These include:
    e134b9362ba0b750d6ac1b444780019622d14aa5
    b38f0f7857c073edfcc0a64675b7f7ede04be00f
    fd2b210383358fe7697f201e19ac9779879ba72a
    afc397376ec50e96b2918ee64e48baf4dda0d37d

5. The testcase from MySQL relies heavily on sleep and timing for
testing, and seems likely to sporadically fail on heavily loaded test
servers in buildbot or distro build farms.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:58 +02:00
Sergei Golubchik
06b7fce9f2 Merge branch '10.1' into 10.2 2016-09-09 08:33:08 +02:00
Sergei Golubchik
6b1863b830 Merge branch '10.0' into 10.1 2016-08-25 12:40:09 +02:00
Monty
b51109693e MDEV-10630 rpl.rpl_mdev6020 fails in buildbot with timeout
The issue was that when running with valgrind the wait for master_pos_Wait()
was not long enough.

This patch also fixes two other failures that could affect rpl_mdev6020:
- check_if_conflicting_replication_locks() didn't properly check domains
- 'did_mark_start_commit' was after signals to other threads was sent which could
  get the variable read too early.
2016-08-22 10:16:00 +03:00
Michael Widenius
db7edfed17 MDEV-7563 Support CHECK constraint as in (or close to) SQL Standard
MDEV-10134 Add full support for DEFAULT

- Added support for using tables with MySQL 5.7 virtual fields,
  including MySQL 5.7 syntax
- Better error messages also for old cases
- CREATE ... SELECT now also updates timestamp columns
- Blob can now have default values
- Added new system variable "check_constraint_checks", to turn of
  CHECK constraint checking if needed.
- Removed some engine independent tests in suite vcol to only test myisam
- Moved some tests from 'include' to 't'. Should some day be done for all tests.
- FRM version increased to 11 if one uses virtual fields or constraints
- Changed to use a bitmap to check if a field has got a value, instead of
  setting HAS_EXPLICIT_VALUE bit in field flags
- Expressions can now be up to 65K in total
- Ensure we are not refering to uninitialized fields when handling virtual fields or defaults
- Changed check_vcol_func_processor() to return a bitmap of used types
- Had to change some functions that calculated cached value in fix_fields to do
  this in val() or getdate() instead.
- store_now_in_TIME() now takes a THD argument
- fill_record() now updates default values
- Add a lookahead for NOT NULL, to be able to handle DEFAULT 1+1 NOT NULL
- Automatically generate a name for constraints that doesn't have a name
- Added support for ALTER TABLE DROP CONSTRAINT
- Ensure that partition functions register virtual fields used. This fixes
  some bugs when using virtual fields in a partitioning function
2016-06-30 11:43:02 +02:00
Nirbhay Choubey
7305be2f7e MDEV-5535: Cannot reopen temporary table
mysqld maintains a list of TABLE objects for all temporary
tables created within a session in THD. Here each table is
represented by a TABLE object.

A query referencing a particular temporary table for more
than once, however, failed with ER_CANT_REOPEN_TABLE error
because a TABLE_SHARE was allocate together with the TABLE,
so temporary tables always had only one TABLE per TABLE_SHARE.

This patch lift this restriction by separating TABLE and
TABLE_SHARE objects and storing TABLE_SHAREs for temporary
tables in a list in THD, and TABLEs in a list within their
respective TABLE_SHAREs.
2016-06-10 18:39:43 -04:00
Monty
636bb59034 Final fixes for Memory_used
- Change some static variables to dynamic to ensure that we don't do any memory
  allocations before server starts or stops
- Print more memory information on SIGHUP. Fixed output.
- Write out if memory was lost if run with --debug-at-exit
- Fixed wrong #ifdef in sql_cache.cc
2016-04-28 17:15:38 +03:00
Sergei Golubchik
f4faac4d6a Merge branch '10.0' into 10.1 2016-01-25 22:58:57 +01:00
Kristian Nielsen
2f88b14acd Merge branch 'tmp' into tmp-10.1
Conflicts:
	sql/slave.cc
2016-01-15 13:01:19 +01:00
Kristian Nielsen
74b1af19e9 Merge branch 'tmp' into tmp-10.0
Conflicts:
	sql/slave.cc
2016-01-15 12:50:23 +01:00
Kristian Nielsen
06b2e327fc Fix error handling for GTID and domain-based parallel replication
This occurs when replication stops with an error, domain-based parallel
replication is used, and the GTID position contains more than one domain.
Furthermore, it relates to the case where the SQL thread is restarted
without first stopping the IO thread.

In this case, the file/offset relay-log position does not correctly
represent the slave's multi-dimensional position, because other domains may
be far ahead of, or behind, the domain with the failing event. So the code
reverts the relay log position back to the start of a relay log file that is
known to be before all active domains.

There was a bug that when the SQL thread was restarted, the
rli->relay_log_state was incorrectly initialised from @@gtid_slave_pos. This
position will likely be too far ahead, due to reverting the relay log
position. Thus, if the replication fails again after the SQL thread restart,
the rli->restart_gtid_pos might be updated incorrectly. This in turn would
cause a second SQL thread restart to replicate from the wrong position, if
the IO thread was still left running.

The fix is to initialise rli->relay_log_state from @@gtid_slave_pos only
when we actually purge and re-fetch relay logs from the master, not at every
SQL thread start.

A related problem is the use of sql_slave_skip_counter to resolve
replication failures in this kind of scenario. Since the slave position is
multi-dimensional, sql_slave_skip_counter can not work properly - it is
indeterminate exactly which event is to be skipped, and is unlikely to work
as expected for the user. So make this an error in the case where
domain-based parallel replication is used with multiple domains, suggesting
instead the user to set @@gtid_slave_pos to reliably skip the desired event.
2016-01-15 12:48:14 +01:00
Monty
661a6d8906 Cleanup of slave code:
- Added testing if connection is killed to shortcut reading of connection data
  This will allow us later in 10.2 to do a cleaner shutdown of slaves (less errors in the log)
- Add new status variables: Slaves_connected, Slaves_running and Slave_connections.
- Use MYSQL_SLAVE_NOT_RUN instead of 0 with slave_running.
- Don't print obvious extra warnings to the error log when slave is shut down normally.
2016-01-03 13:20:07 +02:00
Sergei Golubchik
a2bcee626d Merge branch '10.0' into 10.1 2015-12-21 21:24:22 +01:00
Monty
c3018b0ff4 Fixes to get all test to run on MacosX Lion 10.7
This includes fixing all utilities to not have any memory leaks,
as safemalloc warnings stopped tests from passing on MacOSX.

- Ensure that all clients takes character-set-dir, as the
  libmysqlclient library will use it.
- mysql-test-run now passes character-set-dir to all external clients.
- Changed dynstr_free() so that it can be called twice (made freeing code easier)
- Changed rpl_global_gtid_slave_state to be allocated dynamicly as it
  includes a mutex that needs to be initizlied/destroyed before my_end() is called.
- Removed rpl_slave_state::init() and rpl_slave_stage::deinit() as
  their job are better handling by constructor and delete.
- Print alias instead of table_name in check_duplicate_key as
  table_name may have been converted to lower case.

Other things:
- Fixed a case in time_to_datetime_with_warn() where we where
  using && instead of & in tests
2015-11-29 17:51:23 +02:00
Kristian Nielsen
8f2e05f41c Merge branch 'mdev7818-4' into 10.1
Conflicts:
	mysql-test/suite/perfschema/r/stage_mdl_global.result
	sql/rpl_rli.cc
	sql/sql_parse.cc
2015-11-13 14:24:40 +01:00
Kristian Nielsen
75dc267101 Change Seconds_behind_master to be updated only at commit in parallel replication
Before, the Seconds_behind_master was updated already when an event
was queued for a worker thread to execute later. This might lead users
to interpret a low value as the slave being almost up to date with the
master, while in reality there might still be lots and lots of events
still queued up waiting to be applied by the slave.

See https://lists.launchpad.net/maria-developers/msg08958.html for
more detailed discussions.
2015-11-13 10:24:53 +01:00
Kristian Nielsen
244f043e6e Merge MDEV-8193 into 10.0 2015-09-11 12:03:04 +02:00
Kristian Nielsen
df9b8aee58 Merge MDEV-8193 into 10.1
Conflicts:
	sql/rpl_rli.cc
2015-09-11 12:01:48 +02:00
Kristian Nielsen
51eaa7fe53 MDEV-8193: UNTIL clause in START SLAVE is sporadically disobeyed by parallel replication
The code was using the wrong variable when comparing the binlog name
for the UNTIL position. This could cause the comparison to fail after
binlog rotation, in turn causing the UNTIL clause to not trigger slave
stop.
2015-09-11 10:51:56 +02:00
Sergei Golubchik
b85a00161e MDEV-8264 encryption for binlog
* Start_encryption_log_event
* --encrypt-binlog command line option

based on google patches.
2015-09-04 10:33:55 +02:00
Sergei Golubchik
c862c15bba cleanup: [partial] removal of llstr()
now when my_vsnprintf() supports %llu for a few years already.
2015-09-04 10:33:54 +02:00
Sergei Golubchik
fff6f4278b Revert f1abd015, make a smaller fix
commit f1abd015dc
Author: Andrei Elkin <aelkin@mysql.com>
Date:   Thu Nov 12 17:10:19 2009 +0200

    Bug #47210   first execution of "start slave until" stops too early
2015-09-04 10:33:54 +02:00
Sergei Golubchik
7b54dec1c6 cleanup: comments 2015-09-04 10:33:52 +02:00
Monty
872a953b22 MDEV-8469 Add RESET MASTER TO x to allow specification of binlog file nr
Other things:
- Avoid calling init_and_set_log_file_name() when opening binary log.
- Remove newlines early when reading from index file.
- Ensure that reset_logs() will work even if thd is 0 (Can happen on startup)
- Added thd to sart_slave_threads() for better error handling.
2015-07-16 10:36:58 +03:00
Kristian Nielsen
565960816e Merge MDEV-8354 into 10.1 2015-06-24 17:18:12 +02:00
Kristian Nielsen
8af5ab405a Merge MDEV-8354 into 10.0 2015-06-24 16:53:41 +02:00
Kristian Nielsen
b89de2b2ce MDEV-8354: out-of-order error with --gtid-ignore-duplicates and row-based replication
The --gtid-ignore-duplicates option was not working correctly with row-based
replication. When a row event was completed, but before committing, there
was a small window where another multi-source SQL thread could wrongly try
to re-execute the same transaction, without properly ignoring the duplicate
GTID. This would lead to duplicate key error or out-of-order GTID error or
similar.

Thanks to Matt Neth for reporting this and giving an easy way to reproduce
the issue.
2015-06-24 16:52:50 +02:00
Sergei Golubchik
5091a4ba75 Merge tag 'mariadb-10.0.19' into 10.1 2015-06-01 15:51:25 +02:00
Sergey Vojtovich
7cfa803d8e MDEV-8001 - mysql_reset_thd_for_next_command() takes 0.04% in OLTP RO
Removed mysql_reset_thd_for_next_command(). Call THD::reset_for_next_command()
directly instead.

mysql_reset_thd_for_next_command() overhead dropped 0.04% -> out of radar.
THD::reset_for_next_command() overhead didn't increase.
2015-05-13 10:43:14 +04:00
Sergei Golubchik
f875c9f2a0 MDEV-5114 seconds_behind_master flips to 0 & spikes back, when running show slaves status
1. After a period of wait (where last_master_timestamp=0)
   do NOT restore the last_master_timestamp to the timestamp
   of the last executed event (which would mean we've just
   executed it, and we're that much behind the master).

2. Update last_master_timestamp before executing the event,
   not after.

Take the approach from the this commit (but with a different test
case that actually makes sense):

commit 0c75ab453fb8c5439576af8fe5add7a1b89f1569
Author: Luis Soares <luis.soares@sun.com>
Date:   Thu Apr 15 17:39:31 2010 +0100

    BUG#52166: Seconds_Behind_Master spikes after long idle period
2015-05-03 11:21:55 +02:00
Kristian Nielsen
791b0ab5db Merge 10.0 -> 10.1 2015-04-20 13:21:58 +02:00
Kristian Nielsen
2e82a8233c MDEV-7785: errorneous -> erroneous spelling mistake 2015-03-16 10:54:47 +01:00
Kristian Nielsen
2e4dc5a370 after-merge fixes 2015-03-04 14:12:48 +01:00
Kristian Nielsen
95d7208859 Merge MDEV-6589 and MDEV-6403 into 10.1.
Conflicts:
	sql/log.cc
	sql/rpl_rli.cc
	sql/sql_repl.cc
2015-03-04 13:49:37 +01:00
Kristian Nielsen
ad0d203f2e MDEV-6589: Incorrect relay log start position when restarting SQL thread after error in parallel replication
The problem occurs in parallel replication in GTID mode, when we are using
multiple replication domains. In this case, if the SQL thread stops, the
slave GTID position may refer to a different point in the relay log for each
domain.

The bug was that when the SQL thread was stopped and restarted (but the IO
thread was kept running), the SQL thread would resume applying the relay log
from the point of the most advanced replication domain, silently skipping all
earlier events within other domains. This caused replication corruption.

This patch solves the problem by storing, when the SQL thread stops with
multiple parallel replication domains active, the current GTID
position. Additionally, the current position in the relay logs is moved back
to a point known to be earlier than the current position of any replication
domain. Then when the SQL thread restarts from the earlier position, GTIDs
encountered are compared against the stored GTID position. Any GTID that was
already applied before the stop is skipped to avoid duplicate apply.

This patch should have no effect if multi-domain GTID parallel replication is
not used. Similarly, if both SQL and IO thread are stopped and restarted, the
patch has no effect, as in this case the existing relay logs are removed and
re-fetched from the master at the current global @@gtid_slave_pos.
2015-03-04 13:36:04 +01:00
Sergei Golubchik
4b21cd21fe Merge branch '10.0' into merge-wip 2015-01-31 21:48:47 +01:00
Sergei Golubchik
e695db0f2d MDEV-7437 remove suport for "atomics" with rwlocks 2015-01-13 10:15:21 +01:00
Kristian Nielsen
f27817c1d0 MDEV-7326: Server deadlock in connection with parallel replication
The bug occurs when a transaction does a retry after all transactions have
done mark_start_commit() in a batch of group commit from the master. In this
case, the retrying transaction can unmark_start_commit() after the following
batch has already started running and de-allocated the GCO. Then after retry,
the transaction will re-do mark_start_commit() on a de-allocated GCO, and also
wakeup of later GCOs can be lost.

This was seen "in the wild" by a user, even though it is not known exactly
what circumstances can lead to retry of one transaction after all transactions
in a group have reached the commit phase.

The lifetime around GCO was somewhat clunky anyway. With this patch, a GCO
lives until rpl_parallel_entry::last_committed_sub_id has reached the last
transaction in the GCO. This guarantees that the GCO will still be alive when
a transaction does mark_start_commit(). Also, we now loop over the list of
active GCOs for wakeup, to ensure we do not lose a wakeup even in the
problematic case.
2015-01-07 14:45:39 +01:00
Kristian Nielsen
db21fddc37 MDEV-6676: Optimistic parallel replication
Implement a new mode for parallel replication. In this mode, all transactions
are optimistically attempted applied in parallel. In case of conflicts, the
offending transaction is rolled back and retried later non-parallel.

This is an early-release patch to facilitate testing, more changes to user
interface / options will be expected. The new mode is not enabled by default.
2014-12-06 08:49:50 +01:00
Kristian Nielsen
52b25934d7 MDEV-7237: Parallel replication: incorrect relaylog position after stop/start the slave
The replication relay log position was sometimes updated incorrectly at the
end of a transaction in parallel replication. This happened because the relay
log file name was taken from the current Relay_log_info (SQL driver thread),
not the correct value for the transaction in question.

The result was that if a transaction was applied while the SQL driver thread
was at least one relay log file ahead, _and_ the SQL thread was subsequently
stopped before applying any events from the most recent relay log file, then
the relay log position would be incorrect - wrong relay log file name. Thus,
when the slave was started again, usually a relay log read error would result,
or in rare cases, if the position happened to be readable, the slave might
even skip arbitrary amounts of events.

In GTID mode, the relay log position is reset when both slave threads are
restarted, so this bug would only be seen in non-GTID mode, or in GTID mode
when only the SQL thread, not the IO thread, was stopped.
2014-12-01 13:53:57 +01:00
Kristian Nielsen
b79685902d MDEV-6903: gtid_slave_pos is incorrect after master crash
When a master slave restarts, it logs a special restart format description
event in its binlog. When the slave sees this event, it knows it needs to roll
back any active partial transaction, in case the master crashed previously in
the middle of writing such transaction to its binlog.

However, there was a bug where this rollback did not reset rgi->pending_gtid.
This caused the @@gtid_slave_pos to be updated incorrectly with the GTID of
the partial transaction that was rolled back.

Fix this by always clearing rgi->pending_gtid in cleanup_context(), hopefully
preventing similar bugs from turning up in other special cases where a
transaction is rolled back during replication.

Thanks to Pavel Ivanov for tracking down the issue and providing a test case.
2014-11-25 12:19:48 +01:00
Kristian Nielsen
eec04fb4f6 MDEV-6680: Performance of domain_parallel replication is disappointing
The code that handles free lists of various objects passed to worker threads
in parallel replication handles freeing in batches, to avoid taking and
releasing LOCK_rpl_thread too often. However, it was possible for freeing to
be delayed to the point where one thread could stall the SQL driver thread due
to full queue, while other worker threads might be idle. This could
significantly degrade possible parallelism and thus performance.

Clean up the batch freeing code so that it is more robust and now able to
regularly free batches of object, so that normally the queue will not run full
unless the SQL driver thread is really far ahead of the worker threads.
2014-11-13 10:20:48 +01:00
Kristian Nielsen
8a3e2f29bb MDEV-6718: Server crashed in Gtid_log_event::Gtid_log_event with parallel replication
The bug occured in parallel replication when re-trying transactions that
failed due to deadlock. In this case, the relay log file is re-opened and the
events are read out again. This reading requires a format description event of
the appropriate version. But the code was using a description event stored in
rli, which is not thread-safe. This could lead to various rare races if the
format description event was replaced by the SQL driver thread at the exact
moment where a worker thread was trying to use it.

The fix is to instead make the retry code create and maintain its own format
description event. When the relay log file is opened, we first read the format
description event from the start of the file, before seeking to the current
position. This now uses the same code as when the SQL driver threads starts
from a given relay log position. This also makes sure that the correct format
description event version will be used in cases where the version of the
binlog could change during replication.
2014-11-13 10:09:46 +01:00
Michael Widenius
70823e1d91 MDEV-5120 Test suite test maria-no-logging fails
The reason for the failure was a bug in an include file on debian that causes 'struct stat'
to have different sized depending on the environment.

This patch fixes so that we always include my_global.h or my_config.h before we include any other files.

Other things:
- Removed #include <my_global.h> in some include files; Better to always do this at the top level to have as few
  "always-include-this-file-first' files as possible.
- Removed usage of some include files that where already included by my_global.h or by other files.


client/mysql_plugin.c:
  Use my_global.h first
client/mysqlslap.c:
  Remove duplicated include files
extra/comp_err.c:
  Remove duplicated include files
include/m_string.h:
  Remove duplicated include files
include/maria.h:
  Remove duplicated include files
libmysqld/emb_qcache.cc:
  Use my_global.h first
plugin/semisync/semisync.h:
  Use my_pthread.h first
sql/datadict.cc:
  Use my_global.h first
sql/debug_sync.cc:
  Use my_global.h first
sql/derror.cc:
  Use my_global.h first
sql/des_key_file.cc:
  Use my_global.h first
sql/discover.cc:
  Use my_global.h first
sql/event_data_objects.cc:
  Use my_global.h first
sql/event_db_repository.cc:
  Use my_global.h first
sql/event_parse_data.cc:
  Use my_global.h first
sql/event_queue.cc:
  Use my_global.h first
sql/event_scheduler.cc:
  Use my_global.h first
sql/events.cc:
  Use my_global.h first
sql/field.cc:
  Use my_global.h first
  Remove duplicated include files
sql/field_conv.cc:
  Use my_global.h first
sql/filesort.cc:
  Use my_global.h first
  Remove duplicated include files
sql/gstream.cc:
  Use my_global.h first
sql/ha_ndbcluster.cc:
  Use my_global.h first
sql/ha_ndbcluster_binlog.cc:
  Use my_global.h first
sql/ha_ndbcluster_cond.cc:
  Use my_global.h first
sql/ha_partition.cc:
  Use my_global.h first
sql/handler.cc:
  Use my_global.h first
sql/hash_filo.cc:
  Use my_global.h first
sql/hostname.cc:
  Use my_global.h first
sql/init.cc:
  Use my_global.h first
sql/item.cc:
  Use my_global.h first
sql/item_buff.cc:
  Use my_global.h first
sql/item_cmpfunc.cc:
  Use my_global.h first
sql/item_create.cc:
  Use my_global.h first
sql/item_geofunc.cc:
  Use my_global.h first
sql/item_inetfunc.cc:
  Use my_global.h first
sql/item_row.cc:
  Use my_global.h first
sql/item_strfunc.cc:
  Use my_global.h first
sql/item_subselect.cc:
  Use my_global.h first
sql/item_sum.cc:
  Use my_global.h first
sql/item_timefunc.cc:
  Use my_global.h first
sql/item_xmlfunc.cc:
  Use my_global.h first
sql/key.cc:
  Use my_global.h first
sql/lock.cc:
  Use my_global.h first
sql/log.cc:
  Use my_global.h first
sql/log_event.cc:
  Use my_global.h first
sql/log_event_old.cc:
  Use my_global.h first
sql/mf_iocache.cc:
  Use my_global.h first
sql/mysql_install_db.cc:
  Remove duplicated include files
sql/mysqld.cc:
  Remove duplicated include files
sql/net_serv.cc:
  Remove duplicated include files
sql/opt_range.cc:
  Use my_global.h first
sql/opt_subselect.cc:
  Use my_global.h first
sql/opt_sum.cc:
  Use my_global.h first
sql/parse_file.cc:
  Use my_global.h first
sql/partition_info.cc:
  Use my_global.h first
sql/procedure.cc:
  Use my_global.h first
sql/protocol.cc:
  Use my_global.h first
sql/records.cc:
  Use my_global.h first
sql/records.h:
  Don't include my_global.h
  Better to do this at the upper level
sql/repl_failsafe.cc:
  Use my_global.h first
sql/rpl_filter.cc:
  Use my_global.h first
sql/rpl_gtid.cc:
  Use my_global.h first
sql/rpl_handler.cc:
  Use my_global.h first
sql/rpl_injector.cc:
  Use my_global.h first
sql/rpl_record.cc:
  Use my_global.h first
sql/rpl_record_old.cc:
  Use my_global.h first
sql/rpl_reporting.cc:
  Use my_global.h first
sql/rpl_rli.cc:
  Use my_global.h first
sql/rpl_tblmap.cc:
  Use my_global.h first
sql/rpl_utility.cc:
  Use my_global.h first
sql/set_var.cc:
  Added comment
sql/slave.cc:
  Use my_global.h first
sql/sp.cc:
  Use my_global.h first
sql/sp_cache.cc:
  Use my_global.h first
sql/sp_head.cc:
  Use my_global.h first
sql/sp_pcontext.cc:
  Use my_global.h first
sql/sp_rcontext.cc:
  Use my_global.h first
sql/spatial.cc:
  Use my_global.h first
sql/sql_acl.cc:
  Use my_global.h first
sql/sql_admin.cc:
  Use my_global.h first
sql/sql_analyse.cc:
  Use my_global.h first
sql/sql_audit.cc:
  Use my_global.h first
sql/sql_base.cc:
  Use my_global.h first
sql/sql_binlog.cc:
  Use my_global.h first
sql/sql_bootstrap.cc:
  Use my_global.h first
  Use my_global.h first
sql/sql_cache.cc:
  Use my_global.h first
sql/sql_class.cc:
  Use my_global.h first
sql/sql_client.cc:
  Use my_global.h first
sql/sql_connect.cc:
  Use my_global.h first
sql/sql_crypt.cc:
  Use my_global.h first
sql/sql_cursor.cc:
  Use my_global.h first
sql/sql_db.cc:
  Use my_global.h first
sql/sql_delete.cc:
  Use my_global.h first
sql/sql_derived.cc:
  Use my_global.h first
sql/sql_do.cc:
  Use my_global.h first
sql/sql_error.cc:
  Use my_global.h first
sql/sql_explain.cc:
  Use my_global.h first
sql/sql_expression_cache.cc:
  Use my_global.h first
sql/sql_handler.cc:
  Use my_global.h first
sql/sql_help.cc:
  Use my_global.h first
sql/sql_insert.cc:
  Use my_global.h first
sql/sql_lex.cc:
  Use my_global.h first
sql/sql_load.cc:
  Use my_global.h first
sql/sql_locale.cc:
  Use my_global.h first
sql/sql_manager.cc:
  Use my_global.h first
sql/sql_parse.cc:
  Use my_global.h first
sql/sql_partition.cc:
  Use my_global.h first
sql/sql_plugin.cc:
  Added comment
sql/sql_prepare.cc:
  Use my_global.h first
sql/sql_priv.h:
  Added error if we use this before including my_global.h
  This check is here becasue so many files includes sql_priv.h first.
sql/sql_profile.cc:
  Use my_global.h first
sql/sql_reload.cc:
  Use my_global.h first
sql/sql_rename.cc:
  Use my_global.h first
sql/sql_repl.cc:
  Use my_global.h first
sql/sql_select.cc:
  Use my_global.h first
sql/sql_servers.cc:
  Use my_global.h first
sql/sql_show.cc:
  Added comment
sql/sql_signal.cc:
  Use my_global.h first
sql/sql_statistics.cc:
  Use my_global.h first
sql/sql_table.cc:
  Use my_global.h first
sql/sql_tablespace.cc:
  Use my_global.h first
sql/sql_test.cc:
  Use my_global.h first
sql/sql_time.cc:
  Use my_global.h first
sql/sql_trigger.cc:
  Use my_global.h first
sql/sql_udf.cc:
  Use my_global.h first
sql/sql_union.cc:
  Use my_global.h first
sql/sql_update.cc:
  Use my_global.h first
sql/sql_view.cc:
  Use my_global.h first
sql/sys_vars.cc:
  Added comment
sql/table.cc:
  Use my_global.h first
sql/thr_malloc.cc:
  Use my_global.h first
sql/transaction.cc:
  Use my_global.h first
sql/uniques.cc:
  Use my_global.h first
sql/unireg.cc:
  Use my_global.h first
sql/unireg.h:
  Removed inclusion of my_global.h
storage/archive/ha_archive.cc:
  Added comment
storage/blackhole/ha_blackhole.cc:
  Use my_global.h first
storage/csv/ha_tina.cc:
  Use my_global.h first
storage/csv/transparent_file.cc:
  Use my_global.h first
storage/federated/ha_federated.cc:
  Use my_global.h first
storage/federatedx/federatedx_io.cc:
  Use my_global.h first
storage/federatedx/federatedx_io_mysql.cc:
  Use my_global.h first
storage/federatedx/federatedx_io_null.cc:
  Use my_global.h first
storage/federatedx/federatedx_txn.cc:
  Use my_global.h first
storage/heap/ha_heap.cc:
  Use my_global.h first
storage/innobase/handler/handler0alter.cc:
  Use my_global.h first
storage/maria/ha_maria.cc:
  Use my_global.h first
storage/maria/unittest/ma_maria_log_cleanup.c:
  Remove duplicated include files
storage/maria/unittest/test_file.c:
  Added comment
storage/myisam/ha_myisam.cc:
  Move sql_plugin.h first as this includes my_global.h
storage/myisammrg/ha_myisammrg.cc:
  Use my_global.h first
storage/oqgraph/oqgraph_thunk.cc:
  Use my_config.h and my_global.h first
  One could not include my_global.h before oqgraph_thunk.h (don't know why)
storage/spider/ha_spider.cc:
  Use my_global.h first
storage/spider/hs_client/config.cpp:
  Use my_global.h first
storage/spider/hs_client/escape.cpp:
  Use my_global.h first
storage/spider/hs_client/fatal.cpp:
  Use my_global.h first
storage/spider/hs_client/hstcpcli.cpp:
  Use my_global.h first
storage/spider/hs_client/socket.cpp:
  Use my_global.h first
storage/spider/hs_client/string_util.cpp:
  Use my_global.h first
storage/spider/spd_conn.cc:
  Use my_global.h first
storage/spider/spd_copy_tables.cc:
  Use my_global.h first
storage/spider/spd_db_conn.cc:
  Use my_global.h first
storage/spider/spd_db_handlersocket.cc:
  Use my_global.h first
storage/spider/spd_db_mysql.cc:
  Use my_global.h first
storage/spider/spd_db_oracle.cc:
  Use my_global.h first
storage/spider/spd_direct_sql.cc:
  Use my_global.h first
storage/spider/spd_i_s.cc:
  Use my_global.h first
storage/spider/spd_malloc.cc:
  Use my_global.h first
storage/spider/spd_param.cc:
  Use my_global.h first
storage/spider/spd_ping_table.cc:
  Use my_global.h first
storage/spider/spd_sys_table.cc:
  Use my_global.h first
storage/spider/spd_table.cc:
  Use my_global.h first
storage/spider/spd_trx.cc:
  Use my_global.h first
storage/xtradb/handler/handler0alter.cc:
  Use my_global.h first
storage/xtradb/handler/i_s.cc:
  Use my_global.h first
2014-09-30 20:31:14 +03:00
Sergei Golubchik
4d4ce59d2b compilation fixes for WITH_ATOMIC_OPS=rwlocks 2014-09-08 12:59:57 +02:00
Kristian Nielsen
501c56ef1e MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
Merge the patches into MariaDB 10.0 main.

With this patch, parallel replication will now automatically retry a
transaction that fails due to deadlock or other temporary error, same as
single-threaded replication.

We catch deadlocks with InnoDB transactions due to enforced commit order. If
T1 must commit before T2 in parallel replication and T1 ends up waiting for T2
inside InnoDB, we kill T2 and retry it later to resolve the deadlock
automatically.
2014-07-11 12:06:47 +02:00
Kristian Nielsen
ba4e56d8d7 Fix small merge errors after rebase 2014-07-08 15:59:03 +02:00
Kristian Nielsen
98fc5b3af8 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
After-review changes.

For this patch in 10.0, we do not introduce a new public storage engine API,
we just fix the InnoDB/XtraDB issues. In 10.1, we will make a better public
API that can be used for all storage engines (MDEV-6429).

Eliminate the background thread that did deadlock kills asynchroneously.
Instead, we ensure that the InnoDB/XtraDB code can handle doing the kill from
inside the deadlock detection code (when thd_report_wait_for() needs to kill a
later thread to resolve a deadlock).

(We preserve the part of the original patch that introduces dedicated mutex
and condition for the slave init thread, to remove the abuse of
LOCK_thread_count for start/stop synchronisation of the slave init thread).
2014-07-08 12:54:47 +02:00
Kristian Nielsen
9150a0c7cb MDEV-4937: sql_slave_skip_counter does not work with GTID
The sql_slave_skip_counter is important to be able to recover replication from
certain errors. Often, an appropriate solution is to set
sql_slave_skip_counter to skip over a problem event. But setting
sql_slave_skip_counter produced an error in GTID mode, with a suggestion to
instead set @@gtid_slave_pos to point past the problem event. This however is
not always possible; for example, in case of an INCIDENT event, that event
does not have any GTID to assign to @@gtid_slave_pos.

With this patch, sql_slave_skip_counter now works in GTID mode the same was as
in non-GTID mode. When set, that many initial events are skipped when the SQL
thread starts, plus as many extra events are needed to completely skip any
partially skipped event group. The GTID position is updated to point past the
skipped event(s).
2014-06-25 15:24:11 +02:00
Kristian Nielsen
86362129a2 MDEV-6120: When slave stops with error, error message should indicate the failing GTID
If replication breaks in GTID mode, it is not trivial to determine the GTID of
the failing event group. This is a problem, as such GTID is needed eg. to
explicitly set @@gtid_slave_pos to skip to after that event group, or to
compare errors on different servers, etc.

Fix by ensuring that relevant slave errors logged to the error log include the
GTID of the event group containing the problem event.
2014-06-25 15:17:03 +02:00
Kristian Nielsen
312219cc63 MDEV-6364: Migrate a slave from MySQL 5.6 to MariaDB 10 break replication
MySQL 5.6 implemented WL#344, which is about a MASTER_DELAY option to CHANGE
MASTER. But as part of this worklog, the format of the realy-log.info file was
changed. The new format is not understood by earlier versions, and nor by
MariaDB 10.0, so changing server to those versions would cause the slave to
abort with an error due to reading incorrect data out of relay-log.info.

Fix this by backporting from the WL#344 patch just the code that understands
the new relay-log.info format. We still write out the old format, and none of
the MASTER_DELAY feature is backported with this commit.
2014-06-24 14:43:08 +02:00
unknown
bd4153a8c2 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel
replication causing replication to fail.

Remove the temporary fix for MDEV-5914, which used READ COMMITTED for parallel
replication worker threads. Replace it with a better, more selective solution.

The issue is with certain edge cases of InnoDB gap locks, for example between
INSERT and ranged DELETE. It is possible for the gap lock set by the DELETE to
block the INSERT, if the DELETE runs first, while the record lock set by
INSERT does not block the DELETE, if the INSERT runs first. This can cause a
conflict between the two in parallel replication on the slave even though they
ran without conflicts on the master.

With this patch, InnoDB will ask the server layer about the two involved
transactions before blocking on a gap lock. If the server layer tells InnoDB
that the transactions are already fixed wrt. commit order, as they are in
parallel replication, InnoDB will ignore the gap lock and allow the two
transactions to proceed in parallel, avoiding the conflict.

Improve the fix for MDEV-6020. When InnoDB itself detects a deadlock, it now
asks the server layer for any preferences about which transaction to roll
back. In case of parallel replication with two transactions T1 and T2 fixed to
commit T1 before T2, the server layer will ask InnoDB to roll back T2 as the
deadlock victim, not T1. This helps in some cases to avoid excessive deadlock
rollback, as T2 will in any case need to wait for T1 to complete before it can
itself commit.

Also some misc. fixes found during development and testing:

 - Remove thd_rpl_is_parallel(), it is not used or needed.

 - Use KILL_CONNECTION instead of KILL_QUERY when a parallel replication
   worker thread is killed to resolve a deadlock with fixed commit
   ordering. There are some cases, eg. in sql/sql_parse.cc, where a KILL_QUERY
   can be ignored if the query otherwise completed successfully, and this
   could cause the deadlock kill to be lost, so that the deadlock was not
   correctly resolved.

 - Fix random test failure due to missing wait_for_binlog_checkpoint.inc.

 - Make sure that deadlock or other temporary errors during parallel
   replication are not printed to the the error log; there were some places
   around the replication code with extra error logging. These conditions can
   occur occasionally and are handled automatically without breaking
   replication, so they should not pollute the error log.

 - Fix handling of rgi->gtid_sub_id. We need to be able to access this also at
   the end of a transaction, to be able to detect and resolve deadlocks due to
   commit ordering. But this value was also used as a flag to mark whether
   record_gtid() had been called, by being set to zero, losing the value. Now,
   introduce a separate flag rgi->gtid_pending, so rgi->gtid_sub_id remains
   valid for the entire duration of the transaction.

 - Fix one place where the code to handle ignored errors called reset_killed()
   unconditionally, even if no error was caught that should be ignored. This
   could cause loss of a deadlock kill signal, breaking deadlock detection and
   resolution.

 - Fix a couple of missing mysql_reset_thd_for_next_command(). This could
   cause a prior error condition to remain for the next event executed,
   causing assertions about errors already being set and possibly giving
   incorrect error handling for following event executions.

 - Fix code that cleared thd->rgi_slave in the parallel replication worker
   threads after each event execution; this caused the deadlock detection and
   handling code to not be able to correctly process the associated
   transactions as belonging to replication worker threads.

 - Remove useless error code in slave_background_kill_request().

 - Fix bug where wfc->wakeup_error was not cleared at
   wait_for_commit::unregister_wait_for_prior_commit(). This could cause the
   error condition to wrongly propagate to a later wait_for_prior_commit(),
   causing spurious ER_PRIOR_COMMIT_FAILED errors.

 - Do not put the binlog background thread into the processlist. It causes
   too many result differences in mtr, but also it probably is not useful
   for users to pollute the process list with a system thread that does not
   really perform any user-visible tasks...
2014-06-10 10:13:15 +02:00
unknown
629b822913 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel
replication causing replication to fail.

In parallel replication, we run transactions from the master in parallel, but
force them to commit in the same order they did on the master. If we force T1
to commit before T2, but T2 holds eg. a row lock that is needed by T1, we get
a deadlock when T2 waits until T1 has committed.

Usually, we do not run T1 and T2 in parallel if there is a chance that they
can have conflicting locks like this, but there are certain edge cases where
it can occasionally happen (eg. MDEV-5914, MDEV-5941, MDEV-6020). The bug was
that this would cause replication to hang, eventually getting a lock timeout
and causing the slave to stop with error.

With this patch, InnoDB will report back to the upper layer whenever a
transactions T1 is about to do a lock wait on T2. If T1 and T2 are parallel
replication transactions, and T2 needs to commit later than T1, we can thus
detect the deadlock; we then kill T2, setting a flag that causes it to catch
the kill and convert it to a deadlock error; this error will then cause T2 to
roll back and release its locks (so that T1 can commit), and later T2 will be
re-tried and eventually also committed.

The kill happens asynchroneously in a slave background thread; this is
necessary, as the reporting from InnoDB about lock waits happen deep inside
the locking code, at a point where it is not possible to directly call
THD::awake() due to mutexes held.

Deadlock is assumed to be (very) rarely occuring, so this patch tries to
minimise the performance impact on the normal case where no deadlocks occur,
rather than optimise the handling of the occasional deadlock.

Also fix transaction retry due to deadlock when it happens after a transaction
already signalled to later transactions that it started to commit. In this
case we need to undo this signalling (and later redo it when we commit again
during retry), so following transactions will not start too early.

Also add a missing thd->send_kill_message() that got triggered during testing
(this corrects an incorrect fix for MySQL Bug#58933).
2014-06-03 10:31:11 +02:00
unknown
787c470cef MDEV-5262: Missing retry after temp error in parallel replication
Handle retry of event groups that span multiple relay log files.

 - If retry reaches the end of one relay log file, move on to the next.

 - Handle refcounting of relay log files, and avoid purging relay log
   files until all event groups have completed that might have needed
   them for transaction retry.
2014-05-15 15:52:08 +02:00
unknown
b0b60f2498 MDEV-5262: Missing retry after temp error in parallel replication
Start implementing that an event group can be re-tried in parallel replication
if it fails with a temporary error (like deadlock).

Patch is very incomplete, just some very basic retry works.

Stuff still missing (not complete list):

 - Handle moving to the next relay log file, if event group to be retried
   spans multiple relay log files.

 - Handle refcounting of relay log files, to ensure that we do not purge a
   relay log file and then later attempt to re-execute events out of it.

 - Handle description_event_for_exec - we need to save this somehow for the
   possible retry - and use the correct one in case it differs between relay
   logs.

 - Do another retry attempt in case the first retry also fails.

 - Limit the max number of retries.

 - Lots of testing will be needed for the various edge cases.
2014-05-08 14:20:18 +02:00
unknown
64923bb653 MDEV-6156: Parallel replication incorrectly caches charset between worker threads
The previous patch for this bug was unfortunately completely wrong.

The purpose of cached_charset is to remember which character set we
have installed currently in the THD, so that in the common case where
charset does not change between queries, we do not need to update it
in the THD. Thus, it is important that the cached_charset field is
tightly coupled to the THD for which it handles caching.

Thus the right place to put cached_charset seems to be in the THD.
This patch introduces a field THD:system_thread_info where such info
in general can be placed without further inflating the THD with unused
data for other threads (THD is already far too big as it is). It then
moves the cached_charset into this slot for the SQL driver thread and
for the parallel replication worker threads.

The THD::rpl_filter field is also moved inside system_thread_info, to
keep the size of THD unchanged. Moving further fields in to reduce the
size of THD is a separate task, filed as MDEV-6164.
2014-04-25 12:58:31 +02:00
unknown
010971a761 MDEV-6156: Parallel replication incorrectly caches charset between worker threads
Replication caches the character sets used in a query, to be able to quickly
reuse them for the next query in the common case of them not having changed.

In parallel replication, this caching needs to be per-worker-thread. The
code was not modified to handle this correctly, so the caching in one worker
could cause another worker to run a query using the wrong character set,
causing replication corruption.
2014-04-23 16:06:06 +02:00
unknown
a5418c5540 MDEV-5921: In parallel replication, an error is not correctly signalled to the next transaction
When a transaction fails in parallel replication, it should signal the error
to any following transactions doing wait_for_prior_commit() on it. But the
code for this was incorrect, and would not correctly remember a prior error
when sending the signal. This caused corruption when slave stopped due to an
error.

Fix by remembering the error code when we first get an error, and passing the
saved error code to wakeup_subsequent_commits().

Thanks to nanyi607rao who reported this bug on
maria-developers@lists.launchpad.net and analysed the root cause.
2014-03-21 10:11:28 +01:00
unknown
8b9b7ec395 MDEV-5804: If same GTID is received on multiple master connections in multi-source replication, the event is double-executed causing corruption or replication failure
Some fixes, mainly to make it work in non-parallel replication mode also
(--slave-parallel-threads=0).

Patch should be fairly complete now.
2014-03-12 00:14:49 +01:00
unknown
2c2478b822 MDEV-5804: If same GTID is received on multiple master connections in multi-source replication, the event is double-executed causing corruption or replication failure
Before, the arrival of same GTID twice in multi-source replication
would cause double-apply or in gtid strict mode an error.

Keep the behaviour, but add an option --gtid-ignore-duplicates which
allows to correctly handle duplicates, ignoring all but the first.
This relies on the user ensuring correct configuration so that
sequence numbers are strictly increasing within each replication
domain; then duplicates can be detected simply by comparing the
sequence numbers against what is already applied.

Only one master connection (but possibly multiple parallel worker
threads within that connection) is allowed to apply events within
one replication domain at a time; any other connection that
receives a GTID in the same domain either discards it (if it is
already applied) or waits for the other connection to not have
any events to apply.

Intermediate patch, as proof-of-concept for testing. The main limitation
is that currently it is only implemented for parallel replication,
@@slave_parallel_threads > 0.
2014-03-09 10:27:38 +01:00
unknown
5ec49e6452 Merge MDEV-5754, MDEV-5769, and MDEV-5764 into 10.0 2014-03-04 14:32:42 +01:00
unknown
641feed481 MDEV-5764: START SLAVE UNTIL does not work with parallel replication
With parallel replication, there can be any number of events queued on
in-memory lists in the worker threads.

For normal STOP SLAVE, we want to skip executing any remaining events on those
lists and stop as quickly as possible.

However, for START SLAVE UNTIL, when the UNTIL position is reached in the SQL
driver thread, we must _not_ stop until all already queued events for the
workers have been executed - otherwise we would stop too early, before the
actual UNTIL position had been completely reached.

The code did not handle UNTIL correctly, stopping too early due to not
executing the queued events to completion. Fix this, and also implement that
an explicit STOP SLAVE in the middle (when the SQL driver thread has reached
the UNTIL position but the workers have not) _will_ cause an immediate stop.
2014-03-03 12:13:55 +01:00
unknown
20959fa09c Merge MDEV-5657 (parallel replication) to 10.0 2014-02-26 16:38:42 +01:00
unknown
e90f68c0ba MDEV-5657: Parallel replication.
Clean up and improve the parallel implementation code, mainly related to
scheduling of work to threads and handling of stop and errors.

Fix a lot of bugs in various corner cases that could lead to crashes or
corruption.

Fix that a single replication domain could easily grab all worker threads and
stall all other domains; now a configuration variable
--slave-domain-parallel-threads allows to limit the number of
workers.

Allow next event group to start as soon as previous group begins the commit
phase (as opposed to when it ends it); this allows multiple event groups on
the slave to participate in group commit, even when no other opportunities for
parallelism are available.

Various fixes:

 - Fix some races in the rpl.rpl_parallel test case.

 - Fix an old incorrect assertion in Log_event iocache read.

 - Fix repeated malloc/free of wait_for_commit and rpl_group_info objects.

 - Simplify wait_for_commit wakeup logic.

 - Fix one case in queue_for_group_commit() where killing one thread would
   fail to correctly signal the error to the next, causing loss of the
   transaction after slave restart.

 - Fix leaking of pthreads (and their allocated stack) due to missing
   PTHREAD_CREATE_DETACHED attribute.

 - Fix how one batch of group-committed transactions wait for the previous
   batch before starting to execute themselves. The old code had a very
   complex scheduling where the first transaction was handled differently,
   with subtle bugs in corner cases. Now each event group is always scheduled
   for a new worker (in a round-robin fashion amongst available workers).
   Keep a count of how many transactions have started to commit, and wait for
   that counter to reach the appropriate value.

 - Fix slave stop to wait for all workers to actually complete processing;
   before, the wait was for update of last_committed_sub_id, which happens a
   bit earlier, and could leave worker threads potentially accessing bits of
   the replication state that is no longer valid after slave stop.

 - Fix a couple of places where the test suite would kill a thread waiting
   inside enter_cond() in connection with debug_sync; debug_sync + kill can
   crash in rare cases due to a race with mysys_var_current_mutex in this
   case.

 - Fix some corner cases where we had enter_cond() but no exit_cond().

 - Fix that we could get failure in wait_for_prior_commit() but forget to flag
   the error with my_error().

 - Fix slave stop (both for normal stop and stop due to error). Now, at stop
   we pick a specific safe point (in terms of event groups executed) and make
   sure that all event groups before that point are executed to completion,
   and that no event group after start executing; this ensures a safe place to
   restart replication, even for non-transactional stuff/DDL. In error stop,
   make sure that all prior event groups are allowed to execute to completion,
   and that any later event groups that have started are rolled back, if
   possible. The old code could leave eg. T1 and T3 committed but T2 not, or
   it could even leave half a transaction not rolled back in some random
   worker, which would cause big problems when that worker was later reused
   after slave restart.

 - Fix the accounting of amount of events queued for one worker. Before, the
   amount was reduced immediately as soon as the events were dequeued (which
   happens all at once); this allowed twice the amount of events to be queued
   in memory for each single worker, which is not what users would expect.

 - Fix that an error set during execution of one event was sometimes not
   cleared before executing the next, causing problems with the error
   reporting.

 - Fix incorrect handling of thd->killed in worker threads.
2014-02-26 15:02:09 +01:00
unknown
dd93ec5633 Merge MariaDB 10.0-base to 10.0. 2014-02-10 15:12:17 +01:00
unknown
fefdb576bb Merge of MDEV-4984, MDEV-4726, and MDEV-5636 into 10.0-base.
MDEV-4984: Implement MASTER_GTID_WAIT() and @@LAST_GTID.
    MDEV-4726: Race in mysql-test/suite/rpl/t/rpl_gtid_stop_start.test
    MDEV-5636: Deadlock in RESET MASTER
2014-02-10 12:39:26 +01:00
unknown
4e6606acad MDEV-4984: Implement MASTER_GTID_WAIT() and @@LAST_GTID.
MASTER_GTID_WAIT() is similar to MASTER_POS_WAIT(), but works with a
GTID position rather than an old-style filename/offset.

@@LAST_GTID gives the GTID assigned to the last transaction written
into the binlog.

Together, the two can be used by applications to obtain the GTID of
an update on the master, and then do a MASTER_GTID_WAIT() for that
position on any read slave where it is important to get results that
are caught up with the master at least to the point of the update.

The implementation of MASTER_GTID_WAIT() is implemented in a way
that tries to minimise the performance impact on the SQL threads,
even in the presense of many waiters on single GTID positions (as
from @@LAST_GTID).
2014-02-07 19:15:28 +01:00
Michael Widenius
10001c8e4f Automatic merge 2014-02-05 19:23:11 +02:00
Michael Widenius
5426facdcb Replication changes for CREATE OR REPLACE TABLE
- CREATE TABLE is by default executed on the slave as CREATE OR REPLACE
- DROP TABLE is by default executed on the slave as DROP TABLE IF NOT EXISTS

This means that a slave will by default continue even if we try to create
a table that existed on the slave (the table will be deleted and re-created) or
if we try to drop a table that didn't exist on the slave.
This should be safe as instead of having the slave stop because of an inconsistency between
master and slave, it will fix the inconsistency.
Those that would prefer to get a stopped slave instead for the above cases can set slave_ddl_exec_mode to STRICT. 

- Ensure that a CREATE OR REPLACE TABLE which dropped a table is replicated
- DROP TABLE that generated an error on master is handled as an identical DROP TABLE on the slave (IF NOT EXISTS is not added in this case)
- Added slave_ddl_exec_mode variable to decide how DDL's are replicated

New logic for handling BEGIN GTID ... COMMIT from the binary log:

- When we find a BEGIN GTID, we start a transaction and set OPTION_GTID_BEGIN
- When we find COMMIT, we reset OPTION_GTID_BEGIN and execute the normal COMMIT code.
- While OPTION_GTID_BEGIN is set:
  - We don't generate implict commits before or after statements
  - All tables are regarded as transactional tables in the binary log (to ensure things are executed exactly as on the master)
- We reset OPTION_GTID_BEGIN also on rollback

This will help ensuring that we don't get any sporadic commits (and thus new GTID's) on the slave and will help keep the GTID's between master and slave in sync.


mysql-test/extra/rpl_tests/rpl_log.test:
  Added testing of mode slave_ddl_exec_mode=STRICT
mysql-test/r/mysqld--help.result:
  New help messages
mysql-test/suite/rpl/r/create_or_replace_mix.result:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/r/create_or_replace_row.result:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/r/create_or_replace_statement.result:
  Testing replication of create or replace
mysql-test/suite/rpl/r/rpl_gtid_startpos.result:
  Test must be run in slave_ddl_exec_mode=STRICT as part of the test depends on that DROP TABLE should fail on slave.
mysql-test/suite/rpl/r/rpl_row_log.result:
  Updated result
mysql-test/suite/rpl/r/rpl_row_log_innodb.result:
  Updated result
mysql-test/suite/rpl/r/rpl_row_show_relaylog_events.result:
  Updated result
mysql-test/suite/rpl/r/rpl_stm_log.result:
  Updated result
mysql-test/suite/rpl/r/rpl_stm_mix_show_relaylog_events.result:
  Updated result
mysql-test/suite/rpl/r/rpl_temp_table_mix_row.result:
  Updated result
mysql-test/suite/rpl/t/create_or_replace.inc:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_mix.cnf:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_mix.test:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_row.cnf:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_row.test:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_statement.cnf:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_statement.test:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/rpl_gtid_startpos.test:
  Test must be run in slave_ddl_exec_mode=STRICT as part of the test depends on that DROP TABLE should fail on slave.
mysql-test/suite/rpl/t/rpl_stm_log.test:
  Removed some lines
mysql-test/suite/sys_vars/r/slave_ddl_exec_mode_basic.result:
  Testing of slave_ddl_exec_mode
mysql-test/suite/sys_vars/t/slave_ddl_exec_mode_basic.test:
  Testing of slave_ddl_exec_mode
sql/handler.cc:
  Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
  This is to ensure that statments are not commited too early if non transactional tables are used.
sql/log.cc:
  Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
  Also treat 'direct' log events as transactional (to get them logged as they where on the master)
sql/log_event.cc:
  Ensure that the new error from DROP TABLE when trying to drop a view is treated same as the old one.
  Store error code that slave expects in THD.
  Set OPTION_GTID_BEGIN if we find a BEGIN.
  Reset OPTION_GTID_BEGIN if we find a COMMIT.
sql/mysqld.cc:
  Added slave_ddl_exec_mode_options
sql/mysqld.h:
  Added slave_ddl_exec_mode_options
sql/rpl_gtid.cc:
  Reset OPTION_GTID_BEGIN if we record a gtid (safety)
sql/sql_class.cc:
  Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
sql/sql_class.h:
  Added to THD: log_current_statement and slave_expected_error
sql/sql_insert.cc:
  Ensure that CREATE OR REPLACE is logged if table was deleted.
  Don't do implicit commit for CREATE if we are under OPTION_GTID_BEGIN
sql/sql_parse.cc:
  Change CREATE TABLE -> CREATE OR REPLACE TABLE for slaves
  Change DROP TABLE -> DROP TABLE IF EXISTS for slaves
  CREATE TABLE doesn't force implicit commit in case of OPTION_GTID_BEGIN
  Don't do commits before or after any statement if OPTION_GTID_BEGIN was set.
sql/sql_priv.h:
  Added OPTION_GTID_BEGIN
sql/sql_show.cc:
  Enhanced store_create_info() to also be able to handle CREATE OR REPLACE
sql/sql_show.h:
  Updated prototype
sql/sql_table.cc:
  Ensure that CREATE OR REPLACE is logged if table was deleted.
sql/sys_vars.cc:
  Added slave_ddl_exec_mode
sql/transaction.cc:
  Added warning if we got a GTID under OPTION_GTID_BEGIN
2014-02-05 19:01:59 +02:00