Commit graph

2449 commits

Author SHA1 Message Date
Monty
4bad74e139 Added error checking for all calls to flush_relay_log_info() and stmt_done() 2017-02-28 16:10:47 +01:00
Monty
c5e25c8b40 Added a separate lock for start/stop/reset slave.
This solves some possible dead locks when one calls stop slave while slave
is starting.
2017-02-28 16:10:46 +01:00
Monty
e65f667bb6 MDEV-9573 'Stop slave' hangs on replication slave
The reason for this is that stop slave takes LOCK_active_mi over the
whole operation while some slave operations will also need LOCK_active_mi
which causes deadlocks.

Fixed by introducing object counting for Master_info and not taking
LOCK_active_mi over stop slave or even stop_all_slaves()

Another benefit of this approach is that it allows:
- Multiple threads can run SHOW SLAVE STATUS at the same time
- START/STOP/RESET/SLAVE STATUS on a slave will not block other slaves
- Simpler interface for handling get_master_info()
- Added some missing unlock of 'log_lock' in error condtions
- Moved rpl_parallel_inactivate_pool(&global_rpl_thread_pool) to end
  of stop_slave() to not have to use LOCK_active_mi inside
  terminate_slave_threads()
- Changed argument for remove_master_info() to Master_info, as we always
  have this available
- Fixed core dump when doing FLUSH TABLES WITH READ LOCK and parallel
  replication. Problem was that waiting for pause_for_ftwrl was not done
  when deleting rpt->current_owner after a force_abort.
2017-02-28 16:10:46 +01:00
Sujatha Sivakumar
e619295e1b Bug#24901077: RESET SLAVE ALL DOES NOT ALWAYS RESET SLAVE
Description:
============
If you have a relay log index file that has ended up with
some relay log files that do not exists, then RESET SLAVE
ALL is not enough to get back to a clean state.

Analysis:
=========
In the bug scenario slave server is in stopped state and
some of the relay logs got deleted but the relay log index
file is not updated.

During slave server restart replication initialization fails
as some of the required relay logs are missing. User
executes RESET SLAVE/RESET SLAVE ALL command to start a
clean slave. As per the documentation RESET SLAVE command
clears the master info and relay log info repositories,
deletes all the relay log files, and starts a new relay log
file. But in a scenario where the slave server's
Relay_log_info object is not initialized slave will not
purge the existing relay logs. Hence the index file still
remains in a bad state. Users will not be able to start
the slave unless these files are cleared.

Fix:
===
RESET SLAVE/RESET SLAVE ALL commands should do the cleanup
even in a scenario where Relay_log_info object
initialization failed.

Backported a flag named 'error_on_rli_init_info' which is
required to identify slave's Relay_log_info object
initialization failure. This flag exists in MySQL-5.6
onwards as part of BUG#14021292 fix.

During RESET SLAVE/RESET SLAVE ALL execution this flag
indicates the Relay_log_info initialization failure.
In such a case open the relay log index/relay log files
and do the required clean up.
2017-02-28 10:00:51 +05:30
Nirbhay Choubey
ee8b5c305a Merge tag 'mariadb-10.0.29' into 10.0-galera 2017-01-13 13:53:59 -05:00
Marko Mäkelä
5044dae239 Merge 10.0 into 10.1 2017-01-10 14:30:11 +02:00
Kristian Nielsen
43378f367c MDEV-10271: Stopped SQL slave thread doesn't print a message to error log like IO thread does
Make the slave SQL thread always output to the error log the message "Slave
SQL thread exiting, replication stopped in ..." whenever it previously
outputted "Slave SQL thread initialized, starting replication ...".

Before this patch, it was somewhat inconsistent in which cases the message
would be output and in which not, depending on the exact time and cause of
the condition that caused the SQL thread to stop.
2017-01-06 10:46:20 +01:00
Sergei Golubchik
4a5d25c338 Merge branch '10.1' into 10.2 2016-12-29 13:23:18 +01:00
Sergei Golubchik
2f20d297f8 Merge branch '10.0' into 10.1 2016-12-11 09:53:42 +01:00
kevg
780db8e252 fix build and some warnings 2016-11-24 17:36:02 +03:00
Kristian Nielsen
390f2a013b Fix incorrect reading of events from relaylog in parallel replication.
The SQL thread keeps track of the position in the current relay log from
which to read the next event. This position is not normally used, but a
certain interaction with the IO thread can cause the SQL thread to re-open
the relay log and seek to the stored position.

In parallel replication, there were a couple of places where the position
was not updated. This created a race where a re-open of the relay log could
seek to the wrong position and start re-reading and processing events
already handled once, causing various kinds of problems.

Fix this by moving the position update into a single place in
apply_event_and_update_pos(), which should ensure that the position is
always updated in the parallel replication case.

This problem was found from the testcase of MDEV-10863, but it is logically
a separate problem.
2016-11-16 11:00:38 +01:00
Kristian Nielsen
f1fcc1fc10 Back-port Master_info::using_parallel() to 10.0.
This has no functional changes, but it helps avoid merge problems from 10.0
to 10.1. In 10.0, code that checks for parallel replication uses
opt_slave_parallel_threads > 0, but this check needs to be
mi->using_parallel() in 10.1. By using the same check in 10.0 (with
unchanged semantics), merge problems to 10.1 are avoided.
2016-11-15 23:00:11 +01:00
Kristian Nielsen
bccd0b5e0e Merge branch 'mdev10863' into 10.1 2016-11-15 13:10:21 +01:00
Kristian Nielsen
717f212840 MDEV-10863: parallel replication tries to continue from wrong position
This occured when the SQL thread (but not the IO thread) stops while
GTID and parallel replication are used with multiple domain ids in the
GTID position, and is restarted.

In this case, the SQL needs to start some way back in the relay log,
applying or skipping events within each replication domain as
appropriate.

The SQL threads starts at the beginning of an old relay log file, and
this position may be in the middle of an event group. The bug was that
such partial event group could be re-applied, causing replication
corruption.

This patch fixes the issue, by making sure to skip any initial events
that were part of an earlier (already applied) event group.
2016-11-04 12:33:42 +01:00
Kristian Nielsen
b002509b67 MDEV-11065: Compressed binary log. Merge code into current 10.2.
Conflicts:
	sql/share/errmsg-utf8.txt
2016-11-03 14:48:51 +01:00
Sergei Golubchik
a98c85bb50 Merge branch '10.0-galera' into 10.1 2016-11-02 13:44:07 +01:00
vinchen
0e380c3bfe two fix:
1.Avoid overflowing buffers in case of corrupt events
2.Check the compressed algorithm.
2016-10-29 21:59:20 +08:00
Nirbhay Choubey
5db2195a35 Merge tag 'mariadb-10.0.28' into 10.0-galera 2016-10-28 15:50:13 -04:00
Sergei Golubchik
22490a0d70 MDEV-8345 STOP SLAVE should not cause an ERROR to be logged to the error log
cherry-pick from 5.7:
  commit 6b24763
  Author: Manish Kumar <manish.4.kumar@oracle.com>
  Date:   Tue Mar 27 13:10:42 2012 +0530

  BUG#12977988 - ON STOP SLAVE: ERROR READING PACKET FROM SERVER: LOST CONNECTION
                 TO MYSQL SERVER
  BUG#11761457 - ERROR 2013 + "ERROR READING RELAY LOG EVENT" ON STOP SLAVEBUG#12977988 - ON STOP SLAVE: ERROR READING PACKET FROM SERVER: LOST CONNECTION
               TO MYSQL SERVER
2016-10-26 18:44:34 +02:00
vinchen
07f09df92b fix the ABI and stop slave hang problem 2016-10-21 13:37:48 +02:00
Kristian Nielsen
c06bc66816 MDEV-11065: Compressed binary log
Minor review comments/changes:

 - A bunch of style-fixes.

 - Change macros to static inline functions.

 - Update check_event_type() with compressed event types.

 - Small .result file update.
2016-10-20 18:00:59 +02:00
vinchen
d4b2c9bb1a optimize the memory allocation for compressed binlog event 2016-10-19 20:20:47 +02:00
vinchen
640051e06a Binlog compressed
Add some event types for the compressed event, there are:
     QUERY_COMPRESSED_EVENT,
     WRITE_ROWS_COMPRESSED_EVENT_V1,
     UPDATE_ROWS_COMPRESSED_EVENT_V1,
     DELETE_POWS_COMPRESSED_EVENT_V1,
     WRITE_ROWS_COMPRESSED_EVENT,
     UPDATE_ROWS_COMPRESSED_EVENT,
     DELETE_POWS_COMPRESSED_EVENT.
These events inheritance the uncompressed editor events. One of their constructor functions and write
function have been overridden for uncompressing and compressing. Anything but this is totally the same.

On slave, The IO thread will uncompress and convert them When it receiving the events from the master.
So the SQL and worker threads can be stay unchanged.

Now we use zlib as compress algorithm. It maybe support other algorithm in the future.
2016-10-19 20:20:35 +02:00
vinchen
0fa39ffba7 fix code style.. 2016-10-19 13:52:17 +02:00
vinchen
c334f4fe46 fix the code style for read_binlog_speed_limit 2016-10-19 13:52:17 +02:00
vinchen
43789901c7 Control the binlog read speed for compressed protocol 2016-10-19 13:51:08 +02:00
vinchen
8eb0f5ca1a Control the Maximum speed(KB/s) to read binlog from master 2016-10-19 13:51:08 +02:00
Kristian Nielsen
e1ef99c3dc MDEV-7145: Delayed replication
Merge feature into 10.2 from feature branch.

Delayed replication adds an option

  CHANGE MASTER TO master_delay=<seconds>

Replication will then delay applying events with that many
seconds. This creates a replication slave that reflects the state of
the master some time in the past.

Feature is ported from MySQL source tree.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-16 23:44:44 +02:00
Kristian Nielsen
3011060b2a MDEV-7145: Delayed slave.
Extend to work also for parallel replication.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:59 +02:00
Kristian Nielsen
814880711f BUG#56442: Slave executes delayed statements when STOP SLAVE is issued
Problem:
When using the delayed slave feature, and the SQL thread is delaying,
and the user issues STOP SLAVE, the event we wait for was executed.
It should not be executed.
Fix:
Check the return value from the delay function,
slave.cc:slave_sleep(). If the return value is 1, it means the thread
has been stopped, in this case we don't execute the statement.

Also, refactored the test case for delayed slave a little: added the
test script include/rpl_assert.inc, which asserts that a condition holds
and prints a message if not. Made rpl_delayed_slave.test use this. The
advantage is that the test file is much easier to read and maintain,
because it is clear what is an assertion and what is not, and also the
expected result can be found in the test file, you don't have to compare
it to the result file.

Manually merged into MariaDB from MySQL commit
fd2b210383358fe7697f201e19ac9779879ba72a

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:59 +02:00
Kristian Nielsen
b2bc6dadee MDEV-7145: Delayed replication, cleanup some code
The original MySQL patch left some refactoring todo's, possibly
because of known conflicts with other parallel development (like
info-repository feature perhaps).

This patch fixes those todos/refactorings.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:59 +02:00
Kristian Nielsen
a9fb480fd6 MDEV-7145: Delayed replication, fixing test failures.
Two merge error fixed, and testsuite updated to removed some other
test failues.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:58 +02:00
Kristian Nielsen
19abe79fd1 MDEV-7145: Delayed replication, intermediate commit.
Initial merge of delayed replication from MySQL git.

The code from the initial push into MySQL is merged, and the
associated test case passes. A number of tasks are still pending:

1. Check full test suite run for any regressions or .result file updates.

2. Extend the feature to also work for parallel replication.

3. There are some todo-comments about future refactoring left from
MySQL, these should be located and merged on top.

4. There are some later related MySQL commits, these should be checked
and merged. These include:
    e134b9362ba0b750d6ac1b444780019622d14aa5
    b38f0f7857c073edfcc0a64675b7f7ede04be00f
    fd2b210383358fe7697f201e19ac9779879ba72a
    afc397376ec50e96b2918ee64e48baf4dda0d37d

5. The testcase from MySQL relies heavily on sleep and timing for
testing, and seems likely to sporadically fail on heavily loaded test
servers in buildbot or distro build farms.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:58 +02:00
Kristian Nielsen
50f19ca809 Remove unnecessary global mutex in parallel replication.
The function apply_event_and_update_pos() is called with the
rli->data_lock mutex held. However, there seems to be nothing in the
function actually needing the mutex to be held. Certainly not in the
parallel replication case, where sql_slave_skip_counter is always 0
since the non-zero case is handled by the SQL driver thread.

So this patch makes parallel replication use a variant of
apply_event_and_update_pos() without the need to take the
rli->data_lock mutex. This avoids one contended global mutex for each
event executed, which might improve performance on CPU-bound workloads
somewhat.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 22:44:40 +02:00
Sergei Golubchik
ec59220f2c post-merge fixes for ec47bea 2016-09-12 13:54:44 +02:00
Kristian Nielsen
ec47beaba6 Merge parallel replication async deadlock kill into 10.2.
Conflicts:
	sql/mysqld.cc
	sql/slave.cc
2016-09-09 12:15:53 +02:00
Sergei Golubchik
06b7fce9f2 Merge branch '10.1' into 10.2 2016-09-09 08:33:08 +02:00
Kristian Nielsen
7e0c9de864 Parallel replication async deadlock kill
When a deadlock kill is detected inside the storage engine, the kill
is not done immediately, to avoid calling back into the storage engine
kill_query method with various lock subsystem mutexes held. Instead the
kill is queued and done later by a slave background thread.

This patch in preparation for fixing TokuDB optimistic parallel
replication, as well as for removing locking hacks in InnoDB/XtraDB in
10.2.

Signed-off-by: Kristian Nielsen <knielsen at knielsen-hq.org>
2016-09-08 15:25:40 +02:00
Monty
96e95b5465 Better SHOW PROCESSLIST for replication
- When waiting for events, start time is now counted from start of wait
- Instead of having "Connect" as "Command" for all replication threads we
  now have:
  - Slave_IO for Slave thread reading relay log
  - Slave_SQL for slave executing SQL commands or distribution queries to
    Slave workers
  - Slave_worker for slave threads executin SQL commands in parallel replication
2016-08-29 13:10:17 +03:00
Sergei Golubchik
6b1863b830 Merge branch '10.0' into 10.1 2016-08-25 12:40:09 +02:00
Nirbhay Choubey
c309e99ff9 Merge branch '10.0' into 10.0-galera 2016-08-24 19:30:32 -04:00
Vicențiu Ciorbaru
4eb898bb16 MDEV-10563 Crash during shutdown in Master_info_index::any_slave_sql_running
In well defined C code, the "this" pointer is never NULL. Currently, we
were potentially dereferencing a NULL pointer (master_info_index). GCC v6
removes any "if (!this)" conditions as it assumes this is always a
non-null pointer. In order to prevent undefined behaviour, check the
pointer before dereferencing and remove the check within member
functions.
2016-08-23 21:24:36 +03:00
Monty
8d5a0d650b Cleanups and minor fixes
- Fixed typos
- Added --core-on-failure to mysql-test-run
- More DBUG_PRINT in viosocket.c
- Don't forget CLIENT_REMEMBER_OPTIONS for compressed slave protocol
- Removed not used stage variables
2016-08-21 20:14:13 +03:00
Vladislav Vaintroub
31a8cf54c8 Revert "MDEV-9293 Connector/C integration"
This reverts commit 7b89b9f510.
2016-08-19 15:46:27 +00:00
Vladislav Vaintroub
7b89b9f510 MDEV-9293 Connector/C integration 2016-08-19 15:27:37 +00:00
Oleksandr Byelkin
66ac894c40 MDEV-10455: libmariadbclient18 + MySQL-python leaks memory on failed connections
Support of CLIENT_REMEMBER_OPTIONS and freeing options added.
2016-08-11 17:50:21 +02:00
Kristian Nielsen
fb076581f6 MDEV-10271: Stopped SQL slave thread doesn't print a message to error log like IO thread does
Make the slave SQL thread always output to the error log the message "Slave
SQL thread exiting, replication stopped in ..." whenever it previously
outputted "Slave SQL thread initialized, starting replication ...".

Before this patch, it was somewhat inconsistent in which cases the message
would be output and in which not, depending on the exact time and cause of
the condition that caused the SQL thread to stop.
2016-07-25 13:07:50 +02:00
Sergei Golubchik
932646b1ff Merge branch '10.1' into 10.2 2016-06-30 16:38:05 +02:00
Alexander Barkov
3f32bf627f More tests for "MDEV-7563 Support CHECK constraint".
Testing non-ASCII string literals.
2016-06-30 11:43:02 +02:00
Sergei Golubchik
62e0a4552f Merge branch '10.0-galera' into 10.1 2016-06-28 22:06:22 +02:00
Sergei Golubchik
3361aee591 Merge branch '10.0' into 10.1 2016-06-28 22:01:55 +02:00
Nirbhay Choubey
14d62505d9 Merge tag 'mariadb-10.0.26' into 10.0-galera 2016-06-24 12:01:22 -04:00
Nirbhay Choubey
ecdb2b6e86 Merge tag 'mariadb-5.5.50' into 5.5-galera 2016-06-23 12:54:38 -04:00
Sergei Golubchik
a10fd659aa Fixed for failures in buildbot: Replication
1. remove unnecessary rpl-tokudb combination file.
2. fix rpl_ignore_table to cleanup properly (not leave test
   grants in memory)
3. check_temp_dir() is supposed to set the error in stmt_da - do
   it even when called multiple times, this fixes a crash when
   rpl.rpl_slave_load_tmpdir_not_exist is run twice.
2016-06-22 10:40:43 +02:00
Sergei Golubchik
c081c978a2 Merge branch '5.5' into bb-10.0 2016-06-21 14:11:02 +02:00
Sergei Golubchik
ae29ea2d86 Merge branch 'mysql/5.5' into 5.5 2016-06-14 13:55:28 +02:00
Nirbhay Choubey
868c2ceb01 MDEV-9083: Slave IO thread does not handle autoreconnect to restarting Galera Cluster node
Chery-picked commits from codership/mysql-wsrep.

MW-284: Slave I/O retry on ER_COM_UNKNOWN_ERROR

Slave would treat ER_COM_UNKNOWN_ERROR as fatal error and stop.
The fix here is to treat it as a network error and rely on the
built-in mechanism to retry.

MW-284: Add an MTR test
2016-06-12 19:28:56 -04:00
Nirbhay Choubey
7305be2f7e MDEV-5535: Cannot reopen temporary table
mysqld maintains a list of TABLE objects for all temporary
tables created within a session in THD. Here each table is
represented by a TABLE object.

A query referencing a particular temporary table for more
than once, however, failed with ER_CANT_REOPEN_TABLE error
because a TABLE_SHARE was allocate together with the TABLE,
so temporary tables always had only one TABLE per TABLE_SHARE.

This patch lift this restriction by separating TABLE and
TABLE_SHARE objects and storing TABLE_SHAREs for temporary
tables in a list in THD, and TABLEs in a list within their
respective TABLE_SHAREs.
2016-06-10 18:39:43 -04:00
Monty
89685d55d7 Reuse THD for new user connections
- To ensure that mallocs are marked for the correct THD, even if it's
  allocated in another thread, I added the thread_id to the THD constructor
- Added st_my_thread_var to thr_lock_info_init() to avoid a call to my_thread_var
- Moved things from THD::THD() to THD::init()
- Moved some things to THD::cleanup()
- Added THD::free_connection() and THD::reset_for_reuse()
- Added THD to CONNECT::create_thd()
- Added THD::thread_dbug_id and st_my_thread_var->dbug_id. These are needed
  to ensure that we have a constant thread_id used for debugging with a THD,
  even if it changes thread_id (=connection_id)
- Set variables.pseudo_thread_id in constructor. Removed not needed sets.
2016-06-04 09:06:00 +02:00
Sujatha Sivakumar
ef3f09f0c9 Bug#23251517: SEMISYNC REPLICATION HANGING
Revert following bug fix:

Bug#20685029: SLAVE IO THREAD SHOULD STOP WHEN DISK IS
FULL
Bug#21753696: MAKE SHOW SLAVE STATUS NON BLOCKING IF IO
THREAD WAITS FOR DISK SPACE

This fix results in a deadlock between slave IO thread
and SQL thread.

(cherry picked from commit e3fea6c6dbb36c6ab21c4ab777224560e9608b53)
2016-05-16 11:34:20 +02:00
Sujatha Sivakumar
df7ecf64f5 Bug#23251517: SEMISYNC REPLICATION HANGING
Revert following bug fix:

Bug#20685029: SLAVE IO THREAD SHOULD STOP WHEN DISK IS
FULL
Bug#21753696: MAKE SHOW SLAVE STATUS NON BLOCKING IF IO
THREAD WAITS FOR DISK SPACE

This fix results in a deadlock between slave IO thread
and SQL thread.
2016-05-13 16:42:45 +05:30
Nirbhay Choubey
8a1efa1bdd Merge branch '10.0' into 10.0-galera 2016-04-29 16:50:58 -04:00
Monty
9c846373f0 Merge commit 'd5822a3ad0657040114cdc185c6387b9eb3a12b2' into 10.2 2016-04-28 16:59:33 +03:00
Monty
732adec0a4 Removed some not needed when doing delete thd, which caused warnings about
wrong mutex usage from safe_mutex.
Ensure that LOCK_status is always taken before LOCK_thread_count
2016-04-28 13:39:55 +03:00
Sergei Golubchik
f67a2211ec Merge branch '10.1' into 10.2 2016-03-23 22:36:46 +01:00
Sergei Golubchik
3b0c7ac1f9 Merge branch '10.0' into 10.1 2016-03-21 13:02:53 +01:00
Kristian Nielsen
f8251911a4 MDEV-9595: Shutdown takes forever with many replication channels
There was a race between end_slave() and cleanup code at the end of
handle_slave_sql(). This could cause access to master_info_index and
global_rpl_thread_pool after they had been freed.

Fix by skipping that cleanup if server shutdown is in progress, as is done
in other parts of the code as well (the cleanup, which stops worker threads
that are not needed anymore, is redundant anyway when the server is shutting
down).
2016-03-03 08:53:42 +01:00
Sujatha Sivakumar
8361151765 Bug#20685029: SLAVE IO THREAD SHOULD STOP WHEN DISK IS
FULL
Bug#21753696: MAKE SHOW SLAVE STATUS NON BLOCKING IF IO
THREAD WAITS FOR DISK SPACE

Problem:
========
Currently SHOW SLAVE STATUS blocks if IO thread waits for
disk space. This makes automation tools verifying
server health block on taking relevant action. Finally this
will create SHOW SLAVE STATUS piles.

Analysis:
=========
SHOW SLAVE STATUS hangs on mi->data_lock if relay log write
is waiting for free disk space while holding mi->data_lock.
mi->data_lock is needed to protect the format description
event (mi->format_description_event) which is accessed by
the clients running FLUSH LOGS and slave IO thread. Note
relay log writes don't need to be protected by
mi->data_lock, LOCK_log is used to protect relay log between
IO and SQL thread (see MYSQL_BIN_LOG::append_event). The
code takes mi->data_lock to protect
mi->format_description_event during relay log rotate which
might get triggered right after relay log write.

Fix:
====
Release the data_lock just for the duration of writing into
relay log.

Made change to ensure the following lock order is maintained
to avoid deadlocks.

data_lock, LOCK_log

data_lock is held during relay log rotations to protect
the description event.
2016-03-01 12:29:51 +05:30
Nirbhay Choubey
0d58323e26 Merge tag 'mariadb-10.0.24' into 10.0-galera 2016-02-23 20:53:29 -05:00
Monty
3d4a7390c1 MDEV-6150 Speed up connection speed by moving creation of THD to new thread
Creating a CONNECT object on client connect and pass this to the working thread which creates the THD.
Split LOCK_thread_count to different mutexes
Added LOCK_thread_start to syncronize threads
Moved most usage of LOCK_thread_count to dedicated functions
Use next_thread_id() instead of thread_id++

Other things:
- Thread id now starts from 1 instead of 2
- Added cast for thread_id as thread id is now of type my_thread_id
- Made THD->host const (To ensure it's not changed)
- Removed some DBUG_PRINT() about entering/exiting mutex as these was already logged by mutex code
- Fixed that aborted_connects and connection_errors_internal are counted in all cases
- Don't take locks for current_linfo when we set it (not needed as it was 0 before)
2016-02-07 10:34:03 +02:00
Alexey Botchkov
75a1d866dd MDEV-5273 Prepared statement doesn't return metadata after prepare.
SHOW SLAVE STATUS fixed.
2016-01-28 11:12:03 +04:00
Sergei Golubchik
f4faac4d6a Merge branch '10.0' into 10.1 2016-01-25 22:58:57 +01:00
Kristian Nielsen
2f88b14acd Merge branch 'tmp' into tmp-10.1
Conflicts:
	sql/slave.cc
2016-01-15 13:01:19 +01:00
Kristian Nielsen
74b1af19e9 Merge branch 'tmp' into tmp-10.0
Conflicts:
	sql/slave.cc
2016-01-15 12:50:23 +01:00
Kristian Nielsen
06b2e327fc Fix error handling for GTID and domain-based parallel replication
This occurs when replication stops with an error, domain-based parallel
replication is used, and the GTID position contains more than one domain.
Furthermore, it relates to the case where the SQL thread is restarted
without first stopping the IO thread.

In this case, the file/offset relay-log position does not correctly
represent the slave's multi-dimensional position, because other domains may
be far ahead of, or behind, the domain with the failing event. So the code
reverts the relay log position back to the start of a relay log file that is
known to be before all active domains.

There was a bug that when the SQL thread was restarted, the
rli->relay_log_state was incorrectly initialised from @@gtid_slave_pos. This
position will likely be too far ahead, due to reverting the relay log
position. Thus, if the replication fails again after the SQL thread restart,
the rli->restart_gtid_pos might be updated incorrectly. This in turn would
cause a second SQL thread restart to replicate from the wrong position, if
the IO thread was still left running.

The fix is to initialise rli->relay_log_state from @@gtid_slave_pos only
when we actually purge and re-fetch relay logs from the master, not at every
SQL thread start.

A related problem is the use of sql_slave_skip_counter to resolve
replication failures in this kind of scenario. Since the slave position is
multi-dimensional, sql_slave_skip_counter can not work properly - it is
indeterminate exactly which event is to be skipped, and is unlikely to work
as expected for the user. So make this an error in the case where
domain-based parallel replication is used with multiple domains, suggesting
instead the user to set @@gtid_slave_pos to reliably skip the desired event.
2016-01-15 12:48:14 +01:00
Monty
8fcc0bfefa Fixed bug in semi_sync replication tests.
The problem was that wait_for_slave_io_to_start reported that the io thread
was ready, when it was still initializing. This caused test suite to
continue too early, for example before the semi sync plugin was properly
enabled.

Fixed by introducing a new internal stage: "Preparing". Slave_IO_Running is
now set to "Yes" only when all initializing is done and the IO thread is
ready to read things from the master.

The only test affected by this change is rpl_flsh_tbls, which got stuck in
the preparing phase while trying to read the GTID position from a table.
Fixed by having this test waiting for Preparing instead of Yes.
2016-01-03 13:27:59 +02:00
Monty
661a6d8906 Cleanup of slave code:
- Added testing if connection is killed to shortcut reading of connection data
  This will allow us later in 10.2 to do a cleaner shutdown of slaves (less errors in the log)
- Add new status variables: Slaves_connected, Slaves_running and Slave_connections.
- Use MYSQL_SLAVE_NOT_RUN instead of 0 with slave_running.
- Don't print obvious extra warnings to the error log when slave is shut down normally.
2016-01-03 13:20:07 +02:00
Sergei Golubchik
a2bcee626d Merge branch '10.0' into 10.1 2015-12-21 21:24:22 +01:00
Nirbhay Choubey
dad555a09c Merge tag 'mariadb-10.0.23' into 10.0-galera 2015-12-19 14:24:38 -05:00
Monty
c3018b0ff4 Fixes to get all test to run on MacosX Lion 10.7
This includes fixing all utilities to not have any memory leaks,
as safemalloc warnings stopped tests from passing on MacOSX.

- Ensure that all clients takes character-set-dir, as the
  libmysqlclient library will use it.
- mysql-test-run now passes character-set-dir to all external clients.
- Changed dynstr_free() so that it can be called twice (made freeing code easier)
- Changed rpl_global_gtid_slave_state to be allocated dynamicly as it
  includes a mutex that needs to be initizlied/destroyed before my_end() is called.
- Removed rpl_slave_state::init() and rpl_slave_stage::deinit() as
  their job are better handling by constructor and delete.
- Print alias instead of table_name in check_duplicate_key as
  table_name may have been converted to lower case.

Other things:
- Fixed a case in time_to_datetime_with_warn() where we where
  using && instead of & in tests
2015-11-29 17:51:23 +02:00
Sergei Golubchik
7f19330c59 Merge branch 'github/10.0-galera' into 10.1 2015-11-19 17:48:36 +01:00
Nirbhay Choubey
f47124c9ef Incorrect statements binlogged on slave with do_domain_ids=(...)
In domain ID based filtering, a flag is used to filter-out
the events that belong to a particular domain. This flag gets
set when IO thread receives a GTID_EVENT for the domain on
filter list and its reset at the last event in the GTID group.
The resetting, however, was wrongly done before the decision to
write/filter the event from relay log is made. As a result, the
last event in the group will always pass through the filter.
Fixed by deferring the reset logic. Also added a test case.
2015-11-18 02:11:20 -05:00
Kristian Nielsen
8f2e05f41c Merge branch 'mdev7818-4' into 10.1
Conflicts:
	mysql-test/suite/perfschema/r/stage_mdl_global.result
	sql/rpl_rli.cc
	sql/sql_parse.cc
2015-11-13 14:24:40 +01:00
Kristian Nielsen
6bf88cdd9d Merge branch 'mdev7818-4' into bb-10.0-knielsen 2015-11-13 14:08:38 +01:00
Kristian Nielsen
75dc267101 Change Seconds_behind_master to be updated only at commit in parallel replication
Before, the Seconds_behind_master was updated already when an event
was queued for a worker thread to execute later. This might lead users
to interpret a low value as the slave being almost up to date with the
master, while in reality there might still be lots and lots of events
still queued up waiting to be applied by the slave.

See https://lists.launchpad.net/maria-developers/msg08958.html for
more detailed discussions.
2015-11-13 10:24:53 +01:00
Monty
e8c1b35f18 MDEV-8476 Race condition in slave SQL thread shutdown
Patch backported from MariaDB 10.1

- Ensure that we wait with cleanup() until slave thread has stopped.
- Added signal_thd_deleted() to signal close_connections() that all THD's has been freed.

Other things
- Removed not needed calls to THD_CHECK_SENTRY() when we are calling 'delete thd'.
2015-11-12 14:51:01 +02:00
Nirbhay Choubey
4d15112962 Merge tag 'mariadb-10.0.22' into 10.0-galera 2015-10-31 18:07:02 -04:00
Sergei Golubchik
dfb74dea30 Merge branch '10.0' into 10.1 2015-10-12 00:37:58 +02:00
Monty
a69a6ddac8 MDEV-4487 Allow replication from MySQL 5.6+ when GTID is enabled on the master
MDEV-8685 MariaDB fails to decode Anonymous_GTID entries
MDEV-5705 Replication testing: 5.6->10.0

- Ignoring GTID events from MySQL 5.6+ (Allows replication from MySQL 5.6+ with GTID enabled)
- Added ignorable events from MySQL 5.6
- mysqlbinlog now writes information about GTID and ignorable events.
- Added more information in error message when replication stops because of wrong information in binary log.
- Fixed wrong test when write_on_release() should flush cache.
2015-10-08 10:45:09 +03:00
Nirbhay Choubey
db66d2f92d refs codership/mysql-wsrep#188
- setting error code for slave, if mysql slave node dropped from cluster
2015-09-10 00:20:49 -04:00
Nirbhay Choubey
2012a810ab refs codership/mysql-wsrep#181
- Galera related errors in mysql slave applying will now cause slave to
abort
2015-09-10 00:14:24 -04:00
Sergei Golubchik
b85a00161e MDEV-8264 encryption for binlog
* Start_encryption_log_event
* --encrypt-binlog command line option

based on google patches.
2015-09-04 10:33:55 +02:00
Sergei Golubchik
41d68cabee cleanup: Log_event::write() and MYSQL_BIN_LOG::write_cache()
Introduce Log_event_writer() that encapsulates
writing data to an IO_CACHE with automatic checksum calculation.

Now all events properly checksum themselves as needed.

Use Log_event_writer in MYSQL_BIN_LOG::write_cache() instead
of copy-pasting its logic all over.

Later Log_event_writer will also do encryption.
2015-09-04 10:33:55 +02:00
Sergei Golubchik
c862c15bba cleanup: [partial] removal of llstr()
now when my_vsnprintf() supports %llu for a few years already.
2015-09-04 10:33:54 +02:00
Sergei Golubchik
fff6f4278b Revert f1abd015, make a smaller fix
commit f1abd015dc
Author: Andrei Elkin <aelkin@mysql.com>
Date:   Thu Nov 12 17:10:19 2009 +0200

    Bug #47210   first execution of "start slave until" stops too early
2015-09-04 10:33:54 +02:00
Sergei Golubchik
1720fcdcbc cleanup DBUG, DBUG_DUMP_EVENT_BUF
introduce DBUG_DUMP_EVENT_BUF,
remove few unused DBUG_EXECUTE_IF's
simplify few DBUG_PRINT's
remove few redundant #ifndef DBUG_OFF's
2015-09-04 10:33:53 +02:00
Sergei Golubchik
2d2286faf3 cleanup: use enum_binlog_checksum_alg, not uint8
* fix unireg.h includes
* use enum_binlog_checksum_alg for binlog checksum variables,
  not uint8
2015-09-04 10:33:52 +02:00
Sergei Golubchik
530a6e7481 Merge branch '10.0' into 10.1
referenced_by_foreign_key2(), needed for InnoDB to compile,
was taken from 10.0-galera
2015-09-03 12:58:41 +02:00
Monty
4f0255cbf9 Fixed errors and bugs found by valgrind:
- If run with valgrind, mysqltest will now wait longer when syncronizing slave with master
- Ensure that we wait with cleanup() until slave thread has stopped.
- Added signal_thd_deleted() to signal close_connections() that all THD's has been freed.
- Check in handle_fatal_signal() that we don't use variables that has been freed.
- Increased some timeouts when run with --valgrind

Other things:
- Fixed wrong test in one_thread_per_connection_end() if galera is used.
- Removed not needed calls to THD_CHECK_SENTRY() when we are calling 'delete thd'.
2015-09-01 18:42:02 +03:00
Monty
56aa19989f MDEV-6152: Remove calls to current_thd while creating Item
Part 5: Removing calls to current_thd in net_read calls, creating fields,
        query_cache, acl and some other places where thd was available
2015-09-01 18:42:02 +03:00