Commit graph

2664 commits

Author SHA1 Message Date
Marko Mäkelä
f740d23ce6 Merge 10.1 into 10.2 2017-04-28 12:22:32 +03:00
Monty
5a759d31f7 Changing field::field_name and Item::name to LEX_CSTRING
Benefits of this patch:
- Removed a lot of calls to strlen(), especially for field_string
- Strings generated by parser are now const strings, less chance of
  accidently changing a string
- Removed a lot of calls with LEX_STRING as parameter (changed to pointer)
- More uniform code
- Item::name_length was not kept up to date. Now fixed
- Several bugs found and fixed (Access to null pointers,
  access of freed memory, wrong arguments to printf like functions)
- Removed a lot of casts from (const char*) to (char*)

Changes:
- This caused some ABI changes
  - lex_string_set now uses LEX_CSTRING
  - Some fucntions are now taking const char* instead of char*
- Create_field::change and after changed to LEX_CSTRING
- handler::connect_string, comment and engine_name() changed to LEX_CSTRING
- Checked printf() related calls to find bugs. Found and fixed several
  errors in old code.
- A lot of changes from LEX_STRING to LEX_CSTRING, especially related to
  parsing and events.
- Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING*
- Some changes for char* to const char*
- Added printf argument checking for my_snprintf()
- Introduced null_clex_str, star_clex_string, temp_lex_str to simplify
  code
- Added item_empty_name and item_used_name to be able to distingush between
  items that was given an empty name and items that was not given a name
  This is used in sql_yacc.yy to know when to give an item a name.
- select table_name."*' is not anymore same as table_name.*
- removed not used function Item::rename()
- Added comparision of item->name_length before some calls to
  my_strcasecmp() to speed up comparison
- Moved Item_sp_variable::make_field() from item.h to item.cc
- Some minimal code changes to avoid copying to const char *
- Fixed wrong error message in wsrep_mysql_parse()
- Fixed wrong code in find_field_in_natural_join() where real_item() was
  set when it shouldn't
- ER_ERROR_ON_RENAME was used with extra arguments.
- Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already
  give the error.

TODO:
- Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c
- Change code to not modify LEX_CSTRING for database name
  (as part of lower_case_table_names)
2017-04-23 22:35:46 +03:00
Kristian Nielsen
89aad233de MDEV-12179: Per-engine mysql.gtid_slave_pos table
Intermediate commit.

Move the discovery of mysql.gtid_slave_pos* tables into the SQL thread.

This avoids doing things like opening tables and scanning the mysql
schema for tables inside of the START SLAVE statement, which might
interact badly with existing transaction or table locks.

(Even though START SLAVE is documented to implicitly commit any active
transactions, this appears not to be the case in current code).

Table discovery fits naturally in the SQL thread init code, next to
the loading of mysql.gtid_slave_pos state.
2017-04-23 10:49:58 +02:00
Marko Mäkelä
8c38147cdd Merge 10.0 into 10.1 2017-04-21 12:46:12 +03:00
Kristian Nielsen
fdf2d40770 MDEV-12179: Per-engine mysql.gtid_slave_pos table
Intermediate commit.

Implement auto-creation of mysql.gtid_slave_pos* tables with needed engines,
if listed in --gtid-pos-auto-engines.

Uses an asynchronous approach to minimise locking overhead.

The list of available tables is extended with a flag. Extra entries are
added for --gtid-pos-auto-engines tables that do not exist yet, marked as
not existing but ready for auto-creation.

If record_gtid() needs a table marked for auto-creation, it sends a request
to the slave background thread to create the table, and continues to use an
existing table for the current and immediately coming transactions.

As soon as the slave background thread has made the new table available, it
will be used for all subsequent relevant transactions in record_gtid().

This asynchronous approach also avoids a lot of complex issues around trying
to do DDL in the middle of an on-going transaction.
2017-04-21 10:30:16 +02:00
Kristian Nielsen
88613e1df6 MDEV-11201: gtid_ignore_duplicates incorrectly ignores statements when GTID replication is not enabled
When master_use_gtid=no, the IO thread loads the slave GTID state from
the master during connect. This races with the SQL thread when
gtid_ignore_duplicates=1. If an event is in the relay log from before
the new connect and has not been applied yet, moving the slave
position causes the SQL thread to think that event should be skipped
due to gtid_ignore_duplicates=1.

This patch simply disables gtid_ignore_duplicates when not using GTID,
which seems to be what one would expect.
2017-04-10 07:53:27 +02:00
Sergei Golubchik
da4d71d10d Merge branch '10.1' into 10.2 2017-03-30 12:48:42 +02:00
Sergei Golubchik
09a2107b1b Merge branch '10.0' into 10.1 2017-03-21 19:20:44 +01:00
Sachin Setiya
9cf499724f Merge branch '10.0' into bb-10.0-galera 2017-03-20 18:11:56 +05:30
Sachin Setiya
f66395f7c0 Merge tag 'mariadb-10.0.30' into bb-sachin-10.0-galera-merge
Signed-off-by: Sachin Setiya <sachin.setiya@mariadb.com>
2017-03-17 02:05:20 +05:30
Monty
2d0c579a86 Wait for slave threads to start during startup
- Before this patch during startup all slave threads was started without
  any check that they had started properly.
- If one did a START SLAVE, STOP SLAVE or CHANGE MASTER as first command to the server
  there was a chance that server could access structures that where not
  properly  initialized which could lead to crashes in
  Log_event::read_log_event
- Fixed by waiting for slave threads to start up properly also during
  server startup, like we do with START SLAVE.
2017-03-16 14:21:33 +02:00
Vladislav Vaintroub
f2fe5cb282 Fix several compile warnings on Windows 2017-03-10 19:07:07 +00:00
Marko Mäkelä
adc91387e3 Merge 10.0 into 10.1 2017-03-03 13:27:12 +02:00
Monty
f3c65ce951 Add protection to not access is_open() without LOCK_log mutex
Protection added to reopen_file() and new_file_impl().

Without this we could get an assert in fn_format() as name == 0,
because the file was closed and name reset, atthe same time
new_file_impl() was called.
2017-02-28 16:10:47 +01:00
Monty
b624b41abb Don't allow one to kill START SLAVE while the slaves IO_THREAD or SQL_THREAD
is starting.

This is needed as if we kill the START SLAVE thread too early during
shutdown then the IO_THREAD or SQL_THREAD will not have time to properly
initlize it's replication or THD structures and clean_up() will try to
delete master_info structures that are still in use.
2017-02-28 16:10:47 +01:00
Monty
4bad74e139 Added error checking for all calls to flush_relay_log_info() and stmt_done() 2017-02-28 16:10:47 +01:00
Monty
c5e25c8b40 Added a separate lock for start/stop/reset slave.
This solves some possible dead locks when one calls stop slave while slave
is starting.
2017-02-28 16:10:46 +01:00
Monty
e65f667bb6 MDEV-9573 'Stop slave' hangs on replication slave
The reason for this is that stop slave takes LOCK_active_mi over the
whole operation while some slave operations will also need LOCK_active_mi
which causes deadlocks.

Fixed by introducing object counting for Master_info and not taking
LOCK_active_mi over stop slave or even stop_all_slaves()

Another benefit of this approach is that it allows:
- Multiple threads can run SHOW SLAVE STATUS at the same time
- START/STOP/RESET/SLAVE STATUS on a slave will not block other slaves
- Simpler interface for handling get_master_info()
- Added some missing unlock of 'log_lock' in error condtions
- Moved rpl_parallel_inactivate_pool(&global_rpl_thread_pool) to end
  of stop_slave() to not have to use LOCK_active_mi inside
  terminate_slave_threads()
- Changed argument for remove_master_info() to Master_info, as we always
  have this available
- Fixed core dump when doing FLUSH TABLES WITH READ LOCK and parallel
  replication. Problem was that waiting for pause_for_ftwrl was not done
  when deleting rpt->current_owner after a force_abort.
2017-02-28 16:10:46 +01:00
Sujatha Sivakumar
e619295e1b Bug#24901077: RESET SLAVE ALL DOES NOT ALWAYS RESET SLAVE
Description:
============
If you have a relay log index file that has ended up with
some relay log files that do not exists, then RESET SLAVE
ALL is not enough to get back to a clean state.

Analysis:
=========
In the bug scenario slave server is in stopped state and
some of the relay logs got deleted but the relay log index
file is not updated.

During slave server restart replication initialization fails
as some of the required relay logs are missing. User
executes RESET SLAVE/RESET SLAVE ALL command to start a
clean slave. As per the documentation RESET SLAVE command
clears the master info and relay log info repositories,
deletes all the relay log files, and starts a new relay log
file. But in a scenario where the slave server's
Relay_log_info object is not initialized slave will not
purge the existing relay logs. Hence the index file still
remains in a bad state. Users will not be able to start
the slave unless these files are cleared.

Fix:
===
RESET SLAVE/RESET SLAVE ALL commands should do the cleanup
even in a scenario where Relay_log_info object
initialization failed.

Backported a flag named 'error_on_rli_init_info' which is
required to identify slave's Relay_log_info object
initialization failure. This flag exists in MySQL-5.6
onwards as part of BUG#14021292 fix.

During RESET SLAVE/RESET SLAVE ALL execution this flag
indicates the Relay_log_info initialization failure.
In such a case open the relay log index/relay log files
and do the required clean up.
2017-02-28 10:00:51 +05:30
Nirbhay Choubey
ee8b5c305a Merge tag 'mariadb-10.0.29' into 10.0-galera 2017-01-13 13:53:59 -05:00
Marko Mäkelä
5044dae239 Merge 10.0 into 10.1 2017-01-10 14:30:11 +02:00
Kristian Nielsen
43378f367c MDEV-10271: Stopped SQL slave thread doesn't print a message to error log like IO thread does
Make the slave SQL thread always output to the error log the message "Slave
SQL thread exiting, replication stopped in ..." whenever it previously
outputted "Slave SQL thread initialized, starting replication ...".

Before this patch, it was somewhat inconsistent in which cases the message
would be output and in which not, depending on the exact time and cause of
the condition that caused the SQL thread to stop.
2017-01-06 10:46:20 +01:00
Sergei Golubchik
4a5d25c338 Merge branch '10.1' into 10.2 2016-12-29 13:23:18 +01:00
Sergei Golubchik
2f20d297f8 Merge branch '10.0' into 10.1 2016-12-11 09:53:42 +01:00
kevg
780db8e252 fix build and some warnings 2016-11-24 17:36:02 +03:00
Kristian Nielsen
390f2a013b Fix incorrect reading of events from relaylog in parallel replication.
The SQL thread keeps track of the position in the current relay log from
which to read the next event. This position is not normally used, but a
certain interaction with the IO thread can cause the SQL thread to re-open
the relay log and seek to the stored position.

In parallel replication, there were a couple of places where the position
was not updated. This created a race where a re-open of the relay log could
seek to the wrong position and start re-reading and processing events
already handled once, causing various kinds of problems.

Fix this by moving the position update into a single place in
apply_event_and_update_pos(), which should ensure that the position is
always updated in the parallel replication case.

This problem was found from the testcase of MDEV-10863, but it is logically
a separate problem.
2016-11-16 11:00:38 +01:00
Kristian Nielsen
f1fcc1fc10 Back-port Master_info::using_parallel() to 10.0.
This has no functional changes, but it helps avoid merge problems from 10.0
to 10.1. In 10.0, code that checks for parallel replication uses
opt_slave_parallel_threads > 0, but this check needs to be
mi->using_parallel() in 10.1. By using the same check in 10.0 (with
unchanged semantics), merge problems to 10.1 are avoided.
2016-11-15 23:00:11 +01:00
Kristian Nielsen
bccd0b5e0e Merge branch 'mdev10863' into 10.1 2016-11-15 13:10:21 +01:00
Kristian Nielsen
717f212840 MDEV-10863: parallel replication tries to continue from wrong position
This occured when the SQL thread (but not the IO thread) stops while
GTID and parallel replication are used with multiple domain ids in the
GTID position, and is restarted.

In this case, the SQL needs to start some way back in the relay log,
applying or skipping events within each replication domain as
appropriate.

The SQL threads starts at the beginning of an old relay log file, and
this position may be in the middle of an event group. The bug was that
such partial event group could be re-applied, causing replication
corruption.

This patch fixes the issue, by making sure to skip any initial events
that were part of an earlier (already applied) event group.
2016-11-04 12:33:42 +01:00
Kristian Nielsen
b002509b67 MDEV-11065: Compressed binary log. Merge code into current 10.2.
Conflicts:
	sql/share/errmsg-utf8.txt
2016-11-03 14:48:51 +01:00
Sergei Golubchik
a98c85bb50 Merge branch '10.0-galera' into 10.1 2016-11-02 13:44:07 +01:00
vinchen
0e380c3bfe two fix:
1.Avoid overflowing buffers in case of corrupt events
2.Check the compressed algorithm.
2016-10-29 21:59:20 +08:00
Nirbhay Choubey
5db2195a35 Merge tag 'mariadb-10.0.28' into 10.0-galera 2016-10-28 15:50:13 -04:00
Sergei Golubchik
22490a0d70 MDEV-8345 STOP SLAVE should not cause an ERROR to be logged to the error log
cherry-pick from 5.7:
  commit 6b24763
  Author: Manish Kumar <manish.4.kumar@oracle.com>
  Date:   Tue Mar 27 13:10:42 2012 +0530

  BUG#12977988 - ON STOP SLAVE: ERROR READING PACKET FROM SERVER: LOST CONNECTION
                 TO MYSQL SERVER
  BUG#11761457 - ERROR 2013 + "ERROR READING RELAY LOG EVENT" ON STOP SLAVEBUG#12977988 - ON STOP SLAVE: ERROR READING PACKET FROM SERVER: LOST CONNECTION
               TO MYSQL SERVER
2016-10-26 18:44:34 +02:00
vinchen
07f09df92b fix the ABI and stop slave hang problem 2016-10-21 13:37:48 +02:00
Kristian Nielsen
c06bc66816 MDEV-11065: Compressed binary log
Minor review comments/changes:

 - A bunch of style-fixes.

 - Change macros to static inline functions.

 - Update check_event_type() with compressed event types.

 - Small .result file update.
2016-10-20 18:00:59 +02:00
vinchen
d4b2c9bb1a optimize the memory allocation for compressed binlog event 2016-10-19 20:20:47 +02:00
vinchen
640051e06a Binlog compressed
Add some event types for the compressed event, there are:
     QUERY_COMPRESSED_EVENT,
     WRITE_ROWS_COMPRESSED_EVENT_V1,
     UPDATE_ROWS_COMPRESSED_EVENT_V1,
     DELETE_POWS_COMPRESSED_EVENT_V1,
     WRITE_ROWS_COMPRESSED_EVENT,
     UPDATE_ROWS_COMPRESSED_EVENT,
     DELETE_POWS_COMPRESSED_EVENT.
These events inheritance the uncompressed editor events. One of their constructor functions and write
function have been overridden for uncompressing and compressing. Anything but this is totally the same.

On slave, The IO thread will uncompress and convert them When it receiving the events from the master.
So the SQL and worker threads can be stay unchanged.

Now we use zlib as compress algorithm. It maybe support other algorithm in the future.
2016-10-19 20:20:35 +02:00
vinchen
0fa39ffba7 fix code style.. 2016-10-19 13:52:17 +02:00
vinchen
c334f4fe46 fix the code style for read_binlog_speed_limit 2016-10-19 13:52:17 +02:00
vinchen
43789901c7 Control the binlog read speed for compressed protocol 2016-10-19 13:51:08 +02:00
vinchen
8eb0f5ca1a Control the Maximum speed(KB/s) to read binlog from master 2016-10-19 13:51:08 +02:00
Kristian Nielsen
e1ef99c3dc MDEV-7145: Delayed replication
Merge feature into 10.2 from feature branch.

Delayed replication adds an option

  CHANGE MASTER TO master_delay=<seconds>

Replication will then delay applying events with that many
seconds. This creates a replication slave that reflects the state of
the master some time in the past.

Feature is ported from MySQL source tree.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-16 23:44:44 +02:00
Kristian Nielsen
3011060b2a MDEV-7145: Delayed slave.
Extend to work also for parallel replication.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:59 +02:00
Kristian Nielsen
814880711f BUG#56442: Slave executes delayed statements when STOP SLAVE is issued
Problem:
When using the delayed slave feature, and the SQL thread is delaying,
and the user issues STOP SLAVE, the event we wait for was executed.
It should not be executed.
Fix:
Check the return value from the delay function,
slave.cc:slave_sleep(). If the return value is 1, it means the thread
has been stopped, in this case we don't execute the statement.

Also, refactored the test case for delayed slave a little: added the
test script include/rpl_assert.inc, which asserts that a condition holds
and prints a message if not. Made rpl_delayed_slave.test use this. The
advantage is that the test file is much easier to read and maintain,
because it is clear what is an assertion and what is not, and also the
expected result can be found in the test file, you don't have to compare
it to the result file.

Manually merged into MariaDB from MySQL commit
fd2b210383358fe7697f201e19ac9779879ba72a

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:59 +02:00
Kristian Nielsen
b2bc6dadee MDEV-7145: Delayed replication, cleanup some code
The original MySQL patch left some refactoring todo's, possibly
because of known conflicts with other parallel development (like
info-repository feature perhaps).

This patch fixes those todos/refactorings.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:59 +02:00
Kristian Nielsen
a9fb480fd6 MDEV-7145: Delayed replication, fixing test failures.
Two merge error fixed, and testsuite updated to removed some other
test failues.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:58 +02:00
Kristian Nielsen
19abe79fd1 MDEV-7145: Delayed replication, intermediate commit.
Initial merge of delayed replication from MySQL git.

The code from the initial push into MySQL is merged, and the
associated test case passes. A number of tasks are still pending:

1. Check full test suite run for any regressions or .result file updates.

2. Extend the feature to also work for parallel replication.

3. There are some todo-comments about future refactoring left from
MySQL, these should be located and merged on top.

4. There are some later related MySQL commits, these should be checked
and merged. These include:
    e134b9362ba0b750d6ac1b444780019622d14aa5
    b38f0f7857c073edfcc0a64675b7f7ede04be00f
    fd2b210383358fe7697f201e19ac9779879ba72a
    afc397376ec50e96b2918ee64e48baf4dda0d37d

5. The testcase from MySQL relies heavily on sleep and timing for
testing, and seems likely to sporadically fail on heavily loaded test
servers in buildbot or distro build farms.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:58 +02:00
Kristian Nielsen
50f19ca809 Remove unnecessary global mutex in parallel replication.
The function apply_event_and_update_pos() is called with the
rli->data_lock mutex held. However, there seems to be nothing in the
function actually needing the mutex to be held. Certainly not in the
parallel replication case, where sql_slave_skip_counter is always 0
since the non-zero case is handled by the SQL driver thread.

So this patch makes parallel replication use a variant of
apply_event_and_update_pos() without the need to take the
rli->data_lock mutex. This avoids one contended global mutex for each
event executed, which might improve performance on CPU-bound workloads
somewhat.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 22:44:40 +02:00
Sergei Golubchik
ec59220f2c post-merge fixes for ec47bea 2016-09-12 13:54:44 +02:00
Kristian Nielsen
ec47beaba6 Merge parallel replication async deadlock kill into 10.2.
Conflicts:
	sql/mysqld.cc
	sql/slave.cc
2016-09-09 12:15:53 +02:00
Sergei Golubchik
06b7fce9f2 Merge branch '10.1' into 10.2 2016-09-09 08:33:08 +02:00
Kristian Nielsen
7e0c9de864 Parallel replication async deadlock kill
When a deadlock kill is detected inside the storage engine, the kill
is not done immediately, to avoid calling back into the storage engine
kill_query method with various lock subsystem mutexes held. Instead the
kill is queued and done later by a slave background thread.

This patch in preparation for fixing TokuDB optimistic parallel
replication, as well as for removing locking hacks in InnoDB/XtraDB in
10.2.

Signed-off-by: Kristian Nielsen <knielsen at knielsen-hq.org>
2016-09-08 15:25:40 +02:00
Monty
96e95b5465 Better SHOW PROCESSLIST for replication
- When waiting for events, start time is now counted from start of wait
- Instead of having "Connect" as "Command" for all replication threads we
  now have:
  - Slave_IO for Slave thread reading relay log
  - Slave_SQL for slave executing SQL commands or distribution queries to
    Slave workers
  - Slave_worker for slave threads executin SQL commands in parallel replication
2016-08-29 13:10:17 +03:00
Sergei Golubchik
6b1863b830 Merge branch '10.0' into 10.1 2016-08-25 12:40:09 +02:00
Nirbhay Choubey
c309e99ff9 Merge branch '10.0' into 10.0-galera 2016-08-24 19:30:32 -04:00
Vicențiu Ciorbaru
4eb898bb16 MDEV-10563 Crash during shutdown in Master_info_index::any_slave_sql_running
In well defined C code, the "this" pointer is never NULL. Currently, we
were potentially dereferencing a NULL pointer (master_info_index). GCC v6
removes any "if (!this)" conditions as it assumes this is always a
non-null pointer. In order to prevent undefined behaviour, check the
pointer before dereferencing and remove the check within member
functions.
2016-08-23 21:24:36 +03:00
Monty
8d5a0d650b Cleanups and minor fixes
- Fixed typos
- Added --core-on-failure to mysql-test-run
- More DBUG_PRINT in viosocket.c
- Don't forget CLIENT_REMEMBER_OPTIONS for compressed slave protocol
- Removed not used stage variables
2016-08-21 20:14:13 +03:00
Vladislav Vaintroub
31a8cf54c8 Revert "MDEV-9293 Connector/C integration"
This reverts commit 7b89b9f510.
2016-08-19 15:46:27 +00:00
Vladislav Vaintroub
7b89b9f510 MDEV-9293 Connector/C integration 2016-08-19 15:27:37 +00:00
Oleksandr Byelkin
66ac894c40 MDEV-10455: libmariadbclient18 + MySQL-python leaks memory on failed connections
Support of CLIENT_REMEMBER_OPTIONS and freeing options added.
2016-08-11 17:50:21 +02:00
Kristian Nielsen
fb076581f6 MDEV-10271: Stopped SQL slave thread doesn't print a message to error log like IO thread does
Make the slave SQL thread always output to the error log the message "Slave
SQL thread exiting, replication stopped in ..." whenever it previously
outputted "Slave SQL thread initialized, starting replication ...".

Before this patch, it was somewhat inconsistent in which cases the message
would be output and in which not, depending on the exact time and cause of
the condition that caused the SQL thread to stop.
2016-07-25 13:07:50 +02:00
Sergei Golubchik
932646b1ff Merge branch '10.1' into 10.2 2016-06-30 16:38:05 +02:00
Alexander Barkov
3f32bf627f More tests for "MDEV-7563 Support CHECK constraint".
Testing non-ASCII string literals.
2016-06-30 11:43:02 +02:00
Sergei Golubchik
62e0a4552f Merge branch '10.0-galera' into 10.1 2016-06-28 22:06:22 +02:00
Sergei Golubchik
3361aee591 Merge branch '10.0' into 10.1 2016-06-28 22:01:55 +02:00
Nirbhay Choubey
14d62505d9 Merge tag 'mariadb-10.0.26' into 10.0-galera 2016-06-24 12:01:22 -04:00
Nirbhay Choubey
ecdb2b6e86 Merge tag 'mariadb-5.5.50' into 5.5-galera 2016-06-23 12:54:38 -04:00
Sergei Golubchik
a10fd659aa Fixed for failures in buildbot: Replication
1. remove unnecessary rpl-tokudb combination file.
2. fix rpl_ignore_table to cleanup properly (not leave test
   grants in memory)
3. check_temp_dir() is supposed to set the error in stmt_da - do
   it even when called multiple times, this fixes a crash when
   rpl.rpl_slave_load_tmpdir_not_exist is run twice.
2016-06-22 10:40:43 +02:00
Sergei Golubchik
c081c978a2 Merge branch '5.5' into bb-10.0 2016-06-21 14:11:02 +02:00
Sergei Golubchik
ae29ea2d86 Merge branch 'mysql/5.5' into 5.5 2016-06-14 13:55:28 +02:00
Nirbhay Choubey
868c2ceb01 MDEV-9083: Slave IO thread does not handle autoreconnect to restarting Galera Cluster node
Chery-picked commits from codership/mysql-wsrep.

MW-284: Slave I/O retry on ER_COM_UNKNOWN_ERROR

Slave would treat ER_COM_UNKNOWN_ERROR as fatal error and stop.
The fix here is to treat it as a network error and rely on the
built-in mechanism to retry.

MW-284: Add an MTR test
2016-06-12 19:28:56 -04:00
Nirbhay Choubey
7305be2f7e MDEV-5535: Cannot reopen temporary table
mysqld maintains a list of TABLE objects for all temporary
tables created within a session in THD. Here each table is
represented by a TABLE object.

A query referencing a particular temporary table for more
than once, however, failed with ER_CANT_REOPEN_TABLE error
because a TABLE_SHARE was allocate together with the TABLE,
so temporary tables always had only one TABLE per TABLE_SHARE.

This patch lift this restriction by separating TABLE and
TABLE_SHARE objects and storing TABLE_SHAREs for temporary
tables in a list in THD, and TABLEs in a list within their
respective TABLE_SHAREs.
2016-06-10 18:39:43 -04:00
Monty
89685d55d7 Reuse THD for new user connections
- To ensure that mallocs are marked for the correct THD, even if it's
  allocated in another thread, I added the thread_id to the THD constructor
- Added st_my_thread_var to thr_lock_info_init() to avoid a call to my_thread_var
- Moved things from THD::THD() to THD::init()
- Moved some things to THD::cleanup()
- Added THD::free_connection() and THD::reset_for_reuse()
- Added THD to CONNECT::create_thd()
- Added THD::thread_dbug_id and st_my_thread_var->dbug_id. These are needed
  to ensure that we have a constant thread_id used for debugging with a THD,
  even if it changes thread_id (=connection_id)
- Set variables.pseudo_thread_id in constructor. Removed not needed sets.
2016-06-04 09:06:00 +02:00
Sujatha Sivakumar
ef3f09f0c9 Bug#23251517: SEMISYNC REPLICATION HANGING
Revert following bug fix:

Bug#20685029: SLAVE IO THREAD SHOULD STOP WHEN DISK IS
FULL
Bug#21753696: MAKE SHOW SLAVE STATUS NON BLOCKING IF IO
THREAD WAITS FOR DISK SPACE

This fix results in a deadlock between slave IO thread
and SQL thread.

(cherry picked from commit e3fea6c6dbb36c6ab21c4ab777224560e9608b53)
2016-05-16 11:34:20 +02:00
Sujatha Sivakumar
df7ecf64f5 Bug#23251517: SEMISYNC REPLICATION HANGING
Revert following bug fix:

Bug#20685029: SLAVE IO THREAD SHOULD STOP WHEN DISK IS
FULL
Bug#21753696: MAKE SHOW SLAVE STATUS NON BLOCKING IF IO
THREAD WAITS FOR DISK SPACE

This fix results in a deadlock between slave IO thread
and SQL thread.
2016-05-13 16:42:45 +05:30
Nirbhay Choubey
8a1efa1bdd Merge branch '10.0' into 10.0-galera 2016-04-29 16:50:58 -04:00
Monty
9c846373f0 Merge commit 'd5822a3ad0657040114cdc185c6387b9eb3a12b2' into 10.2 2016-04-28 16:59:33 +03:00
Monty
732adec0a4 Removed some not needed when doing delete thd, which caused warnings about
wrong mutex usage from safe_mutex.
Ensure that LOCK_status is always taken before LOCK_thread_count
2016-04-28 13:39:55 +03:00
Sergei Golubchik
f67a2211ec Merge branch '10.1' into 10.2 2016-03-23 22:36:46 +01:00
Sergei Golubchik
3b0c7ac1f9 Merge branch '10.0' into 10.1 2016-03-21 13:02:53 +01:00
Kristian Nielsen
f8251911a4 MDEV-9595: Shutdown takes forever with many replication channels
There was a race between end_slave() and cleanup code at the end of
handle_slave_sql(). This could cause access to master_info_index and
global_rpl_thread_pool after they had been freed.

Fix by skipping that cleanup if server shutdown is in progress, as is done
in other parts of the code as well (the cleanup, which stops worker threads
that are not needed anymore, is redundant anyway when the server is shutting
down).
2016-03-03 08:53:42 +01:00
Sujatha Sivakumar
8361151765 Bug#20685029: SLAVE IO THREAD SHOULD STOP WHEN DISK IS
FULL
Bug#21753696: MAKE SHOW SLAVE STATUS NON BLOCKING IF IO
THREAD WAITS FOR DISK SPACE

Problem:
========
Currently SHOW SLAVE STATUS blocks if IO thread waits for
disk space. This makes automation tools verifying
server health block on taking relevant action. Finally this
will create SHOW SLAVE STATUS piles.

Analysis:
=========
SHOW SLAVE STATUS hangs on mi->data_lock if relay log write
is waiting for free disk space while holding mi->data_lock.
mi->data_lock is needed to protect the format description
event (mi->format_description_event) which is accessed by
the clients running FLUSH LOGS and slave IO thread. Note
relay log writes don't need to be protected by
mi->data_lock, LOCK_log is used to protect relay log between
IO and SQL thread (see MYSQL_BIN_LOG::append_event). The
code takes mi->data_lock to protect
mi->format_description_event during relay log rotate which
might get triggered right after relay log write.

Fix:
====
Release the data_lock just for the duration of writing into
relay log.

Made change to ensure the following lock order is maintained
to avoid deadlocks.

data_lock, LOCK_log

data_lock is held during relay log rotations to protect
the description event.
2016-03-01 12:29:51 +05:30
Nirbhay Choubey
0d58323e26 Merge tag 'mariadb-10.0.24' into 10.0-galera 2016-02-23 20:53:29 -05:00
Monty
3d4a7390c1 MDEV-6150 Speed up connection speed by moving creation of THD to new thread
Creating a CONNECT object on client connect and pass this to the working thread which creates the THD.
Split LOCK_thread_count to different mutexes
Added LOCK_thread_start to syncronize threads
Moved most usage of LOCK_thread_count to dedicated functions
Use next_thread_id() instead of thread_id++

Other things:
- Thread id now starts from 1 instead of 2
- Added cast for thread_id as thread id is now of type my_thread_id
- Made THD->host const (To ensure it's not changed)
- Removed some DBUG_PRINT() about entering/exiting mutex as these was already logged by mutex code
- Fixed that aborted_connects and connection_errors_internal are counted in all cases
- Don't take locks for current_linfo when we set it (not needed as it was 0 before)
2016-02-07 10:34:03 +02:00
Alexey Botchkov
75a1d866dd MDEV-5273 Prepared statement doesn't return metadata after prepare.
SHOW SLAVE STATUS fixed.
2016-01-28 11:12:03 +04:00
Sergei Golubchik
f4faac4d6a Merge branch '10.0' into 10.1 2016-01-25 22:58:57 +01:00
Kristian Nielsen
2f88b14acd Merge branch 'tmp' into tmp-10.1
Conflicts:
	sql/slave.cc
2016-01-15 13:01:19 +01:00
Kristian Nielsen
74b1af19e9 Merge branch 'tmp' into tmp-10.0
Conflicts:
	sql/slave.cc
2016-01-15 12:50:23 +01:00
Kristian Nielsen
06b2e327fc Fix error handling for GTID and domain-based parallel replication
This occurs when replication stops with an error, domain-based parallel
replication is used, and the GTID position contains more than one domain.
Furthermore, it relates to the case where the SQL thread is restarted
without first stopping the IO thread.

In this case, the file/offset relay-log position does not correctly
represent the slave's multi-dimensional position, because other domains may
be far ahead of, or behind, the domain with the failing event. So the code
reverts the relay log position back to the start of a relay log file that is
known to be before all active domains.

There was a bug that when the SQL thread was restarted, the
rli->relay_log_state was incorrectly initialised from @@gtid_slave_pos. This
position will likely be too far ahead, due to reverting the relay log
position. Thus, if the replication fails again after the SQL thread restart,
the rli->restart_gtid_pos might be updated incorrectly. This in turn would
cause a second SQL thread restart to replicate from the wrong position, if
the IO thread was still left running.

The fix is to initialise rli->relay_log_state from @@gtid_slave_pos only
when we actually purge and re-fetch relay logs from the master, not at every
SQL thread start.

A related problem is the use of sql_slave_skip_counter to resolve
replication failures in this kind of scenario. Since the slave position is
multi-dimensional, sql_slave_skip_counter can not work properly - it is
indeterminate exactly which event is to be skipped, and is unlikely to work
as expected for the user. So make this an error in the case where
domain-based parallel replication is used with multiple domains, suggesting
instead the user to set @@gtid_slave_pos to reliably skip the desired event.
2016-01-15 12:48:14 +01:00
Monty
8fcc0bfefa Fixed bug in semi_sync replication tests.
The problem was that wait_for_slave_io_to_start reported that the io thread
was ready, when it was still initializing. This caused test suite to
continue too early, for example before the semi sync plugin was properly
enabled.

Fixed by introducing a new internal stage: "Preparing". Slave_IO_Running is
now set to "Yes" only when all initializing is done and the IO thread is
ready to read things from the master.

The only test affected by this change is rpl_flsh_tbls, which got stuck in
the preparing phase while trying to read the GTID position from a table.
Fixed by having this test waiting for Preparing instead of Yes.
2016-01-03 13:27:59 +02:00
Monty
661a6d8906 Cleanup of slave code:
- Added testing if connection is killed to shortcut reading of connection data
  This will allow us later in 10.2 to do a cleaner shutdown of slaves (less errors in the log)
- Add new status variables: Slaves_connected, Slaves_running and Slave_connections.
- Use MYSQL_SLAVE_NOT_RUN instead of 0 with slave_running.
- Don't print obvious extra warnings to the error log when slave is shut down normally.
2016-01-03 13:20:07 +02:00
Sergei Golubchik
a2bcee626d Merge branch '10.0' into 10.1 2015-12-21 21:24:22 +01:00
Nirbhay Choubey
dad555a09c Merge tag 'mariadb-10.0.23' into 10.0-galera 2015-12-19 14:24:38 -05:00
Monty
c3018b0ff4 Fixes to get all test to run on MacosX Lion 10.7
This includes fixing all utilities to not have any memory leaks,
as safemalloc warnings stopped tests from passing on MacOSX.

- Ensure that all clients takes character-set-dir, as the
  libmysqlclient library will use it.
- mysql-test-run now passes character-set-dir to all external clients.
- Changed dynstr_free() so that it can be called twice (made freeing code easier)
- Changed rpl_global_gtid_slave_state to be allocated dynamicly as it
  includes a mutex that needs to be initizlied/destroyed before my_end() is called.
- Removed rpl_slave_state::init() and rpl_slave_stage::deinit() as
  their job are better handling by constructor and delete.
- Print alias instead of table_name in check_duplicate_key as
  table_name may have been converted to lower case.

Other things:
- Fixed a case in time_to_datetime_with_warn() where we where
  using && instead of & in tests
2015-11-29 17:51:23 +02:00
Sergei Golubchik
7f19330c59 Merge branch 'github/10.0-galera' into 10.1 2015-11-19 17:48:36 +01:00
Nirbhay Choubey
f47124c9ef Incorrect statements binlogged on slave with do_domain_ids=(...)
In domain ID based filtering, a flag is used to filter-out
the events that belong to a particular domain. This flag gets
set when IO thread receives a GTID_EVENT for the domain on
filter list and its reset at the last event in the GTID group.
The resetting, however, was wrongly done before the decision to
write/filter the event from relay log is made. As a result, the
last event in the group will always pass through the filter.
Fixed by deferring the reset logic. Also added a test case.
2015-11-18 02:11:20 -05:00
Kristian Nielsen
8f2e05f41c Merge branch 'mdev7818-4' into 10.1
Conflicts:
	mysql-test/suite/perfschema/r/stage_mdl_global.result
	sql/rpl_rli.cc
	sql/sql_parse.cc
2015-11-13 14:24:40 +01:00
Kristian Nielsen
6bf88cdd9d Merge branch 'mdev7818-4' into bb-10.0-knielsen 2015-11-13 14:08:38 +01:00
Kristian Nielsen
75dc267101 Change Seconds_behind_master to be updated only at commit in parallel replication
Before, the Seconds_behind_master was updated already when an event
was queued for a worker thread to execute later. This might lead users
to interpret a low value as the slave being almost up to date with the
master, while in reality there might still be lots and lots of events
still queued up waiting to be applied by the slave.

See https://lists.launchpad.net/maria-developers/msg08958.html for
more detailed discussions.
2015-11-13 10:24:53 +01:00
Monty
e8c1b35f18 MDEV-8476 Race condition in slave SQL thread shutdown
Patch backported from MariaDB 10.1

- Ensure that we wait with cleanup() until slave thread has stopped.
- Added signal_thd_deleted() to signal close_connections() that all THD's has been freed.

Other things
- Removed not needed calls to THD_CHECK_SENTRY() when we are calling 'delete thd'.
2015-11-12 14:51:01 +02:00
Nirbhay Choubey
4d15112962 Merge tag 'mariadb-10.0.22' into 10.0-galera 2015-10-31 18:07:02 -04:00
Sergei Golubchik
dfb74dea30 Merge branch '10.0' into 10.1 2015-10-12 00:37:58 +02:00
Monty
a69a6ddac8 MDEV-4487 Allow replication from MySQL 5.6+ when GTID is enabled on the master
MDEV-8685 MariaDB fails to decode Anonymous_GTID entries
MDEV-5705 Replication testing: 5.6->10.0

- Ignoring GTID events from MySQL 5.6+ (Allows replication from MySQL 5.6+ with GTID enabled)
- Added ignorable events from MySQL 5.6
- mysqlbinlog now writes information about GTID and ignorable events.
- Added more information in error message when replication stops because of wrong information in binary log.
- Fixed wrong test when write_on_release() should flush cache.
2015-10-08 10:45:09 +03:00
Nirbhay Choubey
db66d2f92d refs codership/mysql-wsrep#188
- setting error code for slave, if mysql slave node dropped from cluster
2015-09-10 00:20:49 -04:00
Nirbhay Choubey
2012a810ab refs codership/mysql-wsrep#181
- Galera related errors in mysql slave applying will now cause slave to
abort
2015-09-10 00:14:24 -04:00
Sergei Golubchik
b85a00161e MDEV-8264 encryption for binlog
* Start_encryption_log_event
* --encrypt-binlog command line option

based on google patches.
2015-09-04 10:33:55 +02:00
Sergei Golubchik
41d68cabee cleanup: Log_event::write() and MYSQL_BIN_LOG::write_cache()
Introduce Log_event_writer() that encapsulates
writing data to an IO_CACHE with automatic checksum calculation.

Now all events properly checksum themselves as needed.

Use Log_event_writer in MYSQL_BIN_LOG::write_cache() instead
of copy-pasting its logic all over.

Later Log_event_writer will also do encryption.
2015-09-04 10:33:55 +02:00
Sergei Golubchik
c862c15bba cleanup: [partial] removal of llstr()
now when my_vsnprintf() supports %llu for a few years already.
2015-09-04 10:33:54 +02:00
Sergei Golubchik
fff6f4278b Revert f1abd015, make a smaller fix
commit f1abd015dc
Author: Andrei Elkin <aelkin@mysql.com>
Date:   Thu Nov 12 17:10:19 2009 +0200

    Bug #47210   first execution of "start slave until" stops too early
2015-09-04 10:33:54 +02:00
Sergei Golubchik
1720fcdcbc cleanup DBUG, DBUG_DUMP_EVENT_BUF
introduce DBUG_DUMP_EVENT_BUF,
remove few unused DBUG_EXECUTE_IF's
simplify few DBUG_PRINT's
remove few redundant #ifndef DBUG_OFF's
2015-09-04 10:33:53 +02:00
Sergei Golubchik
2d2286faf3 cleanup: use enum_binlog_checksum_alg, not uint8
* fix unireg.h includes
* use enum_binlog_checksum_alg for binlog checksum variables,
  not uint8
2015-09-04 10:33:52 +02:00
Sergei Golubchik
530a6e7481 Merge branch '10.0' into 10.1
referenced_by_foreign_key2(), needed for InnoDB to compile,
was taken from 10.0-galera
2015-09-03 12:58:41 +02:00
Monty
4f0255cbf9 Fixed errors and bugs found by valgrind:
- If run with valgrind, mysqltest will now wait longer when syncronizing slave with master
- Ensure that we wait with cleanup() until slave thread has stopped.
- Added signal_thd_deleted() to signal close_connections() that all THD's has been freed.
- Check in handle_fatal_signal() that we don't use variables that has been freed.
- Increased some timeouts when run with --valgrind

Other things:
- Fixed wrong test in one_thread_per_connection_end() if galera is used.
- Removed not needed calls to THD_CHECK_SENTRY() when we are calling 'delete thd'.
2015-09-01 18:42:02 +03:00
Monty
56aa19989f MDEV-6152: Remove calls to current_thd while creating Item
Part 5: Removing calls to current_thd in net_read calls, creating fields,
        query_cache, acl and some other places where thd was available
2015-09-01 18:42:02 +03:00
Monty
3cb578c001 MDEV-6152: Remove calls to current_thd while creating Item
- Part 3: Adding mem_root to push_back() and push_front()

Other things:
- Added THD as an argument to some partition functions.
- Added memory overflow checking for XML tag's in read_xml()
2015-08-27 22:21:08 +03:00
Monty
1bae0d9e56 Stage 2 of MDEV-6152:
- Added mem_root to all calls to new Item
- Added private method operator new(size_t size) to Item to ensure that
  we always use a mem_root when creating an item.

This saves use once call to current_thd per Item creation
2015-08-21 10:40:51 +04:00
Sergey Vojtovich
31e365efae MDEV-8010 - Avoid sql_alloc() in Items (Patch #1)
Added mandatory thd parameter to Item (and all derivative classes) constructor.
Added thd parameter to all routines that may create items.
Also removed "current_thd" from Item::Item. This reduced number of
pthread_getspecific() calls from 290 to 177 per OLTP RO transaction.
2015-08-21 10:40:39 +04:00
Nirbhay Choubey
91acc8b16f Merge tag 'mariadb-10.0.21' into 10.0-galera 2015-08-08 14:21:22 -04:00
Nirbhay Choubey
5b9dd459fb Merge tag 'mariadb-5.5.45' into 5.5-galera 2015-08-07 17:02:51 -04:00
Jan Lindström
9a5787db51 Merge commit '96badb16afcf' into 10.0
Conflicts:
	client/mysql_upgrade.c
	mysql-test/r/func_misc.result
	mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result
	mysql-test/suite/innodb/r/innodb-fk.result
	mysql-test/t/subselect_sj_mat.test
	sql/item.cc
	sql/item_func.cc
	sql/log.cc
	sql/log_event.cc
	sql/rpl_utility.cc
	sql/slave.cc
	sql/sql_class.cc
	sql/sql_class.h
	sql/sql_select.cc
	storage/innobase/dict/dict0crea.c
	storage/innobase/dict/dict0dict.c
	storage/innobase/handler/ha_innodb.cc
	storage/xtradb/dict/dict0crea.c
	storage/xtradb/dict/dict0dict.c
	storage/xtradb/handler/ha_innodb.cc
	vio/viosslfactories.c
2015-08-03 23:09:43 +03:00
Monty
f3e578ab30 Fixed MDEV-8428: Mangled DML statements on 2nd level slave when enabling binlog checksums
Fix was to add a test in Query_log_event::Query_log_event() if we are using
CREATE ... SELECT and in this case use trans cache, like we do on the master.
This avoid using (with doesn't have checksum)

Other things:
- Removed dummy call my_checksum(0L, NULL, 0)
- More DBUG_PRINT
- Cleaned up Log_event::need_checksum() to make it more readable (similar as in MySQL 5.6)
- Renamed variable that was hiding another one in create_table_imp()
2015-07-26 14:32:45 +03:00
Monty
7115341473 Fixed warnings and errors found by buildbot
field.cc
- Fixed warning about overlapping memory copy (backport from 10.0)

Item_subselect.cc
- Fixed core dump in main.view
- Problem was that thd->lex->current_select->master_unit()->item was not set, which caused crash in maxr_as_dependent

sql/mysqld.cc
- Got error on shutdown as we where freeing mutex before all THD objects was freed
  (~THD uses some mutex). Fixed by during shutdown freeing THD inside mutex.

sql/log.cc
- log_space_lock and LOCK_log where locked in inconsistenly. Fixed by not having a log_space_lock around purge_logs.

sql/slave.cc
- Remove unnecessary log_space_lock
- Move cond_broadcast inside lock to ensure we don't miss the signal
2015-07-25 15:15:52 +03:00
Monty
872a953b22 MDEV-8469 Add RESET MASTER TO x to allow specification of binlog file nr
Other things:
- Avoid calling init_and_set_log_file_name() when opening binary log.
- Remove newlines early when reading from index file.
- Ensure that reset_logs() will work even if thd is 0 (Can happen on startup)
- Added thd to sart_slave_threads() for better error handling.
2015-07-16 10:36:58 +03:00
Monty
7332af49e4 - Renaming variables so that they don't shadow others (After this patch one can compile with -Wshadow and get much fewer warnings)
- Changed ER(ER_...) to ER_THD(thd, ER_...) when thd was known or if there was many calls to current_thd in the same function.
- Changed ER(ER_..) to ER_THD_OR_DEFAULT(current_thd, ER...) in some places where current_thd is not necessary defined.
- Removing calls to current_thd when we have access to thd

Part of this is optimization (not calling current_thd when not needed),
but part is bug fixing for error condition when current_thd is not defined
(For example on startup and end of mysqld)

Notable renames done as otherwise a lot of functions would have to be changed:
- In JOIN structure renamed:
   examined_rows -> join_examined_rows
   record_count -> join_record_count
- In Field, renamed new_field() to make_new_field()

Other things:
- Added DBUG_ASSERT(thd == tmp_thd) in Item_singlerow_subselect() just to be safe.
- Removed old 'tab' prefix in JOIN_TAB::save_explain_data() and use members directly
- Added 'thd' as argument to a few functions to avoid calling current_thd.
2015-07-06 20:24:14 +03:00
Nirbhay Choubey
46024098be Merge tag 'mariadb-10.0.20' into 10.0-galera 2015-06-21 23:54:55 -04:00
Nirbhay Choubey
327409443f Merge tag 'mariadb-5.5.44' into 5.5-galera 2015-06-21 21:50:43 -04:00
Kristian Nielsen
b1b0db294f Merge MDEV-8294 into 10.1 2015-06-10 12:42:18 +02:00
Kristian Nielsen
36f37a4890 Merge MDEV-8294 into 10.0 2015-06-10 12:01:06 +02:00
Kristian Nielsen
682ed005c5 MDEV-8294: Inconsistent behavior of slave parallel threads at runtime
There were some cases where the slave SQL thread could stop without
the pool of parallel replication worker threads being correctly
de-activated.
2015-06-10 11:57:42 +02:00
Nirbhay Choubey
f965cae5fb MDEV-7110 : Add missing MySQL variable log_bin_basename and log_bin_index
Add log_bin_index, log_bin_basename and relay_log_basename system
variables. Also, convert relay_log_index system variable to
NO_CMD_LINE and implement --relay-log-index as a command line
option.
2015-06-09 13:38:29 -04:00
Sergei Golubchik
8e7d6652ad CRLF->LF 2015-06-02 22:07:47 +02:00
Sergei Golubchik
5091a4ba75 Merge tag 'mariadb-10.0.19' into 10.1 2015-06-01 15:51:25 +02:00
Kristian Nielsen
6e49201644 Fix compilation warnings in -DWITH_WSREP=OFF build. 2015-05-11 12:47:43 +02:00
Nirbhay Choubey
e11cad9e9d Merge tag 'mariadb-10.0.19' into 10.0-galera 2015-05-09 17:09:21 -04:00
Sergei Golubchik
49c853fb94 Merge branch '5.5' into 10.0 2015-05-04 22:00:24 +02:00
Nirbhay Choubey
d2562004c5 Merge tag 'mariadb-5.5.43' into 5.5-galera 2015-05-04 13:50:52 -04:00
Sergei Golubchik
f875c9f2a0 MDEV-5114 seconds_behind_master flips to 0 & spikes back, when running show slaves status
1. After a period of wait (where last_master_timestamp=0)
   do NOT restore the last_master_timestamp to the timestamp
   of the last executed event (which would mean we've just
   executed it, and we're that much behind the master).

2. Update last_master_timestamp before executing the event,
   not after.

Take the approach from the this commit (but with a different test
case that actually makes sense):

commit 0c75ab453fb8c5439576af8fe5add7a1b89f1569
Author: Luis Soares <luis.soares@sun.com>
Date:   Thu Apr 15 17:39:31 2010 +0100

    BUG#52166: Seconds_Behind_Master spikes after long idle period
2015-05-03 11:21:55 +02:00
Kristian Nielsen
9cdf5c2bfd Merge branch '10.0' into 10.1 2015-04-29 11:30:26 +02:00
Kristian Nielsen
ed701c6a23 MDEV-7864: Slave SQL: stopping on non-last RBR event with annotations results in SEGV (signal 11)
The slave SQL thread was clearing serial_rgi->thd before deleting
serial_rgi, which could cause access to NULL THD.

The clearing was introduced in commit
2e100cc5a4 and is just plain wrong. So revert
that part (single line) of that commit.

Thanks to Daniel Black for bug analysis and test case.
2015-04-28 11:56:54 +02:00
Sergei Golubchik
0f12ada6b6 Merge remote-tracking branch 'mysql/5.5' into 5.5 2015-04-27 21:04:06 +02:00
Kristian Nielsen
791b0ab5db Merge 10.0 -> 10.1 2015-04-20 13:21:58 +02:00
Kristian Nielsen
167332597f Merge 10.0 -> 10.1.
Conflicts:
	mysql-test/suite/multi_source/multisource.result
	sql/sql_base.cc
2015-04-17 15:18:44 +02:00
Kristian Nielsen
a8523559e9 Merge MDEV-7975 into 10.0 2015-04-14 14:23:35 +02:00
Kristian Nielsen
0c6904258b Merge MDEV-7975 into 10.1 2015-04-14 14:10:37 +02:00
Kristian Nielsen
5d2b85a297 MDEV-7975: sporadic failure in test case rpl.rpl_gtid_startpos
Add some suppressions that were missing. They are for if a STOP SLAVE is
executed early during IO thread startup, when it is negotiating with the
master. The master connection may be killed in the middle of a
mysql_real_query(), which is not a test failure if it is a network error.

This also caught one real code error, fixed with this commit: The I/O thread
would fail to automatically reconnect if a network error happened while
fetching the value of @@GLOBAL.gtid_domain_id.
2015-04-14 13:03:11 +02:00
Sergey Vojtovich
18e9c314e4 MDEV-6650 - LINT_INIT emits code in non-debug builds
Replaced all references to LINT_INIT with UNINIT_VAR and LINT_INIT_STRUCT.
Removed LINT_INIT macro.
2015-03-16 14:48:22 +04:00
Kristian Nielsen
2e82a8233c MDEV-7785: errorneous -> erroneous spelling mistake 2015-03-16 10:54:47 +01:00
Nirbhay Choubey
7a6cad5221 Backport fix for MDEV-7673, MDEV-7203 and MDEV-7192 from 10.0-galera 2015-03-11 12:36:00 -04:00
Kristian Nielsen
ed04c40b01 MDEV-5289: master server starts slave parallel threads
Delay spawning parallel replication worker threads until a slave SQL
thread is running, and de-spawn them when the last SQL thread stops.

This is especially useful to avoid needless threads on a master in a
setup where same my.cnf is used on masters and slaves.
2015-03-11 09:18:16 +01:00
Sergei Golubchik
2db62f686e Merge branch '10.0' into 10.1 2015-03-07 13:21:02 +01:00
Kristian Nielsen
2e4dc5a370 after-merge fixes 2015-03-04 14:12:48 +01:00
Kristian Nielsen
95d7208859 Merge MDEV-6589 and MDEV-6403 into 10.1.
Conflicts:
	sql/log.cc
	sql/rpl_rli.cc
	sql/sql_repl.cc
2015-03-04 13:49:37 +01:00
Kristian Nielsen
3ef0b9b235 Merge MDEV-6589 and MDEV-6403 into 10.0. 2015-03-04 13:36:54 +01:00
Kristian Nielsen
ad0d203f2e MDEV-6589: Incorrect relay log start position when restarting SQL thread after error in parallel replication
The problem occurs in parallel replication in GTID mode, when we are using
multiple replication domains. In this case, if the SQL thread stops, the
slave GTID position may refer to a different point in the relay log for each
domain.

The bug was that when the SQL thread was stopped and restarted (but the IO
thread was kept running), the SQL thread would resume applying the relay log
from the point of the most advanced replication domain, silently skipping all
earlier events within other domains. This caused replication corruption.

This patch solves the problem by storing, when the SQL thread stops with
multiple parallel replication domains active, the current GTID
position. Additionally, the current position in the relay logs is moved back
to a point known to be earlier than the current position of any replication
domain. Then when the SQL thread restarts from the earlier position, GTIDs
encountered are compared against the stored GTID position. Any GTID that was
already applied before the stop is skipped to avoid duplicate apply.

This patch should have no effect if multi-domain GTID parallel replication is
not used. Similarly, if both SQL and IO thread are stopped and restarted, the
patch has no effect, as in this case the existing relay logs are removed and
re-fetched from the master at the current global @@gtid_slave_pos.
2015-03-04 13:36:04 +01:00
Nirbhay Choubey
af651c80f7 Merge tag 'mariadb-10.0.17' into 10.0-galera
Conflicts:
	storage/innobase/include/trx0trx.h
2015-02-27 17:36:54 -05:00
Kristian Nielsen
a227cf8046 MDEV-7335: Potential parallel slave deadlock with specific binlog corruption
If somehow the COMMIT or XID event in an event group was missing, the code in
parallel replication to handle this was not sufficient, leading to server
deadlock.
2015-02-24 14:39:15 +01:00
Kristian Nielsen
004dd0aaa8 MDEV-7568: STOP SLAVE crashes the server
The order of initialisation during server startup was incorrect. The slave
threads were started before the parallel replication worker thread pool was
initialised, allowing a race where uninitialised data could be accessed.
2015-02-19 15:43:27 +01:00
Kristian Nielsen
8672339328 MDEV-6676: Optimistic parallel replication
Adjust the configuration options, as discussed on the
maria-developers@ mailing list.

The option to hint a transaction to not be replicated in parallel is
now called @@skip_parallel_replication, consistent with
@@skip_replication.

And the --slave-parallel-mode is now simplified to have just one of
the following values:

  none
  minimal
  conservative
  optimistic
  aggressive

This reflects successively harder efforts to find opportunities to run
things in parallel on the slave. It allows to extend the server with
more automatic heuristics in the future without having to introduce a
new configuration option for each and every one.
2015-02-07 09:42:58 +01:00
Nirbhay Choubey
7cda4bee0e maria-10.0.16 merge
bzr merge -r4588 maria/10.0
2015-01-26 22:54:27 -05:00
Venkatesh Duggirala
ebb2a3f5e1 Problem: IO thread fails to connect to master if servers are configured with
special character sets like utf16, utf32, ucs2.

Analysis: MySQL server does not support few special character sets like
  utf16,utf32 and ucs2 as "client's character set"(eg: utf16,utf32, ucs2).
  It is known limitation listed in the documentation
  http://dev.mysql.com/doc/refman/5.5/en/charset-connection.html.

  The default value for default-character-set parameter is 'auto'
  which means that if the server's character set is not supported,
  then server automatically changes client's character set to
  predefined character-set which is 'latin1' in the current code.

  Eg:
  $ ./mysql -uroot -S$SOCKET_FILE --default-character-set=utf16
  ERROR 1231 (42000): Variable 'character_set_client' can't be set to the value of 'utf16'

  $ ./mysql -uroot -S$SOCKET_FILE will be successfully connected to
  server with 'latin1' as default client side character set.

  When IO thread is trying to connect to Master, it sets server's character
  set as client's character set. When Slave server is started with these
  special character sets, IO thread (which is like a connection to Master)
  fails because of the above said limitation.

 Fix: Now even IO thread also behaves the same as a regular client behaves.
  i.e., If server's character set is not supported as client's character set,
  then set default's client character set(latin1) as client's character set.
2015-01-14 14:13:52 +05:30
Sergei Golubchik
e695db0f2d MDEV-7437 remove suport for "atomics" with rwlocks 2015-01-13 10:15:21 +01:00
Kristian Nielsen
db21fddc37 MDEV-6676: Optimistic parallel replication
Implement a new mode for parallel replication. In this mode, all transactions
are optimistically attempted applied in parallel. In case of conflicts, the
offending transaction is rolled back and retried later non-parallel.

This is an early-release patch to facilitate testing, more changes to user
interface / options will be expected. The new mode is not enabled by default.
2014-12-06 08:49:50 +01:00
Nirbhay Choubey
3bb02f3e6d bzr merge -rtag:mariadb-10.0.15 maria/10.0 2014-12-05 12:33:02 -05:00
Nirbhay Choubey
a50ddebb5c MDEV-6593 : domain_id based replication filters
Implementation for domain ID based filtering of replication events.
2014-12-03 22:30:48 -05:00
Sergei Golubchik
853077ad7e Merge branch '10.0' into bb-10.1-merge
Conflicts:
	.bzrignore
	VERSION
	cmake/plugin.cmake
	debian/dist/Debian/control
	debian/dist/Ubuntu/control
	mysql-test/r/join_outer.result
	mysql-test/r/join_outer_jcl6.result
	mysql-test/r/null.result
	mysql-test/r/old-mode.result
	mysql-test/r/union.result
	mysql-test/t/join_outer.test
	mysql-test/t/null.test
	mysql-test/t/old-mode.test
	mysql-test/t/union.test
	packaging/rpm-oel/mysql.spec.in
	scripts/mysql_config.sh
	sql/ha_ndbcluster.cc
	sql/ha_ndbcluster_binlog.cc
	sql/ha_ndbcluster_cond.cc
	sql/item_cmpfunc.h
	sql/lock.cc
	sql/sql_select.cc
	sql/sql_show.cc
	sql/sql_update.cc
	sql/sql_yacc.yy
	storage/innobase/buf/buf0flu.cc
	storage/innobase/fil/fil0fil.cc
	storage/innobase/include/srv0srv.h
	storage/innobase/lock/lock0lock.cc
	storage/tokudb/CMakeLists.txt
	storage/xtradb/buf/buf0flu.cc
	storage/xtradb/fil/fil0fil.cc
	storage/xtradb/include/srv0srv.h
	storage/xtradb/lock/lock0lock.cc
	support-files/mysql.spec.sh
2014-12-02 22:25:16 +01:00
Kristian Nielsen
1eed274848 Fix wording in error log message, to be consistent with other messages ("IO thread" -> "I/O thread"). 2014-12-02 12:11:07 +01:00
Sergei Golubchik
f62c12b405 Merge 10.0.14 into 10.1 2014-10-15 12:59:13 +02:00
Sergei Golubchik
7f5e51b940 MDEV-34 delete storage/ndb and sql/*ndb* (and collateral changes)
remove:
* NDB from everywhere
* IM from mtr-v1
* packaging/rpm-oel and packaging/rpm-uln
* few unused spec files
* plug.in file
* .bzrignore
2014-10-11 18:53:06 +02:00
Sergei Golubchik
03ec3511a8 cleanup: galera misc cleanups
also disable galera-specific output in mysql_tzinfo_to_sql,
it'll be enabled later.
2014-10-10 22:27:36 +02:00
Sergei Golubchik
3620910eea cleanup: galera merge, simple changes 2014-10-01 23:38:27 +02:00
Michael Widenius
70823e1d91 MDEV-5120 Test suite test maria-no-logging fails
The reason for the failure was a bug in an include file on debian that causes 'struct stat'
to have different sized depending on the environment.

This patch fixes so that we always include my_global.h or my_config.h before we include any other files.

Other things:
- Removed #include <my_global.h> in some include files; Better to always do this at the top level to have as few
  "always-include-this-file-first' files as possible.
- Removed usage of some include files that where already included by my_global.h or by other files.


client/mysql_plugin.c:
  Use my_global.h first
client/mysqlslap.c:
  Remove duplicated include files
extra/comp_err.c:
  Remove duplicated include files
include/m_string.h:
  Remove duplicated include files
include/maria.h:
  Remove duplicated include files
libmysqld/emb_qcache.cc:
  Use my_global.h first
plugin/semisync/semisync.h:
  Use my_pthread.h first
sql/datadict.cc:
  Use my_global.h first
sql/debug_sync.cc:
  Use my_global.h first
sql/derror.cc:
  Use my_global.h first
sql/des_key_file.cc:
  Use my_global.h first
sql/discover.cc:
  Use my_global.h first
sql/event_data_objects.cc:
  Use my_global.h first
sql/event_db_repository.cc:
  Use my_global.h first
sql/event_parse_data.cc:
  Use my_global.h first
sql/event_queue.cc:
  Use my_global.h first
sql/event_scheduler.cc:
  Use my_global.h first
sql/events.cc:
  Use my_global.h first
sql/field.cc:
  Use my_global.h first
  Remove duplicated include files
sql/field_conv.cc:
  Use my_global.h first
sql/filesort.cc:
  Use my_global.h first
  Remove duplicated include files
sql/gstream.cc:
  Use my_global.h first
sql/ha_ndbcluster.cc:
  Use my_global.h first
sql/ha_ndbcluster_binlog.cc:
  Use my_global.h first
sql/ha_ndbcluster_cond.cc:
  Use my_global.h first
sql/ha_partition.cc:
  Use my_global.h first
sql/handler.cc:
  Use my_global.h first
sql/hash_filo.cc:
  Use my_global.h first
sql/hostname.cc:
  Use my_global.h first
sql/init.cc:
  Use my_global.h first
sql/item.cc:
  Use my_global.h first
sql/item_buff.cc:
  Use my_global.h first
sql/item_cmpfunc.cc:
  Use my_global.h first
sql/item_create.cc:
  Use my_global.h first
sql/item_geofunc.cc:
  Use my_global.h first
sql/item_inetfunc.cc:
  Use my_global.h first
sql/item_row.cc:
  Use my_global.h first
sql/item_strfunc.cc:
  Use my_global.h first
sql/item_subselect.cc:
  Use my_global.h first
sql/item_sum.cc:
  Use my_global.h first
sql/item_timefunc.cc:
  Use my_global.h first
sql/item_xmlfunc.cc:
  Use my_global.h first
sql/key.cc:
  Use my_global.h first
sql/lock.cc:
  Use my_global.h first
sql/log.cc:
  Use my_global.h first
sql/log_event.cc:
  Use my_global.h first
sql/log_event_old.cc:
  Use my_global.h first
sql/mf_iocache.cc:
  Use my_global.h first
sql/mysql_install_db.cc:
  Remove duplicated include files
sql/mysqld.cc:
  Remove duplicated include files
sql/net_serv.cc:
  Remove duplicated include files
sql/opt_range.cc:
  Use my_global.h first
sql/opt_subselect.cc:
  Use my_global.h first
sql/opt_sum.cc:
  Use my_global.h first
sql/parse_file.cc:
  Use my_global.h first
sql/partition_info.cc:
  Use my_global.h first
sql/procedure.cc:
  Use my_global.h first
sql/protocol.cc:
  Use my_global.h first
sql/records.cc:
  Use my_global.h first
sql/records.h:
  Don't include my_global.h
  Better to do this at the upper level
sql/repl_failsafe.cc:
  Use my_global.h first
sql/rpl_filter.cc:
  Use my_global.h first
sql/rpl_gtid.cc:
  Use my_global.h first
sql/rpl_handler.cc:
  Use my_global.h first
sql/rpl_injector.cc:
  Use my_global.h first
sql/rpl_record.cc:
  Use my_global.h first
sql/rpl_record_old.cc:
  Use my_global.h first
sql/rpl_reporting.cc:
  Use my_global.h first
sql/rpl_rli.cc:
  Use my_global.h first
sql/rpl_tblmap.cc:
  Use my_global.h first
sql/rpl_utility.cc:
  Use my_global.h first
sql/set_var.cc:
  Added comment
sql/slave.cc:
  Use my_global.h first
sql/sp.cc:
  Use my_global.h first
sql/sp_cache.cc:
  Use my_global.h first
sql/sp_head.cc:
  Use my_global.h first
sql/sp_pcontext.cc:
  Use my_global.h first
sql/sp_rcontext.cc:
  Use my_global.h first
sql/spatial.cc:
  Use my_global.h first
sql/sql_acl.cc:
  Use my_global.h first
sql/sql_admin.cc:
  Use my_global.h first
sql/sql_analyse.cc:
  Use my_global.h first
sql/sql_audit.cc:
  Use my_global.h first
sql/sql_base.cc:
  Use my_global.h first
sql/sql_binlog.cc:
  Use my_global.h first
sql/sql_bootstrap.cc:
  Use my_global.h first
  Use my_global.h first
sql/sql_cache.cc:
  Use my_global.h first
sql/sql_class.cc:
  Use my_global.h first
sql/sql_client.cc:
  Use my_global.h first
sql/sql_connect.cc:
  Use my_global.h first
sql/sql_crypt.cc:
  Use my_global.h first
sql/sql_cursor.cc:
  Use my_global.h first
sql/sql_db.cc:
  Use my_global.h first
sql/sql_delete.cc:
  Use my_global.h first
sql/sql_derived.cc:
  Use my_global.h first
sql/sql_do.cc:
  Use my_global.h first
sql/sql_error.cc:
  Use my_global.h first
sql/sql_explain.cc:
  Use my_global.h first
sql/sql_expression_cache.cc:
  Use my_global.h first
sql/sql_handler.cc:
  Use my_global.h first
sql/sql_help.cc:
  Use my_global.h first
sql/sql_insert.cc:
  Use my_global.h first
sql/sql_lex.cc:
  Use my_global.h first
sql/sql_load.cc:
  Use my_global.h first
sql/sql_locale.cc:
  Use my_global.h first
sql/sql_manager.cc:
  Use my_global.h first
sql/sql_parse.cc:
  Use my_global.h first
sql/sql_partition.cc:
  Use my_global.h first
sql/sql_plugin.cc:
  Added comment
sql/sql_prepare.cc:
  Use my_global.h first
sql/sql_priv.h:
  Added error if we use this before including my_global.h
  This check is here becasue so many files includes sql_priv.h first.
sql/sql_profile.cc:
  Use my_global.h first
sql/sql_reload.cc:
  Use my_global.h first
sql/sql_rename.cc:
  Use my_global.h first
sql/sql_repl.cc:
  Use my_global.h first
sql/sql_select.cc:
  Use my_global.h first
sql/sql_servers.cc:
  Use my_global.h first
sql/sql_show.cc:
  Added comment
sql/sql_signal.cc:
  Use my_global.h first
sql/sql_statistics.cc:
  Use my_global.h first
sql/sql_table.cc:
  Use my_global.h first
sql/sql_tablespace.cc:
  Use my_global.h first
sql/sql_test.cc:
  Use my_global.h first
sql/sql_time.cc:
  Use my_global.h first
sql/sql_trigger.cc:
  Use my_global.h first
sql/sql_udf.cc:
  Use my_global.h first
sql/sql_union.cc:
  Use my_global.h first
sql/sql_update.cc:
  Use my_global.h first
sql/sql_view.cc:
  Use my_global.h first
sql/sys_vars.cc:
  Added comment
sql/table.cc:
  Use my_global.h first
sql/thr_malloc.cc:
  Use my_global.h first
sql/transaction.cc:
  Use my_global.h first
sql/uniques.cc:
  Use my_global.h first
sql/unireg.cc:
  Use my_global.h first
sql/unireg.h:
  Removed inclusion of my_global.h
storage/archive/ha_archive.cc:
  Added comment
storage/blackhole/ha_blackhole.cc:
  Use my_global.h first
storage/csv/ha_tina.cc:
  Use my_global.h first
storage/csv/transparent_file.cc:
  Use my_global.h first
storage/federated/ha_federated.cc:
  Use my_global.h first
storage/federatedx/federatedx_io.cc:
  Use my_global.h first
storage/federatedx/federatedx_io_mysql.cc:
  Use my_global.h first
storage/federatedx/federatedx_io_null.cc:
  Use my_global.h first
storage/federatedx/federatedx_txn.cc:
  Use my_global.h first
storage/heap/ha_heap.cc:
  Use my_global.h first
storage/innobase/handler/handler0alter.cc:
  Use my_global.h first
storage/maria/ha_maria.cc:
  Use my_global.h first
storage/maria/unittest/ma_maria_log_cleanup.c:
  Remove duplicated include files
storage/maria/unittest/test_file.c:
  Added comment
storage/myisam/ha_myisam.cc:
  Move sql_plugin.h first as this includes my_global.h
storage/myisammrg/ha_myisammrg.cc:
  Use my_global.h first
storage/oqgraph/oqgraph_thunk.cc:
  Use my_config.h and my_global.h first
  One could not include my_global.h before oqgraph_thunk.h (don't know why)
storage/spider/ha_spider.cc:
  Use my_global.h first
storage/spider/hs_client/config.cpp:
  Use my_global.h first
storage/spider/hs_client/escape.cpp:
  Use my_global.h first
storage/spider/hs_client/fatal.cpp:
  Use my_global.h first
storage/spider/hs_client/hstcpcli.cpp:
  Use my_global.h first
storage/spider/hs_client/socket.cpp:
  Use my_global.h first
storage/spider/hs_client/string_util.cpp:
  Use my_global.h first
storage/spider/spd_conn.cc:
  Use my_global.h first
storage/spider/spd_copy_tables.cc:
  Use my_global.h first
storage/spider/spd_db_conn.cc:
  Use my_global.h first
storage/spider/spd_db_handlersocket.cc:
  Use my_global.h first
storage/spider/spd_db_mysql.cc:
  Use my_global.h first
storage/spider/spd_db_oracle.cc:
  Use my_global.h first
storage/spider/spd_direct_sql.cc:
  Use my_global.h first
storage/spider/spd_i_s.cc:
  Use my_global.h first
storage/spider/spd_malloc.cc:
  Use my_global.h first
storage/spider/spd_param.cc:
  Use my_global.h first
storage/spider/spd_ping_table.cc:
  Use my_global.h first
storage/spider/spd_sys_table.cc:
  Use my_global.h first
storage/spider/spd_table.cc:
  Use my_global.h first
storage/spider/spd_trx.cc:
  Use my_global.h first
storage/xtradb/handler/handler0alter.cc:
  Use my_global.h first
storage/xtradb/handler/i_s.cc:
  Use my_global.h first
2014-09-30 20:31:14 +03:00
Nirbhay Choubey
c916085e27 bzr merge -rtag:mariadb-10.0.14 maria/10.0/ 2014-09-28 20:43:56 -04:00
Michael Widenius
2362d98470 MDEV-6698: safe_mutex: Found wrong usage of mutex 'log_space_lock' and 'LOCK_log'
Moved freeing of mutex earlier, as we don't need to have log_space_cond locked for doing rotate_relay_log()

sql/slave.cc:
  Moved freeing of mutex earlier, as we don't need to have log_space_cond locked for doing rotate_relay_log()
2014-09-09 17:37:08 +03:00
Sergei Golubchik
3da761912a MDEV-6616 Server crashes in my_hash_first if shutdown is performed when FLUSH LOGS is running
master_info_index becomes zero during shutdown.
check that it's valid (under a mutex) before dereferencing.
2014-09-06 08:33:56 +02:00
Kristian Nielsen
36f50be970 MDEV-6462: Slave replicating using GTID doesn't recover correctly when master crashes in the middle of transaction
If the slave gets a reconnect in the middle of a GTID event group, normally
it will re-fetch that event group, skipping the first part that was already
queued for the SQL thread.

However, if the master crashed while writing the event group, the group is
incomplete. This patch detects this case and makes sure that the
transaction is rolled back and nothing is skipped from any following
event groups.

Similarly, a network proxy might cause the reconnect to end up on a
different master server. Detect this by noticing a different server_id,
and similarly in this case roll back the partially received group.
2014-09-02 14:07:01 +02:00
Jan Lindström
ab150128ce MDEV-6247: Merge 10.0-galera to 10.1.
Merged lp:maria/maria-10.0-galera up to revision 3880.

    Added a new functions to handler API to forcefully abort_transaction,
    producing fake_trx_id, get_checkpoint and set_checkpoint for XA. These
    were added for future possiblity to add more storage engines that
    could use galera replication.
2014-08-27 13:15:37 +03:00
Jan Lindström
df4dd593f2 MDEV-6247: Merge 10.0-galera to 10.1.
Merged lp:maria/maria-10.0-galera up to revision 3879.

Added a new functions to handler API to forcefully abort_transaction,
producing fake_trx_id, get_checkpoint and set_checkpoint for XA. These
were added for future possiblity to add more storage engines that
could use galera replication.
2014-08-26 15:43:46 +03:00
Nirbhay Choubey
8358dd53b7 bzr merge -r4346 maria/10.0 (maria-10.0.13) 2014-08-11 23:55:41 -04:00
Monty
e2b2bde358 Made sql_log_slow a session variable
mysqldump:
- Added --log-queries to allow one to disable logging for the dump

sql/log_event.cc:
- Removed setting of enable_slow_log as it's not required anymore.

sql/sql_parse.cc:
- Set enable_slow_log to value of thd->variables.sql_log_slow as this will speed up tests if slow log is disabled.
- opt_log_slow_admin_statements can now only disable slow log, not enable it.

sql/sql_explain.cc:
- Minor cleanup

Other things:
- Added sql_log_slow to system variables.
- Changed opt_slow_log to global_system_variables.sql_log_slow in all files
- Updated tests to reflect changes
2014-08-09 13:22:01 +03:00
Sergei Golubchik
9fffe990c4 buildbot found failures
config.h.cmake:
  define NOMINMAX, otherwise Windows system headers define min() and max() macros
sql/slave.cc:
  mi->report() has one more argument in MariaDB
storage/xtradb/buf/buf0flu.cc:
  xtradb fixes for windows, again
2014-08-08 13:53:43 +02:00
Sergei Golubchik
6fb17a0601 5.5.39 merge 2014-08-07 18:06:56 +02:00
Nirbhay Choubey
ec91eea8db Local merge of mariadb-5.5.39
bzr merge -r4264 maria/5.5

Text conflict in sql/mysqld.cc
Text conflict in storage/xtradb/btr/btr0cur.c
Text conflict in storage/xtradb/buf/buf0buf.c
Text conflict in storage/xtradb/buf/buf0lru.c
Text conflict in storage/xtradb/handler/ha_innodb.cc
5 conflicts encountered.
2014-08-06 14:06:11 -04:00
Sergei Golubchik
50e192a04f Bug#17638477 UNINSTALL AND INSTALL SEMI-SYNC PLUGIN CAUSES SLAVES TO BREAK
Fix the bug properly (plugin cannot be unloaded as long as it's locked).
Enable and fix the test case.
Significantly reduce number of LOCK_plugin locks for semisync
(practically all locks were removed)
2014-08-03 12:45:14 +02:00
Kristian Nielsen
501c56ef1e MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
Merge the patches into MariaDB 10.0 main.

With this patch, parallel replication will now automatically retry a
transaction that fails due to deadlock or other temporary error, same as
single-threaded replication.

We catch deadlocks with InnoDB transactions due to enforced commit order. If
T1 must commit before T2 in parallel replication and T1 ends up waiting for T2
inside InnoDB, we kill T2 and retry it later to resolve the deadlock
automatically.
2014-07-11 12:06:47 +02:00
Kristian Nielsen
e81ecc9c72 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
Fix a bug discovered in Buildbot valgrind. The logic in checking for slave
init thread completion was reversed, so depending on thread scheduling server
startup could hang.

Also add another variant of SSL valgrind suppression, needed for different
library version.
2014-07-11 10:54:43 +02:00
Kristian Nielsen
98fc5b3af8 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
After-review changes.

For this patch in 10.0, we do not introduce a new public storage engine API,
we just fix the InnoDB/XtraDB issues. In 10.1, we will make a better public
API that can be used for all storage engines (MDEV-6429).

Eliminate the background thread that did deadlock kills asynchroneously.
Instead, we ensure that the InnoDB/XtraDB code can handle doing the kill from
inside the deadlock detection code (when thd_report_wait_for() needs to kill a
later thread to resolve a deadlock).

(We preserve the part of the original patch that introduces dedicated mutex
and condition for the slave init thread, to remove the abuse of
LOCK_thread_count for start/stop synchronisation of the slave init thread).
2014-07-08 12:54:47 +02:00
Kristian Nielsen
9150a0c7cb MDEV-4937: sql_slave_skip_counter does not work with GTID
The sql_slave_skip_counter is important to be able to recover replication from
certain errors. Often, an appropriate solution is to set
sql_slave_skip_counter to skip over a problem event. But setting
sql_slave_skip_counter produced an error in GTID mode, with a suggestion to
instead set @@gtid_slave_pos to point past the problem event. This however is
not always possible; for example, in case of an INCIDENT event, that event
does not have any GTID to assign to @@gtid_slave_pos.

With this patch, sql_slave_skip_counter now works in GTID mode the same was as
in non-GTID mode. When set, that many initial events are skipped when the SQL
thread starts, plus as many extra events are needed to completely skip any
partially skipped event group. The GTID position is updated to point past the
skipped event(s).
2014-06-25 15:24:11 +02:00
Kristian Nielsen
86362129a2 MDEV-6120: When slave stops with error, error message should indicate the failing GTID
If replication breaks in GTID mode, it is not trivial to determine the GTID of
the failing event group. This is a problem, as such GTID is needed eg. to
explicitly set @@gtid_slave_pos to skip to after that event group, or to
compare errors on different servers, etc.

Fix by ensuring that relevant slave errors logged to the error log include the
GTID of the event group containing the problem event.
2014-06-25 15:17:03 +02:00
Nirbhay Choubey
a76a6601ec bzr merge -rtag:mariadb-10.0.12 maria/10.0 2014-06-19 13:12:38 -04:00
Sergey Vojtovich
b6c175aad4 MDEV-6039 - WebScaleSQL patches
Preserve CLIENT_REMEMBER_OPTIONS flag for compressed connections

Code cleanup: removed reference to CLIENT_REMEMBER_OPTIONS from server
code. This flag is ignored in MariaDB.
2014-06-18 12:12:43 +04:00
unknown
081926f3d8 MDEV-6188: master_retry_count (ignored if disconnect happens on SET master_heartbeat_period)
That particular part of slave connect to master was missing code to handle
retry in case of network errors. The same problem is present in MySQL 5.5, but
fixed in MySQL 5.6.

Fixed with this patch, by adding the code (mostly identical to MySQL 5.6), and
also adding a test case.

I checked other queries done towards master during slave connect, and they now
all seem to handle reconnect in case of network failures.
2014-06-17 14:10:13 +02:00
unknown
bd4153a8c2 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel
replication causing replication to fail.

Remove the temporary fix for MDEV-5914, which used READ COMMITTED for parallel
replication worker threads. Replace it with a better, more selective solution.

The issue is with certain edge cases of InnoDB gap locks, for example between
INSERT and ranged DELETE. It is possible for the gap lock set by the DELETE to
block the INSERT, if the DELETE runs first, while the record lock set by
INSERT does not block the DELETE, if the INSERT runs first. This can cause a
conflict between the two in parallel replication on the slave even though they
ran without conflicts on the master.

With this patch, InnoDB will ask the server layer about the two involved
transactions before blocking on a gap lock. If the server layer tells InnoDB
that the transactions are already fixed wrt. commit order, as they are in
parallel replication, InnoDB will ignore the gap lock and allow the two
transactions to proceed in parallel, avoiding the conflict.

Improve the fix for MDEV-6020. When InnoDB itself detects a deadlock, it now
asks the server layer for any preferences about which transaction to roll
back. In case of parallel replication with two transactions T1 and T2 fixed to
commit T1 before T2, the server layer will ask InnoDB to roll back T2 as the
deadlock victim, not T1. This helps in some cases to avoid excessive deadlock
rollback, as T2 will in any case need to wait for T1 to complete before it can
itself commit.

Also some misc. fixes found during development and testing:

 - Remove thd_rpl_is_parallel(), it is not used or needed.

 - Use KILL_CONNECTION instead of KILL_QUERY when a parallel replication
   worker thread is killed to resolve a deadlock with fixed commit
   ordering. There are some cases, eg. in sql/sql_parse.cc, where a KILL_QUERY
   can be ignored if the query otherwise completed successfully, and this
   could cause the deadlock kill to be lost, so that the deadlock was not
   correctly resolved.

 - Fix random test failure due to missing wait_for_binlog_checkpoint.inc.

 - Make sure that deadlock or other temporary errors during parallel
   replication are not printed to the the error log; there were some places
   around the replication code with extra error logging. These conditions can
   occur occasionally and are handled automatically without breaking
   replication, so they should not pollute the error log.

 - Fix handling of rgi->gtid_sub_id. We need to be able to access this also at
   the end of a transaction, to be able to detect and resolve deadlocks due to
   commit ordering. But this value was also used as a flag to mark whether
   record_gtid() had been called, by being set to zero, losing the value. Now,
   introduce a separate flag rgi->gtid_pending, so rgi->gtid_sub_id remains
   valid for the entire duration of the transaction.

 - Fix one place where the code to handle ignored errors called reset_killed()
   unconditionally, even if no error was caught that should be ignored. This
   could cause loss of a deadlock kill signal, breaking deadlock detection and
   resolution.

 - Fix a couple of missing mysql_reset_thd_for_next_command(). This could
   cause a prior error condition to remain for the next event executed,
   causing assertions about errors already being set and possibly giving
   incorrect error handling for following event executions.

 - Fix code that cleared thd->rgi_slave in the parallel replication worker
   threads after each event execution; this caused the deadlock detection and
   handling code to not be able to correctly process the associated
   transactions as belonging to replication worker threads.

 - Remove useless error code in slave_background_kill_request().

 - Fix bug where wfc->wakeup_error was not cleared at
   wait_for_commit::unregister_wait_for_prior_commit(). This could cause the
   error condition to wrongly propagate to a later wait_for_prior_commit(),
   causing spurious ER_PRIOR_COMMIT_FAILED errors.

 - Do not put the binlog background thread into the processlist. It causes
   too many result differences in mtr, but also it probably is not useful
   for users to pollute the process list with a system thread that does not
   really perform any user-visible tasks...
2014-06-10 10:13:15 +02:00
unknown
629b822913 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel
replication causing replication to fail.

In parallel replication, we run transactions from the master in parallel, but
force them to commit in the same order they did on the master. If we force T1
to commit before T2, but T2 holds eg. a row lock that is needed by T1, we get
a deadlock when T2 waits until T1 has committed.

Usually, we do not run T1 and T2 in parallel if there is a chance that they
can have conflicting locks like this, but there are certain edge cases where
it can occasionally happen (eg. MDEV-5914, MDEV-5941, MDEV-6020). The bug was
that this would cause replication to hang, eventually getting a lock timeout
and causing the slave to stop with error.

With this patch, InnoDB will report back to the upper layer whenever a
transactions T1 is about to do a lock wait on T2. If T1 and T2 are parallel
replication transactions, and T2 needs to commit later than T1, we can thus
detect the deadlock; we then kill T2, setting a flag that causes it to catch
the kill and convert it to a deadlock error; this error will then cause T2 to
roll back and release its locks (so that T1 can commit), and later T2 will be
re-tried and eventually also committed.

The kill happens asynchroneously in a slave background thread; this is
necessary, as the reporting from InnoDB about lock waits happen deep inside
the locking code, at a point where it is not possible to directly call
THD::awake() due to mutexes held.

Deadlock is assumed to be (very) rarely occuring, so this patch tries to
minimise the performance impact on the normal case where no deadlocks occur,
rather than optimise the handling of the occasional deadlock.

Also fix transaction retry due to deadlock when it happens after a transaction
already signalled to later transactions that it started to commit. In this
case we need to undo this signalling (and later redo it when we commit again
during retry), so following transactions will not start too early.

Also add a missing thd->send_kill_message() that got triggered during testing
(this corrects an incorrect fix for MySQL Bug#58933).
2014-06-03 10:31:11 +02:00
Nirbhay Choubey
645d402544 Fix for a build failure. 2014-05-21 15:16:15 -04:00
Nirbhay Choubey
99df0fbad5 bzr merge -r3968..3984 codership/5.5 (non-Innodb changes only). 2014-05-21 14:32:57 -04:00
Nirbhay Choubey
086af8367e bzr merge -r4209 maria/10.0. 2014-05-21 11:09:55 -04:00
unknown
787c470cef MDEV-5262: Missing retry after temp error in parallel replication
Handle retry of event groups that span multiple relay log files.

 - If retry reaches the end of one relay log file, move on to the next.

 - Handle refcounting of relay log files, and avoid purging relay log
   files until all event groups have completed that might have needed
   them for transaction retry.
2014-05-15 15:52:08 +02:00
Sergei Golubchik
edf1fbd25b MDEV-6153 Trivial Lintian errors in MariaDB sources: spelling errors and wrong executable bits 2014-05-13 11:53:30 +02:00
Venkatesh Duggirala
33f15dc7ac Bug#17283409 4-WAY DEADLOCK: ZOMBIES, PURGING BINLOGS,
SHOW PROCESSLIST, SHOW BINLOGS

Problem:  A deadlock was occurring when 4 threads were
involved in acquiring locks in the following way
Thread 1: Dump thread ( Slave is reconnecting, so on
              Master, a new dump thread is trying kill
              zombie dump threads. It acquired thread's
              LOCK_thd_data and it is about to acquire
              mysys_var->current_mutex ( which LOCK_log)
Thread 2: Application thread is executing show binlogs and
               acquired LOCK_log and it is about to acquire
               LOCK_index.
Thread 3: Application thread is executing Purge binary logs
               and acquired LOCK_index and it is about to
               acquire LOCK_thread_count.
Thread 4: Application thread is executing show processlist
               and acquired LOCK_thread_count and it is
               about to acquire zombie dump thread's
               LOCK_thd_data.
Deadlock Cycle:
     Thread 1 -> Thread 2 -> Thread 3-> Thread 4 ->Thread 1

The same above deadlock was observed even when thread 4 is
executing 'SELECT * FROM information_schema.processlist' command and
acquired LOCK_thread_count and it is about to acquire zombie
dump thread's LOCK_thd_data.

Analysis:
There are four locks involved in the deadlock.  LOCK_log,
LOCK_thread_count, LOCK_index and LOCK_thd_data.
LOCK_log, LOCK_thread_count, LOCK_index are global mutexes
where as LOCK_thd_data is local to a thread.
We can divide these four locks in two groups.
Group 1 consists of LOCK_log and LOCK_index and the order
should be LOCK_log followed by LOCK_index.
Group 2 consists of other two mutexes
LOCK_thread_count, LOCK_thd_data and the order should
be LOCK_thread_count followed by LOCK_thd_data.
Unfortunately, there is no specific predefined lock order defined
to follow in the MySQL system when it comes to locks across these
two groups. In the above problematic example,
there is no problem in the way we are acquiring the locks
if you see each thread individually.
But If you combine all 4 threads, they end up in a deadlock.

Fix: 
Since everything seems to be fine in the way threads are taking locks,
In this patch We are changing the duration of the locks in Thread 4
to break the deadlock. i.e., before the patch, Thread 4
('show processlist' command) mysqld_list_processes()
function acquires LOCK_thread_count for the complete duration
of the function and it also acquires/releases
each thread's LOCK_thd_data.

LOCK_thread_count is used to protect addition and
deletion of threads in global threads list. While show
process list is looping through all the existing threads,
it will be a problem if a thread is exited but there is no problem
if a new thread is added to the system. Hence a new mutex is
introduced "LOCK_thd_remove" which will protect deletion
of a thread from global threads list. All threads which are
getting exited should acquire LOCK_thd_remove
followed by LOCK_thread_count. (It should take LOCK_thread_count
also because other places of the code still thinks that exit thread
is protected with LOCK_thread_count. In this fix, we are changing
only 'show process list' query logic )
(Eg: unlink_thd logic will be protected with
LOCK_thd_remove).

Logic of mysqld_list_processes(or file_schema_processlist)
will now be protected with 'LOCK_thd_remove' instead of
'LOCK_thread_count'.

Now the new locking order after this patch is:
LOCK_thd_remove -> LOCK_thd_data -> LOCK_log ->
LOCK_index -> LOCK_thread_count
2014-05-08 18:13:01 +05:30
unknown
b0b60f2498 MDEV-5262: Missing retry after temp error in parallel replication
Start implementing that an event group can be re-tried in parallel replication
if it fails with a temporary error (like deadlock).

Patch is very incomplete, just some very basic retry works.

Stuff still missing (not complete list):

 - Handle moving to the next relay log file, if event group to be retried
   spans multiple relay log files.

 - Handle refcounting of relay log files, to ensure that we do not purge a
   relay log file and then later attempt to re-execute events out of it.

 - Handle description_event_for_exec - we need to save this somehow for the
   possible retry - and use the correct one in case it differs between relay
   logs.

 - Do another retry attempt in case the first retry also fails.

 - Limit the max number of retries.

 - Lots of testing will be needed for the various edge cases.
2014-05-08 14:20:18 +02:00
unknown
64923bb653 MDEV-6156: Parallel replication incorrectly caches charset between worker threads
The previous patch for this bug was unfortunately completely wrong.

The purpose of cached_charset is to remember which character set we
have installed currently in the THD, so that in the common case where
charset does not change between queries, we do not need to update it
in the THD. Thus, it is important that the cached_charset field is
tightly coupled to the THD for which it handles caching.

Thus the right place to put cached_charset seems to be in the THD.
This patch introduces a field THD:system_thread_info where such info
in general can be placed without further inflating the THD with unused
data for other threads (THD is already far too big as it is). It then
moves the cached_charset into this slot for the SQL driver thread and
for the parallel replication worker threads.

The THD::rpl_filter field is also moved inside system_thread_info, to
keep the size of THD unchanged. Moving further fields in to reduce the
size of THD is a separate task, filed as MDEV-6164.
2014-04-25 12:58:31 +02:00
unknown
010971a761 MDEV-6156: Parallel replication incorrectly caches charset between worker threads
Replication caches the character sets used in a query, to be able to quickly
reuse them for the next query in the common case of them not having changed.

In parallel replication, this caching needs to be per-worker-thread. The
code was not modified to handle this correctly, so the caching in one worker
could cause another worker to run a query using the wrong character set,
causing replication corruption.
2014-04-23 16:06:06 +02:00
Jan Lindström
fa18dc3944 Merge lp:codership-mysql/5.5 -r3961..3980. 2014-04-16 13:08:29 +03:00
Jan Lindström
150e88e8c9 Merge from lp:maria/5.5 to maria-5.5.37 release revision 4154. 2014-04-16 12:13:43 +03:00
Nirbhay Choubey
4213263101 Merging mariadb-10.0.10.
* bzr merge -rtag:mariadb-10.0.10 maria/10.0.
2014-04-08 10:36:34 -04:00
unknown
584c2d0ae9 MDEV-6040: MariaDB hangs if terminated quickly after start
We need to use mysql_cond_broadcast() rather than _signal for
COND_thread_count, as there can be multiple waiters.

Thanks to Pavel Ivanov for reporting both the problem and the
solution.
2014-04-10 09:38:57 +02:00
Nirbhay Choubey
02ba2bfdb4 Merging revision from codership-mysql/5.5 (r3928..3968) and
codership-mysql/5.6 (r4021..4065).
- Also contains fixes for some build failures.
2014-03-27 16:26:00 -04:00
Nirbhay Choubey
90e4f7f9d3 * bzr merge -rtag:mariadb-10.0.9 maria/10.0
* Fix for post-merge build failures.
2014-03-26 14:27:24 -04:00
Nirbhay Choubey
c5f7486654 bzr merge -r4062..4065 codership/5.6 2014-03-26 14:13:12 -04:00
Nirbhay Choubey
899f9801d4 bzr merge -r3946..3968 codership/5.5 2014-03-25 17:01:05 -04:00
Sergei Golubchik
10740939eb 5.5 merge 2014-03-26 22:25:38 +01:00
Michael Widenius
dd13db6f4a MDEV-5829: STOP SLAVE resets global status variables
Reason for the bug was an optimization for higher connect speed where we moved when global status was updated,
but forgot to update states when slave thread dies.
Fixed by adding thd->add_status_to_global() before deleting slave thread's thd.


mysys/my_delete.c:
  Added missing newline
sql/mysqld.cc:
  Use add_status_to_global()
sql/slave.cc:
  Added missing add_status_to_global()
sql/sql_class.cc:
  Use add_status_to_global()
sql/sql_class.h:
  Simplify adding local status to global by adding add_status_to_global()
2014-03-14 16:29:23 +02:00
unknown
8b9b7ec395 MDEV-5804: If same GTID is received on multiple master connections in multi-source replication, the event is double-executed causing corruption or replication failure
Some fixes, mainly to make it work in non-parallel replication mode also
(--slave-parallel-threads=0).

Patch should be fairly complete now.
2014-03-12 00:14:49 +01:00
unknown
2c2478b822 MDEV-5804: If same GTID is received on multiple master connections in multi-source replication, the event is double-executed causing corruption or replication failure
Before, the arrival of same GTID twice in multi-source replication
would cause double-apply or in gtid strict mode an error.

Keep the behaviour, but add an option --gtid-ignore-duplicates which
allows to correctly handle duplicates, ignoring all but the first.
This relies on the user ensuring correct configuration so that
sequence numbers are strictly increasing within each replication
domain; then duplicates can be detected simply by comparing the
sequence numbers against what is already applied.

Only one master connection (but possibly multiple parallel worker
threads within that connection) is allowed to apply events within
one replication domain at a time; any other connection that
receives a GTID in the same domain either discards it (if it is
already applied) or waits for the other connection to not have
any events to apply.

Intermediate patch, as proof-of-concept for testing. The main limitation
is that currently it is only implemented for parallel replication,
@@slave_parallel_threads > 0.
2014-03-09 10:27:38 +01:00
Sergei Golubchik
4c788b06d4 10.0-base merge 2014-03-05 23:20:10 +01:00
unknown
cfc2eb9dd2 MDEV-5703: [PATCH] Slave disconnects and fails to reconnect on Error_code: 1159
Patch from Tomas Matejicek

Add missing error code to is_networ_error(), to allow slave to
automatically attempt reconnection also in this case.
2014-03-04 14:37:09 +01:00
unknown
5ec49e6452 Merge MDEV-5754, MDEV-5769, and MDEV-5764 into 10.0 2014-03-04 14:32:42 +01:00
unknown
bd2a0a2389 Merge MDEV-5754, MDEV-5769, and MDEV-5764 into 10.0-base 2014-03-04 14:21:00 +01:00
unknown
ec374f1e53 MDEV-5769: Slave crashes on attempt to do parallel replication from an older master
Older master has no GTID events, so such events are not available for
deciding on scheduling of event groups and so on.

With this patch, we run such events from old masters single-threaded, in the
sql driver thread.

This seems better than trying to make the parallel code handle the data from
older masters; while possible, this would require a lot of testing (as well as
possibly some extra overhead in the scheduling of events), which hardly seems
worthwhile.
2014-03-04 08:48:32 +01:00
unknown
641feed481 MDEV-5764: START SLAVE UNTIL does not work with parallel replication
With parallel replication, there can be any number of events queued on
in-memory lists in the worker threads.

For normal STOP SLAVE, we want to skip executing any remaining events on those
lists and stop as quickly as possible.

However, for START SLAVE UNTIL, when the UNTIL position is reached in the SQL
driver thread, we must _not_ stop until all already queued events for the
workers have been executed - otherwise we would stop too early, before the
actual UNTIL position had been completely reached.

The code did not handle UNTIL correctly, stopping too early due to not
executing the queued events to completion. Fix this, and also implement that
an explicit STOP SLAVE in the middle (when the SQL driver thread has reached
the UNTIL position but the workers have not) _will_ cause an immediate stop.
2014-03-03 12:13:55 +01:00
Sergei Golubchik
41c760b121 merge 2014-02-28 10:00:31 +01:00
Sergei Golubchik
05aba79e98 MDEV-4447 MariaDB sources should have unix-style line endings everywhere 2014-02-27 12:00:16 +01:00
unknown
6d3150c48d Merge MDEV-5657 (parallel replication) to 10.0-base 2014-02-26 17:03:32 +01:00
unknown
20959fa09c Merge MDEV-5657 (parallel replication) to 10.0 2014-02-26 16:38:42 +01:00
Sergei Golubchik
0dc23679c8 10.0-base merge 2014-02-26 15:28:07 +01:00
unknown
e90f68c0ba MDEV-5657: Parallel replication.
Clean up and improve the parallel implementation code, mainly related to
scheduling of work to threads and handling of stop and errors.

Fix a lot of bugs in various corner cases that could lead to crashes or
corruption.

Fix that a single replication domain could easily grab all worker threads and
stall all other domains; now a configuration variable
--slave-domain-parallel-threads allows to limit the number of
workers.

Allow next event group to start as soon as previous group begins the commit
phase (as opposed to when it ends it); this allows multiple event groups on
the slave to participate in group commit, even when no other opportunities for
parallelism are available.

Various fixes:

 - Fix some races in the rpl.rpl_parallel test case.

 - Fix an old incorrect assertion in Log_event iocache read.

 - Fix repeated malloc/free of wait_for_commit and rpl_group_info objects.

 - Simplify wait_for_commit wakeup logic.

 - Fix one case in queue_for_group_commit() where killing one thread would
   fail to correctly signal the error to the next, causing loss of the
   transaction after slave restart.

 - Fix leaking of pthreads (and their allocated stack) due to missing
   PTHREAD_CREATE_DETACHED attribute.

 - Fix how one batch of group-committed transactions wait for the previous
   batch before starting to execute themselves. The old code had a very
   complex scheduling where the first transaction was handled differently,
   with subtle bugs in corner cases. Now each event group is always scheduled
   for a new worker (in a round-robin fashion amongst available workers).
   Keep a count of how many transactions have started to commit, and wait for
   that counter to reach the appropriate value.

 - Fix slave stop to wait for all workers to actually complete processing;
   before, the wait was for update of last_committed_sub_id, which happens a
   bit earlier, and could leave worker threads potentially accessing bits of
   the replication state that is no longer valid after slave stop.

 - Fix a couple of places where the test suite would kill a thread waiting
   inside enter_cond() in connection with debug_sync; debug_sync + kill can
   crash in rare cases due to a race with mysys_var_current_mutex in this
   case.

 - Fix some corner cases where we had enter_cond() but no exit_cond().

 - Fix that we could get failure in wait_for_prior_commit() but forget to flag
   the error with my_error().

 - Fix slave stop (both for normal stop and stop due to error). Now, at stop
   we pick a specific safe point (in terms of event groups executed) and make
   sure that all event groups before that point are executed to completion,
   and that no event group after start executing; this ensures a safe place to
   restart replication, even for non-transactional stuff/DDL. In error stop,
   make sure that all prior event groups are allowed to execute to completion,
   and that any later event groups that have started are rolled back, if
   possible. The old code could leave eg. T1 and T3 committed but T2 not, or
   it could even leave half a transaction not rolled back in some random
   worker, which would cause big problems when that worker was later reused
   after slave restart.

 - Fix the accounting of amount of events queued for one worker. Before, the
   amount was reduced immediately as soon as the events were dequeued (which
   happens all at once); this allowed twice the amount of events to be queued
   in memory for each single worker, which is not what users would expect.

 - Fix that an error set during execution of one event was sometimes not
   cleared before executing the next, causing problems with the error
   reporting.

 - Fix incorrect handling of thd->killed in worker threads.
2014-02-26 15:02:09 +01:00
Nirbhay Choubey
ae6e1548cb Merge from maria/5.5 (-rtag:mariadb-5.5.36). 2014-02-25 17:49:41 -05:00
Sergei Golubchik
0b9a0a3517 5.5 merge 2014-02-25 16:04:35 +01:00
Sergey Vojtovich
d12c7adf71 MDEV-5314 - Compiling fails on OSX using clang
This is port of fix for MySQL BUG#17647863.

revno: 5572
revision-id: jon.hauglid@oracle.com-20131030232243-b0pw98oy72uka2sj
committer: Jon Olav Hauglid <jon.hauglid@oracle.com>
timestamp: Thu 2013-10-31 00:22:43 +0100
message:
  Bug#17647863: MYSQL DOES NOT COMPILE ON OSX 10.9 GM

  Rename test() macro to MY_TEST() to avoid conflict with libc++.
2014-02-19 14:05:15 +04:00
Sergei Golubchik
84651126c0 MySQL-5.5.36 merge
(without few incorrect bugfixes and with 1250 files where only a copyright year was changed)
2014-02-17 11:00:51 +01:00
unknown
33a8afd8ef MDEV-5703: [PATCH] Slave disconnects and fails to reconnect on Error_code: 1159
Patch from Tomas Matejicek

Add missing error code to is_networ_error(), to allow slave to
automatically attempt reconnection also in this case.
2014-03-04 20:50:19 +01:00
unknown
dd93ec5633 Merge MariaDB 10.0-base to 10.0. 2014-02-10 15:12:17 +01:00
Michael Widenius
10001c8e4f Automatic merge 2014-02-05 19:23:11 +02:00
Michael Widenius
5426facdcb Replication changes for CREATE OR REPLACE TABLE
- CREATE TABLE is by default executed on the slave as CREATE OR REPLACE
- DROP TABLE is by default executed on the slave as DROP TABLE IF NOT EXISTS

This means that a slave will by default continue even if we try to create
a table that existed on the slave (the table will be deleted and re-created) or
if we try to drop a table that didn't exist on the slave.
This should be safe as instead of having the slave stop because of an inconsistency between
master and slave, it will fix the inconsistency.
Those that would prefer to get a stopped slave instead for the above cases can set slave_ddl_exec_mode to STRICT. 

- Ensure that a CREATE OR REPLACE TABLE which dropped a table is replicated
- DROP TABLE that generated an error on master is handled as an identical DROP TABLE on the slave (IF NOT EXISTS is not added in this case)
- Added slave_ddl_exec_mode variable to decide how DDL's are replicated

New logic for handling BEGIN GTID ... COMMIT from the binary log:

- When we find a BEGIN GTID, we start a transaction and set OPTION_GTID_BEGIN
- When we find COMMIT, we reset OPTION_GTID_BEGIN and execute the normal COMMIT code.
- While OPTION_GTID_BEGIN is set:
  - We don't generate implict commits before or after statements
  - All tables are regarded as transactional tables in the binary log (to ensure things are executed exactly as on the master)
- We reset OPTION_GTID_BEGIN also on rollback

This will help ensuring that we don't get any sporadic commits (and thus new GTID's) on the slave and will help keep the GTID's between master and slave in sync.


mysql-test/extra/rpl_tests/rpl_log.test:
  Added testing of mode slave_ddl_exec_mode=STRICT
mysql-test/r/mysqld--help.result:
  New help messages
mysql-test/suite/rpl/r/create_or_replace_mix.result:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/r/create_or_replace_row.result:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/r/create_or_replace_statement.result:
  Testing replication of create or replace
mysql-test/suite/rpl/r/rpl_gtid_startpos.result:
  Test must be run in slave_ddl_exec_mode=STRICT as part of the test depends on that DROP TABLE should fail on slave.
mysql-test/suite/rpl/r/rpl_row_log.result:
  Updated result
mysql-test/suite/rpl/r/rpl_row_log_innodb.result:
  Updated result
mysql-test/suite/rpl/r/rpl_row_show_relaylog_events.result:
  Updated result
mysql-test/suite/rpl/r/rpl_stm_log.result:
  Updated result
mysql-test/suite/rpl/r/rpl_stm_mix_show_relaylog_events.result:
  Updated result
mysql-test/suite/rpl/r/rpl_temp_table_mix_row.result:
  Updated result
mysql-test/suite/rpl/t/create_or_replace.inc:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_mix.cnf:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_mix.test:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_row.cnf:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_row.test:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_statement.cnf:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_statement.test:
  Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/rpl_gtid_startpos.test:
  Test must be run in slave_ddl_exec_mode=STRICT as part of the test depends on that DROP TABLE should fail on slave.
mysql-test/suite/rpl/t/rpl_stm_log.test:
  Removed some lines
mysql-test/suite/sys_vars/r/slave_ddl_exec_mode_basic.result:
  Testing of slave_ddl_exec_mode
mysql-test/suite/sys_vars/t/slave_ddl_exec_mode_basic.test:
  Testing of slave_ddl_exec_mode
sql/handler.cc:
  Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
  This is to ensure that statments are not commited too early if non transactional tables are used.
sql/log.cc:
  Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
  Also treat 'direct' log events as transactional (to get them logged as they where on the master)
sql/log_event.cc:
  Ensure that the new error from DROP TABLE when trying to drop a view is treated same as the old one.
  Store error code that slave expects in THD.
  Set OPTION_GTID_BEGIN if we find a BEGIN.
  Reset OPTION_GTID_BEGIN if we find a COMMIT.
sql/mysqld.cc:
  Added slave_ddl_exec_mode_options
sql/mysqld.h:
  Added slave_ddl_exec_mode_options
sql/rpl_gtid.cc:
  Reset OPTION_GTID_BEGIN if we record a gtid (safety)
sql/sql_class.cc:
  Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
sql/sql_class.h:
  Added to THD: log_current_statement and slave_expected_error
sql/sql_insert.cc:
  Ensure that CREATE OR REPLACE is logged if table was deleted.
  Don't do implicit commit for CREATE if we are under OPTION_GTID_BEGIN
sql/sql_parse.cc:
  Change CREATE TABLE -> CREATE OR REPLACE TABLE for slaves
  Change DROP TABLE -> DROP TABLE IF EXISTS for slaves
  CREATE TABLE doesn't force implicit commit in case of OPTION_GTID_BEGIN
  Don't do commits before or after any statement if OPTION_GTID_BEGIN was set.
sql/sql_priv.h:
  Added OPTION_GTID_BEGIN
sql/sql_show.cc:
  Enhanced store_create_info() to also be able to handle CREATE OR REPLACE
sql/sql_show.h:
  Updated prototype
sql/sql_table.cc:
  Ensure that CREATE OR REPLACE is logged if table was deleted.
sql/sys_vars.cc:
  Added slave_ddl_exec_mode
sql/transaction.cc:
  Added warning if we got a GTID under OPTION_GTID_BEGIN
2014-02-05 19:01:59 +02:00
Sergei Golubchik
72c20282db 10.0-base merge 2014-02-03 15:22:39 +01:00
Sergei Golubchik
59d9d08e2b 5.5 merge 2014-02-01 00:54:03 +01:00
Nirbhay Choubey
ecc2c96c9d Merge of maria/5.5 into maria-5.5-galera.
bzr merge -r tag:mariadb-5.5.35 maria/5.5
2014-01-29 19:00:43 -05:00
Michael Widenius
7ffc9da093 Implementation of MDEV-5491: CREATE OR REPLACE TABLE
Using CREATE OR REPLACE TABLE is be identical to

DROP TABLE IF EXISTS table_name;
CREATE TABLE ...;

Except that:

* CREATE OR REPLACE is be atomic (now one can create the same table between drop and create).
* Temporary tables will not shadow the table name for the DROP as the CREATE TABLE tells us already if we are using a temporary table or not.
* If the table was locked with LOCK TABLES, the new table will be locked with the same lock after it's created.

Implementation details:
- We don't anymore open the to-be-created table during CREATE TABLE, which the original code did.
  - There is no need to open a table we are planning to create. It's enough to check if the table exists or not.
- Removed some of duplicated code for CREATE IF NOT EXISTS.
- Give an error when using CREATE OR REPLACE with IF NOT EXISTS (conflicting options).
- As a side effect of the code changes, we don't anymore have to internally re-prepare prepared statements with CREATE TABLE if the table exists.
- Made one code path for all testing if log table are in use.
- Better error message if one tries to create/drop/alter a log table in use
- Added back disabled rpl_row_create_table test as it now seams to work and includes a lot of interesting tests.
- Added HA_LEX_CREATE_REPLACE to mark if we are using CREATE OR REPLACE
- Aligned CREATE OR REPLACE parsing code in sql_yacc.yy for TABLE and VIEW
- Changed interface for drop_temporary_table() to make it more reusable
- Changed Locked_tables_list::init_locked_tables() to work on the table object instead of the table list object. Before this it used a mix of both, which was not good.
- Locked_tables_list::unlock_locked_tables(THD *thd) now requires a valid thd argument. Old usage of calling this with 0 i changed to instead call Locked_tables_list::reset()
- Added functions Locked_tables_list:restore_lock() and Locked_tables_list::add_back_last_deleted_lock() to be able to easily add back a locked table to the lock list.
- Added restart_trans_for_tables() to be able to restart a transaction.
- DROP_ACL is required if one uses CREATE TABLE OR REPLACE.
- Added drop of normal and temporary tables in create_table_imp() if CREATE OR REPLACE was used.
- Added reacquiring of table locks in mysql_create_table() and mysql_create_like_table()




mysql-test/include/commit.inc:
  With new code we get fewer status increments
mysql-test/r/commit_1innodb.result:
  With new code we get fewer status increments
mysql-test/r/create.result:
  Added testing of create or replace with timeout
mysql-test/r/create_or_replace.result:
  Basic testing of CREATE OR REPLACE TABLE
mysql-test/r/partition_exchange.result:
  New error message
mysql-test/r/ps_ddl.result:
  Fewer reprepares with new code
mysql-test/suite/archive/discover.result:
  Don't rediscover archive tables if the .frm file exists
  (Sergei will look at this if there is a better way...)
mysql-test/suite/archive/discover.test:
  Don't rediscover archive tables if the .frm file exists
  (Sergei will look at this if there is a better way...)
mysql-test/suite/funcs_1/r/innodb_views.result:
  New error message
mysql-test/suite/funcs_1/r/memory_views.result:
  New error message
mysql-test/suite/rpl/disabled.def:
  rpl_row_create_table should now be safe to use
mysql-test/suite/rpl/r/rpl_row_create_table.result:
  Updated results after adding back disabled test
mysql-test/suite/rpl/t/rpl_create_if_not_exists.test:
  Added comment
mysql-test/suite/rpl/t/rpl_row_create_table.test:
  Added CREATE OR REPLACE TABLE test
mysql-test/t/create.test:
  Added CREATE OR REPLACE TABLE test
mysql-test/t/create_or_replace-master.opt:
  Create logs
mysql-test/t/create_or_replace.test:
  Basic testing of CREATE OR REPLACE TABLE
mysql-test/t/partition_exchange.test:
  Error number changed as we are now using same code for all log table change issues
mysql-test/t/ps_ddl.test:
  Fewer reprepares with new code
sql/handler.h:
  Moved things around a bit in a structure to get better alignment.
  Added HA_LEX_CREATE_REPLACE to mark if we are using CREATE OR REPLACE
  Added 3 elements to end of HA_CREATE_INFO to be able to store state to add backs locks in case of LOCK TABLES.
sql/log.cc:
  Reimplemented check_if_log_table():
  - Simpler and faster usage
  - Can give error messages
  
  This gives us one code path for allmost all error messages if log tables are in use
sql/log.h:
  New interface for check_if_log_table()
sql/slave.cc:
  More logging
sql/sql_alter.cc:
  New interface for check_if_log_table()
sql/sql_base.cc:
  More documentation
  Changed interface for drop_temporary_table() to make it more reusable
  Changed Locked_tables_list::init_locked_tables() to work on the table object instead of the table list object. Before this it used a mix of both, which was not good.
  Locked_tables_list::unlock_locked_tables(THD *thd) now requires a valid thd argument.  Old usage of calling this with 0 i changed to instead call Locked_tables_list::reset()
  Added functions Locked_tables_list:restore_lock() and Locked_tables_list::add_back_last_deleted_lock() to be able to easily add back a locked table to the lock list.
  Check for command number instead of open_strategy of CREATE TABLE was used.
  Added restart_trans_for_tables() to be able to restart a transaction.  This was needed in "create or replace ... select" between the drop table and the select.
sql/sql_base.h:
  Added and updated function prototypes
sql/sql_class.h:
  Added new prototypes to Locked_tables_list class
  Added extra argument to select_create to avoid double call to eof() or send_error()
  - I needed this in some edge case where the table was not created against expections.
sql/sql_db.cc:
  New interface for check_if_log_table()
sql/sql_insert.cc:
  Remember position to lock information so that we can reaquire table lock for LOCK TABLES + CREATE OR REPLACE TABLE SELECT. Later add back the lock by calling restore_lock().
  Removed one not needed indentation level in create_table_from_items()
  Ensure we don't call send_eof() or abort_result_set() twice.
sql/sql_lex.h:
  Removed variable that I temporarly added in an earlier changeset
sql/sql_parse.cc:
  Removed old test code (marked with QQ)
  Ensure that we have open_strategy set as TABLE_LIST::OPEN_STUB in CREATE TABLE
  Removed some IF NOT EXISTS code as this is now handled in create_table_table_impl().
  Set OPTION_KEEP_LOGS later. This code had to be moved as the test for IF EXISTS has changed place.
  DROP_ACL is required if one uses CREATE TABLE OR REPLACE.
sql/sql_partition_admin.cc:
  New interface for check_if_log_table()
sql/sql_rename.cc:
  New interface for check_if_log_table()
sql/sql_table.cc:
  New interface for check_if_log_table()
  Moved some code in mysql_rm_table() under a common test.
  - Safe as temporary tables doesn't have statistics.
  - !is_temporary_table(table) test was moved out from drop_temporary_table() and merged with upper level code.
  - Added drop of normal and temporary tables in create_table_imp() if CREATE OR REPLACE was used.
  - Added reacquiring of table locks in mysql_create_table() and mysql_create_like_table()
  - In mysql_create_like_table(), restore table->open_strategy() if it was changed.
  - Re-test if table was a view after opening it.
sql/sql_table.h:
  New prototype for mysql_create_table_no_lock()
sql/sql_yacc.yy:
  Added syntax for CREATE OR REPLACE TABLE
  Reuse new code for CREATE OR REPLACE VIEW
sql/table.h:
  Added name for enum type
sql/table_cache.cc:
  More DBUG
2014-01-29 15:37:17 +02:00
Jan Lindström
d43afb8828 Merge MariaDB-10.0.7 revision 3961. 2014-01-25 11:02:49 +02:00
unknown
8cc6e90d74 MDEV-5509: Seconds_behind_master incorrect in parallel replication
The problem was a race between the SQL driver thread and the worker threads.
The SQL driver thread would set rli->last_master_timestamp to zero to
mark that it has caught up with the master, while the worker threads would
set it to the timestamp of the executed event. This can happen out-of-order
in parallel replication, causing the "caught up" status to be overwritten
and Seconds_Behind_Master to wrongly grow when the slave is idle.

To fix, introduce a separate flag rli->sql_thread_caught_up to mark that the
SQL driver thread is caught up. This avoids issues with worker threads
overwriting the SQL driver thread status. In parallel replication, we then
make SHOW SLAVE STATUS check in addition that all worker threads are idle
before showing Seconds_Behind_Master as 0 due to slave idle.
2014-01-08 11:00:44 +01:00
Murthy Narkedimilli
c92223e198 Updated/added copyright headers 2014-01-06 10:52:35 +05:30
Murthy Narkedimilli
496abd0814 Updated/added copyright headers 2014-01-06 10:52:35 +05:30
Michael Widenius
4e9a2d5469 Don't writing entries to slave log about binlog_checksum not existing on master if log_warnings is <=1.
This solves the issue of getting a lot of unnecessary errors logged on the slave when connecting to MySQL or an old MariaDB version.


sql/slave.cc:
  Don't write that binlog_checksum doesn't exists on the master if log_warnings <= 1
2014-01-05 15:21:58 +02:00
Michael Widenius
273078c5fa Fixes to get valgrind to work with jemalloc
- Added MALLOC_LIBRARY variable to hold name of malloc library
- Back ported valgrind related fixes from jemalloc 3.4.1 to the included jemalloc 3.3.1
- Renamed bitmap_init() and bitmap_free() to my_bitmap_init() and my_bitmap_free() to avoid clash with jemalloc 3.4.1
- Use option --soname-synonyms=somalloc=NON to valgrind when using jemalloc
- Show version related variables in mysqld --help
  -- Added SHOW_VALUE_IN_HELP marker

Increased back_log to 150 as the original value was a bit too small


CMakeLists.txt:
  Added MALLOC_LIBRARY variable to hold name of malloc library
cmake/jemalloc.cmake:
  Added MALLOC_LIBRARY variable to hold name of malloc library
config.h.cmake:
  Added MALLOC_LIBRARY variable to hold name of malloc library
extra/jemalloc/ChangeLog:
  Updates changelog
extra/jemalloc/include/jemalloc/internal/arena.h:
  Backported valgrind fixes from jemalloc 3.4.1
extra/jemalloc/include/jemalloc/internal/jemalloc_internal.h.in:
  Backported valgrind fixes from jemalloc 3.4.1
extra/jemalloc/include/jemalloc/internal/private_namespace.h:
  Backported valgrind fixes from jemalloc 3.4.1
extra/jemalloc/include/jemalloc/internal/tcache.h:
  Backported valgrind fixes from jemalloc 3.4.1
extra/jemalloc/src/arena.c:
  Backported valgrind fixes from jemalloc 3.4.1
include/my_bitmap.h:
  Renamed bitmap_init() and bitmap_free() to my_bitmap_init() and my_bitmap_free() to avoid clash with jemalloc 3.4.1
mysql-test/mysql-test-run.pl:
  Use option --soname-synonyms=somalloc=NON to valgrind when using jemalloc
mysql-test/valgrind.supp:
  Supression of memory leak in OpenSuse 12.3
mysys/my_bitmap.c:
  Renamed bitmap_init() and bitmap_free() to my_bitmap_init() and my_bitmap_free()
sql/ha_ndbcluster_binlog.cc:
  Renames
sql/ha_ndbcluster_cond.h:
  Renames
sql/ha_partition.cc:
  Renames
sql/handler.cc:
  Renames
sql/item_subselect.cc:
  Renames
sql/log_event.cc:
  Renames
sql/log_event_old.cc:
  Renames
sql/mysqld.cc:
  Renames
  Show version related variables in mysqld --help
sql/opt_range.cc:
  Renames
sql/opt_table_elimination.cc:
  Renames
sql/partition_info.cc:
  Renames
sql/rpl_injector.h:
  Renames
sql/set_var.h:
  Renames
sql/slave.cc:
  Renames
sql/sql_bitmap.h:
  Renames
sql/sql_insert.cc:
  Renames
sql/sql_lex.h:
  Renames
sql/sql_parse.cc:
  Renames
sql/sql_partition.cc:
  Renames
sql/sql_select.cc:
  Renames
sql/sql_show.cc:
  Renames
sql/sql_update.cc:
  Renames
sql/sys_vars.cc:
  Show version related variables in mysqld --help
sql/sys_vars.h:
  Added SHOW_VALUE_IN_HELP marker for variables that should be shown in --help
sql/table.cc:
  Renames
sql/table.h:
  Removed not used bitmap_init_value
storage/connect/ha_connect.cc:
  Removed compiler warning
storage/maria/ma_open.c:
  Renames
unittest/mysys/bitmap-t.c:
  Renames
2014-01-02 11:19:19 +02:00
unknown
f2c4765c16 MDEV-5363: Make parallel replication waits killable
Fix locking bug / race introduced by previous patch: If we
bail out of next_event() due to kill, there was a small window
where we could exit without rli->data_lock held, causing an
unlock-twice bug.
2013-12-16 13:48:32 +01:00
Sergei Golubchik
d28d3ba40d 10.0-base merge 2013-12-16 13:02:21 +01:00
unknown
dbfe5f4774 MDEV-5363: Make parallel replication waits killable
Add a test case for killing a waiting query in parallel replication.

Fix several bugs found:

 - We should not wakeup_subsequent_commits() in ha_rollback_trans(), since we
   do not know the right wakeup_error() to give.

 - When a wait_for_prior_commit() is killed, we must unregister from the
   waitee so we do not race and get an extra (non-kill) wakeup.

 - We need to deal with error propagation correctly in queue_for_group_commit
   when one thread is killed.

 - Fix one locking issue in queue_for_group_commit(), we could unlock the
   waitee lock too early and this end up processing wakeup() with insufficient
   locking.

 - Fix Xid_log_event::do_apply_event; if commit fails it must not update the
   in-memory @@gtid_slave_pos state.

 - Fix and cleanup some things in the rpl_parallel.cc error handling.

 - Add a missing check for killed in the slave sql driver thread, to avoid a
   race.
2013-12-13 14:26:51 +01:00
Seppo Jaakola
496e22cf3b merge with MariaDB 5.6 bzr merge lp:maria --rtag:mariadb-10.0.6
and a number of fixes to make this buildable. 
Run also few short multi-master high conflict rate tests, with no issues
2013-12-04 10:32:43 +02:00
unknown
4bce09c104 MDEV-4986: GTID - do not do on-disk update of master.info after every event group
This was actually implemented as part of MDEV-4506, parallel replication.
Unfortunately, one of the conditionals was reversed. So fsync of master.info
was disabled in non-gtid mode, instead of in gtid mode.

So fix the conditional to be correct.
2013-11-29 15:46:09 +01:00
Sergei Golubchik
ab3604989c MDEV-4243 [PATCH] Warnings/errors while compiling with clang
fix the code to compile with clang. fix warnings too.

include/probes_mysql_nodtrace.h:
  clang++ doesn't like numeric _constants_ being used in ||
  (it suspects that the intention was | ). Boolean constants are ok.
sql/hostname.cc:
  only used in DBUG_ASSERT
sql/item.cc:
  str_to_time and str_to_datetime return bool, not MYSQL_TIMESTAMP_xxx
sql/item_func.cc:
  str_to_datetime_with_warn() returns bool, not MYSQL_TIMESTAMP_xxx
storage/cassandra/CMakeLists.txt:
  CMAKE_CXX_FLAGS can be empty
storage/connect/odbconn.cpp:
  HWND is void*
storage/connect/user_connect.h:
  deprecated on FreeBSD and unused anyway
storage/connect/value.cpp:
  bad characters inside. unused.
storage/spider/spd_trx.cc:
  clang++ warns that memset will also overwrite vtbl. it might be as well a good idea,
  as it asserts that the object will only be used as a storage.
  silence the warning.
2013-11-28 22:35:59 +01:00
Jan Lindström
071edcfea0 Merge with MariaDB 5.5.34. 2013-11-25 17:14:08 +02:00
unknown
7a1f31866c MDEV-4946: IO thread should expose its current GTID position
Add another column to SHOW SLAVE STATUS, and adjust test suite
to cope.
2013-11-25 15:21:25 +01:00
Sergei Golubchik
c6d30805db 5.5 merge 2013-11-23 00:50:54 +01:00
unknown
55a7159f53 MDEV-4982: GTID looses all binlog state after crash if InnoDB is disabled
MDEV-4725: Incorrect binlog state recovery if crash while writing event group

The binlog state was not recovered correctly if XA is not used (eg. InnoDB
disabled), or if server crashed in the middle of writing an event group to the
binlog.

With this patch, we ensure that recovery of binlog state is done even if we do
not do the full XA binlog recovery, and we ensure that we only recover fully
written event groups into the binlog state.
2013-11-21 14:42:25 +01:00
Sergei Golubchik
fa3f8a18b2 mysql-5.5.34 merge
(some patches reverted, test case added)
2013-11-19 13:16:25 +01:00
unknown
def3c98af4 MDEV-5291: Slave transaction retry on temporary error leaves dangling error in SHOW SLAVE STATUS
Make sure to clear the temporary error before the retry.
2013-11-14 15:08:29 +01:00
unknown
55de9b0468 merge 10-base->10.0 2013-11-11 23:40:53 +02:00
Alexander Barkov
4bf339453d A patch from Kristian:
Remove rpl_group_info from THD before freeing it,
to avoid access-after-free in THD.
2013-11-08 16:20:58 +04:00
unknown
e05962000d MDEV-4506: Parallel replication
Fix access of freed memory in debug builds. When deleting serial_rgi,
safe_mutex was trying to access current_thd, when that thd had just been
deleted (I hate all this current_thd and other magic thread local storage crap
used all over the code). Now delete the serial_rgi before the thd.
2013-11-07 11:56:06 +01:00
unknown
273bcb92c1 Merge 10.0-base to 10.0 2013-11-07 07:52:40 +01:00
unknown
bdbf90b969 MDEV-4506: Parallel replication
MDEV-5217: Incorrect event pos update leading to corruption of reading of events from relay log

The rli->event_relay_log_pos was sometimes undated incorrectly when using
parallel replication, especially around relay log rotates. This could cause
the SQL thread to seek into an invalid position in the relay log, resulting in
errors about invalid events or even random corruption in some cases.
2013-11-06 10:18:04 +01:00
Michael Widenius
de9d9792ab Fixed things missing in merge between 10.0-base and 10.0
Updated --help text to declare --slave-parallel-threads as an alpha feature

mysql-test/r/mysqld--help.result:
  Updated --help text
sql/slave.cc:
  Added missing trans_retries++ that caused rpl_deadlock_innodb.test to fail.
  This is safe as this part is never run in parallel.
sql/sql_base.cc:
  Fixed temporary table handling (part of merge)
sql/sys_vars.cc:
  Updated --help text to declare --slave-parallel-threads as an alpha feature
2013-11-03 22:26:44 +02:00
unknown
57a267a8c0 Merge from 10.0-base to 10.0 the feature MDEV-4506: Parallel replication.
The merge is still missing a few hunks related to temporary tables and
InnoDB log file size. The associated code did not seem to exist in
10.0, so the merge of that needs more work. Until this is fixed, there
are a number of test failures as a result.
2013-11-01 12:00:11 +01:00
unknown
cb86ce60b9 Merge MDEV-4506: Parallel replication into 10.0-base. 2013-11-01 09:17:06 +01:00
Sergei Golubchik
0fdb3bcfdb 10.0-base merge (roles) 2013-10-29 15:08:44 +01:00
unknown
2fbd1c7307 MDEV-4506: Parallel replication.
MDEV-5189: Error handling in parallel replication.

Fix error handling in parallel worker threads when a query fails:

 - Report the error to the error log.

 - Return the error back, and set rli->abort_slave.

 - Stop executing more events after the error.
2013-10-28 13:24:56 +01:00
unknown
80d0dd7bab MDEV-4506: Parallel replication.
Do not update relay-log.info and master.info on disk after every event
when using GTID mode:

 - relay-log.info and master.info are not crash-safe, and are not used
   when slave restarts in GTID mode (slave connects with GTID position
   instead and immediately rewrites the file with the new, correct
   information found).

 - When using GTID and parallel replication, the position in
   relay-log.info is misleading at best and simply wrong at worst.

 - When using parallel replication, the fact that every single
   transaction needs to do a write() syscall to the same file is
   likely to become a serious bottleneck.

The files are still written at normal slave stop.

In non-GTID mode, the files are written as normal (this is needed to
be able to restart after slave crash, even if such restart is then not
crash-safe, no change).
2013-10-25 12:56:12 +02:00
unknown
7a22b6a655 MDEV-4506: Parallel replication.
Fix uninitialised variable.
2013-10-24 14:37:45 +02:00
unknown
ee8a816208 MDEV-4506: Parallel replication.
Implement --slave-parallel-max-queue to limit memory usage
of SQL thread read-ahead in the relay log.
2013-10-24 12:44:21 +02:00
unknown
a09d2b105f MDEV-4506: Parallel replication.
Fix some more parts of old-style position updates.
Now we save in rgi some coordinates for master log and relay log, so
that in do_update_pos() we can use the right set of coordinates with
the right events.

The Rotate_log_event::do_update_pos() is fixed in the parallel case
to not directly update relay-log.info (as Rotate event runs directly
in the driver SQL thread, ahead of actual event execution). Instead,
group_master_log_file is updated as part of do_update_pos() in each
event execution.

In the parallel case, position updates happen in parallel without
any ordering, but taking care that position is not updated backwards.
Since position update happens only after event execution this leads
to the right result.

Also fix an access-after-free introduced in an earlier commit.
2013-10-23 15:03:03 +02:00
Alexander Barkov
a06cd2cbe5 Merge 5.3 -> 5.5 2013-10-21 13:37:17 +04:00
Alexander Barkov
046fe91161 Merge 5.2 -> 5.3 2013-10-21 13:36:29 +04:00
Alexander Barkov
c63b72c968 Merge 5.1 -> 5.2 2013-10-21 13:35:43 +04:00
Alexander Barkov
11d141004a A clean-up for DEV-4890 Valgrind warnings on shutdown on a build with openSSL 2013-10-21 13:34:18 +04:00
Alexander Barkov
adbb439358 Merge 5.5 -> 10.0-base 2013-10-21 13:43:45 +04:00
unknown
7681c6aa78 MDEV-4506: Parallel replication: Intermediate commit.
Fix some part of update of old-style coordinates in parallel replication:

 - Ignore XtraDB request for old-style coordinates, not meaningful for
   parallel replication (must use GTID to get crash-safe parallel slave).

 - Only update relay log coordinates forward, not backwards, to ensure
   that parallel threads do not conflict with each other.

 - Move future_event_relay_log_pos to rgi.
2013-10-17 14:11:19 +02:00
Alexander Barkov
b88bf50ec1 Merge 5.5 -> 10.0-base 2013-10-16 20:41:50 +04:00
Alexander Barkov
11005f3413 Merge 5.3->5.5 2013-10-16 18:17:51 +04:00
Alexander Barkov
2b60ad3637 Merge 5.1->5.2 2013-10-16 17:58:15 +04:00
Alexander Barkov
049000101a Merge 5.1 -> 5.3 2013-10-16 17:48:31 +04:00
Alexander Barkov
de4ca539b8 MDEV-4890 Valgrind warnings on shutdown on a build with openSSL 2013-10-16 17:37:11 +04:00
Michael Widenius
5748eb3ec6 Moved the remaining variables, that depends on sql execution, from Relay_log_info to rpl_group_info:
-row_stmt_start_timestamp
-last_event_start_time
-long_find_row_note
-trans_retries

Added slave_executed_entries_lock to protect rli->executed_entries
Added primitives for thread safe 64 bit increment
Update rli->executed_entries when event has executed, not when event has been sent to sql execution thread


sql/log_event.cc:
  row_stmt_start and long_find_row_note is now in rpl_group_info
sql/mysqld.cc:
  Added slave_executed_entries_lock to protect rli->executed_entries
sql/mysqld.h:
  Added slave_executed_entries_lock to protect rli->executed_entries
  Added primitives for thread safe 64 bit increment
sql/rpl_parallel.cc:
  Update rli->executed_entries when event has executed, not when event has been sent to sql execution thread
sql/rpl_rli.cc:
  Moved row_stmt_start_timestamp, last_event_start_time and long_find_row_note from Relay_log_info to rpl_group_info
sql/rpl_rli.h:
  Moved trans_retries, row_stmt_start_timestamp, last_event_start_time and long_find_row_note from Relay_log_info to rpl_group_info
sql/slave.cc:
  Use rgi for trans_retries and last_event_start_time
  Update rli->executed_entries when event has executed, not when event has been sent to sql execution thread
  Reset trans_retries when object is created
2013-10-15 00:17:16 +03:00
Michael Widenius
2e100cc5a4 Fixes for parallel slave:
- Made slaves temporary table multi-thread slave safe by adding mutex around save_temporary_table usage.
  - rli->save_temporary_tables is the active list of all used temporary tables
  - This is copied to THD->temporary_tables when temporary tables are opened and updated when temporary tables are closed
  - Added THD->lock_temporary_tables() and THD->unlock_temporary_tables() to simplify this.
- Relay_log_info->sql_thd renamed to Relay_log_info->sql_driver_thd to avoid wrong usage for merged code.
- Added is_part_of_group() to mark functions that are part of the next function. This replaces setting IN_STMT when events are executed.
- Added is_begin(), is_commit() and is_rollback() functions to Query_log_event to simplify code.
- If slave_skip_counter is set run things in single threaded mode. This simplifies code for skipping events.
- Updating state of relay log (IN_STMT and IN_TRANSACTION) is moved to one single function: update_state_of_relay_log()
  We can't use OPTION_BEGIN to check for the state anymore as the sql_driver and sql execution threads may be different.
  Clear IN_STMT and IN_TRANSACTION in init_relay_log_pos() and Relay_log_info::cleanup_context() to ensure the flags doesn't survive slave restarts
  is_in_group() is now independent of state of executed transaction.
- Reset thd->transaction.all.modified_non_trans_table() if we did set it for single table row events.
  This was mainly for keeping the flag as documented.
- Changed slave_open_temp_tables to uint32 to be able to use atomic operators on it.
- Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock
- Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond
- Changed some functions to take rpl_group_info instead of Relay_log_info to make them multi-slave safe and to simplify usage
  - do_shall_skip()
  - continue_group()
  - sql_slave_killed()
  - next_event()
- Simplifed arguments to io_salve_killed(), check_io_slave_killed() and sql_slave_killed(); No reason to supply THD as this is part of the given structure.
- set_thd_in_use_temporary_tables() removed as in_use is set on usage
- Added information to thd_proc_info() which thread is waiting for slave mutex to exit.
- In open_table() reuse code from find_temporary_table()

Other things:
- More DBUG statements
- Fixed the rpl_incident.test can be run with --debug
- More comments
- Disabled not used function rpl_connect_master()

mysql-test/suite/perfschema/r/all_instances.result:
  Moved sleep_lock and sleep_cond to rpl_group_info
mysql-test/suite/rpl/r/rpl_incident.result:
  Updated result
mysql-test/suite/rpl/t/rpl_incident-master.opt:
  Not needed anymore
mysql-test/suite/rpl/t/rpl_incident.test:
  Fixed that test can be run with --debug
sql/handler.cc:
  More DBUG_PRINT
sql/log.cc:
  More comments
sql/log_event.cc:
  Added DBUG statements
  do_shall_skip(), continue_group() now takes rpl_group_info param
  Use is_begin(), is_commit() and is_rollback() functions instead of inspecting query string
  We don't have set slaves temporary tables 'in_use' as this is now done when tables are opened.
  Removed IN_STMT flag setting. This is now done in update_state_of_relay_log()
  Use IN_TRANSACTION flag to test state of relay log.
  In rows_event_stmt_cleanup() reset thd->transaction.all.modified_non_trans_table if we had set this before.
sql/log_event.h:
  do_shall_skip(), continue_group() now takes rpl_group_info param
  Added is_part_of_group() to mark events that are part of the next event. This replaces setting IN_STMT when events are executed.
  Added is_begin(), is_commit() and is_rollback() functions to Query_log_event to simplify code.
sql/log_event_old.cc:
  Removed IN_STMT flag setting. This is now done in update_state_of_relay_log()
  do_shall_skip(), continue_group() now takes rpl_group_info param
sql/log_event_old.h:
  Added is_part_of_group() to mark events that are part of the next event.
  do_shall_skip(), continue_group() now takes rpl_group_info param
sql/mysqld.cc:
  Changed slave_open_temp_tables to uint32 to be able to use atomic operators on it.
  Relay_log_info::sleep_lock -> Rpl_group_info::sleep_lock
  Relay_log_info::sleep_cond -> Rpl_group_info::sleep_cond
sql/mysqld.h:
  Updated types and names
sql/rpl_gtid.cc:
  More DBUG
sql/rpl_parallel.cc:
  Updated TODO section
  Set thd for event that is execution
  Use new  is_begin(), is_commit() and is_rollback() functions.
  More comments
sql/rpl_rli.cc:
  sql_thd -> sql_driver_thd
  Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock
  Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond
  Clear IN_STMT and IN_TRANSACTION in init_relay_log_pos() and Relay_log_info::cleanup_context() to ensure the flags doesn't survive slave restarts.
  Reset table->in_use for temporary tables as the table may have been used by another THD.
  Use IN_TRANSACTION instead of OPTION_BEGIN to check state of relay log.
  Removed IN_STMT flag setting. This is now done in update_state_of_relay_log()
sql/rpl_rli.h:
  Changed relay log state flags to bit masks instead of bit positions (most other code we have uses bit masks)
  Added IN_TRANSACTION to mark if we are in a BEGIN ... COMMIT section.
  save_temporary_tables is now thread safe
  Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock
  Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond
  Relay_log_info->sql_thd renamed to Relay_log_info->sql_driver_thd to avoid wrong usage for merged code
  is_in_group() is now independent of state of executed transaction.
sql/slave.cc:
  Simplifed arguments to io_salve_killed(), sql_slave_killed() and check_io_slave_killed(); No reason to supply THD as this is part of the given structure.
  set_thd_in_use_temporary_tables() removed as in_use is set on usage in sql_base.cc
  sql_thd -> sql_driver_thd
  More DBUG
  Added update_state_of_relay_log() which will calculate the IN_STMT and IN_TRANSACTION state of the relay log after the current element is executed.
  If slave_skip_counter is set run things in single threaded mode.
  Simplifed arguments to io_salve_killed(), check_io_slave_killed() and sql_slave_killed(); No reason to supply THD as this is part of the given structure.
  Added information to thd_proc_info() which thread is waiting for slave mutex to exit.
  Disabled not used function rpl_connect_master()
  Updated argument to next_event()
sql/sql_base.cc:
  Added mutex around usage of slave's temporary tables. The active list is always kept up to date in sql->rgi_slave->save_temporary_tables.
  Clear thd->temporary_tables after query (safety)
  More DBUG
  When using temporary table, set table->in_use to current thd as the THD may be different for slave threads.
  Some code is ifdef:ed with REMOVE_AFTER_MERGE_WITH_10 as the given code in 10.0 is not yet in this tree.
  In open_table() reuse code from find_temporary_table()
sql/sql_binlog.cc:
  rli->sql_thd -> rli->sql_driver_thd
  Remove duplicate setting of rgi->rli
sql/sql_class.cc:
  Added helper functions rgi_lock_temporary_tables() and rgi_unlock_temporary_tables()
  Would have been nicer to have these inline, but there was no easy way to do that
sql/sql_class.h:
  Added functions to protect slaves temporary tables
sql/sql_parse.cc:
  Added DBUG_PRINT
sql/transaction.cc:
  Added comment
2013-10-14 00:24:05 +03:00
unknown
d61cffa6b1 MDEV-5130: More precise binlog position reporting for IO thread when reconnecting with GTID
Make IO thread check for end of event group, so that upon
disconnect at the end of an event group it can report the last
read GTID as expected. Also inject a fake Rotate event at
reconnect when skipping part of an initial event group, to
give SQL thread the correct Read_Master_Log_Pos.

Reported by Pavel Ivanov.
2013-10-11 11:21:18 +02:00
unknown
8f4eb208d2 merge 10.0-base -> 10.0 2013-10-17 19:01:57 +03:00
Alexander Barkov
a9240dce9e Merge 10.0-base -> 10.0 2013-10-15 10:26:08 +04:00
Sergei Golubchik
9af177042e 10.0-base merge.
Partitioning/InnoDB changes are *not* merged (they'll come from 5.6)
TokuDB does not compile (not updated to 10.0 SE API)
2013-09-21 10:14:42 +02:00
unknown
39794dc72c MDEV-4506: Parallel replication.
Add another test case, using DEBUG_SYNC.
Fix one bug found.
2013-09-17 14:07:21 +02:00
unknown
5633dd8227 MDEV-4506: parallel replication.
Add a simple test case.
Fix bugs found.
2013-09-16 14:33:49 +02:00
unknown
d107bdaa01 MDEV-4506, parallel replication.
Some after-review fixes.
2013-09-13 15:09:57 +02:00
Jan Lindström
9c85ced30d Merge fix. 2013-09-09 10:38:58 +03:00
unknown
ada15c7a0f Fix various places where code would work incorrectly if the common_header_len of events is different on master and slave
Patch developed with the help of Pavel Ivanov.

Also fix an uninitialised variable in queue_event().
2013-09-04 12:22:09 +02:00
Jan Lindström
ba3ff50ab2 Merge 10.0 to galera-10.0 2013-09-03 17:50:36 +03:00
unknown
62d358295b Fix embedded link error and uninitialised variable following previous push. 2013-08-23 10:16:43 +02:00
unknown
f74c745a99 MDEV-4488: When master is on the list of ignore_server_ids, GTID position on slave is not updated
The ignored events are not written to the relay log, but instead a fake
Rotate event is generated to handle update of position.

Extend this for Gtid so we similarly generate a fake Gtid_list event
to update the GTID position.

Also fix an unrelated test issue that got triggered by the added test cases.
2013-08-22 12:36:42 +02:00
Seppo Jaakola
4222b2520b Merge with mariadb 5.5: bzr merge lp:maria/5.5 --rtag:mariadb-5.5.32 2013-08-21 16:34:31 +03:00
unknown
0903a40d09 bMDEV-4906: When event apply fails, next SQL thread start errorneously commits the failing GTID to gtid_slave_pos
When a GTID event is executed, we remember the contained GTID position so that
when we have applied the entire event group we can commit it to
gtid_slave_pos.

However, if the event group fails to apply due to some error and the SQL
thread aborts, the code did not correctly clear the remembered GTID. Thus,
when SQL thread was restarted, the old GTID of the failing event group was
incorrectly updated to gtid_slave_pos when the initial rotate event was
executed, corrupting the GTID position.
2013-08-20 13:44:50 +02:00
Praveenkumar Hulakund
03940a7bd8 Bug#16865959 - PLEASE BACKPORT BUG 14749800.
Since log_throttle is not available in 5.5. Logging of
error message for failure of thread to create new connection
in "create_thread_to_handle_connection" is not backported.

Since, function "my_plugin_log_message" is not available in 
5.5 version and since there is incompatibility between
sql_print_XXX function compiled with g++ and alog files with
gcc to use sql_print_error, changes related to audit log
plugin is not backported.
2013-07-24 15:44:41 +05:30
Praveenkumar Hulakund
0ae219cd75 Bug#16865959 - PLEASE BACKPORT BUG 14749800.
Since log_throttle is not available in 5.5. Logging of
error message for failure of thread to create new connection
in "create_thread_to_handle_connection" is not backported.

Since, function "my_plugin_log_message" is not available in 
5.5 version and since there is incompatibility between
sql_print_XXX function compiled with g++ and alog files with
gcc to use sql_print_error, changes related to audit log
plugin is not backported.
2013-07-24 15:44:41 +05:30