Commit graph

2321 commits

Author SHA1 Message Date
Sergei Golubchik
08687f7ef3 cleanup: my_checksum
remove my_crc_dbug_check (gdb can do it itself).
use 0 instead of my_checkum(0, 0, 0) - just as 10.0 does now.
2015-09-04 10:33:50 +02:00
Sergei Golubchik
530a6e7481 Merge branch '10.0' into 10.1
referenced_by_foreign_key2(), needed for InnoDB to compile,
was taken from 10.0-galera
2015-09-03 12:58:41 +02:00
Kristian Nielsen
ef82cb7c2c Merge MDEV-8725 into 10.1 2015-09-02 10:53:37 +02:00
Kristian Nielsen
83c7b1e95b Merge MDEV-8725 into 10.0 2015-09-02 10:40:34 +02:00
Kristian Nielsen
09bfaf3a13 Fix a potential lost wakeup for binlog_commit_wait_usec
If a transaction T1 needs to wait for a transaction T2, T2's commit will
skip the normal binlog_commit_wait_usec delay, in order not to needlessly
stall throughput.

This works by checking if T2 is already ready to commit. If so, it is woken
up. If not, we set a flag in T2 so that when it gets ready to commit, it
will do so immediately.

But there was a potential race due to insufficient locking, if T2 gets ready
to commit just at the point where T1 does the check. If the race hits, the
wakeup (and early commit) of T2 might be lost.

The race is only theoretical (from code inspection, no known test case), but
seems best to fix it anyway, by properly locking LOCK_prepare_ordered around
the check.
2015-09-02 10:08:09 +02:00
Nirbhay Choubey
cd1a11ace3 MDEV-7205 : Galera cluster & sql_log_bin = off don't work
While sql_bin_log=1(0) is meant to control binary logging for the
current session so that the updates to do(not) get logged into the
binary log to be replicated to the async MariaDB slave. The same
should not affect galera replication.

That is, the updates should always get replicated to other galera
nodes regardless of sql_bin_log's value.

Fixed by making sure that the updates are written to binlog cache
irrespective of sql_bin_log.

Added test cases.
2015-08-08 15:04:15 -04:00
Jan Lindström
9a5787db51 Merge commit '96badb16afcf' into 10.0
Conflicts:
	client/mysql_upgrade.c
	mysql-test/r/func_misc.result
	mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result
	mysql-test/suite/innodb/r/innodb-fk.result
	mysql-test/t/subselect_sj_mat.test
	sql/item.cc
	sql/item_func.cc
	sql/log.cc
	sql/log_event.cc
	sql/rpl_utility.cc
	sql/slave.cc
	sql/sql_class.cc
	sql/sql_class.h
	sql/sql_select.cc
	storage/innobase/dict/dict0crea.c
	storage/innobase/dict/dict0dict.c
	storage/innobase/handler/ha_innodb.cc
	storage/xtradb/dict/dict0crea.c
	storage/xtradb/dict/dict0dict.c
	storage/xtradb/handler/ha_innodb.cc
	vio/viosslfactories.c
2015-08-03 23:09:43 +03:00
Monty
f3e578ab30 Fixed MDEV-8428: Mangled DML statements on 2nd level slave when enabling binlog checksums
Fix was to add a test in Query_log_event::Query_log_event() if we are using
CREATE ... SELECT and in this case use trans cache, like we do on the master.
This avoid using (with doesn't have checksum)

Other things:
- Removed dummy call my_checksum(0L, NULL, 0)
- More DBUG_PRINT
- Cleaned up Log_event::need_checksum() to make it more readable (similar as in MySQL 5.6)
- Renamed variable that was hiding another one in create_table_imp()
2015-07-26 14:32:45 +03:00
Monty
7115341473 Fixed warnings and errors found by buildbot
field.cc
- Fixed warning about overlapping memory copy (backport from 10.0)

Item_subselect.cc
- Fixed core dump in main.view
- Problem was that thd->lex->current_select->master_unit()->item was not set, which caused crash in maxr_as_dependent

sql/mysqld.cc
- Got error on shutdown as we where freeing mutex before all THD objects was freed
  (~THD uses some mutex). Fixed by during shutdown freeing THD inside mutex.

sql/log.cc
- log_space_lock and LOCK_log where locked in inconsistenly. Fixed by not having a log_space_lock around purge_logs.

sql/slave.cc
- Remove unnecessary log_space_lock
- Move cond_broadcast inside lock to ensure we don't miss the signal
2015-07-25 15:15:52 +03:00
Monty
872a953b22 MDEV-8469 Add RESET MASTER TO x to allow specification of binlog file nr
Other things:
- Avoid calling init_and_set_log_file_name() when opening binary log.
- Remove newlines early when reading from index file.
- Ensure that reset_logs() will work even if thd is 0 (Can happen on startup)
- Added thd to sart_slave_threads() for better error handling.
2015-07-16 10:36:58 +03:00
Nirbhay Choubey
dced5146bd Merge branch '10.0-galera' into 10.1 2015-07-14 16:05:29 -04:00
Monty
7332af49e4 - Renaming variables so that they don't shadow others (After this patch one can compile with -Wshadow and get much fewer warnings)
- Changed ER(ER_...) to ER_THD(thd, ER_...) when thd was known or if there was many calls to current_thd in the same function.
- Changed ER(ER_..) to ER_THD_OR_DEFAULT(current_thd, ER...) in some places where current_thd is not necessary defined.
- Removing calls to current_thd when we have access to thd

Part of this is optimization (not calling current_thd when not needed),
but part is bug fixing for error condition when current_thd is not defined
(For example on startup and end of mysqld)

Notable renames done as otherwise a lot of functions would have to be changed:
- In JOIN structure renamed:
   examined_rows -> join_examined_rows
   record_count -> join_record_count
- In Field, renamed new_field() to make_new_field()

Other things:
- Added DBUG_ASSERT(thd == tmp_thd) in Item_singlerow_subselect() just to be safe.
- Removed old 'tab' prefix in JOIN_TAB::save_explain_data() and use members directly
- Added 'thd' as argument to a few functions to avoid calling current_thd.
2015-07-06 20:24:14 +03:00
Nirbhay Choubey
f965cae5fb MDEV-7110 : Add missing MySQL variable log_bin_basename and log_bin_index
Add log_bin_index, log_bin_basename and relay_log_basename system
variables. Also, convert relay_log_index system variable to
NO_CMD_LINE and implement --relay-log-index as a command line
option.
2015-06-09 13:38:29 -04:00
Sergei Golubchik
6309a30dc9 my_b_fill, inline my_b_* functions instead of hairy macros 2015-06-02 18:53:37 +02:00
Sergei Golubchik
d602574542 Merge remote-tracking branch 'github/10.1' into 10.1 2015-06-01 16:01:42 +02:00
Sergei Golubchik
5091a4ba75 Merge tag 'mariadb-10.0.19' into 10.1 2015-06-01 15:51:25 +02:00
Jan Lindström
59815a268b MDEV-7484: Log Time not consistent with InnoDB errors nor with MySQL error log time format 2015-05-30 20:45:17 +03:00
Sergei Golubchik
49c853fb94 Merge branch '5.5' into 10.0 2015-05-04 22:00:24 +02:00
Kristian Nielsen
9088f26f20 MDEV-7802: group commit status variable addition
Backport into 10.0
2015-04-29 11:29:25 +02:00
Sergei Golubchik
8e781601f4 MDEV-6870 Not possible to use FIFO file as a general_log file
Remove the too restrictive bugfix for bug#67088.
FIFO can be used for general/slow logs, but lseek() and fsync() on
FIFO fail. And open() needs to be non-blocking, in case the other
end isn't reading.
2015-04-27 15:42:12 +02:00
Sergei Golubchik
c05d43100b bug: crash when sync() or close() of a log file fails on shutdown
because current_thd is NULL and ER() causes sigsegv
2015-04-27 15:42:12 +02:00
Sergei Golubchik
8f499c395c bug: debug assert crash when seek on log file fails 2015-04-27 15:42:11 +02:00
Kristian Nielsen
a15a4d674d Merge MDEV-7802 into 10.1 2015-04-20 13:22:51 +02:00
Kristian Nielsen
791b0ab5db Merge 10.0 -> 10.1 2015-04-20 13:21:58 +02:00
Daniel Black
85660d7397 MDEV-7977 MYSQL_BIN_LOG::write_incident failing to release LOCK_log
This adds a unlock(LOCK_log) for the unlikely(!is_open()) branch
2015-04-11 18:13:08 +10:00
Kristian Nielsen
accdabd668 Merge MDEV-7888 and MDEV-7929 into 10.0. 2015-04-08 13:19:22 +02:00
Kristian Nielsen
48c10fb5f7 Merge MDEV-7888 and MDEV-7929 into 10.1. 2015-04-08 11:04:24 +02:00
Kristian Nielsen
3b961347db MDEV-7888, MDEV-7929: Parallel replication hangs sometimes on ANALYZE TABLE or DDL
The hangs occur when the group_commit_orderer object is freed before the last
mark_start_commit() call on it - this loses the wakeup to other waiting worker
threads, causing them to hang until killed manually.

The object was freed because wakeup_subsequent_commits() was called two early
in two places. For MDEV-7888, during ANALYZE TABLE, and for MDEV-7929 during
record_gtid() after processing a DDL event. The group_commit_orderer object
can be freed when its last transaction has called wait_for_prior_commit().

Fix by implementing a suspend/resume mechanism for wakeup_subsequent_commits()
that can be used in places where a transaction is committed without this being
the commit of the actual replication event group.

Also add a protection mechanism (that asserts in debug builds) which can
prevent the too-early free and hang if other similar bugs should remain in
other parts of the code.
2015-04-08 11:01:18 +02:00
Daniel Black
1d5220d112 binlog_group_commit_* status variables update
remove group_commit_reason_immediate
rename group_commit_reason_transaction to group_commit_trigger_lock_wait
rename group_commit_reason_usec to group_commit_trigger_timeout
rename group_commit_reason_count to group_commit_triggger_count
2015-04-01 22:47:36 +11:00
Daniel Black
dd7026a703 All updates to binlog_status_group_commit_reason* are under LOCK_prepare_ordered 2015-04-01 21:51:55 +11:00
Kristian Nielsen
f573b65e41 Merge MDEV-7847 and MDEV-7882 into 10.0.
Conflicts:
	mysql-test/suite/rpl/r/rpl_parallel.result
	sql/rpl_parallel.cc
2015-03-30 15:10:29 +02:00
Kristian Nielsen
c41e4d3b49 Merge MDEV-7847 and MDEV-7882 into 10.0.
Conflicts:
	mysql-test/suite/rpl/r/rpl_parallel.result
	mysql-test/suite/rpl/t/rpl_parallel.test
2015-03-30 14:51:25 +02:00
Kristian Nielsen
880f2273fd MDEV-7847: "Slave worker thread retried transaction 10 time(s) in vain, giving up", followed by replication hanging
This patch fixes a bug in the error handling in parallel replication, when one
worker thread gets a failure and other worker threads processing later
transactions have to rollback and abort.

The problem was with the lifetime of group_commit_orderer objects (GCOs).
A GCO is freed when we register that its last event group has committed. This
relies on register_wait_for_prior_commit() and wait_for_prior_commit() to
ensure that the fact that T2 has committed implies that any earlier T1 has
also committed, and can thus no longer execute mark_start_commit().

However, in the error case, the code was skipping the
register_wait_for_prior_commit() and wait_for_prior_commit() calls. Thus
commit ordering was not guaranteed, and a GCO could be freed too early. Then a
later mark_start_commit() would reference deallocated GCO, which could lead to
lost wakeup (causing slave threads to hang) or other corruption.

This patch makes also the error case respect commit order. This way, also the
error case gets the GCO lifetime correct, and the hang no longer occurs.
2015-03-30 14:33:44 +02:00
Daniel Black
54287adc27 MDEV-7802 Add status binlog_group_commit_reason_*
The following global status variables where added:
* binlog_group_commit_reason_count
* binlog_group_commit_reason_usec
* binlog_group_commit_reason_transaction
* binlog_group_commit_reason_immediate

binlog_group_commit_reason_count corresponds to group commits made by
virtue of the binlog_commit_wait_count variable.

binlog_group_commit_reason_usec corresponds to the binlog_commit_wait_usec
variable.

binlog_group_commit_reason_transaction is a result of ordered
transaction that need to occur in the same order on the slave and can't
be parallelised.

binlog_group_commit_reason_immediate is caused to prevent stalls with
row locks as described in log.cc:binlog_report_wait_for. This immediate
count is also counted a second time in binlog_group_commit_reason_transaction.

Overall binlog_group_commits = binlog_group_commit_reason_count +
binlog_group_commit_reason_usec + binlog_group_commit_reason_transaction

This work was funded thanks to Open Source Developers Club Australia.
2015-03-19 15:26:58 +11:00
Sergey Vojtovich
18e9c314e4 MDEV-6650 - LINT_INIT emits code in non-debug builds
Replaced all references to LINT_INIT with UNINIT_VAR and LINT_INIT_STRUCT.
Removed LINT_INIT macro.
2015-03-16 14:48:22 +04:00
Kristian Nielsen
184f718fef MDEV-7249: Performance problem in parallel replication with multi-level slaves
Parallel replication (in 10.0 / "conservative" mode) relies on binlog group
commits to group transactions that can be safely run in parallel on the
slave. The --binlog-commit-wait-count and --binlog-commit-wait-usec options
exist to increase the number of commits per group. But in case of conflicts
between transactions, this can cause unnecessary delay and reduced througput,
especially on a slave where commit order is fixed.

This patch adds a heuristics to reduce this problem. When transaction T1 goes
to commit, it will first wait for N transactions to queue up for a group
commit. However, if we detect that another transaction T2 is waiting for a row
lock held by T1, then we will skip the wait and let T1 commit immediately,
releasing locks and let T2 continue.

On a slave, this avoids the unfortunate situation where T1 is waiting for T2
to join the group commit, but T2 is waiting for T1 to release locks, causing
no work to be done for the duration of the --binlog-commit-wait-usec timeout.

(The heuristic seems reasonable on the master as well, so it is enabled for
all transactions, not just replication transactions).
2015-03-13 14:01:52 +01:00
Sergei Golubchik
2db62f686e Merge branch '10.0' into 10.1 2015-03-07 13:21:02 +01:00
Sergei Golubchik
5f510a9175 Merge branch '5.5' into 10.0 2015-03-06 18:41:32 +01:00
Kristian Nielsen
95d7208859 Merge MDEV-6589 and MDEV-6403 into 10.1.
Conflicts:
	sql/log.cc
	sql/rpl_rli.cc
	sql/sql_repl.cc
2015-03-04 13:49:37 +01:00
Kristian Nielsen
3ef0b9b235 Merge MDEV-6589 and MDEV-6403 into 10.0. 2015-03-04 13:36:54 +01:00
Kristian Nielsen
ad0d203f2e MDEV-6589: Incorrect relay log start position when restarting SQL thread after error in parallel replication
The problem occurs in parallel replication in GTID mode, when we are using
multiple replication domains. In this case, if the SQL thread stops, the
slave GTID position may refer to a different point in the relay log for each
domain.

The bug was that when the SQL thread was stopped and restarted (but the IO
thread was kept running), the SQL thread would resume applying the relay log
from the point of the most advanced replication domain, silently skipping all
earlier events within other domains. This caused replication corruption.

This patch solves the problem by storing, when the SQL thread stops with
multiple parallel replication domains active, the current GTID
position. Additionally, the current position in the relay logs is moved back
to a point known to be earlier than the current position of any replication
domain. Then when the SQL thread restarts from the earlier position, GTIDs
encountered are compared against the stored GTID position. Any GTID that was
already applied before the stop is skipped to avoid duplicate apply.

This patch should have no effect if multi-domain GTID parallel replication is
not used. Similarly, if both SQL and IO thread are stopped and restarted, the
patch has no effect, as in this case the existing relay logs are removed and
re-fetched from the master at the current global @@gtid_slave_pos.
2015-03-04 13:36:04 +01:00
Nirbhay Choubey
34d86ac9ff MDEV-6594: Use separate domain_id for Galera transactions 2015-02-27 22:33:41 -05:00
Kristian Nielsen
aa845d123c MDEV-6391: GTID binlog state not recovered if mariadb-bin.state is removed
When the server starts up, check if the master-bin.state file was lost.
If it was, recover its contents by scanning the last binlog file, thus
avoiding running with a corrupt binlog state.
2015-02-27 14:34:52 +01:00
Kristian Nielsen
b5d6aa5517 MDEV-7310: last_commit_pos_offset set to wrong value after binlog rotate in group commit
When the binlog was rotated due to @@max_binlog_size, the values of the
binlog_shapshot_file and binlog_snapshot_position were inconsistent in case of
non-transactional DML. The position was refering to the old file, while the
filename was of the new file after rotation. This patch makes them consistent
by making sure the position is also refering to the new file.
2015-02-23 13:27:51 +01:00
Sergei Golubchik
863cfb3fa5 small cleanup, remove a useless function 2015-01-31 21:51:45 +01:00
Sergei Golubchik
4b21cd21fe Merge branch '10.0' into merge-wip 2015-01-31 21:48:47 +01:00
Sergei Golubchik
510ca9b697 MDEV-7402 'reset master' hangs, waits for signalled COND_xid_list
Using a boolean flag for 'there is a RESET MASTER in progress' doesn't
work very well for multiple concurrent RESET MASTER statements.
Changed to a counter.
2015-01-19 14:32:28 +01:00
Kristian Nielsen
2de9427ccf MDEV-7391: rpl.rpl_semi_sync, rpl.rpl_semi_sync_after_sync_row fail in buildbot
The problem was caused by a merge error (incorrect conflict resolution) when
the MDEV-7257 patch was merged into 10.1.

The incorrect merge put two code blocks in the wrong order. This caused a race
that was seen as sporadic test failures.

(The problem was that binlog end position was updated before running
after_flush hook; this way it was possible for the binlog dump thread to send
a transaction to a slave without requesting semi-sync acknowledgement. Then
when no acknowledgement was received, semisync replication would be disabled
on the master.)
2015-01-13 14:19:07 +01:00
Sergei Golubchik
e695db0f2d MDEV-7437 remove suport for "atomics" with rwlocks 2015-01-13 10:15:21 +01:00
Nirbhay Choubey
6f4f8c5f8a MDEV-7374 : Losing connection to MySQL while running ALTER TABLE
In the special case of ALTER TABLE with >10K rows, wsrep commit
should skip if wsrep is not enabled. Added a test case.
2014-12-31 20:58:54 -05:00