Commit graph

461 commits

Author SHA1 Message Date
Alfranio Correia
508fe9dd15 BUG#44581 Slave stops when transaction with non-transactional table gets lock wait
timeout
            
In STMT and MIXED modes, a statement that changes both non-transactional and
transactional tables must be written to the binary log whenever there are
changes to non-transactional tables. This means that the statement gets into the
binary log even when the changes to the transactional tables fail. In particular
, in the presence of a failure such statement is annotated with the error number
and wrapped in a begin/rollback. On the slave, while applying the statement, it
is expected the same failure and the rollback prevents the transactional changes
to be persisted.
            
Unfortunately, statements that fail due to concurrency issues (e.g. deadlocks,
timeouts) are logged in the same way causing the slave to stop as the statements
are applied sequentially by the SQL Thread. To fix this bug, we automatically
ignore concurrency failures on the slave. Specifically, the following failures
are ignored: ER_LOCK_WAIT_TIMEOUT, ER_LOCK_DEADLOCK and ER_XA_RBDEADLOCK.
2009-07-06 09:02:14 +01:00
Matthias Leich
7029f8bbe8 Merge 5.0 -> 5.1 of fix for bug 45902 2009-07-02 15:40:27 +02:00
Alfranio Correia
ac1b464a33 BUG#43929 binlog corruption when max_binlog_cache_size is exceeded
Large transactions and statements may corrupt the binary log if the size of the
cache, which is set by the max_binlog_cache_size, is not enough to store the
the changes.

In a nutshell, to fix the bug, we save the position of the next character in the
cache before starting processing a statement. If there is a problem, we simply
restore the position thus removing any effect of the statement from the cache.
Unfortunately, to avoid corrupting the binary log, we may end up loosing changes
on non-transactional tables if they do not fit in the cache. In such cases, we
store an Incident_log_event in order to stop the slave and alert users that some
changes were not logged.

Precisely, for every non-transactional changes that do not fit into the cache,
we do the following:
  a) the statement is *not* logged
  b) an incident event is logged after committing/rolling back the transaction,
  if any. Note that if a failure happens before writing the incident event to
  the binary log, the slave will not stop and the master will not have reported
  any error.
  c) its respective statement gives an error

For transactional changes that do not fit into the cache, we do the following:
  a) the statement is *not* logged
  b) its respective statement gives an error

To work properly, this patch requires two additional things. Firstly, callers to
MYSQL_BIN_LOG::write and THD::binlog_query must handle any error returned and
take the appropriate actions such as undoing the effects of a statement. We
already changed some calls in the sql_insert.cc, sql_update.cc and sql_insert.cc
modules but the remaining calls spread all over the code should be handled in
BUG#37148. Secondly, statements must be either classified as DDL or DML because
DDLs that do not get into the cache must generate an incident event since they
cannot be rolled back.
2009-06-18 14:52:46 +01:00
Patrick Crews
8f2ff69434 Bug#44920: MTR2 is not processing master.opt input properly on Windows
Re-enabled tests main.init_connect and rpl.rpl_init_slave.test for non-Windows
platforms.

Please remove this code upon fixing the bug.
2009-06-12 14:40:02 +01:00
He Zhenxing
6f84951044 Merge BUG#43263 from 5.0-bugteam to 5.1-bugteam 2009-05-31 13:44:41 +08:00
Luis Soares
e2aad850ed BUG#41725: upmerge: 5.0-bt --> 5.1-bt 2009-05-23 00:29:41 +01:00
Alfranio Correia
41fdbbf586 auto-merge 5.1-bugteam (local) --> 5.1-bugteam 2009-05-21 09:36:38 +01:00
Patrick Crews
189f8ebd97 Bug#44920 - MTR2 is not processing master.opt input properly on Windows
Disabling these two tests as they are affected by this bug / causing PB2 failures
on Windows platforms.  Can always disable via include/not_windows.inc if
the bug fix looks like it will take some time.
2009-05-18 12:53:06 -04:00
Serge Kozlov
52d3373e75 Bug#38077.
1. Replace waiting of SQL thread stop by waiting of SQL error on slave and stopped
SQL thread.
2. Remove debug code because it already implemented in MTR2.
2009-05-02 23:28:54 +04:00
Andrei Elkin
f683c5e0b6 moving bug#38694 test files into rpl suite 2009-04-30 16:20:38 +03:00
Alfranio Correia
8126db7c96 Fixed rpl_innodb_mixed_ddl and rpl_000015.
Respectively, replaced "--exec diff" by "--diff_files" which is a mysqltest command to run a
non-operating system specific diff. Removed the file rpl_000015-slave.sh as it is not
necessary in the new MTR.
2009-04-26 22:21:01 +01:00
Alfranio Correia
99d1233a72 BUG#44389 rpl_row_mysqlbinlog fails on windows due to operating system
specifc command

Replaced "--exec rm" by "remove_file" which is a mysqltest command to
erase a file.
2009-04-24 02:02:07 +01:00
Andrei Elkin
21a06c9c55 merge bug#38205 fixes to 5.1-bt 2009-04-21 11:30:40 +03:00
Alfranio Correia
8caf4bfc52 BUG#43949 Initialization of slave produces a warning message in Valgrind
In order to define the --slave-load-tmpdir, the init_relay_log_file()
was calling fn_format(MY_PACK_FILENAME) which internally was indirectly
calling strmov_overlapp() (through pack_dirname) and the following
warning message was being printed out while running in Valgrind:
"source and destination overlap in strcpy".

We fixed the issue by removing the flag MY_PACK_FILENAME as it was not
necessary. In a nutshell, with this flag the function fn_format() tried
to replace a directory by either "~", "." or "..". However, we wanted
exactly to remove such strings.

In this patch, we also refactored the functions init_relay_log_file()
and check_temp_dir(). The former was refactored to call the fn_format()
with the flag MY_SAFE_PATH along with the MY_RETURN_REAL_PATH,  in order
to avoid issues with long directories and return an absolute path,
respectively. The flag MY_SAFE_UNPACK_FILENAME was removed too as it was
responsible for removing "~", "." or ".." only from the file parameter
and we wanted to remove such strings from the directory parameter in
the fn_format(). This result is stored in an rli variable, which is then
processed by the other function in order to verify if the directory exists
and if we are able to create files in it.
2009-04-19 02:21:33 +01:00
Alfranio Correia
a8f7d47e6a BUG#41793 rpl_binlog_corruption disabled in main (needs new mtr)
The test case was missing "let $slave_sql_errno= 1594;".
2009-04-15 12:43:17 +01:00
Andrei Elkin
ae758cbd52 Bug #38205 Row-based Replication (RBR) causes inconsistencies: HA_ERR_FOUND_DUPP_KEY
Bug#319  if while a non-transactional slave is replicating a transaction possible problem

only testing related: addressing reviewers' comments.
2009-04-09 16:05:41 +03:00
He Zhenxing
56230bd67d Auto merge 2009-04-09 14:31:09 +08:00
He Zhenxing
435d6631aa Manually merge BUG#37145 to 5.1-bugteam 2009-04-09 07:42:51 +08:00
Alfranio Correia
65ff9e1b0b BUG#39393. Post-fix for test rpl_skip_error.
The result set for multi-row statements is not the same between STMT and
RBR and among different versions. Thus to avoid test failures, we are not
printing out such result sets. Note, however, that this does not have
impact on coverage and accuracy since the execution is able to continue
without further issues when an error is found on the master and such error
is set to be skipped.
2009-04-08 22:02:19 +01:00
Alfranio Correia
0ac476e3a8 merge 5.1-bugteam --> 5.1-bugteam (local) 2009-04-08 11:07:24 +01:00
Alfranio Correia
14870022ad merge 5.1-bugteam --> 5.1-bugteam (local) 2009-04-06 01:22:34 +01:00
Alfranio Correia
1287d8c53a BUG#39393 slave-skip-errors does not work when using ROW based replication
RBR was not considering the option --slave-skip-errors.
                              
To fix the problem, we are reporting the ignored ERROR(s) as warnings thus avoiding 
stopping the SQL Thread. Besides, it fixes the output of "SHOW VARIABLES LIKE 
'slave_skip_errors'" which was showing nothing when the value "all" was assigned 
to --slave-skip-errors.
                  
@sql/log_event.cc
  skipped rbr errors when the option skip-slave-errors is set.
@sql/slave.cc
  fixed the output of for SHOW VARIABLES LIKE 'slave_skip_errors'"
@test-cases
  fixed the output of rpl.rpl_idempotency
  updated the test case rpl_skip_error
2009-04-05 13:03:04 +01:00
Georgi Kodinov
dccbde7a11 merged 5.1-main -> 5.1-bugteam 2009-04-01 12:57:34 +03:00
Georgi Kodinov
124d5e730c auto-merge 2009-03-27 14:15:50 +02:00
Georgi Kodinov
1b74eac6cb disabled a failing test suite due to bug #42311 2009-03-27 14:12:33 +02:00
Tatiana A. Nurnberg
5d0564ea48 auto-merge 2009-03-27 12:40:53 +01:00
Tatiana A. Nurnberg
858b8af739 Bug#43748: crash when non-super user tries to kill the replication threads
Test was flakey on some machines and showed spurious
reds for races.

New-and-improved test makes do with fewer statements,
no mysqltest-variables, and no backticks. Should hope-
fully be more robust. Heck, it's debatable whether we
should have a test for this, anyway.
2009-03-27 12:20:37 +01:00
Georgi Kodinov
b1bc018253 Worked around the problem described in bug #43884. 2009-03-27 12:59:31 +02:00
Andrei Elkin
badc6a127d Bug#38205 Row-based Replication (RBR) causes inconsistencies: HA_ERR_FOUND_DUP
Bug#319  if while a non-transactional slave is replicating a transaction possible problem 

It is impossible to roll back a mixed engines transaction when one of the engine is
non-transaction. In replication that fact is crucial because the slave can not safely
re-apply a transction that was interrupted with STOP SLAVE.

Fixed with making STOP SLAVE not be effective immediately in the case the current
group of replication events has modified a non-transaction table. In order for slave to leave
either the group needs finishing or the user issues KILL QUERY|CONNECTION slave_thread_id.
2009-03-26 10:25:06 +02:00
Leonard Zhou
9be507ca1f Merge 2009-03-26 12:37:24 +08:00
Tatiana A. Nurnberg
bafe7f60c6 Bug#43748: crash when non-super user tries to kill the replication threads
manual merge. also adds test specific to 5.1+
2009-03-25 17:42:34 +01:00
Andrei Elkin
921e3fe8bf Bug#42977 RBR logs for rows with more than 250 column results in corrupt binlog
The issue happened to be two-fold.
The table map event was recorded into binlog having
an incorrect size when number of columns exceeded 251. 
The Row-based event had incorrect recording and restoring m_width member within
the same as above conditions.

Fixed with correcting m_data_size and m_width.
2009-03-25 12:53:56 +02:00
Leonard Zhou
9095a69eda Fix test case erro in sles10-ia64-a.
Reset master before next test.
2009-03-25 14:19:42 +08:00
Luis Soares
6dff801284 BUG#39701: Mixed binlog format does not switch to row mode on
LOAD_FILE
            
LOAD_FILE is not safe to replicate in STATEMENT mode, because it
depends on a file (which is loaded on master and may not exist in
slave(s)). This leads to scenarios on which the slave replicates the
statement with 'load_file' and it will try to load the file from local
file system. Given that the file may not exist in the slave filesystem
the operation will not succeed (probably returning NULL), causing
master and slave(s) to diverge. However, when using MIXED mode
replication, this can be made to work, if the statement including
LOAD_FILE is marked as unsafe, triggering a switch to ROW mode,
meaning that the contents of the file are written to binlog as row
events. Consequently, the contents from the file in the master will
reach the slave via the binlog.
           
This patch addresses this bug by marking the load_file function as
unsafe. When in mixed mode and when LOAD_FILE is issued, there will be
a switch to row mode. Furthermore, when in statement mode, the
LOAD_FILE will raise a warning that the statement is unsafe in that
mode.
2009-03-24 18:27:33 +00:00
Leonard Zhou
d0cb03b860 Bug#43440 rpl.rpl_temp_table_mix_row fails sporadicly
The problem is that after disconnect, the DOPR TEMPORARY TABLE event didn't been
written into binlog. So after syncing with slave, the TEMPORARY table on slave 
is not removed.
      
Waiting DROP TEMPORARY TABLE event to be written into binlog before sync slave with
master.
2009-03-24 16:55:03 +08:00
Leonard Zhou
56184684f4 Merge 2009-03-24 14:24:27 +08:00
Georgi Kodinov
30f1fc229e Disabled the failing test case until bug #43440 is resolved 2009-03-23 11:38:54 +02:00
Alfranio Correia
b580f3b60b Post-fix BUG#42861. 2009-03-23 01:07:25 +00:00
Alfranio Correia
0a8f0d280c auto-merge 5.1-bugteam (local) --> 5.1-bugteam 2009-03-22 19:46:57 +00:00
Alfranio Correia
2f16f07054 Bug #42861 Assigning invalid directories to --slave-load-tmpdir crashes the slave
Compiling with debug and assigning an invalid directory to --slave-load-tmpdir
was crashing the slave due to the following assertion DBUG_ASSERT(! is_set() ||
can_overwrite_status). This assertion assumes that a thread can change its
state once (i.e. ok,error, etc) before aborting, cleaning/resuming or completing
its execution unless the overwrite flag (i.e. can_overwrite_status) is true.

The Append_block_log_event::do_apply_event which is responsible for creating
temporary file(s) was not cleaning the thread state. Thus a failure while
trying to create a file in an invalid temporary directory was causing the crash.

To fix the problem we check if the temporary directory is valid before starting
the SQL Thread and reset the thread state before creating a file in
Append_block_log_event::do_apply_event.
2009-03-18 10:31:17 +00:00
Leonard Zhou
b843ea8bf8 Merge 2009-03-16 17:06:22 +08:00
Guangbao Ni
647821432c Auto-merge from 5.1-bugteam 2009-03-18 15:02:06 +00:00
Guangbao Ni
15d24779e1 Bug #42217 mysql.procs_priv does not get replicated
mysql.procs_priv table itself does not get replicated.
Inserting routine privilege record into mysql.procs_priv table
is triggered by creating function/procedure statements
according to current user's privileges.
Because the current user of SQL thread has GLOBAL_ACL,
which doesn't need any check mysql.procs_priv privilege
when create/alter/execute routines.
Corresponding GLOBAL_ACL privilege user
doesn't insert routine privilege record into
mysql.procs_priv when creating a routine.

Fixed by switching the current user of SQL thread to definer user if
the definer user exists on slave.
That populates procs_priv, otherwise to keep the SQL thread
user and procs_priv remains unchanged.
2009-03-18 13:48:23 +00:00
Leonard Zhou
650a5722b3 BUG#22504 load data infile sql statement in replication architecture get error
The problem is issued because we set wrong start position and stop position of query string into binlog.
That two values are stored as part of head info of query string.
When we parse binlog, we first get position values then get the query string according position values.
But seems that two values are not calculated correctly after the parse of Yacc.

We don't want to touch so much of yacc because it may influence other codes.
So just add one space after 'INTO' key word when parsing.
This can easily resolve the problem.
2009-03-16 16:21:29 +08:00
Leonard Zhou
6db08f8df5 BUG#39858 rpl.rpl_rotate (rpl.rpl_rotate_logs) failed on pushbuild: result mismatch
The method to purge binary log files produces different results in some platforms.
The reason is that the purge time is calculated based on table modified time and
that can't guarantee to purge master-bin.000002 in all platforms.(eg. windows)

Use a new way that sets the time to purge binlog file 1 second after the last
modified time of master-bin.000002.
That can be sure that the file is always deleted in any platform.
2009-03-12 17:48:41 +08:00
Georgi Kodinov
3fb74f93d8 Revert the push for bug #39858 2009-03-11 17:19:18 +02:00
Leonard Zhou
77ffa795bd Merge 2009-03-11 14:10:50 +08:00
Leonard Zhou
cd80cee780 BUG#39858 rpl.rpl_rotate (rpl.rpl_rotate_logs) failed on pushbuild: result mismatch
The method to purge binary log files produces different results in some platforms.
The reason is that the purge time is calculated based on table modified time and
that can't guarantee to purge master-bin.000002 in all platforms.(eg. windows)

Use a new way that sets the time to purge binlog file 1 second after the last modified time of master-bin.000002.
That can be sure that the file is always deleted in any platform.
2009-03-11 13:35:58 +08:00
Luis Soares
5d5f0fcd47 BUG#39753: Replication failure on MIXED + bit + myisam + no PK
When using mixed mode the record values stored inside the storage
engine differed from the ones computed from the row event. This
happened because the prepare_record function was calling
empty_record macro causing some don't care bits to be left set.
                          
Replacing the empty_record plus explicitly setting defaults with 
restore_record to restore the record default values fixes this.
2009-03-05 20:54:53 +01:00
Leonard Zhou
5c5ace67d1 merge 2009-02-23 16:29:39 +08:00