Commit graph

58192 commits

Author SHA1 Message Date
He Zhenxing
6ae50d8aed Postfix after merge semi-sync with heartbeat
Use ev_offset instead of 1 as the packet header offset when getting
log position from events for heartbeat

call reset_transmit_packet before calling send_heartbeat_event


sql/sql_repl.cc:
  Use ev_offset instead of 1 as the packet header offset when getting log position from events for heartbeat
  call reset_transmit_packet before calling send_heartbeat_event
2009-10-14 13:24:47 +08:00
He Zhenxing
e5e8f1a47f Auto merge 5.1-rep-semisync 2009-10-13 10:28:02 +08:00
He Zhenxing
48e98026e2 Backport post fix for semisync
Remove functions that no longer needed
Fix warning suppressions

mysql-test/suite/rpl/t/rpl_semi_sync.test:
  Fix warning suppressions
plugin/semisync/semisync_slave.cc:
  Remove functions that no longer needed
plugin/semisync/semisync_slave.h:
  Remove functions that no longer needed
2009-10-12 21:21:13 +08:00
He Zhenxing
1a7c7a4066 Backport BUG#47298 Semisync: always wait until timeout if no semi-sync slave available
Add an option to control whether the master should keep waiting
until timeout when it detected that there is no semi-sync slave
available.

The bool option 'rpl_semi_sync_master_wait_no_slave' is 1 by
defalt, and will keep waiting until timeout. When set to 0, the
master will switch to asynchronous replication immediately when
no semi-sync slave is available.
2009-10-12 21:15:32 +08:00
He Zhenxing
26b47d9347 BUG#45674 FLUSH STATUS does not reset semisynchronous counters
Semi-sync status were not reset by FLUSH STATUS, this was because
all semi-sync status variables are defined as SHOW_FUNC and FLUSH
STATUS could only reset SHOW_LONG type variables.

This problem is fixed by change all status variables that should
be reset by FLUSH STATUS from SHOW_FUNC to SHOW_LONG.

After the fix, the following status variables will be reset by
FLUSH STATUS:
  Rpl_semi_sync_master_yes_tx
  Rpl_semi_sync_master_no_tx

Note: normally, FLUSH STATUS itself will be written into binlog
and be replicated, so after FLUSH STATS, one of
  Rpl_semi_sync_master_yes_tx
  Rpl_semi_sync_master_no_tx
can be 1 dependent on the semi-sync status. So it's recommended
to use FLUSH NO_WRITE_TO_BINLOG STATUS to avoid this.
2009-10-12 21:03:04 +08:00
He Zhenxing
825dce21b3 Backport Bug#45852 Semisynch: Last_IO_Error: Fatal error: Failed to run 'after_queue_event' hook
Errors when send reply to master should never cause the IO thread
to stop, because master can fall back to async replication if it
does not get reply from slave.

The problem is fixed by deliberately ignoring the return value of
slaveReply.
2009-10-12 20:57:30 +08:00
He Zhenxing
f8155de078 Backport BUG#45848 Semisynchronous replication internals are visible in SHOW PROCESSLIST and logs
Semi-sync uses an extra connection from slave to master to send
replies, this is a normal client connection, and used a normal
SET query to set the reply information on master, which is visible
to user and may cause some confusion and complaining.

This problem is fixed by using the method of sending reply by
using the same connection that is used by master dump thread to
send binlog to slave. Since now the semi-sync plugins are integrated
with the server code, it is not a problem to use the internal net
interfaces to do this.

The master dump thread will mark the event requires a reply and
wait for the reply when the event just sent is the last event
of a transaction and semi-sync status is ON; And the slave will
send a reply to master when it received such an event that requires
a reply.
2009-10-12 20:55:01 +08:00
Serge Kozlov
a8a96593ca Bug#42891: Tests cleanup
Fix for backport into mysql-5.1-rep+2
2009-10-10 14:27:07 +04:00
Andrei Elkin
6a541923c3 backporting fixes of bug@45940 to 5.1-rpl+2 to cover failures rpl_heartbeat_* as well.
mysql-test/suite/rpl/r/rpl_stop_middle_group.result:
  the new result file
mysql-test/suite/rpl/t/rpl_stop_middle_group.test:
  renamed from rpl_row_stop_middle_update and added a regression test for bug#45940.
2009-10-09 16:26:37 +03:00
Luis Soares
66b869d637 BUG#40611, BUG#44779: reverted in mysql-5.1-rep+2. 2009-10-07 22:13:07 +01:00
Serge Kozlov
3c98da71ae WL#4641, Bug#43828: fix for rpl_heartbeat_basic.test:
The issue appears when number of heartbeat events non-zero before start of test
block. But really we need to check that no new events has received during test block.
So I did following:
1. Replace absolute values by diff of values
2. Increase heartbeat period from 1.5 to 5 sec
2009-10-07 19:44:01 +04:00
Andrei Elkin
f02ec2b5a7 correcting rpl_partition.result 2009-10-07 01:05:27 +03:00
Serge Kozlov
494cb46d83 WL#3788
It is backport patch.
This adds new test case for testing affects of some variables to replication.
2009-10-03 22:21:44 +04:00
He Zhenxing
f108d05932 Manual merge semi-sync to 5.1-rep+2 2009-10-03 18:50:25 +08:00
He Zhenxing
d8724a4538 Fix semisync master/slave status always showed as OFF on sparc
On sparc, semisync master/slave status is always showed as OFF, this
is fixed by change rpl_semisync_master/slave_status variables from
long to char.

plugin/semisync/semisync_master.cc:
  Change rpl_semisync_master_status variables from long to char
plugin/semisync/semisync_master.h:
  Change rpl_semisync_master_status variables from long to char
plugin/semisync/semisync_slave.cc:
  Change rpl_semisync_slave_status variables from long to char
plugin/semisync/semisync_slave.h:
  Change rpl_semisync_slave_status variables from long to char
2009-10-03 13:00:05 +08:00
He Zhenxing
a8c14d9e0e Auto merge 2009-10-03 10:07:03 +08:00
He Zhenxing
d51bd44479 Post fix result file 2009-10-03 09:40:32 +08:00
Serge Kozlov
3aae87ad92 WL#4641 Heartbeat testing
This is backport for next-mr.

The patch adds new test cases that cover replication heartbeat testing.
2009-10-02 23:24:40 +04:00
Andrei Elkin
db72514486 fixing tests results: rpl_ndb_log, rpl_ndb_multi, sp_trans_log; adding replicate-ignore_server_ids specific tests 2009-10-02 16:15:54 +03:00
He Zhenxing
328dabf473 Post fix SEMISYNC_PLUGIN_OPT when semi-sync plugins are not found
mysql-test/mysql-test-run.pl:
  Set SEMISYNC_PLUGIN_OPT to '--plugin-dir=' when semi-sync plugins are not found
2009-10-02 19:16:06 +08:00
He Zhenxing
54120363ef Backport fixes for the follow tests
binlog_tmp_table
rpl_row_sp006_InnoDB
rpl_slave_status
2009-10-02 17:24:21 +08:00
He Zhenxing
280bf1cee6 Backport Post fix of result files after push of BUG#34227 2009-10-02 17:12:10 +08:00
He Zhenxing
dfbac1dd1d Backport BUG#34227 Replication permission error message is misleading
According to Jon's comment, add (at least one of) to the error message.
2009-10-02 16:50:05 +08:00
He Zhenxing
4381f7ed90 Backport post fix compiler warnings and test failures for BUG#25192 BUG#12190 2009-10-02 16:40:06 +08:00
He Zhenxing
228ae2bf50 Backport BUG#12190 CHANGE MASTER has differ path requiremts on MASTER_LOG_FILE and RELAY_LOG_FILE
CHANGE MASTER TO command required the value for RELAY_LOG_FILE to
be an absolute path, which was different from the requirement of
MASTER_LOG_FILE.

This patch fixed the problem by changing the value for RELAY_LOG_FILE
to be the basename of the log file as that for MASTER_LOG_FILE.
2009-10-02 16:35:03 +08:00
He Zhenxing
9739efbfec Backport BUG#25192 Using relay-log and relay-log-index without values produces unexpected results.
Options loaded from config files were added before command line
arguments, and they were parsed together, which could interprete
the following:
option-a
option-b
as --option-a=--option-b if 'option-a' requires a value, and 
caused confusing.

Because all options that requires a value are always given in
the form '--option=value', so it's an error if there is no 
'=value' part for such an option read from config file.

This patch added a separator to separate the arguments from 
config files and that from command line, so that they can be
handled differently. And report an error for options loaded
from config files that requires a value and is not given in the
form '--option=value'.
2009-10-02 16:25:53 +08:00
He Zhenxing
bb6953d1d8 Backport BUG#38468 Memory leak detected when using mysqlbinlog utility
There were two memory leaks in mysqlbinlog command, one was already
fixed by previous patches, another one was that defaults_argv was
set to the value of argv after parse_args, in which called
handle_options after calling load_defaults and changed the value
of argv, and caused the memory allocated for defaults arguments
not freed.

Fixed the problem by setting defaults_argv right after calling
load_defaults.
2009-10-02 16:18:40 +08:00
He Zhenxing
f509a35896 Backport BUG#41613 Slave I/O status inconsistent between SHOW SLAVE STATUS and START SLAVE
There are three internal status for slave I/O thread, both
MYSQL_SLAVE_RUN_NOT_CONNECT and MYSQL_SLAVE_NOT_RUN are reported
as 'No' for Slave_IO_running of command SHOW SLAVE STATUS.

Change MYSQL_SLAVE_RUN_NOT_CONNECT to be reported as 'Connecting'.
2009-10-02 16:07:33 +08:00
He Zhenxing
ebca60c1ff Backport BUG#41013 main.bootstrap coredumps in 6.0-rpl
When a storage engine failed to initialize before allocated slot number,
the slot number would be 0, and when later finalizing this plugin, it would
accidentally unplug the storage engine currently uses slot 0.

This patch fixed this problem by add a new macro value HA_SLOT_UNDEF to
distinguish undefined slot number from slot 0.
2009-10-02 13:59:42 +08:00
He Zhenxing
d9286fbc22 Post fix backporting wl#1720
Fix mtr semisync plugin option paths
2009-10-02 12:11:50 +08:00
Andrei Elkin
737910fb11 merge from 5.1-rpl+2 repo to a local branch with HB and bug@27808 fixes 2009-10-01 20:22:44 +03:00
Andrei Elkin
d91aa57c38 backporting bug@27808 fixes 2009-10-01 19:44:53 +03:00
Luis Soares
a367c88c9e Partial backport for BUG#41399, more precisely, the changes to
wait_until_disconnected.inc.
2009-10-01 00:32:15 +01:00
Alfranio Correia
5c25d17c4e BUG#43075 rpl.rpl_sync fails sporadically on pushbuild
NOTE: Backporting the patch to next-mr.
      
The slave was crashing while failing to execute the init_slave() function.
      
The issue stems from two different reasons:
      
1 - A failure while allocating the master info structure generated a
    segfault due to a NULL pointer.
      
2 - A failure while recovering generated a segfault due to a non-initialized
    relay log file. In other words, the mi->init and rli->init were both set to true
    before executing the recovery process thus creating an inconsistent state as the
    relay log file was not initialized.
      
To circumvent such problems, we refactored the recovery process which is now executed
while initializing the relay log. It is ensured that the master info structure is
created before accessing it and any error is propagated thus avoiding to set mi->init
and rli->init to true when for instance the relay log is not initialized or the relay
info is not flushed.
      
The changes related to the refactory are described below:
      
1 - Removed call to init_recovery from init_slave.
      
2 - Changed the signature of the function init_recovery.
      
3 - Removed flushes. They are called while initializing the relay log and master
    info.
      
4 - Made sure that if the relay info is not flushed the mi-init and rli-init are not
    set to true.
      
In this patch, we also replaced the exit(1) in the fault injection by DBUG_ABORT()
to make it compliant with the code guidelines.
2009-09-30 22:41:05 +01:00
Luis Soares
8f43e00841 BUG#47749: rpl_slave_skip fails sporadically on PB2 (mysql-5.1-rep+2 tree).
rpl_slave_skip fails randomly on PB2. This patch fixes the failure by
setting explicit wait for SQL thread to stop, instead of the 
wait_for_slave_to_stop mysqltest command, after a start until command 
is executed.
2009-09-30 17:42:25 +01:00
Alfranio Correia
0ebecf7f0a BUG#47741 rpl_ndb_extraCol fails in next-mr (mysql-5.1-rep+2) in RBR
This is a temporary fix.

NOTE: Backporting the patch to next-mr.
2009-09-30 16:25:01 +01:00
Alfranio Correia
9922788d7a Post-fix for BUG#43789
NOTE: Backporting the patch to next-mr.
2009-09-30 15:17:15 +01:00
Luis Soares
2f51f75825 Automerge: mysql-5.1-rep+2 (local backports) --> mysql-5.1-rep+2 (local latest) 2009-09-30 12:48:22 +01:00
He Zhenxing
43fe9e045c Backporting BUG#40244 Optimized build of mysqld crashes when built with Sun Studio on SPARC 2009-09-30 19:36:35 +08:00
He Zhenxing
6799db2504 Back porting the test case for semi-sync 2009-09-30 16:09:31 +08:00
Alfranio Correia
3ab71376ce BUG#40337 Fsyncing master and relay log to disk after every event is too slow
NOTE: Backporting the patch to next-mr.
      
The fix proposed in BUG#35542 and BUG#31665 introduces a performance issue
when fsyncing the master.info, relay.info and relay-log.bin* after #th events.
Although such solution has been proposed to reduce the probability of corrupted
files due to a slave-crash, the performance penalty introduced by it has
made the approach impractical for highly intensive workloads.
      
In a nutshell, the option --syn-relay-log proposed in BUG#35542 and BUG#31665
simultaneously fsyncs master.info, relay-log.info and relay-log.bin* and
this is the main source of performance issues.
      
This patch introduces new options that give more control to the user on
what should be fsynced and how often:
      
   1) (--sync-master-info, integer) which syncs the master.info after #th event;
   2) (--sync-relay-log, integer) which syncs the relay-log.bin* after #th
   events.
   3) (--sync-relay-log-info, integer) which syncs the relay.info after #th
   transactions.
      
   To provide both performance and increased reliability, we recommend the following
   setup:
      
   1) --sync-master-info = 0 eventually the operating system will fsync it;
   2) --sync-relay-log = 0 eventually the operating system will fsync it;
   3) --sync-relay-log-info = 1 fsyncs it after every transaction;
      
Notice, that the previous setup does not reduce the probability of
corrupted master.info and relay-log.bin*. To overcome the issue, this patch also
introduces a recovery mechanism that right after restart throws away relay-log.bin*
retrieved from a master and updates the master.info based on the relay.info:
      
      
   4) (--relay-log-recovery, boolean) which enables a recovery mechanism that
   throws away relay-log.bin* after a crash.
      
However, it can only recover the incorrect binlog file and position in master.info,
if other informations (host, port password, etc) are corrupted or incorrect,
then this recovery mechanism will fail to work.
2009-09-29 15:40:52 +01:00
Alfranio Correia
0110bd04d2 BUG#35542 Add option to sync master and relay log to disk after every event
BUG#31665 sync_binlog should cause relay logs to be synchronized

NOTE: Backporting the patch to next-mr.
      
Add sync_relay_log option to server, this option works for relay log 
the same as option sync_binlog for binlog. This option also synchronize
master info to disk when set to non-zero value.
            
Original patches from Sinisa and Mark, with some modifications
2009-09-29 15:27:12 +01:00
Alfranio Correia
25162d0166 BUG#43789 different master/slave table defs cause crash: text/varchar null
vs not null

NOTE: Backporting the patch to next-mr.
                        
The replication was generating corrupted data, warning messages on Valgrind
and aborting on debug mode while replicating a "null" to "not null" field.
Specifically the unpack_row routine, was considering the slave's table
definition and trying to retrieve a field value, where there was nothing to be
retrieved, ignoring the fact that the value was defined as "null" by the master.
                        
To fix the problem, we proceed as follows:
                        
1 - If it is not STRICT sql_mode, implicit default values are used, regardless
if it is multi-row or single-row statement.
                        
2 - However, if it is STRICT mode, then a we do what follows:
                        
2.1 If it is a transactional engine, we do a rollback on the first NULL that is
to be set into a NOT NULL column and return an error.
                        
2.2 If it is a non-transactional engine and it is the first row to be inserted
with multi-row, we also return the error. Otherwise, we proceed with the
execution, use implicit default values and print out warning messages.
                  
Unfortunately, the current patch cannot mimic the behavior showed by the master
for updates on multi-tables and multi-row inserts. This happens because such
statements are unfolded in different row events. For instance, considering the
following updates and strict mode:
                  
(master)
create table t1 (a int);
create table t2 (a int not null);
insert into t1 values (1);
insert into t2 values (2);
update t1, t2 SET t1.a=10, t2.a=NULL;
                  
t1 would have (10) and t2 would have (0) as this would be handled as a
multi-row update. On the other hand, if we had the following updates:
                  
(master)
create table t1 (a int);
create table t2 (a int);
                  
(slave)
create table t1 (a int);
create table t2 (a int not null);
                  
(master)
insert into t1 values (1);
insert into t2 values (2);
update t1, t2 SET t1.a=10, t2.a=NULL;
                  
On the master t1 would have (10) and t2 would have (NULL). On
the slave, t1 would have (10) but the update on t1 would fail.
2009-09-29 15:18:44 +01:00
Luis Soares
ba98c88272 BUG#42928: binlog-format setting prevents server from start if binary
logging is disabled
      
NOTE: this is the backport to next-mr.
            
If one sets binlog-format but does NOT enable binary log, server
refuses to start up. The following messages appears in the error log:
            
090217 12:47:14 [ERROR] You need to use --log-bin to make
--binlog-format work.  
090217 12:47:14 [ERROR] Aborting
            
This patch addresses this by making the server not to bail out if the
binlog-format is set without the log-bin option. Additionally, the
specified binlog-format is stored, in the global system variable
"binlog_format", and a warning is printed instead of an error.
2009-09-29 15:12:43 +01:00
Luis Soares
b79b335033 BUG#40611: MySQL cannot make a binary log after sequential number
beyond unsigned long.
BUG#44779: binlog.binlog_max_extension may be causing failure on 
next test in PB
      
NOTE1: this is the backport to next-mr.
NOTE2: already includes patch for BUG#44779.
      
Binlog file extensions would turn into negative numbers once the
variable used to hold the value reached maximum for signed
long. Consequently, incrementing value to the next (negative) number
would lead to .000000 extension, causing the server to fail.
                  
This patch addresses this issue by not allowing negative extensions
and by returning an error on find_uniq_filename, when the limit is
reached. Additionally, warnings are printed to the error log when the
limit is approaching. FLUSH LOGS will also report warnings to the
user, if the extension number has reached the limit. The limit has been
set to 0x7FFFFFFF as the maximum.

mysql-test/suite/binlog/t/binlog_max_extension.test:
  Test case added that checks the maximum available number for
  binlog extensions.
sql/log.cc:
  Changes to find_uniq_filename and test_if_number.
sql/log.h:
  Added macros with values for MAX_LOG_UNIQUE_FN_EXT and
  LOG_WARN_UNIQUE_FN_EXT_LEFT, as suggested in review.
2009-09-29 15:12:07 +01:00
Luis Soares
b538536718 Bug #30703 SHOW STATUS LIKE 'Slave_running' is not compatible with `SHOW SLAVE
STATUS'
      
NOTE: this is the backport to next-mr.
            
SHOW SHOW STATUS LIKE 'Slave_running' command believes that
if active_mi->slave_running != 0, then io thread is running normally.
But it isn't so in fact. When some errors happen to make io thread
try to reconnect master, then it will become transitional status
(MYSQL_SLAVE_RUN_NOT_CONNECT == 1), which also doesn't equal 0.
Yet, "SHOW SLAVE STATUS" believes that only if
active_mi->slave_running == MYSQL_SLAVE_RUN_CONNECT, then io thread is running.
So "SHOW SLAVE STATUS" can get the correct result.
            
            
Fixed to make SHOW SHOW STATUS LIKE 'Slave_running' command have the same
check condition with "SHOW SLAVE STATUS". It only believe that the io thread
is running when active_mi->slave_running == MYSQL_SLAVE_RUN_CONNECT.
2009-09-29 15:10:37 +01:00
Luis Soares
b43d30e43a BUG#28796: CHANGE MASTER TO MASTER_HOST="" leads to invalid master.info
NOTE: this is the backport to next-mr.
                
This patch addresses the bug reported by checking wether 
host argument is an empty string or not. If empty, an error is
reported to the client, otherwise continue normally.
                       
This commit is based on the originally proposed patch and adds 
a test case as requested during review as well as refines comments, 
and makes test case result file less verbose (compared to previous patch).
2009-09-29 15:09:46 +01:00
Luis Soares
01cdba58ba BUG#23300: Slow query log on slave does not log slow replicated statements
NOTE: this is the backport to next-mr.
      
When using replication, the slave will not log any slow query logs queries 
replicated from the master, even if the option "--log-slow-slave-statements" 
is set and these take more than "log_query_time" to execute.
              
In order to log slow queries in replicated thread one needs to set the
--log-slow-slave-statements, so that the SQL thread is initialized with the 
correct switch. Although setting this flag correctly configures the slave 
thread option to log slow queries, there is an issue with the condition that 
is used to check whether to log the slow query or not. When replaying binlog 
events the statement contains the SET TIMESTAMP clause which will force the 
slow logging condition check to fail. Consequently, the slow query logging will
not take place.
              
This patch addresses this issue by removing the second condition from the
log_slow_statements as it prevents slow queries to be binlogged and seems 
to be deprecated.
2009-09-29 15:09:01 +01:00
Alfranio Correia
cc9e25af54 BUG#38173 Field doesn't have a default value with row-based replication
NOTE: Backporting the patch to next-mr.
      
The reason of  the bug was incompatibile with the master side behaviour.
INSERT query on the master is allowed to insert into a table without specifying
values of DEFAULT-less fields if sql_mode is not strict.
            
Fixed with checking sql_mode by the sql thread to decide how to react.
Non-strict sql_mode should allow Write_rows event to complete.
            
todo: warnings can be shown via show slave status, still this is a 
separate rather general issue how to show warnings for the slave threads.
2009-09-29 15:04:21 +01:00
Alfranio Correia
b7f887652b WL#4828 and BUG#45747
NOTE: Backporting the patch to next-mr.

WL#4828 Augment DBUG_ENTER/DBUG_EXIT to crash MySQL in different functions
-------

The assessment of the replication code in the presence of faults is extremely
import to increase reliability. In particular, one needs to know if servers
will either correctly recovery or print out appropriate error messages thus
avoiding unexpected problems in a production environment.

In order to accomplish this, the current patch refactories the debug macros
already provided in the source code and introduces three new macros that
allows to inject faults, specifically crashes, while entering or exiting a
function or method. For instance, to crash a server while returning from
the init_slave function (see module sql/slave.cc), one needs to do what
follows:

1 - Modify the source replacing DBUG_RETURN by DBUG_CRASH_RETURN;

  DBUG_CRASH_RETURN(0);

2 - Use the debug variable to activate dbug instructions:

  SET SESSION debug="+d,init_slave_crash_return";

The new macros are briefly described below:

DBUG_CRASH_ENTER (function) is equivalent to DBUG_ENTER which registers the
beginning of a function but in addition to it allows for crashing the server
while entering the function if the appropriate dbug instruction is activate.
In this case, the dbug instruction should be "+d,function_crash_enter".

DBUG_CRASH_RETURN (value) is equivalent to DBUG_RETURN which notifies the
end of a function but in addition to it allows for crashing the server
while returning from the function if the appropriate dbug instruction is
activate. In this case, the dbug instruction should be
"+d,function_crash_return". Note that "function" should be the same string
used by either the DBUG_ENTER or DBUG_CRASH_ENTER.

DBUG_CRASH_VOID_RETURN (value) is equivalent to DBUG_VOID_RETURN which
notifies the end of a function but in addition to it allows for crashing
the server while returning from the function if the appropriate dbug
instruction is activate. In this case, the dbug instruction should be
"+d,function_crash_return". Note that "function" should be the same string
used by either the DBUG_ENTER or DBUG_CRASH_ENTER.

To inject other faults, for instance, wrong return values, one should rely
on the macros already available. The current patch also removes a set of
macros that were either not being used or were redundant as other macros
could be used to provide the same feature. In the future, we also consider
dynamic instrumentation of the code.


BUG#45747 DBUG_CRASH_* is not setting the strict option
---------
      
When combining DBUG_CRASH_* with "--debug=d:t:i:A,file" the server crashes
due to a call to the abort function in the DBUG_CRASH_* macro althought the
appropriate keyword has not been set.
2009-09-29 14:55:36 +01:00