The bug is caused by a race condition between the
INSERT DELAYED thread and the client thread's FLUSH TABLE. The
FLUSH TABLE does not guarantee (as is (wrongly) suggested in the
test case) that the INSERT DELAYED is ever executed. The
execution of the test case will thus not be deterministic.
The fix has been to do a deterministic verification that both
threads are complete by checking the content of the table.
This test case tests a circular replication of four hosts.
A--->B--->C--->D--->A
The replicate is slow and needs more time to replicate all data in the circle.
The time it spends to replicate, sometimes, is longer than the time that
wait_condition.inc spends to wait that all data has been replicated. This
cause sporadical failure of this test case.
This patch uses sync_slave_with_master to ensure that all data can be replicated
successfully in the circle.
'LOAD DATA CONCURRENT [LOCAL] INFILE ...' statment only is binlogged as
'LOAD DATA [LOCAL] INFILE ...' in SBR and MBR. As a result, if replication is on,
queries on slaves will be blocked by the replication SQL thread.
This patch write code to write 'CONCURRENT' into the log event if 'CONCURRENT' option
is in the original statement in SBR and MBR.
{PROCEDURE|FUNCTION} FROM ...'
The master would hit an assertion when binary log was
active. This was due to the fact that the thread's diagnostics
area was being cleared before writing to the binlog,
independently of mysql_routine_grant returning an error or
not. When mysql_routine_grant was to return an error, the return
value and the diagnostics area contents would
mismatch. Consequently, neither my_ok would be called nor an
error would be signaled in the diagnostics area, eventually
triggering the assertion in net_end_statement.
We fix this by not clearing the diagnostics area at binlogging
time.
Before this patch, semisync assumed transactions running in parallel
can not be larger than max_connections, but this is not true when
the event scheduler is executing events, and cause semisync run out
of preallocated transaction nodes.
Fix the problem by allocating transaction nodes dynamically.
This patch also fixed a possible deadlock when running UNINSTALL
PLUGIN rpl_semi_sync_master and updating in parallel. Fixed by
releasing the internal Delegate lock before unlock the plugins.
The 'slave_patternload_file' is assigned to the real path of the load data file
when initializing the object of Relay_log_info. But the path of the load data
file is not formatted to real path when executing event from relay log. So the
error will be encountered if the path of the load data file is a symbolic link.
Actually the global 'opt_secure_file_priv' is not formatted to real path when
loading data from file. So the same thing will happen too.
To fix these errors, the path of the load data file should be formatted to
real path when executing event from relay log. And the 'opt_secure_file_priv'
should be formatted to real path when loading data infile.
<tmp_tbl> with RBL
When binlogging the statement, the server always handle the existing
object as a table, even though it is a view. However a view is
handled differently in other parts of the code thus leading the
statement to crash in RBL if the view exists.
This happens because the underlying tables for the view are not opened
when we try to call store_create_info() on the view in order to build
a CREATE TABLE statement.
This patch will only address the crash problem, other binlogging
problems related to CREATE TABLE IF NOT EXISTS LIKE when the existing
object is a view will be solved by BUG 47442.
In RBR, All statements operating on temporary tables should not be binlogged.
Despite this fact, after executing 'TRUNCATE... ' on a temporary table,
the command is still logged, even if in row-based mode. Consequently, this raises
problems in the slave as the table may not exist, resulting in an
execution failure. Ultimately, this causes the slave to report
an error and abort.
After this patch, 'TRUNCATE ...' statement on a temporary table will not be
binlogged in RBR.
Problem: Some system functions that could return different values on
master and slave were not marked unsafe. In particular:
GET_LOCK
IS_FREE_LOCK
IS_USED_LOCK
MASTER_POS_WAIT
RELEASE_LOCK
SLEEP
SYSDATE
VERSION
Fix: Mark these functions unsafe.
The 'rpl_get_master_version_and_clock' test verifies if the slave I/O
thread tries to reconnect to master when it tries to get the values of
the UNIX_TIMESTAMP, SERVER_ID from master under network disconnection.
So the master server is restarted for making the transient network
disconnection. Restarting master server can bring two problems as following:
1. The time out error is encountered sporadically. The slave I/O thread tries
to reconnect master ten times, which is set in my.cnf. So in the test
framework sporadically the slave I/O thread really stoped when it can't
reconnect to master in the ten times successfully before the master starts,
then the time out error will be encountered while waiting for the slave to
start.
2. These warnings and errors are produced in server log file when
the slave I/O thread tries to get the values of the UNIX_TIMESTAMP,
SERVER_ID from master under the transient network disconnection.
To fix problem 1, increase the master retry count to sixty times,
so that the slave I/O thread has enough time to reconnect master
successfully.
To fix problem 2, suppress these warnings and errors by mtr suppression,
because they are expected.
Conflicts
=========
Text conflict in .bzr-mysql/default.conf
Text conflict in libmysqld/CMakeLists.txt
Text conflict in libmysqld/Makefile.am
Text conflict in mysql-test/collections/default.experimental
Text conflict in mysql-test/extra/rpl_tests/rpl_row_sp006.test
Text conflict in mysql-test/suite/binlog/r/binlog_tmp_table.result
Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result
Text conflict in mysql-test/suite/rpl/r/rpl_loaddata_fatal.result
Text conflict in mysql-test/suite/rpl/r/rpl_row_create_table.result
Text conflict in mysql-test/suite/rpl/r/rpl_row_sp006_InnoDB.result
Text conflict in mysql-test/suite/rpl/r/rpl_stm_log.result
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_circular_simplex.result
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_sp006.result
Text conflict in mysql-test/t/mysqlbinlog.test
Text conflict in sql/CMakeLists.txt
Text conflict in sql/Makefile.am
Text conflict in sql/log_event_old.cc
Text conflict in sql/rpl_rli.cc
Text conflict in sql/slave.cc
Text conflict in sql/sql_binlog.cc
Text conflict in sql/sql_lex.h
21 conflicts encountered.
NOTE
====
mysql-5.1-rpl-merge has been made a mirror of mysql-next-mr:
- "mysql-5.1-rpl-merge$ bzr pull ../mysql-next-mr"
This is the first cset (merge/...) committed after pulling
from mysql-next-mr.
Backporting BUG#43789 to mysql-5.1-bugteam
The replication was generating corrupted data, warning messages on Valgrind
and aborting on debug mode while replicating a "null" to "not null" field.
Specifically the unpack_row routine, was considering the slave's table
definition and trying to retrieve a field value, where there was nothing to be
retrieved, ignoring the fact that the value was defined as "null" by the master.
To fix the problem, we proceed as follows:
1 - If it is not STRICT sql_mode, implicit default values are used, regardless
if it is multi-row or single-row statement.
2 - However, if it is STRICT mode, then a we do what follows:
2.1 If it is a transactional engine, we do a rollback on the first NULL that is
to be set into a NOT NULL column and return an error.
2.2 If it is a non-transactional engine and it is the first row to be inserted
with multi-row, we also return the error. Otherwise, we proceed with the
execution, use implicit default values and print out warning messages.
Unfortunately, the current patch cannot mimic the behavior showed by the master
for updates on multi-tables and multi-row inserts. This happens because such
statements are unfolded in different row events. For instance, considering the
following updates and strict mode:
(master)
create table t1 (a int);
create table t2 (a int not null);
insert into t1 values (1);
insert into t2 values (2);
update t1, t2 SET t1.a=10, t2.a=NULL;
t1 would have (10) and t2 would have (0) as this would be handled as a
multi-row update. On the other hand, if we had the following updates:
(master)
create table t1 (a int);
create table t2 (a int);
(slave)
create table t1 (a int);
create table t2 (a int not null);
(master)
insert into t1 values (1);
insert into t2 values (2);
update t1, t2 SET t1.a=10, t2.a=NULL;
On the master t1 would have (10) and t2 would have (NULL). On
the slave, t1 would have (10) but the update on t1 would fail.
Add an option to control whether the master should keep waiting
until timeout when it detected that there is no semi-sync slave
available.
The bool option 'rpl_semi_sync_master_wait_no_slave' is 1 by
defalt, and will keep waiting until timeout. When set to 0, the
master will switch to asynchronous replication immediately when
no semi-sync slave is available.
Semi-sync status were not reset by FLUSH STATUS, this was because
all semi-sync status variables are defined as SHOW_FUNC and FLUSH
STATUS could only reset SHOW_LONG type variables.
This problem is fixed by change all status variables that should
be reset by FLUSH STATUS from SHOW_FUNC to SHOW_LONG.
After the fix, the following status variables will be reset by
FLUSH STATUS:
Rpl_semi_sync_master_yes_tx
Rpl_semi_sync_master_no_tx
Note: normally, FLUSH STATUS itself will be written into binlog
and be replicated, so after FLUSH STATS, one of
Rpl_semi_sync_master_yes_tx
Rpl_semi_sync_master_no_tx
can be 1 dependent on the semi-sync status. So it's recommended
to use FLUSH NO_WRITE_TO_BINLOG STATUS to avoid this.
Semi-sync uses an extra connection from slave to master to send
replies, this is a normal client connection, and used a normal
SET query to set the reply information on master, which is visible
to user and may cause some confusion and complaining.
This problem is fixed by using the method of sending reply by
using the same connection that is used by master dump thread to
send binlog to slave. Since now the semi-sync plugins are integrated
with the server code, it is not a problem to use the internal net
interfaces to do this.
The master dump thread will mark the event requires a reply and
wait for the reply when the event just sent is the last event
of a transaction and semi-sync status is ON; And the slave will
send a reply to master when it received such an event that requires
a reply.
To-number conversion warnings work differenly with CHAR
and VARCHAR sp variables.
The original revision-IDs are:
staale.smedseng@sun.com-20081124095339-2qdvzkp0rn1ljs30staale.smedseng@sun.com-20081125104611-rtxic5d12e83ag2o
The patch provides ER_TRUNCATED_WRONG_VALUE warning messages
for conversion of VARCHAR to numberic values, in line with
messages provided for CHAR conversions. Conversions are
checked for success, and the message is emitted in case
failure.
The tests are amended to accept the added warning messages,
and explicit conversion of ON/OFF values is added for
statements checking system variables. In test
rpl.rpl_switch_stm_row_mixed checking for warnings is
temporarily disabled for one statement, as this generates
warning messages for strings that vary between executions.
The issue appears when number of heartbeat events non-zero before start of test
block. But really we need to check that no new events has received during test block.
So I did following:
1. Replace absolute values by diff of values
2. Increase heartbeat period from 1.5 to 5 sec