rpl_packet got a timeout failure sporadically on PB when stopping
slave. The real reason of this bug is that STOP SLAVE stopped
IO thread first and then stopped SQL thread. It was
possible that IO thread stopped after replicating part of a
transaction which SQL thread was executing. SQL thread would
be hung if the transaction could not be rolled back safely.
After this patch, STOP SLAVE will stop SQL thread first and then stop IO
thread, which guarantees that IO thread will fetch the reset of the
events of the transaction that SQL thread is executing, so that SQL
thread can finish the transaction if it cannot be rolled back safely.
Added below auxiliary files to make the test code neater.
restart_slave_sql.inc
rpl_connection_master.inc
rpl_connection_slave.inc
rpl_connection_slave1.inc
Undoing the patch, it complicates the code but is not the solution
I do not beleive newline mismatch could be the cause of this failure
First, I cannot see how this could be a problem, mtr ignores the newline
when reading the expect file, and the file is written and read on Windows.
Second, if this really was the problem it should have been deterministic:
either the newline is correctly interepreted or it is not.
Backported the fix to 5.1.
Problem: the auxiliary test files rpl_start_server.inc and rpl_stop_server.inc
write a file that is later read by mtr. The bug was that the file was written
with platform-dependent newline terminators, i.e., \r\n on windows, whereas mtr
only understands \n.
Fix: write the file so that it uses \n on all platforms.
Major replication test framework cleanup. This does the following:
- Ensure that all tests clean up the replication state when they
finish, by making check-testcase check the output of SHOW SLAVE STATUS.
This implies:
- Slave must not be running after test finished. This is good
because it removes the risk for sporadic errors in subsequent
tests when a test forgets to sync correctly.
- Slave SQL and IO errors must be cleared when test ends. This is
good because we will notice if a test gets an unexpected error in
the slave threads near the end.
- We no longer have to clean up before a test starts.
- Ensure that all tests that wait for an error in one of the slave
threads waits for a specific error. It is no longer possible to
source wait_for_slave_[sql|io]_to_stop.inc when there is an error
in one of the slave threads. This is good because:
- If a test expects an error but there is a bug that causes
another error to happen, or if it stops the slave thread without
an error, then we will notice.
- When developing tests, wait_for_*_to_[start|stop].inc will fail
immediately if there is an error in the relevant slave thread.
Before this patch, we had to wait for the timeout.
- Remove duplicated and repeated code for setting up unusual replication
topologies. Now, there is a single file that is capable of setting
up arbitrary topologies (include/rpl_init.inc, but
include/master-slave.inc is still available for the most common
topology). Tests can now end with include/rpl_end.inc, which will clean
up correctly no matter what topology is used. The topology can be
changed with include/rpl_change_topology.inc.
- Improved debug information when tests fail. This includes:
- debug info is printed on all servers configured by include/rpl_init.inc
- User can set $rpl_debug=1, which makes auxiliary replication files
print relevant debug info.
- Improved documentation for all auxiliary replication files. Now they
describe purpose, usage, parameters, and side effects.
- Many small code cleanups:
- Made have_innodb.inc output a sensible error message.
- Moved contents of rpl000017-slave.sh into rpl000017.test
- Added mysqltest variables that expose the current state of
disable_warnings/enable_warnings and friends.
- Too many to list here: see per-file comments for details.
when generating new name.
If find_uniq_filename returns an error, then this error is not
being propagated upwards, and execution does not report error to
the user (although a entry in the error log is generated).
Additionally, some more errors were ignored in new_file_impl:
- when writing the rotate event
- when reopening the index and binary log file
This patch addresses this by propagating the error up in the
execution stack. Furthermore, when rotation of the binary log
fails, an incident event is written, because there may be a
chance that some changes for a given statement, were not properly
logged. For example, in SBR, LOAD DATA INFILE statement requires
more than one event to be logged, should rotation fail while
logging part of the LOAD DATA events, then the logged data would
become inconsistent with the data in the storage engine.
Problem: MySQL cp1251 did not support 'U+20AC EURO SIGN'
which was assigned a few years ago to 0x88.
Fix: adding mapping: 0x88 <-> U+20AC
@ mysql-test/include/ctype_8bit.inc
New shared file to test 8bit character sets.
@ mysql-test/r/ctype_cp1251.result
@ mysql-test/t/ctype_cp1251.test
Adding tests
@ sql/share/charsets/cp1251.xml
Adding mapping
@ strings/ctype-extra.c
Regenerating ctype-extra.c using strings/conf_to_src
according to new cp1251.xml
This is a regression from the fix for bug no 38999. A storage engine capable
of reading only a subset of a table's columns updates corresponding bits in
the read buffer to signal that it has read NULL values for the corresponding
columns. It cannot, and should not, update any other bits. Bug no 38999
occurred because the implementation of UPDATE statements compare the NULL bits
using memcmp, inadvertently comparing bits that were never requested from the
storage engine. The regression was caused by the storage engine trying to
alleviate the situation by writing to all NULL bits, even those that it had no
knowledge of. This has devastating effects for the index merge algorithm,
which relies on all NULL bits, except those explicitly requested, being left
unchanged.
The fix reverts the fix for bug no 38999 in both InnoDB and InnoDB plugin and
changes the server's method of comparing records. For engines that always read
entire rows, we proceed as usual. For engines capable of reading only select
columns, the record buffers are now compared on a column by column basis. An
assertion was also added so that non comparable buffers are never read. Some
relevant copy-pasted code was also consolidated in a new function.
Subselect executes twice, at JOIN::optimize stage
and at JOIN::execute stage. At optimize stage
Innodb prebuilt struct which is used for the
retrieval of column values is initialized in.
ha_innobase::index_read(), prebuilt->sql_stat_start is true.
After QUICK_ROR_INTERSECT_SELECT finished his job it
restores read_set/write_set bitmaps with initial values
and deactivates one of the handlers used by
QUICK_ROR_INTERSECT_SELECT in JOIN::cleanup
(it's the case when we reuse original handler as one of
handlers required by QUICK_ROR_INTERSECT_SELECT object).
On second subselect execution inactive handler is activated
in QUICK_RANGE_SELECT::reset, file->ha_index_init().
In ha_index_init Innodb prebuilt struct is reinitialized
with inappropriate read_set/write_set bitmaps. Further
reinitialization in ha_innobase::index_read() does not
happen as prebuilt->sql_stat_start is false.
It leads to partial retrieval of required field values
and we get a mix of field values from different records
in the record buffer.
The fix is to reset
read_set/write_set bitmaps as these values
are required for proper intialization of
internal InnoDB struct which is used for
the retrieval of column values
(see build_template(), ha_innodb.cc)
'CREATE TABLE IF NOT EXISTS ... SELECT' behaviour
BUG#55474, BUG#55499, BUG#55598, BUG#55616 and BUG#55777 are fixed
in this patch too.
This is the 5.1 part.
It implements:
- if the table exists, binlog two events: CREATE TABLE IF NOT EXISTS
and INSERT ... SELECT
- Insert nothing and binlog nothing on master if the existing object
is a view. It only generates a warning that table already exists.
if() treated any non-numeric string as false
Fixed to treat those as true instead
Added some test cases
Fixed missing $ in variable name in include/mix2.inc
table with active trx
Essentially, the problem is that InnoDB does a implicit commit
when a cursor (table handler) is unlocked/closed, creating
a dissonance between the transaction state within the server
layer and the storage engine layer. Theoretically, a statement
transaction can encompass several table instances in a similar
manner to a multiple statement transaction, hence it does not
make sense to limit a statement transaction to the lifetime of
the table instances (cursors) used within it.
Since this particular instance of the problem is only triggerable
on 5.1 and is masked on 5.5 due 2PC being skipped (assertion is in
the prepare phase of a 2PC), the solution (which is less risky) is
to explicitly end the transaction before the cached table is unlock
on rename table.
The patch is to be null merged into trunk.
DROP USER
RENAME USER CURRENT_USER() ...
GRANT ... TO CURRENT_USER()
REVOKE ... FROM CURRENT_USER()
ALTER DEFINER = CURRENT_USER() EVENTbut, When these statements are binlogged, CURRENT_USER() just is binlogged
as 'CURRENT_USER()', it is not expanded to the real user name. When slave
executes the log event, 'CURRENT_USER()' is expand to the user of slave
SQL thread, but SQL thread's user name always NULL. This breaks the replication.
After this patch, session's user will be written into query log events
if these statements call CURREN_USER() or 'ALTER EVENT' does not assign a definer.
DROP USER
RENAME USER CURRENT_USER() ...
GRANT ... TO CURRENT_USER()
REVOKE ... FROM CURRENT_USER()
ALTER DEFINER = CURRENT_USER() EVENTbut, When these statements are binlogged, CURRENT_USER() just is binlogged
as 'CURRENT_USER()', it is not expanded to the real user name. When slave
executes the log event, 'CURRENT_USER()' is expand to the user of slave
SQL thread, but SQL thread's user name always NULL. This breaks the replication.
After this patch, session's user will be written into query log events
if these statements call CURREN_USER() or 'ALTER EVENT' does not assign a definer.
Add code to waiting for a set of errors.
Add code to waiting for an error instead of waiting for io thread to stop, as
after 'START SLAVE', the status of io thread is still not running.
But it doesn't mean slave io thread encounters an error.
without FOR UPDATE is causing a lock".
SELECT statements with subqueries referencing InnoDB tables
were acquiring shared locks on rows in these tables when they
were executed in REPEATABLE-READ mode and with statement or
mixed mode binary logging turned on.
This was a regression which were introduced when fixing
bug 39843.
The problem was that for tables belonging to subqueries
parser set TL_READ_DEFAULT as a lock type. In cases when
statement/mixed binary logging at open_tables() time this
type of lock was converted to TL_READ_NO_INSERT lock at
open_tables() time and caused InnoDB engine to acquire
shared locks on reads from these tables. Although in some
cases such behavior was correct (e.g. for subqueries in
DELETE) in case of SELECT it has caused unnecessary locking.
This patch implements minimal version of the fix for the
specific problem described in the bug-report which supposed
to be not too risky for pushing into 5.1 tree.
The 5.5 tree already contains a more appropriate solution
which also addresses other related issues like bug 53921
"Wrong locks for SELECTs used stored functions may lead
to broken SBR".
This patch tries to solve the problem by ensuring that
TL_READ_DEFAULT lock which is set in the parser for
tables participating in subqueries at open_tables()
time is interpreted as TL_READ_NO_INSERT or TL_READ.
TL_READ is used only if we know that this is a SELECT
and that this particular table is not used by a stored
function.
Test coverage is added for both InnoDB and MyISAM.
This patch introduces an "incompatible" change in locking
scheme for subqueries used in SELECT ... FOR UPDATE and
SELECT .. IN SHARE MODE.
In 4.1 (as well as in 5.0 and 5.1 before fix for bug 39843)
the server would use a snapshot InnoDB read for subqueries
in SELECT FOR UPDATE and SELECT .. IN SHARE MODE statements,
regardless of whether the binary log is on or off.
If the user required a different type of read (i.e. locking
read), he/she could request so explicitly by providing FOR
UPDATE/IN SHARE MODE clause for each individual subquery.
The patch for bug 39843 broke this behaviour (which was not
documented or tested), and started to use locking reads for
all subqueries in SELECT ... FOR UPDATE/IN SHARE MODE.
This patch restores 4.1 behaviour.
This patch should be mostly null-merged into 5.5 tree.
Some of the test cases reference to binlog position and
these position numbers are written into result explicitly.
It is difficult to maintain if log event format changes.
There are a couple of cases explicit position number appears,
we handle them in different ways
A. 'CHANGE MASTER ...' with MASTER_LOG_POS or/and RELAY_LOG_POS options
Use --replace_result to mask them.
B. 'SHOW BINLOG EVENT ...'
Replaced by show_binlog_events.inc or wait_for_binlog_event.inc.
show_binlog_events.inc file's function is enhanced by given
$binlog_file and $binlog_limit.
C. 'SHOW SLAVE STATUS', 'show_slave_status.inc' and 'show_slave_status2.inc'
For the test cases just care a few items in the result of 'SHOW SLAVE STATUS',
only the items related to each test case are showed.
'show_slave_status.inc' is rebuild, only the given items in $status_items
will be showed.
'check_slave_is_running.inc' and 'check_slave_no_error.inc'
and 'check_slave_param.inc' are auxiliary files helping
to show running status and error information easily.
they are ignored to a new test suite "innodb_plugin".
Remove a hack in mtr that was deployed to run the builtin InnoDB tests against
the InnoDB Plugin. Also detect if a test is an 'innodb plugin test' and if so
then transparently replace the builtin InnoDB with the InnoDB Plugin.
This has been back-ported from 6.0 as the problems proved to afflict
5.1 as well.
The fix exposed two new bugs. They were reported as follows.
Bug no 52174: Sometimes wrong plan when reading a MAX value
from non-NULL index
Bug no 52173: Reading NULL value from non-NULL index gives wrong
result in embedded server
Both bugs taken together affect a much smaller class of queries than #47762,
so the fix stays for now.
NULL column for NULL
The optimization to read MIN() and MAX() values from an
index did not properly handle comparisons with NULL
values. Fixed by giving up the particular optimization step
if there are non-NULL safe comparisons with NULL values, as
the result is NULL anyway.
Also, Oracle copyright notice was added to all files.
The problem is that not all column names retrieved from a SELECT
statement can be used as view column names due to length and format
restrictions. The server failed to properly check the conformity
of those automatically generated column names before storing the
final view definition on disk.
Since columns retrieved from a SELECT statement can be anything
ranging from functions to constants values of any format and length,
the solution is to rewrite to a pre-defined format any names that
are not acceptable as a view column name.
The name is rewritten to "Name_exp_%u" where %u translates to the
position of the column. To avoid this conversion scheme, define
explict names for the view columns via the column_list clause.
Also, aliases are now only generated for top level statements.
write()/read()
Sometimes stop/restart master or stop/restart salve can cause
network error, which can cause the 'invalid file descriptor
-1 in syscall write()/read()' warnings. All involved test
cases except rpl_slave_load_remove_tmpfile belong to the
kind of network error. So they are expected.
The 'rpl_slave_load_remove_tmpfile' belongs to file error,
but it is testing the file error as following code:
DBUG_EXECUTE_IF("remove_slave_load_file_before_write",
my_close(fd,MYF(0)); fd= -1; my_delete(fname, MYF(0)););
So it's expected too.
To fix the problem, add the valgrind warnings to the global
suppression list to suppress it.
The test case rpl_binlog_corruption fails on windows because when
adding a line to the binary log index file it gets terminated
with a CR+LF (which btw, is the normal case in windows, but not on
Unixes - LF). This causes mismatch between the relay log names,
causing mysqld to report that it cannot find the log file.
We fix this by creating the instrumented index file through
mysql, ie, using SELECT ... INTO DUMPFILE ..., as opposed on
relying on ultimatly OS commands like: -- echo "..." >
index. These changes go into the file and make the procedure
platform independent:
include/setup_fake_relay_log.inc
Side note: when using SELECT ... INTO DUMPFILE ..., one needs to
check if mysqld is running with secure_file_priv. If it is, we do
it in two steps: 1. create the file on the allowed location;
2. move it to the datadir. If it is not, then we just create the
file directly on the datadir (so previous step 2. is not needed).
Manually deleteing one or more entries from 'master-bin.index', will
cause master infinitely loop to send one binlog file.
When starting a dump session, master opens index file and search the binlog file
which is being requested by the slave. The position of the binlog file in the
index file is recorded. it will be used to find the next binlog file when current
binlog file has dumped completely. As only the position is used, it may
not get the correct file if some entries has been removed manually from the index file.
the master will reopen the current binlog file which has been dump completely
and redump it if it can not get the next binlog file's name from index file.
It obviously is a logical error.
Even though it is allowed to manually change index file,
but it is not recommended. so after this patch, master
sends a fatal error to slave and close the dump session if a new binlog file
has been generated and master can not get it from the index file.