a.k.a. Bug#7975 deadlock without any locking, simple select and update
Bug#7975 was reintroduced when the storage engine API was made
pluggable in MySQL 5.1. Instead of looking at thd->lex directly, we
rely on handler::extra(). But, we were looking at the wrong extra()
flag, and we were ignoring the TRX_DUP_REPLACE flag in places where we
should obey it.
innodb_replace.test: Add tests for hopefully all affected statement
types, so that bug should never ever resurface. This kind of tests
should have been added when fixing Bug#7975 in MySQL 5.0.3 in the
first place.
rb:806 approved by Sunny Bains
post-push fixes for show_slave_io_error= 1 of wait_for_slave_io_error.inc;
Unix and win format path specifically so few tests have to change show_slave_io_error
to zero.
The bug case is similar to one fixed earlier bug_49536.
Deadlock involving LOCK_log appears to be possible because the purge running thread
is holding LOCK_log whereas there is no sense of doing that and which fact was
exploited by the earlier bug fixes.
Fixed with small reengineering of rotate_and_purge(), adding two new methods and
setting up a policy to execute those instead of the former
rotate_and_purge(RP_LOCK_LOG_IS_ALREADY_LOCKED).
The policy for using rotate(), purge() is that if the caller acquires LOCK_log itself,
it should call rotate(), release the mutex and run purge().
Side effect of this patch is refining error message of bug@11747416 to print
the whole path.
mysql-test/suite/rpl/r/rpl_cant_read_event_incident.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_log_pos.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_manual_change_index_file.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_packet.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_rotate_purge_deadlock.result:
new result file is added.
mysql-test/suite/rpl/t/rpl_cant_read_event_incident.test:
The test of that bug can't satisfy windows and unix backslash interpretation so windows
execution is chosen to bypass.
mysql-test/suite/rpl/t/rpl_rotate_purge_deadlock-master.opt:
new opt file is added.
mysql-test/suite/rpl/t/rpl_rotate_purge_deadlock.test:
regression test is added as well as verification of a
possible side effect of the fixes is tried.
sql/log.cc:
LOCK_log is never taken during execution of log purging routine.
The former MYSQL_BIN_LOG::rotate_and_purge is made to necessarily
acquiring and releasing LOCK_log.
If caller takes the mutex itself it has to use a new rotate(), purge()
methods combination and to never let purge() be run with LOCK_log grabbed.
split apart to allow
the caller to chose either it
Simulation of concurrently rotating/purging threads is added.
sql/log.h:
new rotate(), purge() methods are added to be used instead of
the former rotate_and_purge(RP_LOCK_LOG_IS_ALREADY_LOCKED).
rotate_and_purge() signature is changed. Caller should not call rotate_and_purge()
but rather {rotate(), purge()} if LOCK_log is acquired by it.
sql/rpl_injector.cc:
changes to reflect the new rotate_and_purge() signature.
sql/sql_class.h:
unnecessary constants are removed.
sql/sql_parse.cc:
changes to reflect the new rotate_and_purge() signature.
sql/sql_reload.cc:
changes to reflect the new rotate_and_purge() signature.
sql/sql_repl.cc:
followup for bug@11747416: the file name printing is changed to a relative
path instead of just the file name.
This was an attempt to address problems with the Bug#12612184 fix.
Even with this follow-up fix, crash recovery can be broken.
Let us fix the bug later.
In the ON UPDATE CASCADE clause of FOREIGN KEY constraints, the
calculated update vector was not fully initialized. This bug was
introduced in the InnoDB Plugin when implementing support for
ROW_FORMAT=DYNAMIC.
Additionally, the data type information was not initialized, but
apparently it has never been needed in this case. Nevertheless, it is
not good programming practice to pass uninitialized values around.
calc_row_difference(): Declare the update field uninitialized in
Valgrind. Copy the data type information as well, except when the
field is SQL NULL. In the built-in InnoDB, initialize
ufield->extern_storage = FALSE (an initialization bug that had gone
unnoticed this far). The InnoDB Plugin and later have this flag to
dfield_t and have always initialized it properly.
row_ins_cascade_calc_update_vec(): Reduce the scope of some
pointers. Initialize orig_len. (This caused the bug in InnoDB Plugin
and later.)
row_ins_foreign_check_on_constraint(): Simplify a condition. Declare
the update vector uninitialized.
rb:771 approved by Jimmy Yang
PARENT FOR OTHER ONE
Do not try to lookup key_nr'th key in 'table' because there may not be such
a key there. key_nr is the number of the key in the _child_ table name, not
in the parent table.
Instead just print the fields of the record that are covered by the first key
defined on the parent table.
This bug gets a better fix in MySQL 5.6, which is too risky for 5.1 and 5.5.
Approved by: Jon Olav Hauglid (via IM)
This is a redo for 5.5
Added 'innodb_file_format_max' as variable to ignore change to.
Tests that had to restore this amended
Two tests assumed it to be Antelope, make sure these run on a freshly
started server
Binary log of master can get a partially logged event if the server
runs out of disk space and, while waiting for some space to be freed,
is shut down (or crashes). If the server is not stopped, it will just
wait endlessly for space to be freed, thus no partial event anomaly
occurs. The restarted master server has had a dubious policy to send
the incomplete event to slave which it apparently can't handle.
Although an error was printed out the fact of sending with unclear
error message is a source of confusion.
Actually the problem of presence an incomplete event in the binary log
was already fixed by WL 5493 (which was merged to our current trunk
branch, major version 5.6). The fix makes the server truncate the
binary log on server restart and recovery.
However 5.5 master can't do that. So the current issue is a problem of
sending incomplete events to the slave by 5.5 master.
It is fixed in this patch by changing the policy so that only complete
events are pushed by the dump thread to the IO thread. In addition,
the error text that master sends to the slave when an incomplete event
is found, now states that incomplete event may have been caused by an
out-of-disk space situation and provides coordinates of
the first and the last event bytes read.
mysql-test/std_data/bug11747416_32228_binlog.000001:
a binlog is added with the last event written partly.
mysql-test/suite/rpl/r/rpl_cant_read_event_incident.result:
new result file is added.
mysql-test/suite/rpl/r/rpl_log_pos.result:
results updated.
mysql-test/suite/rpl/r/rpl_manual_change_index_file.result:
results updated.
mysql-test/suite/rpl/r/rpl_packet.result:
results updated.
mysql-test/suite/rpl/t/rpl_cant_read_event_incident.test:
regression test for bug#11747416 : 32228 A disk full makes binary log corrupt
is added.
sql/share/errmsg-utf8.txt:
Increasing the explanatory part of ER_MASTER_FATAL_ERROR_READING_BINLOG error message twice
in order to fit to the updated version which carries some more info.
sql/sql_repl.cc:
Error text indicating a failure of reading from binlog that master delivers to the slave
is made more clear;
A policy to regard a partial event to send it out to the slave anyway is removed.
Problem: The following statements can cause the slave to go out of sync
if logged in statement format:
INSERT IGNORE...SELECT
INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
REPLACE ... SELECT
UPDATE IGNORE :
CREATE ... IGNORE SELECT
CREATE ... REPLACE SELECT
Background: Since the order of the rows returned by the SELECT
statement or otherwise may differ on master and slave, therefore
the above statements may cuase the salve to go out of sync with
the master.
Fix:
Issue a warning when statements like the above are exectued and
the bin-logging format is statement. If the logging format is mixed,
use row based logging. Marking a statement as unsafe has been
done in the sql/sql_parse.cc instead of sql/sql_yacc.cc, because while
parsing for a token has been done we cannot be sure if the parsing
of the other tokens has been done as well.
Six new warning messages has been added for each unsafe statement.
binlog.binlog_unsafe.test has been updated to incoporate these additional unsafe statments.
******
BUG#11758262 - 50439: MARK INSERT...SEL...ON DUP KEY UPD,REPLACE...SEL,CREATE...[IGN|REPL] SEL
Problem: The following statements can cause the slave to go out of sync
if logged in statement format:
INSERT IGNORE...SELECT
INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
REPLACE ... SELECT
UPDATE IGNORE :
CREATE ... IGNORE SELECT
CREATE ... REPLACE SELECT
Background: Since the order of the rows returned by the SELECT
statement or otherwise may differ on master and slave, therefore
the above statements may cuase the salve to go out of sync with
the master.
Fix:
Issue a warning when statements like the above are exectued and
the bin-logging format is statement. If the logging format is mixed,
use row based logging. Marking a statement as unsafe has been
done in the sql/sql_parse.cc instead of sql/sql_yacc.cc, because while
parsing for a token has been done we cannot be sure if the parsing
of the other tokens has been done as well.
Six new warning messages has been added for each unsafe statement.
binlog.binlog_unsafe.test has been updated to incoporate these additional unsafe statments.
mysql-test/extra/rpl_tests/rpl_insert_duplicate.test:
Test removed: Added the test to rpl.rpl_insert_ignore.test
******
Test removed: the test is redundant as the same is being tested in rpl.rpl_insert_ignore.
mysql-test/extra/rpl_tests/rpl_insert_id.test:
Warnings disabled for the unsafe statements.
mysql-test/extra/rpl_tests/rpl_insert_ignore.test:
1. Disabled warnings while for unsafe statements
2. As INSERT...IGNORE is an unsafe statement, an insert ignore not changing any rows,
will not be logged in the binary log, in the ROW and MIXED modes. It will however be logged
in STATEMENT mode.
mysql-test/r/commit_1innodb.result:
updated result file
******
updated result file
mysql-test/suite/binlog/r/binlog_stm_blackhole.result:
Updated result file.
mysql-test/suite/binlog/r/binlog_unsafe.result:
updated result file
mysql-test/suite/binlog/t/binlog_unsafe.test:
added tests for the statements marked as unsafe.
mysql-test/suite/rpl/r/rpl_insert_duplicate.result:
File Removed :Result file of rpl_insert_duplicate, which has been removed.
mysql-test/suite/rpl/r/rpl_insert_ignore.result:
Added the content of rpl.rpl_insert_duplicate here.
mysql-test/suite/rpl/r/rpl_insert_select.result:
Result file removed as the corresponding test has beenn removed.
mysql-test/suite/rpl/r/rpl_known_bugs_detection.result:
Updated result file.
mysql-test/suite/rpl/t/rpl_insert_duplicate.test:
File Removed: this was a wrapper for rpl.rpl_insert_duplicate.test, which has been removed.
mysql-test/suite/rpl/t/rpl_insert_select.test:
File Removed: This test became redundant after this fix, This test showed how INSERT IGNORE...SELECT break replication, which has been handled in this fix.
mysql-test/suite/rpl/t/rpl_known_bugs_detection.test:
Since all the tests are statement based bugs are being tested, having mixed format
forces the event to be written in row format. When the statement and causes the
test to fail as certain known bugs do not occur when the even is logged in row format.
sql/share/errmsg-utf8.txt:
added 6 new Warning messages.
******
added 6 new Warning messages.
sql/sql_lex.cc:
Added 6 new error Identifier [ER_BINLOG_STMT_UNSAFEE_*]
sql/sql_lex.h:
Added 6 new BINLOG_STMT_UNSAFE_* enums to identify the type of unsafe statement dealt with in this bug.
******
Added 6 new BINLOG_STMT_UNSAFE_* enums to identify the type of unsafe statement dealt with in this bug.
sql/sql_parse.cc:
added check for specific queries and marked them as unsafe.
******
added check for specific queries and marked them as unsafe.
SYSTEM VARIABLE NAME SQL_MAX_JOIN_SI
BACKGROUND:
ER_TOO_BIG_SELECT refers to SQL_MAX_JOIN_SIZE, which is the
old name for MAX_JOIN_SIZE.
FIX:
Support for old name SQL_MAX_JOIN_SIZE is removed in MySQL 5.6
and is renamed as MAX_JOIN_SIZE.So the errmsg.txt
and mysql.cc files have been updated and the corresponding result
files have also been updated.
Added 'innodb_file_format_check' as variable to ignore change to.
Tests that had to restore this amended
Two tests assumed it to be Antelope, make sure these run on a freshly
started server
For 5.5, apparently innodb_file_format_max is the one to ignore
The problem occurred when indexes are added between the time that an
UNDO record is created and the time that the purge thread comes around
and deletes the old secondary index entries. The purge thread would
hit an assert when trying to build a secondary index entry for
searching. The problem was that the old value of those fields were not
in the UNDO record since they were not part of an index when the UPDATE
occured.
A test case was added to innodb-index.test.
The problem occurred when indexes are added between the time that an
UNDO record is created and the time that the purge thread comes around
and deletes the old secondary index entries. The purge thread would
hit an assert when trying to build a secondary index entry for
searching. The problem was that the old value of those fields were not
in the UNDO record since they were not part of an index when the UPDATE
occured.
A test case was added to innodb-index.test.
(also 5.5+ solution for bug#11766879/bug#60106)
The valgrind warning was due to an unused 'new handler_add_index(...)'
which was never freed.
The error handling did not work (fails as in bug#11766879) and
the implementation was not as transparant as it could, therefore I
made it a bit simpler and more transparant to the underlying handlers.
This way it follows the api better and the error handling works and
is also now tested.
Also added a debug test to verify the error handling.
Improved according to Jon Olavs review:
Added class ha_partition_add_index.
Also added base class Sql_alloc to handler_add_index.
Update 3.
Quick fix: run mysql-stress-test.pl via a wrapper test
Amend mtr to run just that test when using --stress
Updated mysql-stress-test.pl to exit(1) if wrong options
This fix was accidentally pushed to mysql-5.1 after the 5.1.59 clone-off in
bzr revision id marko.makela@oracle.com-20110829081642-z0w992a0mrc62s6w
with the fix of Bug#12704861 Corruption after a crash during BLOB update
but not merged to mysql-5.5 and upwards.
In the Barracuda formats, the clustered index record no longer
contains a prefix of off-page columns. Because of this, the undo log
must contain these prefixes, so that purge and multi-versioning will
continue to work. However, this also means that an undo log record can
become too big to fit in an undo log page. (It is a limitation of the
undo log that undo records cannot span across multiple pages.)
In case the checks for undo log size fail when CREATE TABLE or CREATE
INDEX is executed, we need a fallback that blocks a modification
operation when the undo log record would exceed the maximum size.
trx_undo_free_last_page_func(): Renamed from trx_undo_free_page_in_rollback().
Define the trx_t parameter only in debug builds.
trx_undo_free_last_page(): Wrapper for trx_undo_free_last_page_func().
Pass the trx_t parameter only in debug builds.
trx_undo_truncate_end_func(): Renamed from trx_undo_truncate_end().
Define the trx_t parameter only in debug builds. Rewrite a for(;;) loop
as a while loop for clarity.
trx_undo_truncate_end(): Wrapper for from trx_undo_truncate_end_func().
Pass the trx_t parameter only in debug builds.
trx_undo_erase_page_end(): Return TRUE if the page was non-empty
to begin with. Refuse to erase empty pages.
trx_undo_report_row_operation(): If the page for which the undo log
was too big was empty, free the undo page and return DB_TOO_BIG_RECORD.
rb:749 approved by Inaam Rana
The fix of Bug#12612184 broke crash recovery. When a record that
contains off-page columns (BLOBs) is updated, we must first write redo
log about the BLOB page writes, and only after that write the redo log
about the B-tree changes. The buggy fix would log the B-tree changes
first, meaning that after recovery, we could end up having a record
that contains a null BLOB pointer.
Because we will be redo logging the writes off the off-page columns
before the B-tree changes, we must make sure that the pages chosen for
the off-page columns are free both before and after the B-tree
changes. In this way, the worst thing that can happen in crash
recovery is that the BLOBs are written to free pages, but the B-tree
changes are not applied. The BLOB pages would correctly remain free in
this case. To achieve this, we must allocate the BLOB pages in the
mini-transaction of the B-tree operation. A further quirk is that BLOB
pages are allocated from the same file segment as leaf pages. Because
of this, we must temporarily "hide" any leaf pages that were freed
during the B-tree operation by "fake allocating" them prior to writing
the BLOBs, and freeing them again before the mtr_commit() of the
B-tree operation, in btr_mark_freed_leaves().
btr_cur_mtr_commit_and_start(): Remove this faulty function that was
introduced in the Bug#12612184 fix. The problem that this function was
trying to address was that when we did mtr_commit() the BLOB writes
before the mtr_commit() of the update, the new BLOB pages could have
overwritten clustered index B-tree leaf pages that were freed during
the update. If recovery applied the redo log of the BLOB writes but
did not see the log of the record update, the index tree would be
corrupted. The correct solution is to make the freed clustered index
pages unavailable to the BLOB allocation. This function is also a
likely culprit of InnoDB hangs that were observed when testing the
Bug#12612184 fix.
btr_mark_freed_leaves(): Mark all freed clustered index leaf pages of
a mini-transaction allocated (nonfree=TRUE) before storing the BLOBs,
or freed (nonfree=FALSE) before committing the mini-transaction.
btr_freed_leaves_validate(): A debug function for checking that all
clustered index leaf pages that have been marked free in the
mini-transaction are consistent (have not been zeroed out).
btr_page_alloc_low(): Refactored from btr_page_alloc(). Return the
number of the allocated page, or FIL_NULL if out of space. Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or if this is a "fake allocation"
(init_mtr=NULL) by btr_mark_freed_leaves(nonfree=TRUE).
btr_page_alloc(): Add the parameter init_mtr, allowing the page to be
initialized and X-latched in a different mini-transaction than the one
that is used for the allocation. Invoke btr_page_alloc_low(). If a
clustered index leaf page was previously freed in mtr, remove it from
the memo of previously freed pages.
btr_page_free(): Assert that the page is a B-tree page and it has been
X-latched by the mini-transaction. If the freed page was a leaf page
of a clustered index, link it by a MTR_MEMO_FREE_CLUST_LEAF marker to
the mini-transaction.
btr_store_big_rec_extern_fields_func(): Add the parameter alloc_mtr,
which is NULL (old behaviour in inserts) and the same as local_mtr in
updates. If alloc_mtr!=NULL, the BLOB pages will be allocated from it
instead of the mini-transaction that is used for writing the BLOBs.
fsp_alloc_from_free_frag(): Refactored from
fsp_alloc_free_page(). Allocate the specified page from a partially
free extent.
fseg_alloc_free_page_low(), fseg_alloc_free_page_general(): Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or NULL if this is a "fake allocation"
that prevents the reuse of a previously freed B-tree page for BLOB
storage. If init_mtr==NULL, try harder to reallocate the specified page
and assert that it succeeded.
fsp_alloc_free_page(): Add the parameter "mtr_t* init_mtr" for
specifying the mini-transaction where the page should be initialized.
Do not allow init_mtr == NULL, because this function is never to be
used for "fake allocations".
mtr_t: Add the operation MTR_MEMO_FREE_CLUST_LEAF and the flag
mtr->freed_clust_leaf for quickly determining if any
MTR_MEMO_FREE_CLUST_LEAF operations have been posted.
row_ins_index_entry_low(): When columns are being made off-page in
insert-by-update, invoke btr_mark_freed_leaves(nonfree=TRUE) and pass
the mini-transaction as the alloc_mtr to
btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
row_build(): Correct a comment, and add a debug assertion that a
record that contains NULL BLOB pointers must be a fresh insert.
row_upd_clust_rec(): When columns are being moved off-page, invoke
btr_mark_freed_leaves(nonfree=TRUE) and pass the mini-transaction as
the alloc_mtr to btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
buf_reset_check_index_page_at_flush(): Remove. The function
fsp_init_file_page_low() already sets
bpage->check_index_page_at_flush=FALSE.
There is a known issue in tablespace extension. If the request to
allocate a BLOB page leads to the tablespace being extended, crash
recovery could see BLOB writes to pages that are off the tablespace
file bounds. This should trigger an assertion failure in fil_io() at
crash recovery. The safe thing would be to write redo log about the
tablespace extension to the mini-transaction of the BLOB write, not to
the mini-transaction of the record update. However, there is no redo
log record for file extension in the current redo log format.
rb:693 approved by Sunny Bains
Also addressed issues in bug #11745133, where we could mark a table
corrupted instead of crashing the server when found a corrupted buffer/page
if the table created with innodb_file_per_table on.
Undo previous fix, it is not reliable
Drop setting $MYSQL_LIBDIR, mtr can't be sure anyway
Test is set to override plugin-dir to some known existing dir
for compressed InnoDB tables
ha_innodb::info_low(): For calculating data_length or index_length,
use the compressed page size for compressed tables instead of UNIV_PAGE_SIZE.
rb:714 approved by Sunny Bains
Before BUG#28796, an empty host was used to identify that an instance was no
longer a slave. However, BUG#28796 changed this behavior and one cannot set
an empty host. Besides, a RESET SLAVE only cleans up information on the next
event to retrieve from the master, disables ssl and resets heartbeat period.
So a call to SHOW SLAVE STATUS after issuing a RESET SLAVE still returns some
valid information, such as host, port, user and password.
To fix this problem, we have introduced the command RESET SLAVE ALL that does
what a regular RESET SLAVE does and also clears host, port, user and password
information thus allowing users to identify when an instance is no longer a
slave.
The server crashes if it processes table map events that are
corrupted, especially if they map different tables to the same
identifier. This could happen, for instance, due to BUG 56226.
We fix this by checking whether the table map has already been
mapped before actually applying the event. If it has been mapped
with different settings an error is raised and the slave SQL
thread stops. If it has been mapped with same settings the event
is skipped. If the table is set to be ignored by the filtering
rules, there is no change in behavior: the event is skipped and
ids are not checked.
mysql-test/suite/rpl/t/rpl_row_corruption.test:
Added a simple test case that checks both cases:
- multiple table maps with the same identifier
- multiple table maps with the same identifier, but only one
is processed (the others are filtered out)
Bug#12637786 was fixed with rb:692 by marko. But that fix has a remaining
bug. It added this assert;
ut_ad(ind_field->prefix_len);
before a section of code that assumes there is a prefix_len.
The patch replaced code that explicitly avoided this with a check for
prefix_len. It turns out that the purge thread can get to that assert
without a prefix_len because it does not use a row_ext_t* .
When UNIV_DEBUG is not defined, the affect of this is that the purge thread
sets the dfield->len to zero and then cannot find the entry in the index to
purge. So secondary index entries remain unpurged.
This patch does not do the assert. Instead, it uses
'if (ind_field->prefix_len) {...}'
around the section of code that assumes a prefix_len. This is the way the
patch I provided to Marko did it.
The test case is simply modified to do a sleep(10) in order to give the
purge thread a chance to run. Without the code change to row0row.c, this
modified testcase will assert if InnoDB was compiled with UNIV_DEBUG.
I tried to sleep(5), but it did not always assert.
bug. It added this assert;
ut_ad(ind_field->prefix_len);
before a section of code that assumes there is a prefix_len.
The patch replaced code that explicitly avoided this with a check for
prefix_len. It turns out that the purge thread can get to that assert
without a prefix_len because it does not use a row_ext_t* .
When UNIV_DEBUG is not defined, the affect of this is that the purge thread
sets the dfield->len to zero and then cannot find the entry in the index to
purge. So secondary index entries remain unpurged.
This patch does not do the assert. Instead, it uses
'if (ind_field->prefix_len) {...}'
around the section of code that assumes a prefix_len. This is the way the
patch I provided to Marko did it.
The test case is simply modified to do a sleep(10) in order to give the
purge thread a chance to run. Without the code change to row0row.c, this
modified testcase will assert if InnoDB was compiled with UNIV_DEBUG.
I tried to sleep(5), but it did not always assert.
row_build_index_entry(): In innodb_file_format=Barracuda
(ROW_FORMAT=DYNAMIC or ROW_FORMAT=COMPRESSED), a secondary index on a
full column can refer to a field that is stored off-page in the
clustered index record. Take that into account.
rb:692 approved by Jimmy Yang
With this change, the index prefix column length lifted from 767 bytes
to 3072 bytes if "innodb_large_prefix" is set to "true".
rb://603 approved by Marko
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
Attempt to update an InnoDB temporary table under LOCK TABLES
led to assertion failure in both debug and production builds
if this temporary table was explicitly locked for READ. The
same scenario works fine for MyISAM temporary tables.
The assertion failure was caused by discrepancy between lock
that was requested on the rows of temporary table at LOCK TABLES
time and by update operation. Since SQL-layer requested a
read-lock at LOCK TABLES time InnoDB engine assumed that upcoming
statements which are going to be executed under LOCK TABLES will
only read table and therefore should acquire only S-lock.
An update operation broken this assumption by requesting X-lock.
Possible approaches to fixing this problem are:
1) Skip locking of temporary tables as locking doesn't make any
sense for connection-local objects.
2) Prohibit changing of temporary table locked by LOCK TABLES ...
READ.
Unfortunately both of these approaches have drawbacks which make
them unviable for stable versions of server.
So this patch takes another approach and changes code in such way
that LOCK TABLES for a temporary table will always request write
lock. In 5.5 version of this patch switch from read lock to write
lock is done on SQL-layer.
mysql-test/suite/innodb/r/innodb_mysql.result:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb/t/innodb_mysql.test:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
sql/sql_parse.cc:
Since a temporary table locked by LOCK TABLES can be updated even
if it was only locked for read we always request TL_WRITE locks
for such tables at LOCK TABLES time. This allows to avoid
discrepancy between locks acquired at LOCK TABLES time and by
a statement executed under LOCK TABLES. Such a discrepancy has
caused problems for InnoDB storage engine.
To support this change a part of code implementing LOCK TABLES
has been moved to a helper function.
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
Attempt to update an InnoDB temporary table under LOCK TABLES
led to assertion failure in both debug and production builds
if this temporary table was explicitly locked for READ. The
same scenario works fine for MyISAM temporary tables.
The assertion failure was caused by discrepancy between lock
that was requested on the rows of temporary table at LOCK TABLES
time and by update operation. Since SQL-layer requested a
read-lock at LOCK TABLES time InnoDB engine assumed that upcoming
statements which are going to be executed under LOCK TABLES will
only read table and therefore should acquire only S-lock.
An update operation broken this assumption by requesting X-lock.
Possible approaches to fixing this problem are:
1) Skip locking of temporary tables as locking doesn't make any
sense for connection-local objects.
2) Prohibit changing of temporary table locked by LOCK TABLES ...
READ.
Unfortunately both of these approaches have drawbacks which make
them unviable for stable versions of server.
So this patch takes another approach and changes code in such way
that LOCK TABLES for a temporary table will always request write
lock. In 5.1 version of this patch switch from read lock to write
lock is done inside of InnoDBs handler methods as doing it on
SQL-layer causes compatibility troubles with FLUSH TABLES WITH
READ LOCK.
mysql-test/suite/innodb/r/innodb_mysql.result:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb/t/innodb_mysql.test:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb_plugin/r/innodb_mysql.result:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb_plugin/t/innodb_mysql.test:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
storage/innobase/handler/ha_innodb.cc:
Assume that a temporary table locked by LOCK TABLES can be updated
even if it was only locked for read and therefore an X-lock should
be always requested for such tables.
storage/innodb_plugin/handler/ha_innodb.cc:
Assume that a temporary table locked by LOCK TABLES can be updated
even if it was only locked for read and therefore an X-lock should
be always requested for such tables.
Problem: MYSQL_BIN_LOG::reset_logs acquires mutexes in wrong order.
The correct order is first LOCK_thread_count and then LOCK_log. This function
does it the other way around. This leads to deadlock when run in parallel
with a thread that takes the two locks in correct order. For example, a thread
that disconnects will take the locks in the correct order.
Fix: change order of the locks in MYSQL_BIN_LOG::reset_logs:
first LOCK_thread_count and then LOCK_log.
mysql-test/suite/binlog/r/binlog_reset_master.result:
added result file
mysql-test/suite/binlog/t/binlog_reset_master.test:
Added test case that demonstrates deadlock because of wrong mutex order.
The deadlock is between two threads:
- RESET MASTER acquires mutexes in wrong order.
- client thread shutdown code acquires mutexes in right order.
Actually, this test case does not produce deadlock in 5.1, probably
the client thread shutdown code does not hold both mutexes at the same
time. However, the bug existed in 5.1 (mutexes are taken in the wrong
order) so we push the test case to 5.1 too, to prevent future
regressions.
sql/log.cc:
Change mutex acquisition to the correct order:
first LOCK_thread_count, then LOCK_log.
sql/mysqld.cc:
Add debug code to synchronize test case.
Manual merged mysql-5.1-gca into latest mysql-5.5.
Conflicts
=========
Text conflict in mysql-test/suite/rpl/r/rpl_relayspace.result
Text conflict in mysql-test/suite/rpl/t/rpl_relayspace.test
VM-WIN2003-32-A, SLES10-IA64-A
The test case waits for master_pos_wait not to timeout, which
means that the deadlock between SQL and IO threads was
succesfully and automatically dealt with.
However, very rarely, master_pos_wait reports a timeout. This
happens because the time set for master_pos_wait to wait was
too small (6 seconds). On slow test env this could be a
problem.
We fix this by setting the timeout inline with the one used
in sync_slave_with_master (300 seconds). In addition we
refactored the test case and refined some comments.
.-> USING PASSWORD: NO
The server was always setting the flag for using password to NO and
then relying on the server authentication plugin to update it if it uses
a password.
This creates compatibility problems with 5.1 when rejecting a
nonexistent user login.
Set the default for the password supplied flag for non-existing users
as the default plugin (native password authentication) would do it
for compatibility reasons.
Test case added.
federated.result updated with the correct error message.
Before this fix, the test performance_schema.relaylog would fail
with sporadic failures related to statistics on update_cond.
The reason for these failures is that thread scheduling makes
impossible to predict if instrumented conditions will be used on not.
The fix is to relax the test case, to not collect statistics about:
- wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond
- wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond
DB_COL_APPEARS_TWICE_IN_INDEX: Remove. This condition is already
checked and reported by MySQL before passing the index definition to
the storage engine.
row_create_index_for_mysql(): Remove the redundant check for
DB_COL_APPEARS_TWICE_IN_INDEX. When enforcing the column prefix index
limit, invoke dict_mem_index_free(index) to plug the memory leak. In
the loop, use index->n_def instead of dict_index_get_n_fields(index),
because the latter would be 0 for indexes that have not been copied to
the data dictionary cache.
innodb-use-sys-malloc.test:
Add test cases for attempting to trigger the error checks in
row_create_index_for_mysql(). Before MySQL 5.5 and WL#5743, the leak
is only reproducible if ha_innobase::max_supported_key_part_length()
returned a higher limit than the one used in
row_create_index_for_mysql().
In MySQL 5.5 and later, the leak is reproducible with
innodb_large_prefix=true.
rb:688 approved by Jimmy Yang
Fix a failure of the re-enabled innodb-index.test in the embedded server.
Apparently, the embedded server does not default to ENGINE=InnoDB when
copying an InnoDB table by CREATE TABLE t2 SELECT * FROM t1;
Re-enable a test that was disabled as collateral damage.
Starting with MySQL 5.5, queries will acquire and hold a shared meta-data lock
(MDL) on tables they process, until the transaction is committed or
rolled back. This will prevent DDL operations on the tables, such as creating
an index.
innodb-index.test: Use a second table for creating the index. The index will
still be "too new" for the transaction that was started before the index
creation was started.
Bug 12430414 - THE TEST PERFSCHEMA.SELECTS.TEST CAN AFFECT SUCCEEDING TESTS
Bug 12430599 - THE TEST PERFSCHEMA.ONE_THREAD_PER_CON. CAN AFFECT SUCCEEDING TESTS
Bug 12431153 - THE TEST PERFSCHEMA.PFS_UPGRADE CAN AFFECT SUCCEEDING TEST
This assert could be triggered during two phase commit if binary
log was used as transaction coordinator log. The triggered assert
checks that the same number of transaction IDs are processed in
the prepare and commit phases.
The reason it was triggered, was that the transaction consisted
of an INSERT/UPDATE IGNORE that had an ignorable error. Since it
had an error, no row log events were made and therefore
prepared_xids was 0. However, since it was an IGNORE statement,
the statement started a read/write statement transaction, committed
it and completed successfully.
This patch fixes the problem by adjusting the assert to take
this possibility into account.
Test case added to binlog.binlog_innodb_row.test.
If LOAD DATA INFILE featured a SET clause, the name=value pairs
would be regenerated using item::print. Unfortunately, that code
is mostly optimized for EXPLAIN EXTENDED output and such, and can
not be relied on to return valid SQL.
We now name each value its original, user-supplied form and use
that to create LOAD DATA INFILE statements for statement-based
replication.
mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result:
minor change in syntactic sugar
mysql-test/suite/rpl/r/rpl_loaddatalocal.result:
add test case
mysql-test/suite/rpl/t/rpl_loaddatalocal.test:
add test case
sql/sql_load.cc:
Do not try to item::print values in LOAD DATA INFILE's
SET clause; they might not even be valid SQL at this
point. Use our saved version instead.
sql/sql_yacc.yy:
If LOAD DATA INFILE has SET name=val clauses, tag the
individual val-parts with the user's version so we can
later replicate that, rather than the smashed pieces
we'd get from item::print once the optimizer's through
with our poor values.