Merge up to sunny.bains@oracle.com-20100625081841-ppulnkjk1qlazh82 .
There are 8 more changesets in mysql-5.1-innodb, but PB2 shows a
failure for a test added in one of them. If that is resolved quickly
then those 8 more changesets will be merged too.
and clarifies the invariant in dict_table_get_on_id().
In Mar 2007 Marko observed a crash during recovery, the crash resulted from
an UNDO operation on a system table. His solution was to acquire an X lock on
the data dictionary, this in hindsight was an overkill. It is unclear what
caused the crash, current hypothesis is that it was a memory corruption.
The X lock results in performance issues by when undoing changes due to
rollback during normal operation on regular tables.
Why the change is safe:
======================
The InnoDB code has changed since the original X lock change was made. In the
new code we always lock the data dictionary in X mode during startup when
UNDOing operations on the system tables (this is a given). This ensures that
the crash Marko observed cannot happen as long as all transactions that update
the system tables follow the standard rules by setting the appropriate DICT_OP
flag when writing the log records when they make the changes.
If transactions violate the above mentioned rule then during recovery (at
startup) the rollback code (see trx0roll.c) will not acquire the X lock
and we will see the crash again. This will however be a different bug.
Valgrind warning happpens because of uninitialized null bytes.
In row_sel_push_cache_row_for_mysql() function we fill fetch cache
with necessary field values, row_sel_store_mysql_rec() is called
for this and leaves null bytes untouched.
Later row_sel_pop_cached_row_for_mysql() rewrites table record
buffer with uninited null bytes. We can see the problem from the
test case:
At 'SELECT...' we call row_sel_push...->row_sel_store...->row_sel_pop_cached...
chain which rewrites table->record[0] buffer with uninitialized null bytes.
When we call 'UPDATE...' statement, compare_record uses this buffer and
valgrind warning occurs.
The fix is to init null bytes with default values.
mysql-test/suite/innodb/r/innodb_mysql.result:
test case
mysql-test/suite/innodb/t/innodb_mysql.test:
test case
mysql-test/t/ps_3innodb.test:
enable valgrind testing
storage/innobase/row/row0sel.c:
init null bytes with default values as they might be
left uninitialized in some cases and these uninited bytes
might be copied into mysql record buffer that leads to
valgrind warnings on next use of the buffer.
In semi-consistent read, only unlock freshly locked non-matching records.
Define DB_SUCCESS_LOCKED_REC for indicating a successful operation
where a record lock was created.
lock_rec_lock_fast(): Return LOCK_REC_SUCCESS,
LOCK_REC_SUCCESS_CREATED, or LOCK_REC_FAIL instead of TRUE/FALSE.
lock_sec_rec_read_check_and_lock(),
lock_clust_rec_read_check_and_lock(), lock_rec_enqueue_waiting(),
lock_rec_lock_slow(), lock_rec_lock(), row_ins_set_shared_rec_lock(),
row_ins_set_exclusive_rec_lock(), sel_set_rec_lock(),
row_sel_get_clust_rec_for_mysql(): Return DB_SUCCESS_LOCKED_REC if a
new record lock was created. Adjust callers.
row_unlock_for_mysql(): Correct the function documentation.
row_prebuilt_t::new_rec_locks: Correct the documentation.
BUILD/*: Add valgrind_configs=--with-valgrind.
BUILD/*: Remove -USAFEMALLOC from valgrind_flags.
configure.in: Add AC_ARG_WITH(valgrind) and HAVE_VALGRIND.
include/my_sys.h: Define a number of MEM_ wrappers for VALGRIND_ functions.
include/my_sys.h: Make TRASH do MEM_UNDEFINED().
include/m_string.h: Remove unused macro bzero_if_purify(A,B).
_mymalloc(): Declare MEM_UNDEFINED() on the allocated memory.
_myfree(): Declare MEM_NOACCESS() on the freed memory.
storage/innobase/include/univ.i: Enable UNIV_DEBUG_VALGRIND based on
HAVE_VALGRIND rather than HAVE_purify.
Possible things to do:
* In my_global.h, remove the defined(HAVE_purify) condition
from the _WIN32 uint3korr().
* In my_global.h *int*korr(), use | instead of +
in order to keep the Valgrind V bits accurate
* Consider replacing HAVE_purify with HAVE_VALGRIND
* Use VALGRIND_CREATE_BLOCK, VALGRIND_DISCARD in mem_root and similar places
------------------------------------------------------------
revno: 3094
revision-id: vasil.dimov@oracle.com-20100513074652-0cvlhgkesgbb2bfh
parent: vasil.dimov@oracle.com-20100512173700-byf8xntxjur1hqov
committer: Vasil Dimov <vasil.dimov@oracle.com>
branch nick: mysql-trunk-innodb
timestamp: Thu 2010-05-13 10:46:52 +0300
message:
Followup to Bug#51920, fix binlog.binlog_killed
This is a followup to the fix of
Bug#51920 Innodb connections in row lock wait ignore KILL until lock wait
timeout
in that fix (rb://279) the behavior was changed to honor when a trx is
interrupted during lock wait, but the returned error code was still
"lock wait timeout" when it should be "interrupted".
This change fixes the non-deterministically failing test binlog.binlog_killed,
that failed like this:
binlog.binlog_killed 'stmt' [ fail ]
Test ended at 2010-05-12 11:39:08
CURRENT_TEST: binlog.binlog_killed
mysqltest: At line 208: query 'reap' failed with wrong errno 1205: 'Lock wait timeout exceeded; try restarting transaction', instead of 0...
Approved by: Sunny Bains (rb://344)
------------------------------------------------------------
This merge is non-trivial since it has to introduce the DB_INTERRUPTED
error code.
Also revert vasil.dimov@oracle.com-20100408165555-9rpjh24o0sa9ad5y
which adjusted the binlog.binlog_killed test to the new (wrong) behavior
Also make InnoDB thinks that /*/ only starts a comment. (Bug #53644).
struct trx_struct: Add mysql_query_len.
ha_innodb.cc: Use trx_query_string() instead of trx_query() and
initialize trx->mysql_query_len.
INNOBASE_COPY_STMT(thd, trx): New macro, to initialize
trx->mysql_query_str and trx->mysql_query_len.
dict_strip_comments(): Add and observe the parameter sql_length. Treat
/*/ as the start of a comment.
dict_create_foreign_constraints(), row_table_add_foreign_constraints():
Add the parameter sql_length.
buf_flush_insert_into_flush_list(),
buf_flush_insert_sorted_into_flush_list(),
buf_flush_post_to_doublewrite_buf(): Check that the page is initialized.
------------------------------------------------------------------------
r6103 | marko | 2009-10-26 15:46:18 +0200 (Mon, 26 Oct 2009) | 4 lines
Changed paths:
M /branches/zip/row/row0ins.c
branches/zip: row_ins_alloc_sys_fields(): Zero out the system columns
DB_TRX_ID, DB_ROLL_PTR and DB_ROW_ID, in order to avoid harmless
Valgrind warnings about uninitialized data. (The warnings were
harmless, because the fields would be initialized at a later stage.)
------------------------------------------------------------------------
Detailed revision comments:
r6669 | jyang | 2010-02-11 12:24:19 +0200 (Thu, 11 Feb 2010) | 7 lines
branches/5.1: Fix bug #50691, AIX implementation of readdir_r
causes InnoDB errors. readdir_r() returns an non-NULL value
in the case of reaching the end of a directory. It should
not be treated as an error return.
rb://238 approved by Marko
Detailed revision comments:
r6613 | inaam | 2010-02-09 20:23:09 +0200 (Tue, 09 Feb 2010) | 11 lines
branches/5.1: Fix Bug #38901
InnoDB logs error repeatedly when trying to load page into buffer pool
In buf_page_get_gen() if we are unable to read a page (because of
corruption or some other reason) we keep on retrying. This fills up
error log with millions of entries in no time and we'd eventually run
out of disk space. This patch limits the number of attempts that we
make (currently set to 100) and after that we abort with a message.
rb://241 Approved by: Heikki
Detailed revision comments:
r6545 | jyang | 2010-02-03 03:57:32 +0200 (Wed, 03 Feb 2010) | 8 lines
branches/5.1: Fix bug #49001, "SHOW INNODB STATUS deadlock info
incorrect when deadlock detection aborts". Print the correct
lock owner when recursive function lock_deadlock_recursive()
exceeds its maximum depth LOCK_MAX_DEPTH_IN_DEADLOCK_CHECK.
rb://217, approved by Marko.
Detailed revision comments:
r6538 | sunny | 2010-01-30 00:43:06 +0200 (Sat, 30 Jan 2010) | 6 lines
branches/5.1: Check *first_value every time against the column max
value and set *first_value to next autoinc if it's > col max value.
ie. not rely on what is passed in from MySQL.
[49497] Error 1467 (ER_AUTOINC_READ_FAILED) on inserting a negative value
rb://236
Detailed revision comments:
r6536 | sunny | 2010-01-30 00:13:42 +0200 (Sat, 30 Jan 2010) | 6 lines
branches/5.1: Check *first_value everytime against the column max
value and set *first_value to next autoinc if it's > col max value.
ie. not rely on what is passed in from MySQL.
[49497] Error 1467 (ER_AUTOINC_READ_FAILED) on inserting a negative value
rb://236
Detailed revision comments:
r6535 | sunny | 2010-01-30 00:08:40 +0200 (Sat, 30 Jan 2010) | 11 lines
branches/5.1: Undo the change from r6424. We need to return DB_SUCCESS even
if we were unable to initialize the tabe autoinc value. This is required for
the open to succeed. The only condition we currently treat as a hard error
is if the autoinc field instance passed in by MySQL is NULL.
Previously if the table autoinc value was 0 and the next value was requested
we had an assertion that would fail. Change that assertion and treat a value
of 0 to mean that the autoinc system is unavailable. Generation of next
value will now return failure.
rb://237
Detailed revision comments:
r6424 | marko | 2010-01-12 12:22:19 +0200 (Tue, 12 Jan 2010) | 16 lines
branches/5.1: In innobase_initialize_autoinc(), do not attempt to read
the maximum auto-increment value from the table if
innodb_force_recovery is set to at least 4, so that writes are
disabled. (Bug #46193)
innobase_get_int_col_max_value(): Move the function definition before
ha_innobase::innobase_initialize_autoinc(), because that function now
calls this function.
ha_innobase::innobase_initialize_autoinc(): Change the return type to
void. Do not attempt to read the maximum auto-increment value from
the table if innodb_force_recovery is set to at least 4. Issue
ER_AUTOINC_READ_FAILED to the client when the auto-increment value
cannot be read.
rb://144 by Sunny, revised by Marko
Detailed revision comments:
r6422 | marko | 2010-01-12 11:34:27 +0200 (Tue, 12 Jan 2010) | 3 lines
branches/5.1: Non-functional change:
Make innobase_get_int_col_max_value() a static function.
It does not access any fields of class ha_innobase.
Detailed revision comments:
r6421 | jyang | 2010-01-12 07:59:16 +0200 (Tue, 12 Jan 2010) | 8 lines
branches/5.1: Fix bug #49238: Creating/Dropping a temporary table
while at 1023 transactions will cause assert. Handle possible
DB_TOO_MANY_CONCURRENT_TRXS when deleting metadata in
row_drop_table_for_mysql().
rb://220, approved by Marko
and also applying 5.1-ss6355
Detailed revision comments:
r6324 | jyang | 2009-12-17 06:54:24 +0200 (Thu, 17 Dec 2009) | 8 lines
branches/5.1: Fix bug #47814 - Diagnostics are frequently not
printed after a long lock wait in InnoDB. Separate out the
lock wait timeout check thread from monitor information
printing thread.
rb://200 Approved by Marko.
r6349 | marko | 2009-12-22 11:09:54 +0200 (Tue, 22 Dec 2009) | 3 lines
branches/5.1: lock_print_info_summary(): Remove a reference to
innobase_mysql_end_print_arbitrary_thd() that should have been
removed in r6347 when removing the function.
r6350 | marko | 2009-12-22 11:11:09 +0200 (Tue, 22 Dec 2009) | 1 line
branches/5.1: Remove an obsolete declaration of LOCK_thread_count.
not address the printouts issue
Detailed revision comments:
r6310 | marko | 2009-12-15 15:23:54 +0200 (Tue, 15 Dec 2009) | 30 lines
branches/5.1: Merge r4922 from branches/zip.
This the fix for the first part of Bug #41609 from InnoDB Plugin to
the built-in InnoDB in MySQL 5.1. This allows InnoDB Hot Backup to
back up a database while the built-in InnoDB in MySQL 5.1 is creating
temporary tables. (This fix does not address the printouts about
missing .ibd files for temporary tables at InnoDB startup, which was
committed to branches/zip in r6252.)
rb://219 approved by Sunny Bains.
branches/zip: Distinguish temporary tables in MLOG_FILE_CREATE.
This addresses Mantis Issue #23 in InnoDB Hot Backup and some
of MySQL Bug #41609.
In MLOG_FILE_CREATE, we need to distinguish temporary tables, so that
InnoDB Hot Backup can work correctly. It turns out that we can do this
easily, by using a bit of the previously unused parameter for page number.
(The page number parameter of MLOG_FILE_CREATE has been written as 0
ever since MySQL 4.1, which introduced MLOG_FILE_CREATE.)
MLOG_FILE_FLAG_TEMP: A flag for indicating a temporary table in
the page number parameter of MLOG_FILE_ operations.
fil_op_write_log(): Add the parameter log_flags.
fil_op_log_parse_or_replay(): Add the parameter log_flags.
Do not replay MLOG_FILE_CREATE when MLOG_FILE_FLAG_TEMP is set in log_flags.
This only affects ibbackup --apply-log. InnoDB itself never replays file
operations.
1. BUG#47720 - REPLACE INTO Autoincrement column with negative values.
Detailed revision comments:
r6235 | sunny | 2009-11-26 01:14:42 +0200 (Thu, 26 Nov 2009) | 9 lines
branches/5.1: Fix Bug#47720 - REPLACE INTO Autoincrement column with negative values.
This bug is similiar to the negative autoinc filter patch from earlier,
with the additional handling of filtering out the negative column values
set explicitly by the user.
rb://184
Approved by Heikki.
1. BUG#49032 - auto_increment field does not initialize to last value
in InnoDB Storage Engine
2. Fix whitespace issues and fix tests and make read float/double arg const
Detailed revision comments:
r6231 | sunny | 2009-11-25 10:26:27 +0200 (Wed, 25 Nov 2009) | 7 lines
branches/5.1: Fix BUG#49032 - auto_increment field does not initialize to last value in InnoDB Storage Engine.
We use the appropriate function to read the column value for non-integer
autoinc column types, namely float and double.
rb://208. Approved by Marko.
r6232 | sunny | 2009-11-25 10:27:39 +0200 (Wed, 25 Nov 2009) | 2 lines
branches/5.1: This is an interim fix, fix white space errors.
r6233 | sunny | 2009-11-25 10:28:35 +0200 (Wed, 25 Nov 2009) | 2 lines
branches/5.1: This is an interim fix, fix tests and make read float/double arg const.
r6234 | sunny | 2009-11-25 10:29:03 +0200 (Wed, 25 Nov 2009) | 2 lines
branches/5.1: This is an interim fix, fix whitepsace issues.
1. BUG#45961 - DDL on partitioned innodb tables leaves data dictionary
in an inconsistent state
2. Fix formatting
Detailed revision comments:
r6205 | jyang | 2009-11-20 07:55:48 +0200 (Fri, 20 Nov 2009) | 11 lines
branches/5.1: Add a special case to handle the Duplicated Key error
and return DB_ERROR instead. This is to avoid a possible SIGSEGV
by mysql error handling re-entering the storage layer for dup key
info without proper table handle.
This is to prevent a server crash when error situation in bug
#45961 "DDL on partitioned innodb tables leaves data dictionary
in an inconsistent state" happens.
rb://157 approved by Sunny Bains.
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 3 lines
branches/5.1: Non-functional change, fix formatting.
1. BUG#48526 - Data type for float and double is incorrectly
reported in InnoDB table monitor
Detailed revision comments:
r6188 | jyang | 2009-11-18 07:14:23 +0200 (Wed, 18 Nov 2009) | 8 lines
branches/5.1: Fix bug #48526 "Data type for float and
double is incorrectly reported in InnoDB table monitor".
Certain datatypes are not printed correctly in
dtype_print().
rb://204 Approved by Marko.
1. BUG#4869 - when innodb tablespace is configured too small,
crash and corruption!
Detailed revision comments:
r6187 | jyang | 2009-11-18 05:27:30 +0200 (Wed, 18 Nov 2009) | 9 lines
branches/5.1: Fix bug #48469 "when innodb tablespace is
configured too small, crash and corruption!". Function
btr_create() did not check the return status of fseg_create(),
and continue the index creation even there is no sufficient
space.
rb://205 Approved by Marko
r6200 | vasil | 2009-11-19 12:14:23 +0200 (Thu, 19 Nov 2009) | 4 lines
branches/5.1:
White space fixup - indent under the opening (
r6203 | jyang | 2009-11-19 15:12:22 +0200 (Thu, 19 Nov 2009) | 8 lines
branches/5.1: Use btr_free_root() instead of fseg_free() for
the fix of bug #48469, because fseg_free() is not defined
in the zip branch. And we could save one mini-trasaction started
by fseg_free().
Approved by Marko.
1. BUG#3139 - Mysql crashes: "windows error 995" after several
selects on a large DB
Detailed revision comments:
r6154 | calvin | 2009-11-11 02:51:17 +0200 (Wed, 11 Nov 2009) | 17 lines
branches/5.1: fix bug#3139: Mysql crashes: 'windows error 995'
after several selects on a large DB
During stress environment, Windows AIO may fail with error code
ERROR_OPERATION_ABORTED. InnoDB does not handle the error, rather
crashes. The cause of the error is unknown, but likely due to
faulty hardware or driver.
This patch introduces a new error code OS_FILE_OPERATION_ABORTED,
which maps to Windows ERROR_OPERATION_ABORTED (995). When the error
is detected during AIO, the InnoDB will issue a synchronous retry
(read/write).
This patch has been extensively tested by MySQL support.
Approved by: Marko
rb://196
1. BUG#32430 - 'show innodb status' causes errors Invalid (old?) table
or database name in logs
2. White space fixup
Detailed revision comments:
r6136 | marko | 2009-11-04 12:28:10 +0200 (Wed, 04 Nov 2009) | 15 lines
branches/5.1: Port r6134 from branches/zip:
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 07:57:29 +0000 (Wed, 04 Nov 2009) | 5 lines
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
innobase_print_identifier(): Replace with innobase_convert_name().
innobase_convert_identifier(): New function, called by innobase_convert_name().
r6152 | vasil | 2009-11-10 15:30:20 +0200 (Tue, 10 Nov 2009) | 4 lines
branches/5.1:
White space fixup.