This is the 5.5 version of the fix. The 5.1 version was too complicated to
merge and was null merged.
This is a regression from the fix for bug no 38999. A storage engine capable
of reading only a subset of a table's columns updates corresponding bits in
the read buffer to signal that it has read NULL values for the corresponding
columns. It cannot, and should not, update any other bits. Bug no 38999
occurred because the implementation of UPDATE statements compare the NULL bits
using memcmp, inadvertently comparing bits that were never requested from the
storage engine. The regression was caused by the storage engine trying to
alleviate the situation by writing to all NULL bits, even those that it had no
knowledge of. This has devastating effects for the index merge algorithm,
which relies on all NULL bits, except those explicitly requested, being left
unchanged.
The fix reverts the fix for bug no 38999 in both InnoDB and InnoDB plugin and
changes the server's method of comparing records. For engines that always read
entire rows, we proceed as usual. For engines capable of reading only select
columns, the record buffers are now compared on a column by column basis. An
assertion was also added so that non comparable buffers are never read. Some
relevant copy-pasted code was also consolidated in a new function.
This is a manual merge from mysql-5.1-innodb of:
revno: 3618
revision-id: vasil.dimov@oracle.com-20100930124844-yglojy7c3vaji6dx
parent: vasil.dimov@oracle.com-20100930102618-s9f9ytbytr3eqw9h
committer: Vasil Dimov <vasil.dimov@oracle.com>
branch nick: mysql-5.1-innodb
timestamp: Thu 2010-09-30 15:48:44 +0300
message:
Fix Bug#56340 innodb updates index stats too frequently after non-index updates
This is a simple optimization issue. All stats are related to only indexed
columns, index size or number of rows in the whole table. UPDATEs that touch
only non-indexed columns cannot affect stats and we can avoid calling the
function row_update_statistics_if_needed() which may result in unnecessary I/O.
Approved by: Marko (rb://466)
In addition to the above message: we know that
row_update_cascade_for_mysql() (the other place where
row_update_statistics_if_needed is called) always updates indexed
columns (FK-related), so there is no need to add this cond there.
revno: 3545
revision-id: marko.makela@oracle.com-20100818110110-zfs0i1vfrccfb4yw
parent: vasil.dimov@oracle.com-20100817193934-1yl7zz2odikxauf8
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Wed 2010-08-18 14:01:10 +0300
message:
Bug#55626: MIN and MAX reading a delete-marked record from secondary index
Remove a bogus debug assertion that triggered the bug.
Add assertions precisely where records must not be delete-marked.
And a comment to clarify when the record is allowed to be delete-marked.
------------------------------------------------------------
revno: 3476
committer: Sunny Bains <Sunny.Bains@Oracle.Com>
branch nick: 5.1-security
timestamp: Thu 2010-08-05 19:18:17 +1000
message:
Fix bug# 55543 - InnoDB Plugin: Signal 6: Assertion failure in file fil/fil0fil.c line 4306
The bug is due to a double delete of a BLOB, once via:
rollback -> btr_cur_pessimistic_delete()
and the second time via purge.
The bug is in row_upd_clust_rec_by_insert(). There we relinquish ownership
of the non-updated BLOB columns in btr_cur_mark_extern_inherited_fields()
before building the row entry that will be inserted and whose contents will
be logged in the UNDO log. However, we don't set the BLOB column later to
INHERITED so that a possible rollback will not free the original row's
non-updated BLOB entries. This is because the condition that checks for
that is in :
if (node->upd_ext) {}.
node->upd_ext is non-NULL only if a BLOB column was updated and that column
is part of some key ordering (see row_upd_replace()). This results in the
non-update BLOB columns being deleted during a rollback and subsequently by
purge again.
rb://413
This change is for performance optimization.
Fixed the performance schema instrumentation interface as follows:
- simplified mysql_unlock_mutex()
- simplified mysql_unlock_rwlock()
- simplified mysql_cond_signal()
- simplified mysql_cond_broadcast()
Changed the get_thread_XXX_locker apis to have one extra parameter,
to provide memory to the instrumentation implementation.
This API change allows to use memory provided by the caller,
to avoid having to use thread local storage.
Using this extra parameter will be done in a separate fix,
this change is for the interface only.
Adjusted all the code and unit tests accordingly.
------------------------------------------------------------
revno: 3533
revision-id: marko.makela@oracle.com-20100630093149-wmc37t128gic933v
parent: marko.makela@oracle.com-20100629131219-pjbkpk5rsqztmw27
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Wed 2010-06-30 12:31:49 +0300
message:
Correct some comments that were added in the fix of Bug #54358
(READ UNCOMMITTED access failure of off-page DYNAMIC or COMPRESSED columns).
Records that lack incompletely written externally stored columns may
be accessed by READ UNCOMMITTED transaction even without involving a
crash during an INSERT or UPDATE operation. I verified this as follows.
(1) added a delay after the mini-transaction for writing the clustered
index 'stub' record was committed (patch attached)
(2) started mysqld in gdb, setting breakpoints to the where the
assertions about READ UNCOMMITTED were added in the bug fix
(3) invoked ibtest3 --create-options=key_block_size=2
to create BLOBs in a COMPRESSED table
(4) invoked the following:
yes 'set transaction isolation level read uncommitted;
checksum table blobt3;select sleep(1);'|mysql -uroot test
(5) noted that one of the breakpoints was triggered
(return(NULL) in btr_rec_copy_externally_stored_field())
=== modified file 'storage/innodb_plugin/row/row0ins.c'
--- storage/innodb_plugin/row/row0ins.c 2010-06-30 08:17:25 +0000
+++ storage/innodb_plugin/row/row0ins.c 2010-06-30 08:17:25 +0000
@@ -2120,6 +2120,7 @@ function_exit:
rec_t* rec;
ulint* offsets;
mtr_start(&mtr);
+ os_thread_sleep(5000000);
btr_cur_search_to_nth_level(index, 0, entry, PAGE_CUR_LE,
BTR_MODIFY_TREE, &cursor, 0,
=== modified file 'storage/innodb_plugin/row/row0upd.c'
--- storage/innodb_plugin/row/row0upd.c 2010-06-30 08:11:55 +0000
+++ storage/innodb_plugin/row/row0upd.c 2010-06-30 08:11:55 +0000
@@ -1763,6 +1763,7 @@ row_upd_clust_rec(
rec_offs_init(offsets_);
mtr_start(mtr);
+ os_thread_sleep(5000000);
ut_a(btr_pcur_restore_position(BTR_MODIFY_TREE, pcur, mtr));
rec = btr_cur_get_rec(btr_cur);
------------------------------------------------------------
revno: 3531
revision-id: marko.makela@oracle.com-20100629130058-1ilqaj51u9sj9vqe
parent: marko.makela@oracle.com-20100629125653-t799e5x30h31cvrd
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Tue 2010-06-29 16:00:58 +0300
message:
Bug#54408: txn rollback after recovery: row0umod.c:673
dict_table_get_format(index->table)
The REDUNDANT and COMPACT formats store a local 768-byte prefix of
each externally stored column. No row_ext cache is needed, but we
initialized one nevertheless. When the BLOB pointer was zero, we would
ignore the locally stored prefix as well. This triggered an assertion
failure in row_undo_mod_upd_exist_sec().
row_build(): Allow ext==NULL when a REDUNDANT or COMPACT table
contains externally stored columns.
row_undo_search_clust_to_pcur(), row_upd_store_row(): Invoke
row_build() with ext==NULL on REDUNDANT and COMPACT tables.
rb://382 approved by Jimmy Yang
------------------------------------------------------------
revno: 3529
revision-id: marko.makela@oracle.com-20100629125518-m3am4ia1ffjr0d0j
parent: jimmy.yang@oracle.com-20100629024137-690sacm5sogruzvb
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Tue 2010-06-29 15:55:18 +0300
message:
Bug#54358: READ UNCOMMITTED access failure of off-page DYNAMIC or COMPRESSED
columns
When the server crashes after a record stub has been inserted and
before all its off-page columns have been written, the record will
contain incomplete off-page columns after crash recovery. Such records
may only be accessed at the READ UNCOMMITTED isolation level or when
rolling back a recovered transaction in recv_recovery_rollback_active().
Skip these records at the READ UNCOMMITTED isolation level.
TODO: Add assertions for checking the above assumptions hold when an
incomplete BLOB is encountered.
btr_rec_copy_externally_stored_field(): Return NULL if the field is
incomplete.
row_prebuilt_t::templ_contains_blob: Clarify what "BLOB" means in this
context. Hint: MySQL BLOBs are not the same as InnoDB BLOBs.
row_sel_store_mysql_rec(): Return FALSE if not all columns could be
retrieved. Previously this function always returned TRUE. Assert that
the record is not delete-marked.
row_sel_push_cache_row_for_mysql(): Return FALSE if not all columns
could be retrieved.
row_search_for_mysql(): Skip records containing incomplete off-page
columns. Assert that the transaction isolation level is READ
UNCOMMITTED.
rb://380 approved by Jimmy Yang
Merge and adjust a forgotten change to fix this bug.
rb://393 approved by Jimmy Yang
------------------------------------------------------------------------
r3794 | marko | 2009-01-07 14:14:53 +0000 (Wed, 07 Jan 2009) | 18 lines
branches/6.0: Allow the minimum length of a multi-byte character to be
up to 4 bytes. (Bug #35391)
dtype_t, dict_col_t: Replace mbminlen:2, mbmaxlen:3 with mbminmaxlen:5.
In this way, the 5 bits can hold two values of 0..4, and the storage size
of the fields will not cross the 64-bit boundary. Encode the values as
DATA_MBMAX * mbmaxlen + mbminlen. Define the auxiliary macros
DB_MBMINLEN(mbminmaxlen), DB_MBMAXLEN(mbminmaxlen), and
DB_MINMAXLEN(mbminlen, mbmaxlen).
Try to trim and pad UTF-16 and UTF-32 with spaces as appropriate.
Alexander Barkov suggested the use of cs->cset->fill(cs, buff, len, 0x20).
ha_innobase::store_key_val_for_row() now does that, but the added function
row_mysql_pad_col() does not, because it doesn't have the MySQL TABLE object.
rb://49 approved by Heikki Tuuri
------------------------------------------------------------------------
and clarifies the invariant in dict_table_get_on_id().
In Mar 2007 Marko observed a crash during recovery, the crash resulted from
an UNDO operation on a system table. His solution was to acquire an X lock on
the data dictionary, this in hindsight was an overkill. It is unclear what
caused the crash, current hypothesis is that it was a memory corruption.
The X lock results in performance issues by when undoing changes due to
rollback during normal operation on regular tables.
Why the change is safe:
======================
The InnoDB code has changed since the original X lock change was made. In the
new code we always lock the data dictionary in X mode during startup when
UNDOing operations on the system tables (this is a given). This ensures that
the crash Marko observed cannot happen as long as all transactions that update
the system tables follow the standard rules by setting the appropriate DICT_OP
flag when writing the log records when they make the changes.
If transactions violate the above mentioned rule then during recovery (at
startup) the rollback code (see trx0roll.c) will not acquire the X lock
and we will see the crash again. This will however be a different bug.
------------------------------------------------------------
revno: 3517
revision-id: vasil.dimov@oracle.com-20100622163043-dc0lxy0byg74viet
parent: marko.makela@oracle.com-20100621095148-8g73k8k68dpj080u
committer: Vasil Dimov <vasil.dimov@oracle.com>
branch nick: mysql-5.1-innodb
timestamp: Tue 2010-06-22 19:30:43 +0300
message:
Fix Bug#47991 InnoDB Dictionary Cache memory usage increases indefinitely
when renaming tables
Allocate the table name using ut_malloc() instead of table->heap because
the latter cannot be freed.
Adjust dict_sys->size calculations all over the code.
Change dict_table_t::name from const char* to char* because we need to
ut_malloc()/ut_free() it.
Reviewed by: Inaam, Marko, Heikki (rb://384)
Approved by: Heikki (rb://384)
------------------------------------------------------------
------------------------------------------------------------
revno: 3506
revision-id: sergey.glukhov@sun.com-20100609121718-04mpk5kjxvnrxdu8
parent: sergey.glukhov@sun.com-20100609120734-ndy2281wau9067zv
committer: Sergey Glukhov <Sergey.Glukhov@sun.com>
branch nick: mysql-5.1-innodb
timestamp: Wed 2010-06-09 16:17:18 +0400
message:
Bug#38999 valgrind warnings for update statement in function compare_record()
(InnoDB plugin branch)
@ mysql-test/suite/innodb_plugin/r/innodb_mysql.result
test case
@ mysql-test/suite/innodb_plugin/t/innodb_mysql.test
test case
@ storage/innodb_plugin/row/row0sel.c
init null bytes with default values as they might be
left uninitialized in some cases and these uninited bytes
might be copied into mysql record buffer that leads to
valgrind warnings on next use of the buffer.
------------------------------------------------------------
revno: 3495
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Wed 2010-06-02 13:37:14 +0300
message:
Bug#53674: InnoDB: Error: unlock row could not find a 4 mode lock on the record
In semi-consistent read, only unlock freshly locked non-matching records.
lock_rec_lock_fast(): Return LOCK_REC_SUCCESS,
LOCK_REC_SUCCESS_CREATED, or LOCK_REC_FAIL instead of TRUE/FALSE.
enum db_err: Add DB_SUCCESS_LOCKED_REC for indicating a successful
operation where a record lock was created.
lock_sec_rec_read_check_and_lock(),
lock_clust_rec_read_check_and_lock(), lock_rec_enqueue_waiting(),
lock_rec_lock_slow(), lock_rec_lock(), row_ins_set_shared_rec_lock(),
row_ins_set_exclusive_rec_lock(), sel_set_rec_lock(),
row_sel_get_clust_rec_for_mysql(): Return DB_SUCCESS_LOCKED_REC if a
new record lock was created. Adjust callers.
row_unlock_for_mysql(): Correct the function documentation.
row_prebuilt_t::new_rec_locks: Correct the documentation.
------------------------------------------------------------
revno: 3488
revision-id: marko.makela@oracle.com-20100601103738-upm8awahesmeh9dr
parent: vasil.dimov@oracle.com-20100531163540-9fu3prbn2asqwdi5
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Tue 2010-06-01 13:37:38 +0300
message:
Bug#53812: assert row/row0umod.c line 660 in txn rollback after crash recovery
row_undo_mod_upd_exist_sec(): Tolerate a failure to build the index entry
for a DYNAMIC or COMPRESSED table during crash recovery.
can now view the content of InnoDB System Tables through following
information schema tables:
information_schema.INNODB_SYS_TABLES
information_schema.INNODB_SYS_INDEXES
information_schema.INNODB_SYS_COUMNS
information_schema.INNODB_SYS_FIELDS
information_schema.INNODB_SYS_FOREIGN
information_schema.INNODB_SYS_FOREIGN_COLS
information_schema.INNODB_SYS_TABLESTATS
rb://330 Approved by Marko
------------------------------------------------------------
revno: 3479
revision-id: marko.makela@oracle.com-20100524110439-fazi70rlmt07tzd9
parent: vasil.dimov@oracle.com-20100520133157-42uk5q3pp0vsinac
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Mon 2010-05-24 14:04:39 +0300
message:
Bug#53578: assert on invalid page access, in fil_io()
Store the max_space_id in the data dictionary header in order to avoid
space_id reuse.
DICT_HDR_MIX_ID: Renamed to DICT_HDR_MAX_SPACE_ID, DICT_HDR_MIX_ID_LOW.
dict_hdr_get_new_id(): Return table_id, index_id, space_id or a subset of them.
fil_system_t: Add ibool space_id_reuse_warned.
fil_create_new_single_table_tablespace(): Get the space_id from the caller.
fil_space_create(): Issue a warning if the fil_system->max_assigned_id
is exceeded.
fil_assign_new_space_id(): Return TRUE/FALSE and take a pointer to the
space_id as a parameter. Make the function public.
fil_init(): Initialize all fil_system fields by mem_zalloc(). Remove
explicit initializations of certain fields to 0 or NULL.
Post-merge fixes: Remove the MYSQL_VERSION_ID checks, because they only
apply to the InnoDB Plugin. Fix potential race condition accessing
trx->op_info and trx->detailed_error.
------------------------------------------------------------
revno: 3466
revision-id: marko.makela@oracle.com-20100514130815-ym7j7cfu88ro6km4
parent: marko.makela@oracle.com-20100514130228-n3n42nw7ht78k0wn
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: mysql-5.1-innodb2
timestamp: Fri 2010-05-14 16:08:15 +0300
message:
Make the InnoDB FOREIGN KEY parser understand multi-statements. (Bug #48024)
Also make InnoDB thinks that /*/ only starts a comment. (Bug #53644).
This fixes the bugs in the InnoDB Plugin.
ha_innodb.h: Use trx_query_string() instead of trx_query() when
available (MySQL 5.1.42 or later).
innobase_get_stmt(): New function, to retrieve the currently running
SQL statement.
struct trx_struct: Remove mysql_query_str. Use innobase_get_stmt() instead.
dict_strip_comments(): Add and observe the parameter sql_length. Treat
/*/ as the start of a comment.
dict_create_foreign_constraints(), row_table_add_foreign_constraints():
Add the parameter sql_length.
This is a followup to the fix of
Bug#51920 Innodb connections in row lock wait ignore KILL until lock wait
timeout
in that fix (rb://279) the behavior was changed to honor when a trx is
interrupted during lock wait, but the returned error code was still
"lock wait timeout" when it should be "interrupted".
This change fixes the non-deterministically failing test binlog.binlog_killed,
that failed like this:
binlog.binlog_killed 'stmt' [ fail ]
Test ended at 2010-05-12 11:39:08
CURRENT_TEST: binlog.binlog_killed
mysqltest: At line 208: query 'reap' failed with wrong errno 1205: 'Lock wait timeout exceeded; try restarting transaction', instead of 0...
Approved by: Sunny Bains (rb://344)
in the code but they have nothing to do with the kernel mutex split code.
Some subsequent commits use the new functions. This patch has been tested
with: ./mtr --suite=innodb with UNIV_DEBUG and UNIV_SYNC_DEBUG enabled.
All tests were successful.
------------------------------------------------------------
revno: 3459
revision-id: marko.makela@oracle.com-20100511105308-grp2t3prh3tqivw0
parent: marko.makela@oracle.com-20100511105012-b2t7wvz6mu6bll74
parent: marko.makela@oracle.com-20100505123901-xjxu93h1xnbkfkq0
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: mysql-5.1-innodb
timestamp: Tue 2010-05-11 13:53:08 +0300
message:
Merge a patch from Facebook to fix Bug #53290
commit e759bc64eb5c5eed4f75677ad67246797d486460
Author: Ryan Mack
Date: 3 days ago
Bugfix for 53290, fast unique index creation fails on duplicate null values
Summary:
Bug in the fast index creation code incorrectly considers null
values to be duplicates during block merging. Innodb policy is that
multiple null values are allowed in a unique index. Null duplicates
were correctly ignored while sorting individual blocks and with slow
index creation.
Test Plan:
mtr, including new test, load dbs using deferred index creation
License:
Copyright (C) 2009-2010 Facebook, Inc. All Rights Reserved.
Dual licensed under BSD license and GPLv2.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY FACEBOOK, INC. ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
EVENT SHALL FACEBOOK, INC. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the Free
Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
Place, Suite 330, Boston, MA 02111-1307 USA
------------------------------------------------------------
revno: 3453.2.1
revision-id: marko.makela@oracle.com-20100505123901-xjxu93h1xnbkfkq0
parent: marko.makela@oracle.com-20100505120555-ukoq1gklpheslrxs
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Wed 2010-05-05 15:39:01 +0300
message:
Merge a contribution from Ryan Mack at Facebook:
Bugfix for 53290, fast unique index creation fails on duplicate null values
Summary:
Bug in the fast index creation code incorrectly considers null
values to be duplicates during block merging. Innodb policy is that
multiple null values are allowed in a unique index. Null duplicates
were correctly ignored while sorting individual blocks and with slow
index creation.
Test Plan:
mtr, including new test, load dbs using deferred index creation
DiffCamp Revision: 110840
Reviewed By: mcallaghan
CC: mcallaghan, mysql-devel@lists
Revert Plan:
OK
------------------------------------------------------------
revno: 3450
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Wed 2010-05-05 14:24:11 +0300
message:
row_merge_drop_temp_indexes(): Load the table via the dictionary cache.
Allow multiple indexes to be dropped. (Bug #53256)
1. BUG#49032 - auto_increment field does not initialize to last value
in InnoDB Storage Engine
2. Fix whitespace issues and fix tests and make read float/double arg const
Detailed revision comments:
r6231 | sunny | 2009-11-25 10:26:27 +0200 (Wed, 25 Nov 2009) | 7 lines
branches/5.1: Fix BUG#49032 - auto_increment field does not initialize to last value in InnoDB Storage Engine.
We use the appropriate function to read the column value for non-integer
autoinc column types, namely float and double.
rb://208. Approved by Marko.
r6232 | sunny | 2009-11-25 10:27:39 +0200 (Wed, 25 Nov 2009) | 2 lines
branches/5.1: This is an interim fix, fix white space errors.
r6233 | sunny | 2009-11-25 10:28:35 +0200 (Wed, 25 Nov 2009) | 2 lines
branches/5.1: This is an interim fix, fix tests and make read float/double arg const.
r6234 | sunny | 2009-11-25 10:29:03 +0200 (Wed, 25 Nov 2009) | 2 lines
branches/5.1: This is an interim fix, fix whitepsace issues.