btr_record_not_null_field_in_rec(): Remove the parameter rec.
Use rec_offs_nth_sql_null() instead of rec_get_nth_field().
rb:788 approved by Jimmy Yang
hash index at shutdown
btr_search_disable(): Just drop the entire adaptive hash index,
without dropping every record separately.
buf_pool_clear_hash_index(): Renamed and simplified from
buf_pool_drop_hash_index(). Set block->index = NULL for every block in
the buffer pool. Do not release the btr_search_latch. The caller will
have to adjust other data structures.
Remove block->is_hashed. It is redundant, should be always equal to
block->index != NULL.
Remove btr_search_fully_disabled, btr_search_enabled_mutex, and
SYNC_SEARCH_SYS_CONF. We drop the AHI in one pass, without releasing
the btr_search_latch in between.
Replace void* with const rec_t* and add assertions on btr_search_latch
and btr_search_enabled to ha0ha.h, ha0ha.ic, ha0ha.c.
page_set_max_trx_id(): Ignore the adaptive hash index. I forgot to
push this in rb:750.
btr0sea.c: Always after acquiring btr_search_latch, check for
block->index==NULL or !btr_search_enabled. We can now set
block->index=NULL while only holding btr_search_latch in exclusive
mode. Always acquire btr_search_latch before reading block->index,
except in shortcuts when testing for block->index == NULL.
ha_clear(), ha_search(): Unused function, remove.
buf_page_peek_if_search_hashed(): Remove. This function may avoid
latching a page at the cost of doing a duplicate buf_pool->page_hash
lookup.
rb:775 approved by Inaam Rana
InnoDB acquires an x-latch on btr_search_latch for certain in-place updates
that do affect the adaptive hash index. These operations do not really need
to be protected by the btr_search_latch:
* updating DB_TRX_ID
* updating DB_ROLL_PTR
* updating PAGE_MAX_TRX_ID
* updating the delete-mark flag
rb:750 approved by Sunny Bains
The fix of Bug#12612184 broke crash recovery. When a record that
contains off-page columns (BLOBs) is updated, we must first write redo
log about the BLOB page writes, and only after that write the redo log
about the B-tree changes. The buggy fix would log the B-tree changes
first, meaning that after recovery, we could end up having a record
that contains a null BLOB pointer.
Because we will be redo logging the writes off the off-page columns
before the B-tree changes, we must make sure that the pages chosen for
the off-page columns are free both before and after the B-tree
changes. In this way, the worst thing that can happen in crash
recovery is that the BLOBs are written to free pages, but the B-tree
changes are not applied. The BLOB pages would correctly remain free in
this case. To achieve this, we must allocate the BLOB pages in the
mini-transaction of the B-tree operation. A further quirk is that BLOB
pages are allocated from the same file segment as leaf pages. Because
of this, we must temporarily "hide" any leaf pages that were freed
during the B-tree operation by "fake allocating" them prior to writing
the BLOBs, and freeing them again before the mtr_commit() of the
B-tree operation, in btr_mark_freed_leaves().
btr_cur_mtr_commit_and_start(): Remove this faulty function that was
introduced in the Bug#12612184 fix. The problem that this function was
trying to address was that when we did mtr_commit() the BLOB writes
before the mtr_commit() of the update, the new BLOB pages could have
overwritten clustered index B-tree leaf pages that were freed during
the update. If recovery applied the redo log of the BLOB writes but
did not see the log of the record update, the index tree would be
corrupted. The correct solution is to make the freed clustered index
pages unavailable to the BLOB allocation. This function is also a
likely culprit of InnoDB hangs that were observed when testing the
Bug#12612184 fix.
btr_mark_freed_leaves(): Mark all freed clustered index leaf pages of
a mini-transaction allocated (nonfree=TRUE) before storing the BLOBs,
or freed (nonfree=FALSE) before committing the mini-transaction.
btr_freed_leaves_validate(): A debug function for checking that all
clustered index leaf pages that have been marked free in the
mini-transaction are consistent (have not been zeroed out).
btr_page_alloc_low(): Refactored from btr_page_alloc(). Return the
number of the allocated page, or FIL_NULL if out of space. Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or if this is a "fake allocation"
(init_mtr=NULL) by btr_mark_freed_leaves(nonfree=TRUE).
btr_page_alloc(): Add the parameter init_mtr, allowing the page to be
initialized and X-latched in a different mini-transaction than the one
that is used for the allocation. Invoke btr_page_alloc_low(). If a
clustered index leaf page was previously freed in mtr, remove it from
the memo of previously freed pages.
btr_page_free(): Assert that the page is a B-tree page and it has been
X-latched by the mini-transaction. If the freed page was a leaf page
of a clustered index, link it by a MTR_MEMO_FREE_CLUST_LEAF marker to
the mini-transaction.
btr_store_big_rec_extern_fields_func(): Add the parameter alloc_mtr,
which is NULL (old behaviour in inserts) and the same as local_mtr in
updates. If alloc_mtr!=NULL, the BLOB pages will be allocated from it
instead of the mini-transaction that is used for writing the BLOBs.
fsp_alloc_from_free_frag(): Refactored from
fsp_alloc_free_page(). Allocate the specified page from a partially
free extent.
fseg_alloc_free_page_low(), fseg_alloc_free_page_general(): Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or NULL if this is a "fake allocation"
that prevents the reuse of a previously freed B-tree page for BLOB
storage. If init_mtr==NULL, try harder to reallocate the specified page
and assert that it succeeded.
fsp_alloc_free_page(): Add the parameter "mtr_t* init_mtr" for
specifying the mini-transaction where the page should be initialized.
Do not allow init_mtr == NULL, because this function is never to be
used for "fake allocations".
mtr_t: Add the operation MTR_MEMO_FREE_CLUST_LEAF and the flag
mtr->freed_clust_leaf for quickly determining if any
MTR_MEMO_FREE_CLUST_LEAF operations have been posted.
row_ins_index_entry_low(): When columns are being made off-page in
insert-by-update, invoke btr_mark_freed_leaves(nonfree=TRUE) and pass
the mini-transaction as the alloc_mtr to
btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
row_build(): Correct a comment, and add a debug assertion that a
record that contains NULL BLOB pointers must be a fresh insert.
row_upd_clust_rec(): When columns are being moved off-page, invoke
btr_mark_freed_leaves(nonfree=TRUE) and pass the mini-transaction as
the alloc_mtr to btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
buf_reset_check_index_page_at_flush(): Remove. The function
fsp_init_file_page_low() already sets
bpage->check_index_page_at_flush=FALSE.
There is a known issue in tablespace extension. If the request to
allocate a BLOB page leads to the tablespace being extended, crash
recovery could see BLOB writes to pages that are off the tablespace
file bounds. This should trigger an assertion failure in fil_io() at
crash recovery. The safe thing would be to write redo log about the
tablespace extension to the mini-transaction of the BLOB write, not to
the mini-transaction of the record update. However, there is no redo
log record for file extension in the current redo log format.
rb:693 approved by Sunny Bains
discarded in buf_page_create()
This bug turned out to be a false alarm, a bug in the UNIV_SYNC_DEBUG
diagnostic code. Because of this, the patch was not backported to the
built-in InnoDB in MySQL 5.1. Furthermore, there is no test case for
InnoDB Plugin in MySQL 5.1, because the delete buffering in MySQL 5.5
makes triggering the failure much easier.
When a freed page for which there exist orphaned buffered changes is
allocated and reused for something else, buf_page_create() will discard
the buffered changes by invoking ibuf_merge_or_delete_for_page().
This would violate the InnoDB latching order.
Tweak the latching order as follows. Move SYNC_IBUF_MUTEX below
SYNC_FSP_PAGE, where it logically belongs, and assign new latching
levels for the ibuf->index->lock and the insert buffer B-tree pages:
#define SYNC_IBUF_MUTEX 370 /* ibuf_mutex */
#define SYNC_IBUF_INDEX_TREE 360
#define SYNC_IBUF_TREE_NODE_NEW 359
#define SYNC_IBUF_TREE_NODE 358
btr_block_get(), btr_page_get(): In UNIV_SYNC_DEBUG, add the parameter
"index" for determining the appropriate latching order
(SYNC_IBUF_TREE_NODE or SYNC_TREE_NODE).
btr_page_alloc_for_ibuf(), btr_create(): Use SYNC_IBUF_TREE_NODE_NEW
instead of SYNC_TREE_NODE_NEW for insert buffer pages.
btr_cur_search_to_nth_level(), btr_pcur_restore_position_func(): Use
SYNC_IBUF_TREE_NODE instead of SYNC_TREE_NODE for insert buffer pages.
btr_search_guess_on_hash(): Assert that the index is not an insert buffer tree.
dict_index_add_to_cache(): Use SYNC_IBUF_INDEX_TREE for the insert
buffer tree (ibuf->index->lock).
ibuf0ibuf.c: Use SYNC_IBUF_TREE_NODE or SYNC_IBUF_TREE_NODE_NEW for
all B-tree pages.
ibuf_merge_or_delete_for_page(): Assert that the user page is
BUF_IO_READ fixed. Only in this way it is OK to latch it as
SYNC_IBUF_TREE_NODE instead of the proper SYNC_TREE_NODE (which would
violate the changed latching order).
sync_thread_add_level(): Remove the special tweak for
SYNC_IBUF_MUTEX. Add rules for the added latching levels.
rb:591 approved by Jimmy Yang
approved by: Marko
rb://681
Coalescing of free buf_page_t descriptors can prove to be one severe
bottleneck in performance of compression. One such workload where it
hurts badly is DROP TABLE. This patch removes buf_page_t allocations
from buf_buddy and uses ut_malloc instead.
In order to further reduce overhead of colaescing we no longer attempt
to coalesce a block if the corresponding free_list is less than 16 in
size.
Replace UNIV_BLOB_NULL_DEBUG with UNIV_DEBUG||UNIV_BLOB_LIGHT_DEBUG.
Fix known bogus failures.
btr_cur_optimistic_update(): If rec_offs_any_null_extern(), assert that
the current transaction is an incomplete transaction that is being
rolled back in crash recovery.
row_build(): If rec_offs_any_null_extern(), assert that the transaction
that last updated the record was recovered during crash recovery
(and will soon be rolled back).
btr_cur_compress_if_useful(), btr_compress(): Add the parameter ibool
adjust. If adjust=TRUE, adjust the cursor position after compressing
the page.
btr_lift_page_up(): Return a pointer to the father page.
BTR_KEEP_POS_FLAG: A new flag for btr_cur_pessimistic_update().
btr_cur_pessimistic_update(): If *big_rec != NULL and flags &
BTR_KEEP_POS_FLAG, keep the cursor positioned on the updated record.
Also, do not release the index tree x-lock if *big_rec != NULL.
btr_cur_mtr_commit_and_start(): Commits and restarts a
mini-transaction so that it will retain an x-lock on index->lock and
the page of the cursor. This is invoked when
btr_cur_pessimistic_update() returns *big_rec != NULL.
In all callers of btr_cur_pessimistic_update() that do not pass
BTR_KEEP_POS_FLAG, assert that *big_rec == NULL.
btr_cur_compress(): Unused function [in the built-in MySQL 5.1], remove.
page_rec_get_nth(): Return the nth record on the page (an inverse
function of page_rec_get_n_recs_before()). Refactored from
page_get_middle_rec().
page_get_middle_rec(): Invoke page_rec_get_nth().
page_cur_insert_rec_zip_reorg(): Make use of the page directory
shortcuts in page_rec_get_nth() instead of scanning the whole list of
records.
row_ins_clust_index_entry_by_modify(): Pass BTR_KEEP_POS_FLAG to
btr_cur_pessimistic_update().
row_ins_index_entry_low(): If row_ins_clust_index_entry_by_modify()
returns a big_rec, invoke btr_cur_mtr_commit_and_start() in order to
commit and start the mini-transaction without releasing the x-locks on
index->lock and the cursor page, and write the big_rec. Releasing the
page latch in mtr_commit() caused a race condition.
row_upd_clust_rec(): Pass BTR_KEEP_POS_FLAG to
btr_cur_pessimistic_update(). If it returns a big_rec, invoke
btr_cur_mtr_commit_and_start() in order to commit and start the
mini-transaction without releasing the x-locks on index->lock and the
cursor page, and write the big_rec. Releasing the page latch in
mtr_commit() caused a race condition.
sync_thread_add_level(): Add the parameter ibool relock. When TRUE,
bypass the latching order rules.
rw_lock_add_debug_info(): For nested X-lock requests, pass relock=TRUE
to sync_thread_add_level().
rb:678 approved by Jimmy Yang
Some ut_a(!rec_offs_any_null_extern()) assertion failures are indicating
genuine BLOB bugs, others are bogus failures when rolling back incomplete
transactions at crash recovery. This needs more work, and until I get a
chance to work on it, other testing must not be disrupted by this.
If UNIV_DEBUG or UNIV_BLOB_LIGHT_DEBUG is enabled, add
!rec_offs_any_null_extern() assertions, ensuring that records do not
contain null pointers to externally stored columns in inappropriate
places.
btr_cur_optimistic_update(): Assert !rec_offs_any_null_extern().
Incomplete records must never be updated or deleted. This assertion
will cover also the pessimistic route.
row_build(): Assert !rec_offs_any_null_extern(). Search tuples must
never be built from incomplete index entries.
row_rec_to_index_entry(): Assert !rec_offs_any_null_extern() unless
ROW_COPY_DATA is requested. ROW_COPY_DATA is used for
multi-versioning, and therefore it might be valid to copy the most
recent (uncommitted) version while it contains a null pointer to
off-page columns.
row_vers_build_for_consistent_read(),
row_vers_build_for_semi_consistent_read(): Assert !rec_offs_any_null_extern()
on all versions except the most recent one.
trx_undo_prev_version_build(): Assert !rec_offs_any_null_extern() on
the previous version.
rb:682 approved by Sunny Bains
According to the zlib documentation, next_in and avail_in
must be initialized before invoking inflateInit or inflateInit2.
Furthermore, the zalloc function must clear the allocated memory.
btr_copy_zblob_prefix(): Replace the d_stream parameter with buf,len
and return the copied length.
page_zip_decompress(): Invoke inflateInit2 a little later.
page_zip_zalloc(): Rename from page_zip_alloc().
Invoke mem_heap_zalloc() instead of mem_heap_alloc().
rb:619 approved by Jimmy Yang
and compressed tables
buf_LRU_drop_page_hash_for_tablespace(): after releasing and
reacquiring the buffer pool mutex, do not dereference any block
descriptor pointer that is not known to be a pointer to an
uncompressed page frame (type buf_block_t; state ==
BUF_BLOCK_FILE_PAGE). Also, defer the acquisition of the block_mutex
until it is needed.
buf_page_get_gen(): Add mode == BUF_GET_IF_IN_POOL_PEEK for
buffer-fixing a block without making it young in the LRU list.
buf_page_get_gen(), buf_page_init(), buf_LRU_block_remove_hashed_page():
Set bpage->state = BUF_BLOCK_ZIP_FREE before buf_buddy_free(bpage),
so that similar race conditions might be detected a little easier.
btr_search_drop_page_hash_when_freed(): Use BUF_GET_IF_IN_POOL_PEEK
when dropping the hash indexes.
rb://528 approved by Jimmy Yang
This option is known to be broken when tablespaces contain off-page
columns after crash recovery. It has only been tested when creating
the data files from the scratch.
btr_blob_dbg_t: A map from page_no:heap_no:field_no to first_blob_page_no.
This map is instantiated for every clustered index in index->blobs.
It is protected by index->blobs_mutex.
btr_blob_dbg_msg_issue(): Issue a diagnostic message.
Invoked when btr_blob_dbg_msg is set.
btr_blob_dbg_rbt_insert(): Insert a btr_blob_dbg_t into index->blobs.
btr_blob_dbg_rbt_delete(): Remove a btr_blob_dbg_t from index->blobs.
btr_blob_dbg_cmp(): Comparator for btr_blob_dbg_t.
btr_blob_dbg_add_blob(): Add a BLOB reference to the map.
btr_blob_dbg_add_rec(): Add all BLOB references from a record to the map.
btr_blob_dbg_print(): Display the map of BLOB references in an index.
btr_blob_dbg_remove_rec(): Remove all BLOB references of a record from
the map.
btr_blob_dbg_is_empty(): Check that no BLOB references exist to or
from a page. Disowned references from delete-marked records are
tolerated.
btr_blob_dbg_op(): Perform an operation on all BLOB references on a
B-tree page.
btr_blob_dbg_add(): Add all BLOB references from a B-tree page to the
map.
btr_blob_dbg_remove(): Remove all BLOB references from a B-tree page
from the map.
btr_blob_dbg_restore(): Restore the BLOB references after a failed
page reorganize.
btr_blob_dbg_set_deleted_flag(): Modify the 'deleted' flag in the BLOB
references of a record.
btr_blob_dbg_owner(): Own or disown a BLOB reference.
btr_page_create(), btr_page_free_low(): Assert that no BLOB references exist.
btr_create(): Create index->blobs for clustered indexes.
btr_page_reorganize_low(): Invoke btr_blob_dbg_remove() before copying
the records. Invoke btr_blob_dbg_restore() if the operation fails.
btr_page_empty(), btr_lift_page_up(), btr_compress(), btr_discard_page():
Invoke btr_blob_dbg_remove().
btr_cur_del_mark_set_clust_rec(): Invoke btr_blob_dbg_set_deleted_flag().
Other cases of modifying the delete mark are either in the secondary
index or during crash recovery, which we do not promise to support.
btr_cur_set_ownership_of_extern_field(): Invoke btr_blob_dbg_owner().
btr_store_big_rec_extern_fields(): Invoke btr_blob_dbg_add_blob().
btr_free_externally_stored_field(): Invoke btr_blob_dbg_assert_empty()
on the first BLOB page.
page_cur_insert_rec_low(), page_cur_insert_rec_zip(),
page_copy_rec_list_end_to_created_page(): Invoke btr_blob_dbg_add_rec().
page_cur_insert_rec_zip_reorg(), page_copy_rec_list_end(),
page_copy_rec_list_start(): After failure, invoke
btr_blob_dbg_remove() and btr_blob_dbg_add().
page_cur_delete_rec(): Invoke btr_blob_dbg_remove_rec().
page_delete_rec_list_end(): Invoke btr_blob_dbg_op(btr_blob_dbg_remove_rec).
page_zip_reorganize(): Invoke btr_blob_dbg_remove() before copying the records.
page_zip_copy_recs(): Invoke btr_blob_dbg_add().
row_upd_rec_in_place(): Invoke btr_blob_dbg_rbt_delete() and
btr_blob_dbg_rbt_insert().
innobase_start_or_create_for_mysql(): Warn when UNIV_BLOB_DEBUG is enabled.
rb://550 approved by Jimmy Yang
btr_rec_get_field_ref_offs(), btr_rec_get_field_ref(): New functions.
Get the pointer to an externally stored field.
btr_cur_set_ownership_of_extern_field(): Assert that the BLOB has not
already been disowned.
btr_store_big_rec_extern_fields(): Rename to
btr_store_big_rec_extern_fields_func() and add the debug parameter
update_in_place. All pointers to externally stored columns in the
record must either be zero or they must be pointers to inherited
columns, owned by this record or an earlier record version. For any
BLOB that is stored, the BLOB pointer must previously have been
zero. When the function completes, all BLOB pointers must be nonzero
and owned by the record.
rb://549 approved by Jimmy Yang
Antelope files in btr_check_blob_fil_page_type(). Unfortunately, we
must keep the check in production builds, because InnoDB wrote
uninitialized garbage to FIL_PAGE_TYPE until fairly recently (5.1.x).
rb://546 approved by Jimmy Yang
trx rollback or purge
This patch does not relax the failing debug assertion during purge.
That will be revisited once we have managed to repeat the assertion failure.
row_upd_changes_ord_field_binary_func(): Renamed from
row_upd_changes_ord_field_binary(). Add the parameter que_thr_t* in
UNIV_DEBUG builds. When the off-page column cannot be retrieved,
assert that the current transaction is a recovered one and that it is
the one that is currently being rolled back.
row_upd_changes_ord_field_binary(): A wrapper macro for
row_upd_changes_ord_field_binary_func() that discards the que_thr_t*
parameter unless UNIV_DEBUG is defined.
row_purge_upd_exist_or_extern_func(): Renamed from
row_purge_upd_exist_or_extern(). Add the parameter que_thr_t* in
UNIV_DEBUG builds.
row_purge_upd_exist_or_extern(): A wrapper macro for
row_purge_upd_exist_or_extern_func() that discards the que_thr_t*
parameter unless UNIV_DEBUG is defined.
Make trx_roll_crash_recv_trx const. If there were a 'do not
dereference' attribute, it would be appropriate as well.
rb://588 approved by Jimmy Yang
buf_block_alloc(): ulint zip_size is always 0.
buf_LRU_get_free_block(): ulint zip_size is always 0.
buf_LRU_free_block(): ibool* buf_pool_mutex_released is always NULL.
Remove these parameters.
buf_LRU_get_free_block(): Simplify the initialization of block->page.zip
and release buf_pool_mutex() earlier.
"rows examined" estimates". This change implements "innodb_stats_method"
with options of "nulls_equal", "nulls_unequal" and "null_ignored".
rb://553 approved by Marko
This bug fix requires that Bug #58912 be fixed as well (bzr revision id
marko.makela@oracle.com-20101221093919-mcmmgd4zpse9567d). Otherwise,
another double BLOB free could occur when InnoDB would try to perform
an update-in-place as delete-and-insert-by-update-in-place.
row_upd_clust_rec_by_insert(): Do not disown the externally stored
columns from the old record (btr_cur_mark_extern_inherited_fields())
until after checking the foreign key constraints and successfully
inserting the updated record. If a lock wait timeout occurs between
the delete-marking of the old record and the insertion of the updated
record, mark the columns inherited before retrying the insert.
Distinguish the state UPD_NODE_INSERT_BLOB from
UPD_NODE_INSERT_CLUSTERED.
btr_cur_del_mark_set_clust_rec(): Replace the cursor with
block,rec,index,offsets so that the offsets need not be recalculated.
Assert that rec is on a clustered index leaf page.
btr_cur_disown_inherited_fields(): Renamed from
btr_cur_mark_extern_inherited_fields(). Use
upd_get_field_by_field_no(). Assert that there are externally stored
columns. Assert that a mini-transaction is passed. Remove the return
status. (The only caller, row_upd_clust_rec_by_insert(), will have
determined that some fields have changed ownership.)
btr_cur_mark_dtuple_inherited_extern(): Rename to
row_upd_clust_rec_by_insert_inherit_func() and declare as static. Add
the debug parameters rec, offsets. When rec is given, assert that the
off-page columns match those in the inesrt tuple and that the off-page
columns are owned by the record. Assert that the non-updated off-page
columns in the insert tuple are owned, and mark them inherited.
row_upd_clust_rec_by_insert_inherit(): A wrapper macro for
row_upd_clust_rec_by_insert_inherit_func().
row_undo_mod_upd_exist_sec(): Adjust a comment about
row_upd_clust_rec_by_insert().
rb:508 approved by Jimmy Yang
row_upd_changes_ord_field_binary(): Do not return TRUE if the update
vector changes a column that is covered by a prefix index, but does
not change the column prefix. Add the row_ext_t parameter for
determining whether the prefixes of externally stored columns match.
dfield_datas_are_binary_equal(): Add the parameter len, for comparing
column prefixes when len > 0.
innodb.test: Add a test case where the patch of Bug #55284 failed
without this fix.
rb:537 approved by Jimmy Yang
Replace the array of mutexes that used to protect
dict_index_t::stat_n_diff_key_vals[] with an array of rw locks that protects
all the stats related members in dict_table_t and all of its indexes.
Approved by: Jimmy (rb://503)
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
btr_cur_update_alloc_zip(): Make the function public.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
Fix compiler warning:
btr/btr0sea.c: In function 'btr_search_update_hash_on_delete':
btr/btr0sea.c:1498:9: error: variable 'found' set but not used [-Werror=unused-but-set-variable]
Fix compiler warning:
btr/btr0pcur.c: In function 'btr_pcur_move_backward_from_page':
btr/btr0pcur.c:455:9: error: variable 'space' set but not used [-Werror=unused-but-set-variable]
Fix compiler warning:
btr/btr0cur.c: In function 'btr_free_externally_stored_field':
btr/btr0cur.c:4281:16: error: variable 'rec_block' set but not used [-Werror=unused-but-set-variable]
Fix compiler warning:
btr/btr0cur.c: In function 'btr_cur_optimistic_update':
btr/btr0cur.c:1839:10: error: variable 'orig_rec' set but not used [-Werror=unused-but-set-variable]
Fix compiler warning:
btr/btr0btr.c: In function 'btr_compress':
btr/btr0btr.c:2564:9: error: variable 'level' set but not used [-Werror=unused-but-set-variable]
Fix compiler warning:
btr/btr0btr.c: In function 'btr_page_split_and_insert':
btr/btr0btr.c:1898:11: error: variable 'insert_page' set but not used [-Werror=unused-but-set-variable]
Currently we do a full validation of AHI whenever check tables is
called on any table. This patch fixes this by only doing this full
check in debug versions.
bug#55716
rb://423
approved by: Marko
The bug is due to a double delete of a BLOB, once via:
rollback -> btr_cur_pessimistic_delete()
and the second time via purge.
The bug is in row_upd_clust_rec_by_insert(). There we relinquish ownership
of the non-updated BLOB columns in btr_cur_mark_extern_inherited_fields()
before building the row entry that will be inserted and whose contents will
be logged in the UNDO log. However, we don't set the BLOB column later to
INHERITED so that a possible rollback will not free the original row's
non-updated BLOB entries. This is because the condition that checks for
that is in :
if (node->upd_ext) {}.
node->upd_ext is non-NULL only if a BLOB column was updated and that column
is part of some key ordering (see row_upd_replace()). This results in the
non-update BLOB columns being deleted during a rollback and subsequently by
purge again.
rb://413
(READ UNCOMMITTED access failure of off-page DYNAMIC or COMPRESSED columns).
Records that lack incompletely written externally stored columns may
be accessed by READ UNCOMMITTED transaction even without involving a
crash during an INSERT or UPDATE operation. I verified this as follows.
(1) added a delay after the mini-transaction for writing the clustered
index 'stub' record was committed (patch attached)
(2) started mysqld in gdb, setting breakpoints to the where the
assertions about READ UNCOMMITTED were added in the bug fix
(3) invoked ibtest3 --create-options=key_block_size=2
to create BLOBs in a COMPRESSED table
(4) invoked the following:
yes 'set transaction isolation level read uncommitted;
checksum table blobt3;select sleep(1);'|mysql -uroot test
(5) noted that one of the breakpoints was triggered
(return(NULL) in btr_rec_copy_externally_stored_field())
=== modified file 'storage/innodb_plugin/row/row0ins.c'
--- storage/innodb_plugin/row/row0ins.c 2010-06-30 08:17:25 +0000
+++ storage/innodb_plugin/row/row0ins.c 2010-06-30 08:17:25 +0000
@@ -2120,6 +2120,7 @@ function_exit:
rec_t* rec;
ulint* offsets;
mtr_start(&mtr);
+ os_thread_sleep(5000000);
btr_cur_search_to_nth_level(index, 0, entry, PAGE_CUR_LE,
BTR_MODIFY_TREE, &cursor, 0,
=== modified file 'storage/innodb_plugin/row/row0upd.c'
--- storage/innodb_plugin/row/row0upd.c 2010-06-30 08:11:55 +0000
+++ storage/innodb_plugin/row/row0upd.c 2010-06-30 08:11:55 +0000
@@ -1763,6 +1763,7 @@ row_upd_clust_rec(
rec_offs_init(offsets_);
mtr_start(mtr);
+ os_thread_sleep(5000000);
ut_a(btr_pcur_restore_position(BTR_MODIFY_TREE, pcur, mtr));
rec = btr_cur_get_rec(btr_cur);
columns
When the server crashes after a record stub has been inserted and
before all its off-page columns have been written, the record will
contain incomplete off-page columns after crash recovery. Such records
may only be accessed at the READ UNCOMMITTED isolation level or when
rolling back a recovered transaction in recv_recovery_rollback_active().
Skip these records at the READ UNCOMMITTED isolation level.
TODO: Add assertions for checking the above assumptions hold when an
incomplete BLOB is encountered.
btr_rec_copy_externally_stored_field(): Return NULL if the field is
incomplete.
row_prebuilt_t::templ_contains_blob: Clarify what "BLOB" means in this
context. Hint: MySQL BLOBs are not the same as InnoDB BLOBs.
row_sel_store_mysql_rec(): Return FALSE if not all columns could be
retrieved. Previously this function always returned TRUE. Assert that
the record is not delete-marked.
row_sel_push_cache_row_for_mysql(): Return FALSE if not all columns
could be retrieved.
row_search_for_mysql(): Skip records containing incomplete off-page
columns. Assert that the transaction isolation level is READ
UNCOMMITTED.
rb://380 approved by Jimmy Yang
lock_rec_unlock(): Cache first_lock and rewrite while() loops as for().
btr_cur_optimistic_update(): Use common error handling return.
row_create_prebuilt(): Add Valgrind instrumentation.
------------------------------------------------------------
revno: 3127
revision-id: vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1
parent: vasil.dimov@oracle.com-20100531105923-kpjwl4rbgfpfj13c
committer: Vasil Dimov <vasil.dimov@oracle.com>
branch nick: mysql-trunk-innodb
timestamp: Mon 2010-05-31 18:23:41 +0300
message:
Fix Bug #53947 InnoDB: Assertion failure in thread 4224 in file .\sync\sync0sync.c line 324
Destroy the rw-lock object before freeing the memory it is occupying.
If we do not do this, then the mutex that is contained in the rw-lock
object btr_search_latch_temp->mutex gets "freed" and subsequently
mutex_free() from sync_close() hits a mutex whose memory has been
freed and crashes.
Approved by: Heikki (via IRC)
Discussed with: Calvin
on same table
Protect dict_index_t::stat_n_diff_key_vals[] with an array of
mutexes.
Testing: tested all code paths under UNIV_SYNC_DEBUG
for the one in dict_print() one has to enable the InnoDB table monitor:
CREATE TABLE innodb_table_monitor (a int) ENGINE=INNODB;