Commit graph

9 commits

Author SHA1 Message Date
Marko Mäkelä
91b5e9352a Revert most of revno 3560.9.1 (Bug#12704861)
This was an attempt to address problems with the Bug#12612184 fix.
Even with this follow-up fix, crash recovery can be broken.
Let us fix the bug later.
2011-10-26 11:44:28 +03:00
unknown
40761a9a73 Merge from mysql-5.1.59-release 2011-09-15 18:48:54 +02:00
Marko Mäkelä
41f229cd9e Bug#12704861 Corruption after a crash during BLOB update
The fix of Bug#12612184 broke crash recovery. When a record that
contains off-page columns (BLOBs) is updated, we must first write redo
log about the BLOB page writes, and only after that write the redo log
about the B-tree changes. The buggy fix would log the B-tree changes
first, meaning that after recovery, we could end up having a record
that contains a null BLOB pointer.

Because we will be redo logging the writes off the off-page columns
before the B-tree changes, we must make sure that the pages chosen for
the off-page columns are free both before and after the B-tree
changes. In this way, the worst thing that can happen in crash
recovery is that the BLOBs are written to free pages, but the B-tree
changes are not applied. The BLOB pages would correctly remain free in
this case. To achieve this, we must allocate the BLOB pages in the
mini-transaction of the B-tree operation. A further quirk is that BLOB
pages are allocated from the same file segment as leaf pages. Because
of this, we must temporarily "hide" any leaf pages that were freed
during the B-tree operation by "fake allocating" them prior to writing
the BLOBs, and freeing them again before the mtr_commit() of the
B-tree operation, in btr_mark_freed_leaves().

btr_cur_mtr_commit_and_start(): Remove this faulty function that was
introduced in the Bug#12612184 fix. The problem that this function was
trying to address was that when we did mtr_commit() the BLOB writes
before the mtr_commit() of the update, the new BLOB pages could have
overwritten clustered index B-tree leaf pages that were freed during
the update. If recovery applied the redo log of the BLOB writes but
did not see the log of the record update, the index tree would be
corrupted. The correct solution is to make the freed clustered index
pages unavailable to the BLOB allocation. This function is also a
likely culprit of InnoDB hangs that were observed when testing the
Bug#12612184 fix.

btr_mark_freed_leaves(): Mark all freed clustered index leaf pages of
a mini-transaction allocated (nonfree=TRUE) before storing the BLOBs,
or freed (nonfree=FALSE) before committing the mini-transaction.

btr_freed_leaves_validate(): A debug function for checking that all
clustered index leaf pages that have been marked free in the
mini-transaction are consistent (have not been zeroed out).

btr_page_alloc_low(): Refactored from btr_page_alloc(). Return the
number of the allocated page, or FIL_NULL if out of space. Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or if this is a "fake allocation"
(init_mtr=NULL) by btr_mark_freed_leaves(nonfree=TRUE).

btr_page_alloc(): Add the parameter init_mtr, allowing the page to be
initialized and X-latched in a different mini-transaction than the one
that is used for the allocation. Invoke btr_page_alloc_low(). If a
clustered index leaf page was previously freed in mtr, remove it from
the memo of previously freed pages.

btr_page_free(): Assert that the page is a B-tree page and it has been
X-latched by the mini-transaction. If the freed page was a leaf page
of a clustered index, link it by a MTR_MEMO_FREE_CLUST_LEAF marker to
the mini-transaction.

btr_store_big_rec_extern_fields_func(): Add the parameter alloc_mtr,
which is NULL (old behaviour in inserts) and the same as local_mtr in
updates. If alloc_mtr!=NULL, the BLOB pages will be allocated from it
instead of the mini-transaction that is used for writing the BLOBs.

fsp_alloc_from_free_frag(): Refactored from
fsp_alloc_free_page(). Allocate the specified page from a partially
free extent.

fseg_alloc_free_page_low(), fseg_alloc_free_page_general(): Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or NULL if this is a "fake allocation"
that prevents the reuse of a previously freed B-tree page for BLOB
storage. If init_mtr==NULL, try harder to reallocate the specified page
and assert that it succeeded.

fsp_alloc_free_page(): Add the parameter "mtr_t* init_mtr" for
specifying the mini-transaction where the page should be initialized.
Do not allow init_mtr == NULL, because this function is never to be
used for "fake allocations".

mtr_t: Add the operation MTR_MEMO_FREE_CLUST_LEAF and the flag
mtr->freed_clust_leaf for quickly determining if any
MTR_MEMO_FREE_CLUST_LEAF operations have been posted.

row_ins_index_entry_low(): When columns are being made off-page in
insert-by-update, invoke btr_mark_freed_leaves(nonfree=TRUE) and pass
the mini-transaction as the alloc_mtr to
btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.

row_build(): Correct a comment, and add a debug assertion that a
record that contains NULL BLOB pointers must be a fresh insert.

row_upd_clust_rec(): When columns are being moved off-page, invoke
btr_mark_freed_leaves(nonfree=TRUE) and pass the mini-transaction as
the alloc_mtr to btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.

buf_reset_check_index_page_at_flush(): Remove. The function
fsp_init_file_page_low() already sets
bpage->check_index_page_at_flush=FALSE.

There is a known issue in tablespace extension. If the request to
allocate a BLOB page leads to the tablespace being extended, crash
recovery could see BLOB writes to pages that are off the tablespace
file bounds. This should trigger an assertion failure in fil_io() at
crash recovery. The safe thing would be to write redo log about the
tablespace extension to the mini-transaction of the BLOB write, not to
the mini-transaction of the record update. However, there is no redo
log record for file extension in the current redo log format.

rb:693 approved by Sunny Bains
2011-08-29 11:16:42 +03:00
Marko Mäkelä
f2f4b19678 Bug#12626794 61240: UNUSED FUNCTIONS ... 2011-08-10 14:56:14 +03:00
Marko Mäkelä
896e0ba4e0 Bug#59486 Incorrect usage of UNIV_UNLIKELY() in mlog_parse_string()
mlog_parse_string(): Enclose the comparison in UNIV_UNLIKELY,
not the comparand.
2011-01-25 12:17:28 +02:00
Satya B
e011c02e2f Applying InnoDB Plugin 1.0.5 snapshot ,part 12
From r5995 to r6043

Detailed revision comments:

r5995 | marko | 2009-09-28 03:52:25 -0500 (Mon, 28 Sep 2009) | 17 lines
branches/zip: Do not write to PAGE_INDEX_ID after page creation,
not even when restoring an uncompressed page after a compression failure.

btr_page_reorganize_low(): On compression failure, do not restore
those page header fields that should not be affected by the
reorganization.  Instead, compare the fields.

page_zip_decompress(): Add the parameter ibool all, for copying all
page header fields.  Pass the parameter all=TRUE on block read
completion, redo log application, and page_zip_validate(); pass
all=FALSE in all other cases.

page_zip_reorganize(): Do not restore the uncompressed page on
failure.  It will be restored (to pre-modification state) by the
caller anyway.

rb://167, Issue #346
r5996 | marko | 2009-09-28 07:46:02 -0500 (Mon, 28 Sep 2009) | 4 lines
branches/zip: Address Issue #350 in comments.

lock_rec_queue_validate(), lock_rec_queue_validate(): Note that
this debug code may violate the latching order and cause deadlocks.
r5997 | marko | 2009-09-28 08:03:58 -0500 (Mon, 28 Sep 2009) | 12 lines
branches/zip: Remove an assertion failure when the InnoDB data dictionary
is inconsistent with the MySQL .frm file.

ha_innobase::index_read(): When the index cannot be found,
return an error.

ha_innobase::change_active_index(): When prebuilt->index == NULL,
set also prebuilt->index_usable = FALSE.  This is not needed for
correctness, because prebuilt->index_usable is only checked by
row_search_for_mysql(), which requires prebuilt->index != NULL.

This addresses Issue #349.  Approved by Heikki Tuuri over IM.
r6005 | vasil | 2009-09-29 03:09:52 -0500 (Tue, 29 Sep 2009) | 4 lines
branches/zip:

ChangeLog: wrap around 78th column, not earlier.

r6006 | vasil | 2009-09-29 05:15:25 -0500 (Tue, 29 Sep 2009) | 4 lines
branches/zip:

Add ChangeLog entry for the release of 1.0.4.

r6007 | vasil | 2009-09-29 08:19:59 -0500 (Tue, 29 Sep 2009) | 6 lines
branches/zip:

Fix the year, should be 2009.

Pointed by:	Calvin

r6026 | marko | 2009-09-30 02:18:24 -0500 (Wed, 30 Sep 2009) | 1 line
branches/zip: Add some debug assertions for checking FSEG_MAGIC_N.
r6028 | marko | 2009-09-30 08:55:23 -0500 (Wed, 30 Sep 2009) | 3 lines
branches/zip: recv_no_log_write: New debug flag for tracking down
Mantis Issue #347.  No modifications should be made to the database
while recv_apply_hashed_log_recs() is about to complete.
r6029 | calvin | 2009-09-30 15:32:02 -0500 (Wed, 30 Sep 2009) | 4 lines
branches/zip: non-functional changes

Fix typo.

r6031 | marko | 2009-10-01 06:24:33 -0500 (Thu, 01 Oct 2009) | 49 lines
branches/zip: Clean up after a crash during DROP INDEX.
When InnoDB crashes while dropping an index, ensure that
the index will be completely dropped during crash recovery.

row_merge_drop_index(): Before dropping an index, rename the index to
start with TEMP_INDEX_PREFIX_STR and commit the change, so that
row_merge_drop_temp_indexes() will drop the index after crash
recovery if the server crashes while dropping the index.

fseg_inode_try_get(): New function, forked from fseg_inode_get().
Return NULL if the file segment index node is free.

fseg_inode_get(): Assert that the file segment index node is not free.

fseg_free_step(): If the file segment index node is already free,
print a diagnostic message and return TRUE.

fsp_free_seg_inode(): Write a nonzero number to FSEG_MAGIC_N, so that
allocated-and-freed file segment index nodes can be better
distinguished from uninitialized ones.

This is rb://174, addressing Issue #348.

Tested by restarting mysqld upon the completion of the added
log_write_up_to() invocation below, during DROP INDEX.  The index was
dropped after crash recovery, and re-issuing the DROP INDEX did not
crash the server.

Index: btr/btr0btr.c
===================================================================
--- btr/btr0btr.c	(revision 6026)
+++ btr/btr0btr.c	(working copy)
@@ -42,6 +42,7 @@ Created 6/2/1994 Heikki Tuuri
 #include "ibuf0ibuf.h"
 #include "trx0trx.h"
+#include "log0log.h"
 
 /*
 Latching strategy of the InnoDB B-tree
 --------------------------------------
@@ -873,6 +874,8 @@ leaf_loop:
 
 		goto leaf_loop;
 	}
+
+	log_write_up_to(mtr.end_lsn, LOG_WAIT_ALL_GROUPS, TRUE);
 top_loop:
 	mtr_start(&mtr);
 
r6033 | calvin | 2009-10-01 15:19:46 -0500 (Thu, 01 Oct 2009) | 4 lines
branches/zip: fix a typo in error message

Reported as bug#47763.

r6043 | inaam | 2009-10-05 09:45:35 -0500 (Mon, 05 Oct 2009) | 12 lines
branches/zip  rb://176

Do not invalidate buffer pool while an LRU batch is active. Added
code to buf_pool_invalidate() to wait for the running batches to finish.

This patch also resets the state of buf_pool struct at invalidation. This
addresses the concern where buf_pool->freed_page_clock becomes non-zero
because we read in a system tablespace page for file format info at
startup.

Approved by: Marko
2009-10-09 19:43:15 +05:30
Satya B
d0b61c03bf Applying InnoDB Plugin 1.0.5 snapshot , part 5
From revision r5733 to r5747

Detailed revision comments:

r5733 | sunny | 2009-09-02 02:05:15 -0500 (Wed, 02 Sep 2009) | 6 lines
branches/zip: Fix a regression introduced by the fix for bug#26316. We check
whether a transaction holds any AUTOINC locks before we acquire the kernel
mutex and release those locks.

Fix for rb://153. Approved by Marko.

r5734 | sunny | 2009-09-02 02:08:45 -0500 (Wed, 02 Sep 2009) | 2 lines
branches/zip: Update ChangeLog with r5733 changes.

r5735 | marko | 2009-09-02 02:43:09 -0500 (Wed, 02 Sep 2009) | 2 lines
branches/zip: univ.i: Do not undefine PACKAGE or VERSION.
InnoDB source code does not refer to these macros.
r5736 | marko | 2009-09-02 02:53:19 -0500 (Wed, 02 Sep 2009) | 1 line
branches/zip: Enclose some timestamp functions in #ifndef UNIV_HOTBACKUP.
r5743 | marko | 2009-09-03 01:36:12 -0500 (Thu, 03 Sep 2009) | 3 lines
branches/zip: log_reserve_and_write_fast(): Remove the redundant
output parameter "success".
Success is also indicated by a nonzero return value.
r5744 | marko | 2009-09-03 03:28:35 -0500 (Thu, 03 Sep 2009) | 1 line
branches/zip: ut_align(): Make ptr const, like in ut_align_down().
r5745 | marko | 2009-09-03 03:38:22 -0500 (Thu, 03 Sep 2009) | 2 lines
branches/zip: log_check_log_recs(): Enclose in #ifdef UNIV_LOG_DEBUG.
Add const qualifiers.
r5746 | marko | 2009-09-03 03:55:36 -0500 (Thu, 03 Sep 2009) | 2 lines
branches/zip: log_reserve_and_write_fast(): Do not cache the log_sys pointer
in a local variable.
r5747 | marko | 2009-09-03 05:46:38 -0500 (Thu, 03 Sep 2009) | 2 lines
branches/zip: recv_scan_log_recs(): Replace while with do...while,
because the termination condition will always hold on the first iteration.
2009-10-08 17:48:19 +05:30
Sergey Vojtovich
bae6276d45 Update to innoplug-1.0.4. 2009-07-30 17:42:56 +05:00
Satya B
3945d5e554 Adding innodb_plugin-1.0.4 as storage/innodb_plugin. 2009-05-27 15:15:59 +05:30