2009-05-27 15:15:59 +05:30
|
|
|
/*****************************************************************************
|
|
|
|
|
2011-08-29 11:22:43 +03:00
|
|
|
Copyright (c) 1995, 2011, Oracle and/or its affiliates. All Rights Reserved.
|
2009-05-27 15:15:59 +05:30
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify it under
|
|
|
|
the terms of the GNU General Public License as published by the Free Software
|
|
|
|
Foundation; version 2 of the License.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful, but WITHOUT
|
|
|
|
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
|
|
|
|
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License along with
|
|
|
|
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
|
|
|
|
Place, Suite 330, Boston, MA 02111-1307 USA
|
|
|
|
|
|
|
|
*****************************************************************************/
|
|
|
|
|
|
|
|
/**************************************************//**
|
|
|
|
@file include/mtr0mtr.h
|
|
|
|
Mini-transaction buffer
|
|
|
|
|
|
|
|
Created 11/26/1995 Heikki Tuuri
|
|
|
|
*******************************************************/
|
|
|
|
|
|
|
|
#ifndef mtr0mtr_h
|
|
|
|
#define mtr0mtr_h
|
|
|
|
|
|
|
|
#include "univ.i"
|
|
|
|
#include "mem0mem.h"
|
|
|
|
#include "dyn0dyn.h"
|
|
|
|
#include "buf0types.h"
|
|
|
|
#include "sync0rw.h"
|
|
|
|
#include "ut0byte.h"
|
|
|
|
#include "mtr0types.h"
|
|
|
|
#include "page0types.h"
|
|
|
|
|
|
|
|
/* Logging modes for a mini-transaction */
|
|
|
|
#define MTR_LOG_ALL 21 /* default mode: log all operations
|
|
|
|
modifying disk-based data */
|
|
|
|
#define MTR_LOG_NONE 22 /* log no operations */
|
|
|
|
/*#define MTR_LOG_SPACE 23 */ /* log only operations modifying
|
|
|
|
file space page allocation data
|
|
|
|
(operations in fsp0fsp.* ) */
|
|
|
|
#define MTR_LOG_SHORT_INSERTS 24 /* inserts are logged in a shorter
|
|
|
|
form */
|
|
|
|
|
|
|
|
/* Types for the mlock objects to store in the mtr memo; NOTE that the
|
|
|
|
first 3 values must be RW_S_LATCH, RW_X_LATCH, RW_NO_LATCH */
|
|
|
|
#define MTR_MEMO_PAGE_S_FIX RW_S_LATCH
|
|
|
|
#define MTR_MEMO_PAGE_X_FIX RW_X_LATCH
|
|
|
|
#define MTR_MEMO_BUF_FIX RW_NO_LATCH
|
|
|
|
#define MTR_MEMO_MODIFY 54
|
|
|
|
#define MTR_MEMO_S_LOCK 55
|
|
|
|
#define MTR_MEMO_X_LOCK 56
|
2011-08-29 11:22:43 +03:00
|
|
|
/** The mini-transaction freed a clustered index leaf page. */
|
Bug#12704861 Corruption after a crash during BLOB update
The fix of Bug#12612184 broke crash recovery. When a record that
contains off-page columns (BLOBs) is updated, we must first write redo
log about the BLOB page writes, and only after that write the redo log
about the B-tree changes. The buggy fix would log the B-tree changes
first, meaning that after recovery, we could end up having a record
that contains a null BLOB pointer.
Because we will be redo logging the writes off the off-page columns
before the B-tree changes, we must make sure that the pages chosen for
the off-page columns are free both before and after the B-tree
changes. In this way, the worst thing that can happen in crash
recovery is that the BLOBs are written to free pages, but the B-tree
changes are not applied. The BLOB pages would correctly remain free in
this case. To achieve this, we must allocate the BLOB pages in the
mini-transaction of the B-tree operation. A further quirk is that BLOB
pages are allocated from the same file segment as leaf pages. Because
of this, we must temporarily "hide" any leaf pages that were freed
during the B-tree operation by "fake allocating" them prior to writing
the BLOBs, and freeing them again before the mtr_commit() of the
B-tree operation, in btr_mark_freed_leaves().
btr_cur_mtr_commit_and_start(): Remove this faulty function that was
introduced in the Bug#12612184 fix. The problem that this function was
trying to address was that when we did mtr_commit() the BLOB writes
before the mtr_commit() of the update, the new BLOB pages could have
overwritten clustered index B-tree leaf pages that were freed during
the update. If recovery applied the redo log of the BLOB writes but
did not see the log of the record update, the index tree would be
corrupted. The correct solution is to make the freed clustered index
pages unavailable to the BLOB allocation. This function is also a
likely culprit of InnoDB hangs that were observed when testing the
Bug#12612184 fix.
btr_mark_freed_leaves(): Mark all freed clustered index leaf pages of
a mini-transaction allocated (nonfree=TRUE) before storing the BLOBs,
or freed (nonfree=FALSE) before committing the mini-transaction.
btr_freed_leaves_validate(): A debug function for checking that all
clustered index leaf pages that have been marked free in the
mini-transaction are consistent (have not been zeroed out).
btr_page_alloc_low(): Refactored from btr_page_alloc(). Return the
number of the allocated page, or FIL_NULL if out of space. Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or if this is a "fake allocation"
(init_mtr=NULL) by btr_mark_freed_leaves(nonfree=TRUE).
btr_page_alloc(): Add the parameter init_mtr, allowing the page to be
initialized and X-latched in a different mini-transaction than the one
that is used for the allocation. Invoke btr_page_alloc_low(). If a
clustered index leaf page was previously freed in mtr, remove it from
the memo of previously freed pages.
btr_page_free(): Assert that the page is a B-tree page and it has been
X-latched by the mini-transaction. If the freed page was a leaf page
of a clustered index, link it by a MTR_MEMO_FREE_CLUST_LEAF marker to
the mini-transaction.
btr_store_big_rec_extern_fields_func(): Add the parameter alloc_mtr,
which is NULL (old behaviour in inserts) and the same as local_mtr in
updates. If alloc_mtr!=NULL, the BLOB pages will be allocated from it
instead of the mini-transaction that is used for writing the BLOBs.
fsp_alloc_from_free_frag(): Refactored from
fsp_alloc_free_page(). Allocate the specified page from a partially
free extent.
fseg_alloc_free_page_low(), fseg_alloc_free_page_general(): Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or NULL if this is a "fake allocation"
that prevents the reuse of a previously freed B-tree page for BLOB
storage. If init_mtr==NULL, try harder to reallocate the specified page
and assert that it succeeded.
fsp_alloc_free_page(): Add the parameter "mtr_t* init_mtr" for
specifying the mini-transaction where the page should be initialized.
Do not allow init_mtr == NULL, because this function is never to be
used for "fake allocations".
mtr_t: Add the operation MTR_MEMO_FREE_CLUST_LEAF and the flag
mtr->freed_clust_leaf for quickly determining if any
MTR_MEMO_FREE_CLUST_LEAF operations have been posted.
row_ins_index_entry_low(): When columns are being made off-page in
insert-by-update, invoke btr_mark_freed_leaves(nonfree=TRUE) and pass
the mini-transaction as the alloc_mtr to
btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
row_build(): Correct a comment, and add a debug assertion that a
record that contains NULL BLOB pointers must be a fresh insert.
row_upd_clust_rec(): When columns are being moved off-page, invoke
btr_mark_freed_leaves(nonfree=TRUE) and pass the mini-transaction as
the alloc_mtr to btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
buf_reset_check_index_page_at_flush(): Remove. The function
fsp_init_file_page_low() already sets
bpage->check_index_page_at_flush=FALSE.
There is a known issue in tablespace extension. If the request to
allocate a BLOB page leads to the tablespace being extended, crash
recovery could see BLOB writes to pages that are off the tablespace
file bounds. This should trigger an assertion failure in fil_io() at
crash recovery. The safe thing would be to write redo log about the
tablespace extension to the mini-transaction of the BLOB write, not to
the mini-transaction of the record update. However, there is no redo
log record for file extension in the current redo log format.
rb:693 approved by Sunny Bains
2011-08-29 11:16:42 +03:00
|
|
|
#define MTR_MEMO_FREE_CLUST_LEAF 57
|
2009-05-27 15:15:59 +05:30
|
|
|
|
2009-07-30 17:42:56 +05:00
|
|
|
/** @name Log item types
|
|
|
|
The log items are declared 'byte' so that the compiler can warn if val
|
|
|
|
and type parameters are switched in a call to mlog_write_ulint. NOTE!
|
|
|
|
For 1 - 8 bytes, the flag value must give the length also! @{ */
|
|
|
|
#define MLOG_SINGLE_REC_FLAG 128 /*!< if the mtr contains only
|
2009-05-27 15:15:59 +05:30
|
|
|
one log record for one page,
|
|
|
|
i.e., write_initial_log_record
|
|
|
|
has been called only once,
|
|
|
|
this flag is ORed to the type
|
|
|
|
of that first log record */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_1BYTE (1) /*!< one byte is written */
|
|
|
|
#define MLOG_2BYTES (2) /*!< 2 bytes ... */
|
|
|
|
#define MLOG_4BYTES (4) /*!< 4 bytes ... */
|
|
|
|
#define MLOG_8BYTES (8) /*!< 8 bytes ... */
|
|
|
|
#define MLOG_REC_INSERT ((byte)9) /*!< record insert */
|
|
|
|
#define MLOG_REC_CLUST_DELETE_MARK ((byte)10) /*!< mark clustered index record
|
2009-05-27 15:15:59 +05:30
|
|
|
deleted */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_REC_SEC_DELETE_MARK ((byte)11) /*!< mark secondary index record
|
2009-05-27 15:15:59 +05:30
|
|
|
deleted */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_REC_UPDATE_IN_PLACE ((byte)13) /*!< update of a record,
|
2009-05-27 15:15:59 +05:30
|
|
|
preserves record field sizes */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_REC_DELETE ((byte)14) /*!< delete a record from a
|
2009-05-27 15:15:59 +05:30
|
|
|
page */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_LIST_END_DELETE ((byte)15) /*!< delete record list end on
|
2009-05-27 15:15:59 +05:30
|
|
|
index page */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_LIST_START_DELETE ((byte)16) /*!< delete record list start on
|
2009-05-27 15:15:59 +05:30
|
|
|
index page */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_LIST_END_COPY_CREATED ((byte)17) /*!< copy record list end to a
|
2009-05-27 15:15:59 +05:30
|
|
|
new created index page */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_PAGE_REORGANIZE ((byte)18) /*!< reorganize an
|
|
|
|
index page in
|
|
|
|
ROW_FORMAT=REDUNDANT */
|
|
|
|
#define MLOG_PAGE_CREATE ((byte)19) /*!< create an index page */
|
|
|
|
#define MLOG_UNDO_INSERT ((byte)20) /*!< insert entry in an undo
|
2009-05-27 15:15:59 +05:30
|
|
|
log */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_UNDO_ERASE_END ((byte)21) /*!< erase an undo log
|
2009-05-27 15:15:59 +05:30
|
|
|
page end */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_UNDO_INIT ((byte)22) /*!< initialize a page in an
|
2009-05-27 15:15:59 +05:30
|
|
|
undo log */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_UNDO_HDR_DISCARD ((byte)23) /*!< discard an update undo log
|
2009-05-27 15:15:59 +05:30
|
|
|
header */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_UNDO_HDR_REUSE ((byte)24) /*!< reuse an insert undo log
|
2009-05-27 15:15:59 +05:30
|
|
|
header */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_UNDO_HDR_CREATE ((byte)25) /*!< create an undo
|
|
|
|
log header */
|
|
|
|
#define MLOG_REC_MIN_MARK ((byte)26) /*!< mark an index
|
|
|
|
record as the
|
|
|
|
predefined minimum
|
|
|
|
record */
|
|
|
|
#define MLOG_IBUF_BITMAP_INIT ((byte)27) /*!< initialize an
|
|
|
|
ibuf bitmap page */
|
2009-05-27 15:15:59 +05:30
|
|
|
/*#define MLOG_FULL_PAGE ((byte)28) full contents of a page */
|
2009-10-16 17:28:02 +05:30
|
|
|
#ifdef UNIV_LOG_LSN_DEBUG
|
|
|
|
# define MLOG_LSN ((byte)28) /* current LSN */
|
|
|
|
#endif
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_INIT_FILE_PAGE ((byte)29) /*!< this means that a
|
|
|
|
file page is taken
|
|
|
|
into use and the prior
|
|
|
|
contents of the page
|
|
|
|
should be ignored: in
|
|
|
|
recovery we must not
|
|
|
|
trust the lsn values
|
|
|
|
stored to the file
|
|
|
|
page */
|
|
|
|
#define MLOG_WRITE_STRING ((byte)30) /*!< write a string to
|
|
|
|
a page */
|
|
|
|
#define MLOG_MULTI_REC_END ((byte)31) /*!< if a single mtr writes
|
2009-10-16 17:28:02 +05:30
|
|
|
several log records,
|
2009-05-27 15:15:59 +05:30
|
|
|
this log record ends the
|
|
|
|
sequence of these records */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_DUMMY_RECORD ((byte)32) /*!< dummy log record used to
|
2009-05-27 15:15:59 +05:30
|
|
|
pad a log block full */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_FILE_CREATE ((byte)33) /*!< log record about an .ibd
|
2009-05-27 15:15:59 +05:30
|
|
|
file creation */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_FILE_RENAME ((byte)34) /*!< log record about an .ibd
|
2009-05-27 15:15:59 +05:30
|
|
|
file rename */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_FILE_DELETE ((byte)35) /*!< log record about an .ibd
|
2009-05-27 15:15:59 +05:30
|
|
|
file deletion */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_COMP_REC_MIN_MARK ((byte)36) /*!< mark a compact
|
|
|
|
index record as the
|
|
|
|
predefined minimum
|
2009-05-27 15:15:59 +05:30
|
|
|
record */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_COMP_PAGE_CREATE ((byte)37) /*!< create a compact
|
2009-05-27 15:15:59 +05:30
|
|
|
index page */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_COMP_REC_INSERT ((byte)38) /*!< compact record insert */
|
2009-05-27 15:15:59 +05:30
|
|
|
#define MLOG_COMP_REC_CLUST_DELETE_MARK ((byte)39)
|
2009-07-30 17:42:56 +05:00
|
|
|
/*!< mark compact
|
|
|
|
clustered index record
|
|
|
|
deleted */
|
|
|
|
#define MLOG_COMP_REC_SEC_DELETE_MARK ((byte)40)/*!< mark compact
|
|
|
|
secondary index record
|
|
|
|
deleted; this log
|
|
|
|
record type is
|
|
|
|
redundant, as
|
|
|
|
MLOG_REC_SEC_DELETE_MARK
|
|
|
|
is independent of the
|
|
|
|
record format. */
|
|
|
|
#define MLOG_COMP_REC_UPDATE_IN_PLACE ((byte)41)/*!< update of a
|
|
|
|
compact record,
|
|
|
|
preserves record field
|
|
|
|
sizes */
|
|
|
|
#define MLOG_COMP_REC_DELETE ((byte)42) /*!< delete a compact record
|
2009-05-27 15:15:59 +05:30
|
|
|
from a page */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_COMP_LIST_END_DELETE ((byte)43) /*!< delete compact record list
|
2009-05-27 15:15:59 +05:30
|
|
|
end on index page */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_COMP_LIST_START_DELETE ((byte)44) /*!< delete compact record list
|
2009-05-27 15:15:59 +05:30
|
|
|
start on index page */
|
|
|
|
#define MLOG_COMP_LIST_END_COPY_CREATED ((byte)45)
|
2009-07-30 17:42:56 +05:00
|
|
|
/*!< copy compact
|
|
|
|
record list end to a
|
|
|
|
new created index
|
|
|
|
page */
|
|
|
|
#define MLOG_COMP_PAGE_REORGANIZE ((byte)46) /*!< reorganize an index page */
|
|
|
|
#define MLOG_FILE_CREATE2 ((byte)47) /*!< log record about creating
|
2009-05-27 15:15:59 +05:30
|
|
|
an .ibd file, with format */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_ZIP_WRITE_NODE_PTR ((byte)48) /*!< write the node pointer of
|
2009-05-27 15:15:59 +05:30
|
|
|
a record on a compressed
|
|
|
|
non-leaf B-tree page */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_ZIP_WRITE_BLOB_PTR ((byte)49) /*!< write the BLOB pointer
|
2009-05-27 15:15:59 +05:30
|
|
|
of an externally stored column
|
|
|
|
on a compressed page */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_ZIP_WRITE_HEADER ((byte)50) /*!< write to compressed page
|
2009-05-27 15:15:59 +05:30
|
|
|
header */
|
2009-07-30 17:42:56 +05:00
|
|
|
#define MLOG_ZIP_PAGE_COMPRESS ((byte)51) /*!< compress an index page */
|
|
|
|
#define MLOG_BIGGEST_TYPE ((byte)51) /*!< biggest value (used in
|
|
|
|
assertions) */
|
|
|
|
/* @} */
|
2009-05-27 15:15:59 +05:30
|
|
|
|
2009-07-30 17:42:56 +05:00
|
|
|
/** @name Flags for MLOG_FILE operations
|
|
|
|
(stored in the page number parameter, called log_flags in the
|
|
|
|
functions). The page number parameter was originally written as 0. @{ */
|
|
|
|
#define MLOG_FILE_FLAG_TEMP 1 /*!< identifies TEMPORARY TABLE in
|
2009-05-27 15:15:59 +05:30
|
|
|
MLOG_FILE_CREATE, MLOG_FILE_CREATE2 */
|
2009-07-30 17:42:56 +05:00
|
|
|
/* @} */
|
2009-05-27 15:15:59 +05:30
|
|
|
|
|
|
|
/***************************************************************//**
|
Bug#11766305 - 59392: Remove thr0loc.c and ibuf_inside() [part 4 of 4]
ibuf_inside(), ibuf_enter(), ibuf_exit(): Add the parameter mtr. The
flag is no longer kept in the thread-local storage but in the
mini-transaction (mtr->inside_ibuf).
mtr_start(): Clean up the comment and remove the unused return value.
mtr_commit(): Assert !ibuf_inside(mtr) in debug builds.
ibuf_mtr_start(): Like mtr_start(), but sets the flag.
ibuf_mtr_commit(), ibuf_btr_pcur_commit_specify_mtr(): Wrappers that
assert ibuf_inside().
buf_page_get_zip(), buf_page_init_for_read(),
buf_read_ibuf_merge_pages(), fil_io(), ibuf_free_excess_pages(),
ibuf_contract_ext(): Remove assertions on ibuf_inside(), because a
mini-transaction is not available.
buf_read_ahead_linear(): Add the parameter inside_ibuf.
ibuf_restore_pos(): When this function returns FALSE, it commits mtr
and must therefore do ibuf_exit(mtr).
ibuf_delete_rec(): This function commits mtr and must therefore do
ibuf_exit(mtr).
ibuf_rec_get_page_no(), ibuf_rec_get_space(), ibuf_rec_get_info(),
ibuf_rec_get_op_type(), ibuf_build_entry_from_ibuf_rec(),
ibuf_rec_get_volume(), ibuf_get_merge_page_nos(),
ibuf_get_volume_buffered_count(), ibuf_get_entry_counter_low(): Add
the parameter mtr in debug builds, for asserting ibuf_inside(mtr).
rb:585 approved by Sunny Bains
2011-03-24 14:00:14 +02:00
|
|
|
Starts a mini-transaction. */
|
2009-05-27 15:15:59 +05:30
|
|
|
UNIV_INLINE
|
Bug#11766305 - 59392: Remove thr0loc.c and ibuf_inside() [part 4 of 4]
ibuf_inside(), ibuf_enter(), ibuf_exit(): Add the parameter mtr. The
flag is no longer kept in the thread-local storage but in the
mini-transaction (mtr->inside_ibuf).
mtr_start(): Clean up the comment and remove the unused return value.
mtr_commit(): Assert !ibuf_inside(mtr) in debug builds.
ibuf_mtr_start(): Like mtr_start(), but sets the flag.
ibuf_mtr_commit(), ibuf_btr_pcur_commit_specify_mtr(): Wrappers that
assert ibuf_inside().
buf_page_get_zip(), buf_page_init_for_read(),
buf_read_ibuf_merge_pages(), fil_io(), ibuf_free_excess_pages(),
ibuf_contract_ext(): Remove assertions on ibuf_inside(), because a
mini-transaction is not available.
buf_read_ahead_linear(): Add the parameter inside_ibuf.
ibuf_restore_pos(): When this function returns FALSE, it commits mtr
and must therefore do ibuf_exit(mtr).
ibuf_delete_rec(): This function commits mtr and must therefore do
ibuf_exit(mtr).
ibuf_rec_get_page_no(), ibuf_rec_get_space(), ibuf_rec_get_info(),
ibuf_rec_get_op_type(), ibuf_build_entry_from_ibuf_rec(),
ibuf_rec_get_volume(), ibuf_get_merge_page_nos(),
ibuf_get_volume_buffered_count(), ibuf_get_entry_counter_low(): Add
the parameter mtr in debug builds, for asserting ibuf_inside(mtr).
rb:585 approved by Sunny Bains
2011-03-24 14:00:14 +02:00
|
|
|
void
|
2009-05-27 15:15:59 +05:30
|
|
|
mtr_start(
|
|
|
|
/*======*/
|
Bug#11766305 - 59392: Remove thr0loc.c and ibuf_inside() [part 4 of 4]
ibuf_inside(), ibuf_enter(), ibuf_exit(): Add the parameter mtr. The
flag is no longer kept in the thread-local storage but in the
mini-transaction (mtr->inside_ibuf).
mtr_start(): Clean up the comment and remove the unused return value.
mtr_commit(): Assert !ibuf_inside(mtr) in debug builds.
ibuf_mtr_start(): Like mtr_start(), but sets the flag.
ibuf_mtr_commit(), ibuf_btr_pcur_commit_specify_mtr(): Wrappers that
assert ibuf_inside().
buf_page_get_zip(), buf_page_init_for_read(),
buf_read_ibuf_merge_pages(), fil_io(), ibuf_free_excess_pages(),
ibuf_contract_ext(): Remove assertions on ibuf_inside(), because a
mini-transaction is not available.
buf_read_ahead_linear(): Add the parameter inside_ibuf.
ibuf_restore_pos(): When this function returns FALSE, it commits mtr
and must therefore do ibuf_exit(mtr).
ibuf_delete_rec(): This function commits mtr and must therefore do
ibuf_exit(mtr).
ibuf_rec_get_page_no(), ibuf_rec_get_space(), ibuf_rec_get_info(),
ibuf_rec_get_op_type(), ibuf_build_entry_from_ibuf_rec(),
ibuf_rec_get_volume(), ibuf_get_merge_page_nos(),
ibuf_get_volume_buffered_count(), ibuf_get_entry_counter_low(): Add
the parameter mtr in debug builds, for asserting ibuf_inside(mtr).
rb:585 approved by Sunny Bains
2011-03-24 14:00:14 +02:00
|
|
|
mtr_t* mtr) /*!< out: mini-transaction */
|
|
|
|
__attribute__((nonnull));
|
2009-05-27 15:15:59 +05:30
|
|
|
/***************************************************************//**
|
|
|
|
Commits a mini-transaction. */
|
|
|
|
UNIV_INTERN
|
|
|
|
void
|
|
|
|
mtr_commit(
|
|
|
|
/*=======*/
|
Bug#11766305 - 59392: Remove thr0loc.c and ibuf_inside() [part 4 of 4]
ibuf_inside(), ibuf_enter(), ibuf_exit(): Add the parameter mtr. The
flag is no longer kept in the thread-local storage but in the
mini-transaction (mtr->inside_ibuf).
mtr_start(): Clean up the comment and remove the unused return value.
mtr_commit(): Assert !ibuf_inside(mtr) in debug builds.
ibuf_mtr_start(): Like mtr_start(), but sets the flag.
ibuf_mtr_commit(), ibuf_btr_pcur_commit_specify_mtr(): Wrappers that
assert ibuf_inside().
buf_page_get_zip(), buf_page_init_for_read(),
buf_read_ibuf_merge_pages(), fil_io(), ibuf_free_excess_pages(),
ibuf_contract_ext(): Remove assertions on ibuf_inside(), because a
mini-transaction is not available.
buf_read_ahead_linear(): Add the parameter inside_ibuf.
ibuf_restore_pos(): When this function returns FALSE, it commits mtr
and must therefore do ibuf_exit(mtr).
ibuf_delete_rec(): This function commits mtr and must therefore do
ibuf_exit(mtr).
ibuf_rec_get_page_no(), ibuf_rec_get_space(), ibuf_rec_get_info(),
ibuf_rec_get_op_type(), ibuf_build_entry_from_ibuf_rec(),
ibuf_rec_get_volume(), ibuf_get_merge_page_nos(),
ibuf_get_volume_buffered_count(), ibuf_get_entry_counter_low(): Add
the parameter mtr in debug builds, for asserting ibuf_inside(mtr).
rb:585 approved by Sunny Bains
2011-03-24 14:00:14 +02:00
|
|
|
mtr_t* mtr) /*!< in/out: mini-transaction */
|
|
|
|
__attribute__((nonnull));
|
2009-05-27 15:15:59 +05:30
|
|
|
/**********************************************************//**
|
|
|
|
Sets and returns a savepoint in mtr.
|
|
|
|
@return savepoint */
|
|
|
|
UNIV_INLINE
|
|
|
|
ulint
|
|
|
|
mtr_set_savepoint(
|
|
|
|
/*==============*/
|
|
|
|
mtr_t* mtr); /*!< in: mtr */
|
|
|
|
#ifndef UNIV_HOTBACKUP
|
|
|
|
/**********************************************************//**
|
|
|
|
Releases the (index tree) s-latch stored in an mtr memo after a
|
|
|
|
savepoint. */
|
|
|
|
UNIV_INLINE
|
|
|
|
void
|
|
|
|
mtr_release_s_latch_at_savepoint(
|
|
|
|
/*=============================*/
|
|
|
|
mtr_t* mtr, /*!< in: mtr */
|
|
|
|
ulint savepoint, /*!< in: savepoint */
|
|
|
|
rw_lock_t* lock); /*!< in: latch to release */
|
|
|
|
#else /* !UNIV_HOTBACKUP */
|
|
|
|
# define mtr_release_s_latch_at_savepoint(mtr,savepoint,lock) ((void) 0)
|
|
|
|
#endif /* !UNIV_HOTBACKUP */
|
|
|
|
/***************************************************************//**
|
|
|
|
Gets the logging mode of a mini-transaction.
|
|
|
|
@return logging mode: MTR_LOG_NONE, ... */
|
|
|
|
UNIV_INLINE
|
|
|
|
ulint
|
|
|
|
mtr_get_log_mode(
|
|
|
|
/*=============*/
|
|
|
|
mtr_t* mtr); /*!< in: mtr */
|
|
|
|
/***************************************************************//**
|
|
|
|
Changes the logging mode of a mini-transaction.
|
|
|
|
@return old mode */
|
|
|
|
UNIV_INLINE
|
|
|
|
ulint
|
|
|
|
mtr_set_log_mode(
|
|
|
|
/*=============*/
|
|
|
|
mtr_t* mtr, /*!< in: mtr */
|
|
|
|
ulint mode); /*!< in: logging mode: MTR_LOG_NONE, ... */
|
|
|
|
/********************************************************//**
|
|
|
|
Reads 1 - 4 bytes from a file page buffered in the buffer pool.
|
|
|
|
@return value read */
|
|
|
|
UNIV_INTERN
|
|
|
|
ulint
|
|
|
|
mtr_read_ulint(
|
|
|
|
/*===========*/
|
|
|
|
const byte* ptr, /*!< in: pointer from where to read */
|
|
|
|
ulint type, /*!< in: MLOG_1BYTE, MLOG_2BYTES, MLOG_4BYTES */
|
|
|
|
mtr_t* mtr); /*!< in: mini-transaction handle */
|
|
|
|
#ifndef UNIV_HOTBACKUP
|
|
|
|
/*********************************************************************//**
|
|
|
|
This macro locks an rw-lock in s-mode. */
|
|
|
|
#define mtr_s_lock(B, MTR) mtr_s_lock_func((B), __FILE__, __LINE__,\
|
|
|
|
(MTR))
|
|
|
|
/*********************************************************************//**
|
|
|
|
This macro locks an rw-lock in x-mode. */
|
|
|
|
#define mtr_x_lock(B, MTR) mtr_x_lock_func((B), __FILE__, __LINE__,\
|
|
|
|
(MTR))
|
|
|
|
/*********************************************************************//**
|
|
|
|
NOTE! Use the macro above!
|
|
|
|
Locks a lock in s-mode. */
|
|
|
|
UNIV_INLINE
|
|
|
|
void
|
|
|
|
mtr_s_lock_func(
|
|
|
|
/*============*/
|
|
|
|
rw_lock_t* lock, /*!< in: rw-lock */
|
|
|
|
const char* file, /*!< in: file name */
|
|
|
|
ulint line, /*!< in: line number */
|
|
|
|
mtr_t* mtr); /*!< in: mtr */
|
|
|
|
/*********************************************************************//**
|
|
|
|
NOTE! Use the macro above!
|
|
|
|
Locks a lock in x-mode. */
|
|
|
|
UNIV_INLINE
|
|
|
|
void
|
|
|
|
mtr_x_lock_func(
|
|
|
|
/*============*/
|
|
|
|
rw_lock_t* lock, /*!< in: rw-lock */
|
|
|
|
const char* file, /*!< in: file name */
|
|
|
|
ulint line, /*!< in: line number */
|
|
|
|
mtr_t* mtr); /*!< in: mtr */
|
|
|
|
#endif /* !UNIV_HOTBACKUP */
|
|
|
|
|
|
|
|
/***************************************************//**
|
|
|
|
Releases an object in the memo stack. */
|
|
|
|
UNIV_INTERN
|
|
|
|
void
|
|
|
|
mtr_memo_release(
|
|
|
|
/*=============*/
|
|
|
|
mtr_t* mtr, /*!< in: mtr */
|
|
|
|
void* object, /*!< in: object */
|
|
|
|
ulint type); /*!< in: object type: MTR_MEMO_S_LOCK, ... */
|
|
|
|
#ifdef UNIV_DEBUG
|
|
|
|
# ifndef UNIV_HOTBACKUP
|
|
|
|
/**********************************************************//**
|
|
|
|
Checks if memo contains the given item.
|
|
|
|
@return TRUE if contains */
|
|
|
|
UNIV_INLINE
|
|
|
|
ibool
|
|
|
|
mtr_memo_contains(
|
|
|
|
/*==============*/
|
|
|
|
mtr_t* mtr, /*!< in: mtr */
|
|
|
|
const void* object, /*!< in: object to search */
|
|
|
|
ulint type); /*!< in: type of object */
|
|
|
|
|
|
|
|
/**********************************************************//**
|
|
|
|
Checks if memo contains the given page.
|
|
|
|
@return TRUE if contains */
|
|
|
|
UNIV_INTERN
|
|
|
|
ibool
|
|
|
|
mtr_memo_contains_page(
|
|
|
|
/*===================*/
|
|
|
|
mtr_t* mtr, /*!< in: mtr */
|
|
|
|
const byte* ptr, /*!< in: pointer to buffer frame */
|
|
|
|
ulint type); /*!< in: type of object */
|
|
|
|
/*********************************************************//**
|
|
|
|
Prints info of an mtr handle. */
|
|
|
|
UNIV_INTERN
|
|
|
|
void
|
|
|
|
mtr_print(
|
|
|
|
/*======*/
|
|
|
|
mtr_t* mtr); /*!< in: mtr */
|
|
|
|
# else /* !UNIV_HOTBACKUP */
|
|
|
|
# define mtr_memo_contains(mtr, object, type) TRUE
|
|
|
|
# define mtr_memo_contains_page(mtr, ptr, type) TRUE
|
|
|
|
# endif /* !UNIV_HOTBACKUP */
|
|
|
|
#endif /* UNIV_DEBUG */
|
|
|
|
/*######################################################################*/
|
|
|
|
|
|
|
|
#define MTR_BUF_MEMO_SIZE 200 /* number of slots in memo */
|
|
|
|
|
|
|
|
/***************************************************************//**
|
|
|
|
Returns the log object of a mini-transaction buffer.
|
|
|
|
@return log */
|
|
|
|
UNIV_INLINE
|
|
|
|
dyn_array_t*
|
|
|
|
mtr_get_log(
|
|
|
|
/*========*/
|
|
|
|
mtr_t* mtr); /*!< in: mini-transaction */
|
|
|
|
/***************************************************//**
|
|
|
|
Pushes an object to an mtr memo stack. */
|
|
|
|
UNIV_INLINE
|
|
|
|
void
|
|
|
|
mtr_memo_push(
|
|
|
|
/*==========*/
|
|
|
|
mtr_t* mtr, /*!< in: mtr */
|
|
|
|
void* object, /*!< in: object */
|
|
|
|
ulint type); /*!< in: object type: MTR_MEMO_S_LOCK, ... */
|
|
|
|
|
|
|
|
|
|
|
|
/* Type definition of a mini-transaction memo stack slot. */
|
|
|
|
typedef struct mtr_memo_slot_struct mtr_memo_slot_t;
|
|
|
|
struct mtr_memo_slot_struct{
|
|
|
|
ulint type; /*!< type of the stored object (MTR_MEMO_S_LOCK, ...) */
|
|
|
|
void* object; /*!< pointer to the object */
|
|
|
|
};
|
|
|
|
|
|
|
|
/* Mini-transaction handle and buffer */
|
|
|
|
struct mtr_struct{
|
|
|
|
#ifdef UNIV_DEBUG
|
|
|
|
ulint state; /*!< MTR_ACTIVE, MTR_COMMITTING, MTR_COMMITTED */
|
|
|
|
#endif
|
|
|
|
dyn_array_t memo; /*!< memo stack for locks etc. */
|
|
|
|
dyn_array_t log; /*!< mini-transaction log */
|
2011-08-29 11:22:43 +03:00
|
|
|
unsigned inside_ibuf:1;
|
Bug#11766305 - 59392: Remove thr0loc.c and ibuf_inside() [part 4 of 4]
ibuf_inside(), ibuf_enter(), ibuf_exit(): Add the parameter mtr. The
flag is no longer kept in the thread-local storage but in the
mini-transaction (mtr->inside_ibuf).
mtr_start(): Clean up the comment and remove the unused return value.
mtr_commit(): Assert !ibuf_inside(mtr) in debug builds.
ibuf_mtr_start(): Like mtr_start(), but sets the flag.
ibuf_mtr_commit(), ibuf_btr_pcur_commit_specify_mtr(): Wrappers that
assert ibuf_inside().
buf_page_get_zip(), buf_page_init_for_read(),
buf_read_ibuf_merge_pages(), fil_io(), ibuf_free_excess_pages(),
ibuf_contract_ext(): Remove assertions on ibuf_inside(), because a
mini-transaction is not available.
buf_read_ahead_linear(): Add the parameter inside_ibuf.
ibuf_restore_pos(): When this function returns FALSE, it commits mtr
and must therefore do ibuf_exit(mtr).
ibuf_delete_rec(): This function commits mtr and must therefore do
ibuf_exit(mtr).
ibuf_rec_get_page_no(), ibuf_rec_get_space(), ibuf_rec_get_info(),
ibuf_rec_get_op_type(), ibuf_build_entry_from_ibuf_rec(),
ibuf_rec_get_volume(), ibuf_get_merge_page_nos(),
ibuf_get_volume_buffered_count(), ibuf_get_entry_counter_low(): Add
the parameter mtr in debug builds, for asserting ibuf_inside(mtr).
rb:585 approved by Sunny Bains
2011-03-24 14:00:14 +02:00
|
|
|
/*!< TRUE if inside ibuf changes */
|
Bug#12704861 Corruption after a crash during BLOB update
The fix of Bug#12612184 broke crash recovery. When a record that
contains off-page columns (BLOBs) is updated, we must first write redo
log about the BLOB page writes, and only after that write the redo log
about the B-tree changes. The buggy fix would log the B-tree changes
first, meaning that after recovery, we could end up having a record
that contains a null BLOB pointer.
Because we will be redo logging the writes off the off-page columns
before the B-tree changes, we must make sure that the pages chosen for
the off-page columns are free both before and after the B-tree
changes. In this way, the worst thing that can happen in crash
recovery is that the BLOBs are written to free pages, but the B-tree
changes are not applied. The BLOB pages would correctly remain free in
this case. To achieve this, we must allocate the BLOB pages in the
mini-transaction of the B-tree operation. A further quirk is that BLOB
pages are allocated from the same file segment as leaf pages. Because
of this, we must temporarily "hide" any leaf pages that were freed
during the B-tree operation by "fake allocating" them prior to writing
the BLOBs, and freeing them again before the mtr_commit() of the
B-tree operation, in btr_mark_freed_leaves().
btr_cur_mtr_commit_and_start(): Remove this faulty function that was
introduced in the Bug#12612184 fix. The problem that this function was
trying to address was that when we did mtr_commit() the BLOB writes
before the mtr_commit() of the update, the new BLOB pages could have
overwritten clustered index B-tree leaf pages that were freed during
the update. If recovery applied the redo log of the BLOB writes but
did not see the log of the record update, the index tree would be
corrupted. The correct solution is to make the freed clustered index
pages unavailable to the BLOB allocation. This function is also a
likely culprit of InnoDB hangs that were observed when testing the
Bug#12612184 fix.
btr_mark_freed_leaves(): Mark all freed clustered index leaf pages of
a mini-transaction allocated (nonfree=TRUE) before storing the BLOBs,
or freed (nonfree=FALSE) before committing the mini-transaction.
btr_freed_leaves_validate(): A debug function for checking that all
clustered index leaf pages that have been marked free in the
mini-transaction are consistent (have not been zeroed out).
btr_page_alloc_low(): Refactored from btr_page_alloc(). Return the
number of the allocated page, or FIL_NULL if out of space. Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or if this is a "fake allocation"
(init_mtr=NULL) by btr_mark_freed_leaves(nonfree=TRUE).
btr_page_alloc(): Add the parameter init_mtr, allowing the page to be
initialized and X-latched in a different mini-transaction than the one
that is used for the allocation. Invoke btr_page_alloc_low(). If a
clustered index leaf page was previously freed in mtr, remove it from
the memo of previously freed pages.
btr_page_free(): Assert that the page is a B-tree page and it has been
X-latched by the mini-transaction. If the freed page was a leaf page
of a clustered index, link it by a MTR_MEMO_FREE_CLUST_LEAF marker to
the mini-transaction.
btr_store_big_rec_extern_fields_func(): Add the parameter alloc_mtr,
which is NULL (old behaviour in inserts) and the same as local_mtr in
updates. If alloc_mtr!=NULL, the BLOB pages will be allocated from it
instead of the mini-transaction that is used for writing the BLOBs.
fsp_alloc_from_free_frag(): Refactored from
fsp_alloc_free_page(). Allocate the specified page from a partially
free extent.
fseg_alloc_free_page_low(), fseg_alloc_free_page_general(): Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or NULL if this is a "fake allocation"
that prevents the reuse of a previously freed B-tree page for BLOB
storage. If init_mtr==NULL, try harder to reallocate the specified page
and assert that it succeeded.
fsp_alloc_free_page(): Add the parameter "mtr_t* init_mtr" for
specifying the mini-transaction where the page should be initialized.
Do not allow init_mtr == NULL, because this function is never to be
used for "fake allocations".
mtr_t: Add the operation MTR_MEMO_FREE_CLUST_LEAF and the flag
mtr->freed_clust_leaf for quickly determining if any
MTR_MEMO_FREE_CLUST_LEAF operations have been posted.
row_ins_index_entry_low(): When columns are being made off-page in
insert-by-update, invoke btr_mark_freed_leaves(nonfree=TRUE) and pass
the mini-transaction as the alloc_mtr to
btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
row_build(): Correct a comment, and add a debug assertion that a
record that contains NULL BLOB pointers must be a fresh insert.
row_upd_clust_rec(): When columns are being moved off-page, invoke
btr_mark_freed_leaves(nonfree=TRUE) and pass the mini-transaction as
the alloc_mtr to btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
buf_reset_check_index_page_at_flush(): Remove. The function
fsp_init_file_page_low() already sets
bpage->check_index_page_at_flush=FALSE.
There is a known issue in tablespace extension. If the request to
allocate a BLOB page leads to the tablespace being extended, crash
recovery could see BLOB writes to pages that are off the tablespace
file bounds. This should trigger an assertion failure in fil_io() at
crash recovery. The safe thing would be to write redo log about the
tablespace extension to the mini-transaction of the BLOB write, not to
the mini-transaction of the record update. However, there is no redo
log record for file extension in the current redo log format.
rb:693 approved by Sunny Bains
2011-08-29 11:16:42 +03:00
|
|
|
unsigned modifications:1;
|
2011-08-29 11:22:43 +03:00
|
|
|
/*!< TRUE if the mini-transaction
|
|
|
|
modified buffer pool pages */
|
Bug#12704861 Corruption after a crash during BLOB update
The fix of Bug#12612184 broke crash recovery. When a record that
contains off-page columns (BLOBs) is updated, we must first write redo
log about the BLOB page writes, and only after that write the redo log
about the B-tree changes. The buggy fix would log the B-tree changes
first, meaning that after recovery, we could end up having a record
that contains a null BLOB pointer.
Because we will be redo logging the writes off the off-page columns
before the B-tree changes, we must make sure that the pages chosen for
the off-page columns are free both before and after the B-tree
changes. In this way, the worst thing that can happen in crash
recovery is that the BLOBs are written to free pages, but the B-tree
changes are not applied. The BLOB pages would correctly remain free in
this case. To achieve this, we must allocate the BLOB pages in the
mini-transaction of the B-tree operation. A further quirk is that BLOB
pages are allocated from the same file segment as leaf pages. Because
of this, we must temporarily "hide" any leaf pages that were freed
during the B-tree operation by "fake allocating" them prior to writing
the BLOBs, and freeing them again before the mtr_commit() of the
B-tree operation, in btr_mark_freed_leaves().
btr_cur_mtr_commit_and_start(): Remove this faulty function that was
introduced in the Bug#12612184 fix. The problem that this function was
trying to address was that when we did mtr_commit() the BLOB writes
before the mtr_commit() of the update, the new BLOB pages could have
overwritten clustered index B-tree leaf pages that were freed during
the update. If recovery applied the redo log of the BLOB writes but
did not see the log of the record update, the index tree would be
corrupted. The correct solution is to make the freed clustered index
pages unavailable to the BLOB allocation. This function is also a
likely culprit of InnoDB hangs that were observed when testing the
Bug#12612184 fix.
btr_mark_freed_leaves(): Mark all freed clustered index leaf pages of
a mini-transaction allocated (nonfree=TRUE) before storing the BLOBs,
or freed (nonfree=FALSE) before committing the mini-transaction.
btr_freed_leaves_validate(): A debug function for checking that all
clustered index leaf pages that have been marked free in the
mini-transaction are consistent (have not been zeroed out).
btr_page_alloc_low(): Refactored from btr_page_alloc(). Return the
number of the allocated page, or FIL_NULL if out of space. Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or if this is a "fake allocation"
(init_mtr=NULL) by btr_mark_freed_leaves(nonfree=TRUE).
btr_page_alloc(): Add the parameter init_mtr, allowing the page to be
initialized and X-latched in a different mini-transaction than the one
that is used for the allocation. Invoke btr_page_alloc_low(). If a
clustered index leaf page was previously freed in mtr, remove it from
the memo of previously freed pages.
btr_page_free(): Assert that the page is a B-tree page and it has been
X-latched by the mini-transaction. If the freed page was a leaf page
of a clustered index, link it by a MTR_MEMO_FREE_CLUST_LEAF marker to
the mini-transaction.
btr_store_big_rec_extern_fields_func(): Add the parameter alloc_mtr,
which is NULL (old behaviour in inserts) and the same as local_mtr in
updates. If alloc_mtr!=NULL, the BLOB pages will be allocated from it
instead of the mini-transaction that is used for writing the BLOBs.
fsp_alloc_from_free_frag(): Refactored from
fsp_alloc_free_page(). Allocate the specified page from a partially
free extent.
fseg_alloc_free_page_low(), fseg_alloc_free_page_general(): Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or NULL if this is a "fake allocation"
that prevents the reuse of a previously freed B-tree page for BLOB
storage. If init_mtr==NULL, try harder to reallocate the specified page
and assert that it succeeded.
fsp_alloc_free_page(): Add the parameter "mtr_t* init_mtr" for
specifying the mini-transaction where the page should be initialized.
Do not allow init_mtr == NULL, because this function is never to be
used for "fake allocations".
mtr_t: Add the operation MTR_MEMO_FREE_CLUST_LEAF and the flag
mtr->freed_clust_leaf for quickly determining if any
MTR_MEMO_FREE_CLUST_LEAF operations have been posted.
row_ins_index_entry_low(): When columns are being made off-page in
insert-by-update, invoke btr_mark_freed_leaves(nonfree=TRUE) and pass
the mini-transaction as the alloc_mtr to
btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
row_build(): Correct a comment, and add a debug assertion that a
record that contains NULL BLOB pointers must be a fresh insert.
row_upd_clust_rec(): When columns are being moved off-page, invoke
btr_mark_freed_leaves(nonfree=TRUE) and pass the mini-transaction as
the alloc_mtr to btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
buf_reset_check_index_page_at_flush(): Remove. The function
fsp_init_file_page_low() already sets
bpage->check_index_page_at_flush=FALSE.
There is a known issue in tablespace extension. If the request to
allocate a BLOB page leads to the tablespace being extended, crash
recovery could see BLOB writes to pages that are off the tablespace
file bounds. This should trigger an assertion failure in fil_io() at
crash recovery. The safe thing would be to write redo log about the
tablespace extension to the mini-transaction of the BLOB write, not to
the mini-transaction of the record update. However, there is no redo
log record for file extension in the current redo log format.
rb:693 approved by Sunny Bains
2011-08-29 11:16:42 +03:00
|
|
|
unsigned freed_clust_leaf:1;
|
2011-08-29 11:22:43 +03:00
|
|
|
/*!< TRUE if MTR_MEMO_FREE_CLUST_LEAF
|
Bug#12704861 Corruption after a crash during BLOB update
The fix of Bug#12612184 broke crash recovery. When a record that
contains off-page columns (BLOBs) is updated, we must first write redo
log about the BLOB page writes, and only after that write the redo log
about the B-tree changes. The buggy fix would log the B-tree changes
first, meaning that after recovery, we could end up having a record
that contains a null BLOB pointer.
Because we will be redo logging the writes off the off-page columns
before the B-tree changes, we must make sure that the pages chosen for
the off-page columns are free both before and after the B-tree
changes. In this way, the worst thing that can happen in crash
recovery is that the BLOBs are written to free pages, but the B-tree
changes are not applied. The BLOB pages would correctly remain free in
this case. To achieve this, we must allocate the BLOB pages in the
mini-transaction of the B-tree operation. A further quirk is that BLOB
pages are allocated from the same file segment as leaf pages. Because
of this, we must temporarily "hide" any leaf pages that were freed
during the B-tree operation by "fake allocating" them prior to writing
the BLOBs, and freeing them again before the mtr_commit() of the
B-tree operation, in btr_mark_freed_leaves().
btr_cur_mtr_commit_and_start(): Remove this faulty function that was
introduced in the Bug#12612184 fix. The problem that this function was
trying to address was that when we did mtr_commit() the BLOB writes
before the mtr_commit() of the update, the new BLOB pages could have
overwritten clustered index B-tree leaf pages that were freed during
the update. If recovery applied the redo log of the BLOB writes but
did not see the log of the record update, the index tree would be
corrupted. The correct solution is to make the freed clustered index
pages unavailable to the BLOB allocation. This function is also a
likely culprit of InnoDB hangs that were observed when testing the
Bug#12612184 fix.
btr_mark_freed_leaves(): Mark all freed clustered index leaf pages of
a mini-transaction allocated (nonfree=TRUE) before storing the BLOBs,
or freed (nonfree=FALSE) before committing the mini-transaction.
btr_freed_leaves_validate(): A debug function for checking that all
clustered index leaf pages that have been marked free in the
mini-transaction are consistent (have not been zeroed out).
btr_page_alloc_low(): Refactored from btr_page_alloc(). Return the
number of the allocated page, or FIL_NULL if out of space. Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or if this is a "fake allocation"
(init_mtr=NULL) by btr_mark_freed_leaves(nonfree=TRUE).
btr_page_alloc(): Add the parameter init_mtr, allowing the page to be
initialized and X-latched in a different mini-transaction than the one
that is used for the allocation. Invoke btr_page_alloc_low(). If a
clustered index leaf page was previously freed in mtr, remove it from
the memo of previously freed pages.
btr_page_free(): Assert that the page is a B-tree page and it has been
X-latched by the mini-transaction. If the freed page was a leaf page
of a clustered index, link it by a MTR_MEMO_FREE_CLUST_LEAF marker to
the mini-transaction.
btr_store_big_rec_extern_fields_func(): Add the parameter alloc_mtr,
which is NULL (old behaviour in inserts) and the same as local_mtr in
updates. If alloc_mtr!=NULL, the BLOB pages will be allocated from it
instead of the mini-transaction that is used for writing the BLOBs.
fsp_alloc_from_free_frag(): Refactored from
fsp_alloc_free_page(). Allocate the specified page from a partially
free extent.
fseg_alloc_free_page_low(), fseg_alloc_free_page_general(): Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or NULL if this is a "fake allocation"
that prevents the reuse of a previously freed B-tree page for BLOB
storage. If init_mtr==NULL, try harder to reallocate the specified page
and assert that it succeeded.
fsp_alloc_free_page(): Add the parameter "mtr_t* init_mtr" for
specifying the mini-transaction where the page should be initialized.
Do not allow init_mtr == NULL, because this function is never to be
used for "fake allocations".
mtr_t: Add the operation MTR_MEMO_FREE_CLUST_LEAF and the flag
mtr->freed_clust_leaf for quickly determining if any
MTR_MEMO_FREE_CLUST_LEAF operations have been posted.
row_ins_index_entry_low(): When columns are being made off-page in
insert-by-update, invoke btr_mark_freed_leaves(nonfree=TRUE) and pass
the mini-transaction as the alloc_mtr to
btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
row_build(): Correct a comment, and add a debug assertion that a
record that contains NULL BLOB pointers must be a fresh insert.
row_upd_clust_rec(): When columns are being moved off-page, invoke
btr_mark_freed_leaves(nonfree=TRUE) and pass the mini-transaction as
the alloc_mtr to btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.
buf_reset_check_index_page_at_flush(): Remove. The function
fsp_init_file_page_low() already sets
bpage->check_index_page_at_flush=FALSE.
There is a known issue in tablespace extension. If the request to
allocate a BLOB page leads to the tablespace being extended, crash
recovery could see BLOB writes to pages that are off the tablespace
file bounds. This should trigger an assertion failure in fil_io() at
crash recovery. The safe thing would be to write redo log about the
tablespace extension to the mini-transaction of the BLOB write, not to
the mini-transaction of the record update. However, there is no redo
log record for file extension in the current redo log format.
rb:693 approved by Sunny Bains
2011-08-29 11:16:42 +03:00
|
|
|
was logged in the mini-transaction */
|
2009-05-27 15:15:59 +05:30
|
|
|
ulint n_log_recs;
|
|
|
|
/* count of how many page initial log records
|
|
|
|
have been written to the mtr log */
|
|
|
|
ulint log_mode; /* specifies which operations should be
|
|
|
|
logged; default value MTR_LOG_ALL */
|
|
|
|
ib_uint64_t start_lsn;/* start lsn of the possible log entry for
|
|
|
|
this mtr */
|
|
|
|
ib_uint64_t end_lsn;/* end lsn of the possible log entry for
|
|
|
|
this mtr */
|
|
|
|
#ifdef UNIV_DEBUG
|
|
|
|
ulint magic_n;
|
|
|
|
#endif /* UNIV_DEBUG */
|
|
|
|
};
|
|
|
|
|
|
|
|
#ifdef UNIV_DEBUG
|
|
|
|
# define MTR_MAGIC_N 54551
|
|
|
|
#endif /* UNIV_DEBUG */
|
|
|
|
|
|
|
|
#define MTR_ACTIVE 12231
|
|
|
|
#define MTR_COMMITTING 56456
|
|
|
|
#define MTR_COMMITTED 34676
|
|
|
|
|
|
|
|
#ifndef UNIV_NONINL
|
|
|
|
#include "mtr0mtr.ic"
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#endif
|