mariadb/storage/innobase/include/btr0btr.ic
Jan Lindström 765a43605a MDEV-12253: Buffer pool blocks are accessed after they have been freed
Problem was that bpage was referenced after it was already freed
from LRU. Fixed by adding a new variable encrypted that is
passed down to buf_page_check_corrupt() and used in
buf_page_get_gen() to stop processing page read.

This patch should also address following test failures and
bugs:

MDEV-12419: IMPORT should not look up tablespace in
PageConverter::validate(). This is now removed.

MDEV-10099: encryption.innodb_onlinealter_encryption fails
sporadically in buildbot

MDEV-11420: encryption.innodb_encryption-page-compression
failed in buildbot

MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8

Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing
and replaced these with dict_table_t::file_unreadable. Table
ibd file is missing if fil_get_space(space_id) returns NULL
and encrypted if not. Removed dict_table_t::is_corrupted field.

Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(),
buf_page_decrypt_after_read(), buf_page_encrypt_before_write(),
buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats().

Added test cases when enrypted page could be read while doing
redo log crash recovery. Also added test case for row compressed
blobs.

btr_cur_open_at_index_side_func(),
btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is
NULL.

buf_page_get_zip(): Issue error if page read fails.

buf_page_get_gen(): Use dberr_t for error detection and
do not reference bpage after we hare freed it.

buf_mark_space_corrupt(): remove bpage from LRU also when
it is encrypted.

buf_page_check_corrupt(): @return DB_SUCCESS if page has
been read and is not corrupted,
DB_PAGE_CORRUPTED if page based on checksum check is corrupted,
DB_DECRYPTION_FAILED if page post encryption checksum matches but
after decryption normal page checksum does not match. In read
case only DB_SUCCESS is possible.

buf_page_io_complete(): use dberr_t for error handling.

buf_flush_write_block_low(),
buf_read_ahead_random(),
buf_read_page_async(),
buf_read_ahead_linear(),
buf_read_ibuf_merge_pages(),
buf_read_recv_pages(),
fil_aio_wait():
        Issue error if page read fails.

btr_pcur_move_to_next_page(): Do not reference page if it is
NULL.

Introduced dict_table_t::is_readable() and dict_index_t::is_readable()
that will return true if tablespace exists and pages read from
tablespace are not corrupted or page decryption failed.
Removed buf_page_t::key_version. After page decryption the
key version is not removed from page frame. For unencrypted
pages, old key_version is removed at buf_page_encrypt_before_write()

dict_stats_update_transient_for_index(),
dict_stats_update_transient()
        Do not continue if table decryption failed or table
        is corrupted.

dict0stats.cc: Introduced a dict_stats_report_error function
to avoid code duplication.

fil_parse_write_crypt_data():
        Check that key read from redo log entry is found from
        encryption plugin and if it is not, refuse to start.

PageConverter::validate(): Removed access to fil_space_t as
tablespace is not available during import.

Fixed error code on innodb.innodb test.

Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown
to innodb-bad-key-change2.  Removed innodb-bad-key-change5 test.
Decreased unnecessary complexity on some long lasting tests.

Removed fil_inc_pending_ops(), fil_decr_pending_ops(),
fil_get_first_space(), fil_get_next_space(),
fil_get_first_space_safe(), fil_get_next_space_safe()
functions.

fil_space_verify_crypt_checksum(): Fixed bug found using ASAN
where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly
accessed from row compressed tables. Fixed out of page frame
bug for row compressed tables in
fil_space_verify_crypt_checksum() found using ASAN. Incorrect
function was called for compressed table.

Added new tests for discard, rename table and drop (we should allow them
even when page decryption fails). Alter table rename is not allowed.
Added test for restart with innodb-force-recovery=1 when page read on
redo-recovery cant be decrypted. Added test for corrupted table where
both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted.

Adjusted the test case innodb_bug14147491 so that it does not anymore
expect crash. Instead table is just mostly not usable.

fil0fil.h: fil_space_acquire_low is not visible function
and fil_space_acquire and fil_space_acquire_silent are
inline functions. FilSpace class uses fil_space_acquire_low
directly.

recv_apply_hashed_log_recs() does not return anything.
2017-04-26 15:19:16 +03:00

335 lines
9.2 KiB
Text

/*****************************************************************************
Copyright (c) 1994, 2016, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2015, 2016, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Suite 500, Boston, MA 02110-1335 USA
*****************************************************************************/
/**************************************************//**
@file include/btr0btr.ic
The B-tree
Created 6/2/1994 Heikki Tuuri
*******************************************************/
#include "mach0data.h"
#ifndef UNIV_HOTBACKUP
#include "mtr0mtr.h"
#include "mtr0log.h"
#include "page0zip.h"
#define BTR_MAX_NODE_LEVEL 50 /*!< Maximum B-tree page level
(not really a hard limit).
Used in debug assertions
in btr_page_set_level and
btr_page_get_level_low */
/**************************************************************//**
Gets a buffer page and declares its latching order level. */
UNIV_INLINE
buf_block_t*
btr_block_get_func(
/*===============*/
ulint space, /*!< in: space id */
ulint zip_size, /*!< in: compressed page size in bytes
or 0 for uncompressed pages */
ulint page_no, /*!< in: page number */
ulint mode, /*!< in: latch mode */
const char* file, /*!< in: file name */
ulint line, /*!< in: line where called */
dict_index_t* index, /*!< in: index tree, may be NULL
if it is not an insert buffer tree */
mtr_t* mtr) /*!< in/out: mtr */
{
buf_block_t* block;
dberr_t err;
block = buf_page_get_gen(space, zip_size, page_no, mode,
NULL, BUF_GET, file, line, mtr, &err);
if (err == DB_DECRYPTION_FAILED) {
if (index && index->table) {
index->table->file_unreadable = true;
}
}
if (block) {
if (mode != RW_NO_LATCH) {
buf_block_dbg_add_level(
block, index != NULL && dict_index_is_ibuf(index)
? SYNC_IBUF_TREE_NODE : SYNC_TREE_NODE);
}
}
return(block);
}
/**************************************************************//**
Sets the index id field of a page. */
UNIV_INLINE
void
btr_page_set_index_id(
/*==================*/
page_t* page, /*!< in: page to be created */
page_zip_des_t* page_zip,/*!< in: compressed page whose uncompressed
part will be updated, or NULL */
index_id_t id, /*!< in: index id */
mtr_t* mtr) /*!< in: mtr */
{
if (page_zip) {
mach_write_to_8(page + (PAGE_HEADER + PAGE_INDEX_ID), id);
page_zip_write_header(page_zip,
page + (PAGE_HEADER + PAGE_INDEX_ID),
8, mtr);
} else {
mlog_write_ull(page + (PAGE_HEADER + PAGE_INDEX_ID), id, mtr);
}
}
/** Gets a buffer page and declares its latching order level.
@param space tablespace identifier
@param zip_size compressed page size in bytes or 0 for uncompressed pages
@param page_no page number
@param mode latch mode
@param idx index tree, may be NULL if not the insert buffer tree
@param mtr mini-transaction handle
@return the uncompressed page frame */
UNIV_INLINE
page_t*
btr_page_get(
/*=========*/
ulint space,
ulint zip_size,
ulint root_page_no,
ulint mode,
dict_index_t* index,
mtr_t* mtr)
{
buf_block_t* block=NULL;
buf_frame_t* frame=NULL;
block = btr_block_get(space, zip_size, root_page_no, mode, index, mtr);
if (block) {
frame = buf_block_get_frame(block);
}
return ((page_t*)frame);
}
#endif /* !UNIV_HOTBACKUP */
/**************************************************************//**
Gets the index id field of a page.
@return index id */
UNIV_INLINE
index_id_t
btr_page_get_index_id(
/*==================*/
const page_t* page) /*!< in: index page */
{
return(mach_read_from_8(page + PAGE_HEADER + PAGE_INDEX_ID));
}
#ifndef UNIV_HOTBACKUP
/********************************************************//**
Gets the node level field in an index page.
@return level, leaf level == 0 */
UNIV_INLINE
ulint
btr_page_get_level_low(
/*===================*/
const page_t* page) /*!< in: index page */
{
ulint level;
ut_ad(page);
level = mach_read_from_2(page + PAGE_HEADER + PAGE_LEVEL);
ut_ad(level <= BTR_MAX_NODE_LEVEL);
return(level);
}
/********************************************************//**
Sets the node level field in an index page. */
UNIV_INLINE
void
btr_page_set_level(
/*===============*/
page_t* page, /*!< in: index page */
page_zip_des_t* page_zip,/*!< in: compressed page whose uncompressed
part will be updated, or NULL */
ulint level, /*!< in: level, leaf level == 0 */
mtr_t* mtr) /*!< in: mini-transaction handle */
{
ut_ad(page && mtr);
ut_ad(level <= BTR_MAX_NODE_LEVEL);
if (page_zip) {
mach_write_to_2(page + (PAGE_HEADER + PAGE_LEVEL), level);
page_zip_write_header(page_zip,
page + (PAGE_HEADER + PAGE_LEVEL),
2, mtr);
} else {
mlog_write_ulint(page + (PAGE_HEADER + PAGE_LEVEL), level,
MLOG_2BYTES, mtr);
}
}
/********************************************************//**
Gets the next index page number.
@return next page number */
UNIV_INLINE
ulint
btr_page_get_next(
/*==============*/
const page_t* page, /*!< in: index page */
mtr_t* mtr MY_ATTRIBUTE((unused)))
/*!< in: mini-transaction handle */
{
ut_ad(page != NULL);
ut_ad(mtr != NULL);
#ifndef UNIV_INNOCHECKSUM
ut_ad(mtr_memo_contains_page(mtr, page, MTR_MEMO_PAGE_X_FIX)
|| mtr_memo_contains_page(mtr, page, MTR_MEMO_PAGE_S_FIX));
#endif /* UNIV_INNOCHECKSUM */
return(mach_read_from_4(page + FIL_PAGE_NEXT));
}
/********************************************************//**
Sets the next index page field. */
UNIV_INLINE
void
btr_page_set_next(
/*==============*/
page_t* page, /*!< in: index page */
page_zip_des_t* page_zip,/*!< in: compressed page whose uncompressed
part will be updated, or NULL */
ulint next, /*!< in: next page number */
mtr_t* mtr) /*!< in: mini-transaction handle */
{
ut_ad(page != NULL);
ut_ad(mtr != NULL);
if (page_zip) {
mach_write_to_4(page + FIL_PAGE_NEXT, next);
page_zip_write_header(page_zip, page + FIL_PAGE_NEXT, 4, mtr);
} else {
mlog_write_ulint(page + FIL_PAGE_NEXT, next, MLOG_4BYTES, mtr);
}
}
/********************************************************//**
Gets the previous index page number.
@return prev page number */
UNIV_INLINE
ulint
btr_page_get_prev(
/*==============*/
const page_t* page, /*!< in: index page */
mtr_t* mtr MY_ATTRIBUTE((unused))) /*!< in: mini-transaction handle */
{
ut_ad(page != NULL);
ut_ad(mtr != NULL);
return(mach_read_from_4(page + FIL_PAGE_PREV));
}
/********************************************************//**
Sets the previous index page field. */
UNIV_INLINE
void
btr_page_set_prev(
/*==============*/
page_t* page, /*!< in: index page */
page_zip_des_t* page_zip,/*!< in: compressed page whose uncompressed
part will be updated, or NULL */
ulint prev, /*!< in: previous page number */
mtr_t* mtr) /*!< in: mini-transaction handle */
{
ut_ad(page != NULL);
ut_ad(mtr != NULL);
if (page_zip) {
mach_write_to_4(page + FIL_PAGE_PREV, prev);
page_zip_write_header(page_zip, page + FIL_PAGE_PREV, 4, mtr);
} else {
mlog_write_ulint(page + FIL_PAGE_PREV, prev, MLOG_4BYTES, mtr);
}
}
/**************************************************************//**
Gets the child node file address in a node pointer.
NOTE: the offsets array must contain all offsets for the record since
we read the last field according to offsets and assume that it contains
the child page number. In other words offsets must have been retrieved
with rec_get_offsets(n_fields=ULINT_UNDEFINED).
@return child node address */
UNIV_INLINE
ulint
btr_node_ptr_get_child_page_no(
/*===========================*/
const rec_t* rec, /*!< in: node pointer record */
const ulint* offsets)/*!< in: array returned by rec_get_offsets() */
{
const byte* field;
ulint len;
ulint page_no;
ut_ad(!rec_offs_comp(offsets) || rec_get_node_ptr_flag(rec));
/* The child address is in the last field */
field = rec_get_nth_field(rec, offsets,
rec_offs_n_fields(offsets) - 1, &len);
ut_ad(len == 4);
page_no = mach_read_from_4(field);
if (page_no == 0) {
fprintf(stderr,
"InnoDB: a nonsensical page number 0"
" in a node ptr record at offset %lu\n",
(ulong) page_offset(rec));
buf_page_print(page_align(rec), 0, 0);
ut_ad(0);
}
return(page_no);
}
/**************************************************************//**
Releases the latches on a leaf page and bufferunfixes it. */
UNIV_INLINE
void
btr_leaf_page_release(
/*==================*/
buf_block_t* block, /*!< in: buffer block */
ulint latch_mode, /*!< in: BTR_SEARCH_LEAF or
BTR_MODIFY_LEAF */
mtr_t* mtr) /*!< in: mtr */
{
ut_ad(latch_mode == BTR_SEARCH_LEAF || latch_mode == BTR_MODIFY_LEAF);
ut_ad(!mtr_memo_contains(mtr, block, MTR_MEMO_MODIFY));
mtr_memo_release(mtr, block,
latch_mode == BTR_SEARCH_LEAF
? MTR_MEMO_PAGE_S_FIX
: MTR_MEMO_PAGE_X_FIX);
}
#endif /* !UNIV_HOTBACKUP */