mariadb/storage/innobase/include/trx0rseg.h
Marko Mäkelä 3c09f148f3 MDEV-12288 Reset DB_TRX_ID when the history is removed, to speed up MVCC
Let InnoDB purge reset DB_TRX_ID,DB_ROLL_PTR when the history is removed.

[TODO: It appears that the resetting is not taking place as often as
it could be. We should test that a simple INSERT should eventually
cause row_purge_reset_trx_id() to be invoked unless DROP TABLE is
invoked soon enough.]

The InnoDB clustered index record system columns DB_TRX_ID,DB_ROLL_PTR
are used by multi-versioning. After the history is no longer needed, these
columns can safely be reset to 0 and 1<<55 (to indicate a fresh insert).

When a reader sees 0 in the DB_TRX_ID column, it can instantly determine
that the record is present the read view. There is no need to acquire
the transaction system mutex to check if the transaction exists, because
writes can never be conducted by a transaction whose ID is 0.

The persistent InnoDB undo log used to be split into two parts:
insert_undo and update_undo. The insert_undo log was discarded at
transaction commit or rollback, and the update_undo log was processed
by the purge subsystem. As part of this change, we will only generate
a single undo log for new transactions, and the purge subsystem will
reset the DB_TRX_ID whenever a clustered index record is touched.
That is, all persistent undo log will be preserved at transaction commit
or rollback, to be removed by purge.

The InnoDB redo log format is changed in two ways:
We remove the redo log record type MLOG_UNDO_HDR_REUSE, and
we introduce the MLOG_ZIP_WRITE_TRX_ID record for updating the
DB_TRX_ID,DB_ROLL_PTR in a ROW_FORMAT=COMPRESSED table.

This is also changing the format of persistent InnoDB data files:
undo log and clustered index leaf page records. It will still be
possible via import and export to exchange data files with earlier
versions of MariaDB. The change to clustered index leaf page records
is simple: we allow DB_TRX_ID to be 0.

When it comes to the undo log, we must be able to upgrade from earlier
MariaDB versions after a clean shutdown (no redo log to apply).
While it would be nice to perform a slow shutdown (innodb_fast_shutdown=0)
before an upgrade, to empty the undo logs, we cannot assume that this
has been done. So, separate insert_undo log may exist for recovered
uncommitted transactions. These transactions may be automatically
rolled back, or they may be in XA PREPARE state, in which case InnoDB
will preserve the transaction until an explicit XA COMMIT or XA ROLLBACK.

Upgrade has been tested by starting up MariaDB 10.2 with
./mysql-test-run --manual-gdb innodb.read_only_recovery
and then starting up this patched server with
and without --innodb-read-only.

trx_undo_ptr_t::undo: Renamed from update_undo.

trx_undo_ptr_t::old_insert: Renamed from insert_undo.

trx_rseg_t::undo_list: Renamed from update_undo_list.

trx_rseg_t::undo_cached: Merged from update_undo_cached
and insert_undo_cached.

trx_rseg_t::old_insert_list: Renamed from insert_undo_list.

row_purge_reset_trx_id(): New function to reset the columns.
This will be called for all undo processing in purge
that does not remove the clustered index record.

trx_undo_update_rec_get_update(): Allow trx_id=0 when copying the
old DB_TRX_ID of the record to the undo log.

ReadView::changes_visible(): Allow id==0. (Return true for it.
This is what speeds up the MVCC.)

row_vers_impl_x_locked_low(), row_vers_build_for_semi_consistent_read():
Implement a fast path for DB_TRX_ID=0.

Always initialize the TRX_UNDO_PAGE_TYPE to 0. Remove undo->type.

MLOG_UNDO_HDR_REUSE: Remove. This changes the redo log format!

innobase_start_or_create_for_mysql(): Set srv_undo_sources before
starting any transactions.

The parsing of the MLOG_ZIP_WRITE_TRX_ID record was successfully
tested by running the following:
./mtr --parallel=auto --mysqld=--debug=d,ib_log innodb_zip.bug56680
grep MLOG_ZIP_WRITE_TRX_ID var/*/log/mysqld.1.err
2017-07-07 13:08:48 +03:00

238 lines
7.6 KiB
C++

/*****************************************************************************
Copyright (c) 1996, 2016, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2017, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Suite 500, Boston, MA 02110-1335 USA
*****************************************************************************/
/**************************************************//**
@file include/trx0rseg.h
Rollback segment
Created 3/26/1996 Heikki Tuuri
*******************************************************/
#ifndef trx0rseg_h
#define trx0rseg_h
#include "trx0types.h"
#include "trx0sys.h"
#include "fut0lst.h"
#include <vector>
/** Gets a rollback segment header.
@param[in] space space where placed
@param[in] page_no page number of the header
@param[in,out] mtr mini-transaction
@return rollback segment header, page x-latched */
UNIV_INLINE
trx_rsegf_t*
trx_rsegf_get(
ulint space,
ulint page_no,
mtr_t* mtr);
/** Gets a newly created rollback segment header.
@param[in] space space where placed
@param[in] page_no page number of the header
@param[in,out] mtr mini-transaction
@return rollback segment header, page x-latched */
UNIV_INLINE
trx_rsegf_t*
trx_rsegf_get_new(
ulint space,
ulint page_no,
mtr_t* mtr);
/***************************************************************//**
Gets the file page number of the nth undo log slot.
@return page number of the undo log segment */
UNIV_INLINE
ulint
trx_rsegf_get_nth_undo(
/*===================*/
trx_rsegf_t* rsegf, /*!< in: rollback segment header */
ulint n, /*!< in: index of slot */
mtr_t* mtr); /*!< in: mtr */
/***************************************************************//**
Sets the file page number of the nth undo log slot. */
UNIV_INLINE
void
trx_rsegf_set_nth_undo(
/*===================*/
trx_rsegf_t* rsegf, /*!< in: rollback segment header */
ulint n, /*!< in: index of slot */
ulint page_no,/*!< in: page number of the undo log segment */
mtr_t* mtr); /*!< in: mtr */
/****************************************************************//**
Looks for a free slot for an undo log segment.
@return slot index or ULINT_UNDEFINED if not found */
UNIV_INLINE
ulint
trx_rsegf_undo_find_free(
/*=====================*/
trx_rsegf_t* rsegf, /*!< in: rollback segment header */
mtr_t* mtr); /*!< in: mtr */
/** Creates a rollback segment header.
This function is called only when a new rollback segment is created in
the database.
@param[in] space space id
@param[in] max_size max size in pages
@param[in] rseg_slot_no rseg id == slot number in trx sys
@param[in,out] mtr mini-transaction
@return page number of the created segment, FIL_NULL if fail */
ulint
trx_rseg_header_create(
ulint space,
ulint max_size,
ulint rseg_slot_no,
mtr_t* mtr);
/** Initialize the rollback segments in memory at database startup. */
void
trx_rseg_array_init();
/** Free a rollback segment in memory. */
void
trx_rseg_mem_free(trx_rseg_t* rseg);
/** Create a persistent rollback segment.
@param[in] space_id system or undo tablespace id
@return pointer to new rollback segment
@retval NULL on failure */
trx_rseg_t*
trx_rseg_create(ulint space_id)
MY_ATTRIBUTE((warn_unused_result));
/** Create the temporary rollback segments. */
void
trx_temp_rseg_create();
/********************************************************************
Get the number of unique rollback tablespaces in use except space id 0.
The last space id will be the sentinel value ULINT_UNDEFINED. The array
will be sorted on space id. Note: space_ids should have have space for
TRX_SYS_N_RSEGS + 1 elements.
@return number of unique rollback tablespaces in use. */
ulint
trx_rseg_get_n_undo_tablespaces(
/*============================*/
ulint* space_ids); /*!< out: array of space ids of
UNDO tablespaces */
/* Number of undo log slots in a rollback segment file copy */
#define TRX_RSEG_N_SLOTS (UNIV_PAGE_SIZE / 16)
/* Maximum number of transactions supported by a single rollback segment */
#define TRX_RSEG_MAX_N_TRXS (TRX_RSEG_N_SLOTS / 2)
/** The rollback segment memory object */
struct trx_rseg_t {
/*--------------------------------------------------------*/
/** rollback segment id == the index of its slot in the trx
system file copy */
ulint id;
/** mutex protecting the fields in this struct except id,space,page_no
which are constant */
RsegMutex mutex;
/** space where the rollback segment header is placed */
ulint space;
/** page number of the rollback segment header */
ulint page_no;
/** maximum allowed size in pages */
ulint max_size;
/** current size in pages */
ulint curr_size;
/*--------------------------------------------------------*/
/* Fields for undo logs */
/** List of undo logs */
UT_LIST_BASE_NODE_T(trx_undo_t) undo_list;
/** List of undo log segments cached for fast reuse */
UT_LIST_BASE_NODE_T(trx_undo_t) undo_cached;
/** List of recovered old insert_undo logs of incomplete
transactions (to roll back or XA COMMIT & purge) */
UT_LIST_BASE_NODE_T(trx_undo_t) old_insert_list;
/*--------------------------------------------------------*/
/** Page number of the last not yet purged log header in the history
list; FIL_NULL if all list purged */
ulint last_page_no;
/** Byte offset of the last not yet purged log header */
ulint last_offset;
/** Transaction number of the last not yet purged log */
trx_id_t last_trx_no;
/** TRUE if the last not yet purged log needs purging */
ibool last_del_marks;
/** Reference counter to track rseg allocated transactions. */
ulint trx_ref_count;
/** If true, then skip allocating this rseg as it reside in
UNDO-tablespace marked for truncate. */
bool skip_allocation;
/** @return whether the rollback segment is persistent */
bool is_persistent() const
{
ut_ad(space == SRV_TMP_SPACE_ID
|| space <= TRX_SYS_MAX_UNDO_SPACES);
ut_ad(space == SRV_TMP_SPACE_ID
|| space <= srv_undo_tablespaces_active
|| !srv_was_started);
return(space != SRV_TMP_SPACE_ID);
}
};
/* Undo log segment slot in a rollback segment header */
/*-------------------------------------------------------------*/
#define TRX_RSEG_SLOT_PAGE_NO 0 /* Page number of the header page of
an undo log segment */
/*-------------------------------------------------------------*/
/* Slot size */
#define TRX_RSEG_SLOT_SIZE 4
/* The offset of the rollback segment header on its page */
#define TRX_RSEG FSEG_PAGE_DATA
/* Transaction rollback segment header */
/*-------------------------------------------------------------*/
#define TRX_RSEG_MAX_SIZE 0 /* Maximum allowed size for rollback
segment in pages */
#define TRX_RSEG_HISTORY_SIZE 4 /* Number of file pages occupied
by the logs in the history list */
#define TRX_RSEG_HISTORY 8 /* The update undo logs for committed
transactions */
#define TRX_RSEG_FSEG_HEADER (8 + FLST_BASE_NODE_SIZE)
/* Header for the file segment where
this page is placed */
#define TRX_RSEG_UNDO_SLOTS (8 + FLST_BASE_NODE_SIZE + FSEG_HEADER_SIZE)
/* Undo log segment slots */
/*-------------------------------------------------------------*/
#include "trx0rseg.ic"
#endif