mirror of
https://github.com/MariaDB/server.git
synced 2025-01-16 03:52:35 +01:00
cec8ac3e07
Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
81 lines
2.9 KiB
C
81 lines
2.9 KiB
C
/* Copyright (C) 2006,2007 MySQL AB
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
it under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation; version 2 of the License.
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with this program; if not, write to the Free Software
|
|
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
|
|
|
|
/*
|
|
WL#3071 Maria checkpoint
|
|
First version written by Guilhem Bichot on 2006-04-27.
|
|
Does not compile yet.
|
|
*/
|
|
|
|
/* This is the interface of this module. */
|
|
|
|
typedef enum enum_ma_checkpoint_level {
|
|
CHECKPOINT_NONE= 0,
|
|
/* just write dirty_pages, transactions table and sync files */
|
|
CHECKPOINT_INDIRECT,
|
|
/* also flush all dirty pages which were already dirty at prev checkpoint */
|
|
CHECKPOINT_MEDIUM,
|
|
/* also flush all dirty pages */
|
|
CHECKPOINT_FULL
|
|
} CHECKPOINT_LEVEL;
|
|
|
|
C_MODE_START
|
|
int ma_checkpoint_init(my_bool create_background_thread);
|
|
void ma_checkpoint_end();
|
|
int ma_checkpoint_execute(CHECKPOINT_LEVEL level, my_bool no_wait);
|
|
C_MODE_END
|
|
|
|
/**
|
|
@brief reads some LSNs with special trickery
|
|
|
|
If a 64-bit variable transitions between both halves being zero to both
|
|
halves being non-zero, and back, this function can be used to do a read of
|
|
it (without mutex, without atomic load) which always produces a correct
|
|
(though maybe slightly old) value (even on 32-bit CPUs). The value is at
|
|
least as new as the latest mutex unlock done by the calling thread.
|
|
The assumption is that the system sets both 4-byte halves either at the
|
|
same time, or one after the other (in any order), but NOT some bytes of the
|
|
first half then some bytes of the second half then the rest of bytes of the
|
|
first half. With this assumption, the function can detect when it is
|
|
seeing an inconsistent value.
|
|
|
|
@param LSN pointer to the LSN variable to read
|
|
|
|
@return LSN part (most significant byte always 0)
|
|
*/
|
|
#if ( SIZEOF_CHARP >= 8 )
|
|
/* 64-bit CPU, 64-bit reads are atomic */
|
|
#define lsn_read_non_atomic LSN_WITH_FLAGS_TO_LSN
|
|
#else
|
|
static inline LSN lsn_read_non_atomic_32(const volatile LSN *x)
|
|
{
|
|
/*
|
|
32-bit CPU, 64-bit reads may give a mixed of old half and new half (old
|
|
low bits and new high bits, or the contrary).
|
|
*/
|
|
for (;;) /* loop until no atomicity problems */
|
|
{
|
|
/*
|
|
Remove most significant byte in case this is a LSN_WITH_FLAGS object.
|
|
Those flags in TRN::first_undo_lsn break the condition on transitions so
|
|
they must be removed below.
|
|
*/
|
|
LSN y= LSN_WITH_FLAGS_TO_LSN(*x);
|
|
if (likely((y == LSN_IMPOSSIBLE) || LSN_VALID(y)))
|
|
return y;
|
|
}
|
|
}
|
|
#define lsn_read_non_atomic(x) lsn_read_non_atomic_32(&x)
|
|
#endif
|