mirror of
https://github.com/MariaDB/server.git
synced 2025-01-16 12:02:42 +01:00
18bc7b695a
* to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
310 lines
14 KiB
C
310 lines
14 KiB
C
/* Copyright (C) 2006 MySQL AB
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
it under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation; either version 2 of the License, or
|
|
(at your option) any later version.
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with this program; if not, write to the Free Software
|
|
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
|
|
|
|
/* Page cache variable structures */
|
|
|
|
#ifndef _ma_pagecache_h
|
|
#define _ma_pagecache_h
|
|
C_MODE_START
|
|
|
|
#include "ma_loghandler_lsn.h"
|
|
#include <m_string.h>
|
|
#include <hash.h>
|
|
|
|
/* Type of the page */
|
|
enum pagecache_page_type
|
|
{
|
|
/*
|
|
Used only for control page type changing during debugging. This define
|
|
should only be using when using DBUG.
|
|
*/
|
|
PAGECACHE_EMPTY_PAGE,
|
|
/* the page does not contain LSN */
|
|
PAGECACHE_PLAIN_PAGE,
|
|
/* the page contain LSN (maria tablespace page) */
|
|
PAGECACHE_LSN_PAGE,
|
|
/* Page type used when scanning file and we don't care about the type */
|
|
PAGECACHE_READ_UNKNOWN_PAGE
|
|
};
|
|
|
|
/*
|
|
This enum describe lock status changing. every type of page cache will
|
|
interpret WRITE/READ lock as it need.
|
|
*/
|
|
enum pagecache_page_lock
|
|
{
|
|
PAGECACHE_LOCK_LEFT_UNLOCKED, /* free -> free */
|
|
PAGECACHE_LOCK_LEFT_READLOCKED, /* read -> read */
|
|
PAGECACHE_LOCK_LEFT_WRITELOCKED, /* write -> write */
|
|
PAGECACHE_LOCK_READ, /* free -> read */
|
|
PAGECACHE_LOCK_WRITE, /* free -> write */
|
|
PAGECACHE_LOCK_READ_UNLOCK, /* read -> free */
|
|
PAGECACHE_LOCK_WRITE_UNLOCK, /* write -> free */
|
|
PAGECACHE_LOCK_WRITE_TO_READ /* write -> read */
|
|
};
|
|
/*
|
|
This enum describe pin status changing
|
|
*/
|
|
enum pagecache_page_pin
|
|
{
|
|
PAGECACHE_PIN_LEFT_PINNED, /* pinned -> pinned */
|
|
PAGECACHE_PIN_LEFT_UNPINNED, /* unpinned -> unpinned */
|
|
PAGECACHE_PIN, /* unpinned -> pinned */
|
|
PAGECACHE_UNPIN /* pinned -> unpinned */
|
|
};
|
|
/* How to write the page */
|
|
enum pagecache_write_mode
|
|
{
|
|
/* do not write immediately, i.e. it will be dirty page */
|
|
PAGECACHE_WRITE_DELAY,
|
|
/* page already is in the file. (key cache insert analogue) */
|
|
PAGECACHE_WRITE_DONE
|
|
};
|
|
|
|
/* page number for maria */
|
|
typedef uint32 pgcache_page_no_t;
|
|
|
|
/* file descriptor for Maria */
|
|
typedef struct st_pagecache_file
|
|
{
|
|
File file;
|
|
/** Cannot be NULL */
|
|
my_bool (*read_callback)(uchar *page, pgcache_page_no_t offset,
|
|
uchar *data);
|
|
/** Cannot be NULL */
|
|
my_bool (*write_callback)(uchar *page, pgcache_page_no_t offset,
|
|
uchar *data);
|
|
/** Can be NULL */
|
|
TRANSLOG_ADDRESS (*get_log_address_callback)
|
|
(uchar *page, pgcache_page_no_t offset, uchar *data);
|
|
uchar *callback_data;
|
|
} PAGECACHE_FILE;
|
|
|
|
/* declare structures that is used by st_pagecache */
|
|
|
|
struct st_pagecache_block_link;
|
|
typedef struct st_pagecache_block_link PAGECACHE_BLOCK_LINK;
|
|
struct st_pagecache_page;
|
|
typedef struct st_pagecache_page PAGECACHE_PAGE;
|
|
struct st_pagecache_hash_link;
|
|
typedef struct st_pagecache_hash_link PAGECACHE_HASH_LINK;
|
|
|
|
#include <wqueue.h>
|
|
|
|
#define PAGECACHE_CHANGED_BLOCKS_HASH 128 /* must be power of 2 */
|
|
#define PAGECACHE_PRIORITY_LOW 0
|
|
#define PAGECACHE_PRIORITY_DEFAULT 3
|
|
#define PAGECACHE_PRIORITY_HIGH 6
|
|
|
|
/*
|
|
The page cache structure
|
|
It also contains read-only statistics parameters.
|
|
*/
|
|
|
|
typedef struct st_pagecache
|
|
{
|
|
size_t mem_size; /* specified size of the cache memory */
|
|
ulong min_warm_blocks; /* min number of warm blocks; */
|
|
ulong age_threshold; /* age threshold for hot blocks */
|
|
ulonglong time; /* total number of block link operations */
|
|
ulong hash_entries; /* max number of entries in the hash table */
|
|
long hash_links; /* max number of hash links */
|
|
long hash_links_used; /* number of hash links taken from free links pool */
|
|
long disk_blocks; /* max number of blocks in the cache */
|
|
ulong blocks_used; /* maximum number of concurrently used blocks */
|
|
ulong blocks_unused; /* number of currently unused blocks */
|
|
ulong blocks_changed; /* number of currently dirty blocks */
|
|
ulong warm_blocks; /* number of blocks in warm sub-chain */
|
|
ulong cnt_for_resize_op; /* counter to block resize operation */
|
|
ulong blocks_available; /* number of blocks available in the LRU chain */
|
|
long blocks; /* max number of blocks in the cache */
|
|
uint32 block_size; /* size of the page buffer of a cache block */
|
|
PAGECACHE_HASH_LINK **hash_root;/* arr. of entries into hash table buckets */
|
|
PAGECACHE_HASH_LINK *hash_link_root;/* memory for hash table links */
|
|
PAGECACHE_HASH_LINK *free_hash_list;/* list of free hash links */
|
|
PAGECACHE_BLOCK_LINK *free_block_list;/* list of free blocks */
|
|
PAGECACHE_BLOCK_LINK *block_root;/* memory for block links */
|
|
uchar HUGE_PTR *block_mem; /* memory for block buffers */
|
|
PAGECACHE_BLOCK_LINK *used_last;/* ptr to the last block of the LRU chain */
|
|
PAGECACHE_BLOCK_LINK *used_ins;/* ptr to the insertion block in LRU chain */
|
|
pthread_mutex_t cache_lock; /* to lock access to the cache structure */
|
|
WQUEUE resize_queue; /* threads waiting during resize operation */
|
|
WQUEUE waiting_for_hash_link;/* waiting for a free hash link */
|
|
WQUEUE waiting_for_block; /* requests waiting for a free block */
|
|
/* hash for dirty file bl.*/
|
|
PAGECACHE_BLOCK_LINK *changed_blocks[PAGECACHE_CHANGED_BLOCKS_HASH];
|
|
/* hash for other file bl.*/
|
|
PAGECACHE_BLOCK_LINK *file_blocks[PAGECACHE_CHANGED_BLOCKS_HASH];
|
|
|
|
/*
|
|
The following variables are and variables used to hold parameters for
|
|
initializing the key cache.
|
|
*/
|
|
|
|
ulonglong param_buff_size; /* size the memory allocated for the cache */
|
|
ulong param_block_size; /* size of the blocks in the key cache */
|
|
ulong param_division_limit; /* min. percentage of warm blocks */
|
|
ulong param_age_threshold; /* determines when hot block is downgraded */
|
|
|
|
/* Statistics variables. These are reset in reset_pagecache_counters(). */
|
|
ulong global_blocks_changed; /* number of currently dirty blocks */
|
|
ulonglong global_cache_w_requests;/* number of write requests (write hits) */
|
|
ulonglong global_cache_write; /* number of writes from cache to files */
|
|
ulonglong global_cache_r_requests;/* number of read requests (read hits) */
|
|
ulonglong global_cache_read; /* number of reads from files to cache */
|
|
|
|
uint shift; /* block size = 2 ^ shift */
|
|
myf readwrite_flags; /* Flags to pread/pwrite() */
|
|
myf org_readwrite_flags; /* Flags to pread/pwrite() at init */
|
|
my_bool inited;
|
|
my_bool resize_in_flush; /* true during flush of resize operation */
|
|
my_bool can_be_used; /* usage of cache for read/write is allowed */
|
|
my_bool in_init; /* Set to 1 in MySQL during init/resize */
|
|
HASH files_in_flush; /**< files in flush_pagecache_blocks_int() */
|
|
} PAGECACHE;
|
|
|
|
/** @brief Return values for PAGECACHE_FLUSH_FILTER */
|
|
enum pagecache_flush_filter_result
|
|
{
|
|
FLUSH_FILTER_SKIP_TRY_NEXT= 0,/**< skip page and move on to next one */
|
|
FLUSH_FILTER_OK, /**< flush page and move on to next one */
|
|
FLUSH_FILTER_SKIP_ALL /**< skip page and all next ones */
|
|
};
|
|
/** @brief a filter function type for flush_pagecache_blocks_with_filter() */
|
|
typedef enum pagecache_flush_filter_result
|
|
(*PAGECACHE_FLUSH_FILTER)(enum pagecache_page_type type, pgcache_page_no_t page,
|
|
LSN rec_lsn, void *arg);
|
|
|
|
/* The default key cache */
|
|
extern PAGECACHE dflt_pagecache_var, *dflt_pagecache;
|
|
|
|
extern ulong init_pagecache(PAGECACHE *pagecache, size_t use_mem,
|
|
uint division_limit, uint age_threshold,
|
|
uint block_size, myf my_read_flags);
|
|
extern ulong resize_pagecache(PAGECACHE *pagecache,
|
|
size_t use_mem, uint division_limit,
|
|
uint age_threshold);
|
|
extern void change_pagecache_param(PAGECACHE *pagecache, uint division_limit,
|
|
uint age_threshold);
|
|
|
|
extern uchar *pagecache_read(PAGECACHE *pagecache,
|
|
PAGECACHE_FILE *file,
|
|
pgcache_page_no_t pageno,
|
|
uint level,
|
|
uchar *buff,
|
|
enum pagecache_page_type type,
|
|
enum pagecache_page_lock lock,
|
|
PAGECACHE_BLOCK_LINK **link);
|
|
|
|
#define pagecache_write(P,F,N,L,B,T,O,I,M,K,R) \
|
|
pagecache_write_part(P,F,N,L,B,T,O,I,M,K,R,0,(P)->block_size)
|
|
|
|
#define pagecache_inject(P,F,N,L,B,T,O,I,K,R) \
|
|
pagecache_write_part(P,F,N,L,B,T,O,I,PAGECACHE_WRITE_DONE, \
|
|
K,R,0,(P)->block_size)
|
|
|
|
extern my_bool pagecache_write_part(PAGECACHE *pagecache,
|
|
PAGECACHE_FILE *file,
|
|
pgcache_page_no_t pageno,
|
|
uint level,
|
|
uchar *buff,
|
|
enum pagecache_page_type type,
|
|
enum pagecache_page_lock lock,
|
|
enum pagecache_page_pin pin,
|
|
enum pagecache_write_mode write_mode,
|
|
PAGECACHE_BLOCK_LINK **link,
|
|
LSN first_REDO_LSN_for_page,
|
|
uint offset,
|
|
uint size);
|
|
extern void pagecache_unlock(PAGECACHE *pagecache,
|
|
PAGECACHE_FILE *file,
|
|
pgcache_page_no_t pageno,
|
|
enum pagecache_page_lock lock,
|
|
enum pagecache_page_pin pin,
|
|
LSN first_REDO_LSN_for_page,
|
|
LSN lsn, my_bool was_changed);
|
|
extern void pagecache_unlock_by_link(PAGECACHE *pagecache,
|
|
PAGECACHE_BLOCK_LINK *block,
|
|
enum pagecache_page_lock lock,
|
|
enum pagecache_page_pin pin,
|
|
LSN first_REDO_LSN_for_page,
|
|
LSN lsn, my_bool was_changed);
|
|
extern void pagecache_unpin(PAGECACHE *pagecache,
|
|
PAGECACHE_FILE *file,
|
|
pgcache_page_no_t pageno,
|
|
LSN lsn);
|
|
extern void pagecache_unpin_by_link(PAGECACHE *pagecache,
|
|
PAGECACHE_BLOCK_LINK *link,
|
|
LSN lsn);
|
|
|
|
|
|
/* Results of flush operation (bit field in fact) */
|
|
|
|
/* The flush is done. */
|
|
#define PCFLUSH_OK 0
|
|
/* There was errors during the flush process. */
|
|
#define PCFLUSH_ERROR 1
|
|
/* Pinned blocks was met and skipped. */
|
|
#define PCFLUSH_PINNED 2
|
|
/* PCFLUSH_ERROR and PCFLUSH_PINNED. */
|
|
#define PCFLUSH_PINNED_AND_ERROR (PCFLUSH_ERROR|PCFLUSH_PINNED)
|
|
|
|
#define pagecache_file_init(F,RC,WC,GLC,D) \
|
|
do{ \
|
|
(F).read_callback= (RC); (F).write_callback= (WC); \
|
|
(F).get_log_address_callback= (GLC); (F).callback_data= (uchar*)(D); \
|
|
} while(0)
|
|
|
|
#define flush_pagecache_blocks(A,B,C) \
|
|
flush_pagecache_blocks_with_filter(A,B,C,NULL,NULL)
|
|
extern int flush_pagecache_blocks_with_filter(PAGECACHE *keycache,
|
|
PAGECACHE_FILE *file,
|
|
enum flush_type type,
|
|
PAGECACHE_FLUSH_FILTER filter,
|
|
void *filter_arg);
|
|
extern my_bool pagecache_delete(PAGECACHE *pagecache,
|
|
PAGECACHE_FILE *file,
|
|
pgcache_page_no_t pageno,
|
|
enum pagecache_page_lock lock,
|
|
my_bool flush);
|
|
extern my_bool pagecache_delete_pages(PAGECACHE *pagecache,
|
|
PAGECACHE_FILE *file,
|
|
pgcache_page_no_t pageno,
|
|
uint page_count,
|
|
enum pagecache_page_lock lock,
|
|
my_bool flush);
|
|
extern void end_pagecache(PAGECACHE *keycache, my_bool cleanup);
|
|
extern my_bool pagecache_collect_changed_blocks_with_lsn(PAGECACHE *pagecache,
|
|
LEX_STRING *str,
|
|
LSN *min_lsn);
|
|
extern int reset_pagecache_counters(const char *name, PAGECACHE *pagecache);
|
|
extern uchar *pagecache_block_link_to_buffer(PAGECACHE_BLOCK_LINK *block);
|
|
|
|
|
|
/* Functions to handle multiple key caches */
|
|
extern my_bool multi_pagecache_init(void);
|
|
extern void multi_pagecache_free(void);
|
|
extern PAGECACHE *multi_pagecache_search(uchar *key, uint length,
|
|
PAGECACHE *def);
|
|
extern my_bool multi_pagecache_set(const uchar *key, uint length,
|
|
PAGECACHE *pagecache);
|
|
extern void multi_pagecache_change(PAGECACHE *old_data,
|
|
PAGECACHE *new_data);
|
|
extern int reset_pagecache_counters(const char *name,
|
|
PAGECACHE *pagecache);
|
|
|
|
C_MODE_END
|
|
#endif /* _keycache_h */
|