mirror of
https://github.com/MariaDB/server.git
synced 2025-01-18 21:12:26 +01:00
18bc7b695a
* to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
118 lines
3.6 KiB
C
118 lines
3.6 KiB
C
/* Copyright (C) 2007 MySQL AB
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
it under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation; version 2 of the License.
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with this program; if not, write to the Free Software
|
|
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
|
|
|
|
#include "maria_def.h"
|
|
#include "trnman.h"
|
|
|
|
/**
|
|
@brief writes a COMMIT record to log and commits transaction in memory
|
|
|
|
@param trn transaction
|
|
|
|
@return Operation status
|
|
@retval 0 ok
|
|
@retval 1 error (disk error or out of memory)
|
|
*/
|
|
|
|
int ma_commit(TRN *trn)
|
|
{
|
|
int res;
|
|
LSN commit_lsn;
|
|
LEX_STRING log_array[TRANSLOG_INTERNAL_PARTS];
|
|
DBUG_ENTER("ma_commit");
|
|
|
|
if (trn->undo_lsn == 0) /* no work done, rollback (cheaper than commit) */
|
|
DBUG_RETURN(trnman_rollback_trn(trn));
|
|
/*
|
|
- if COMMIT record is written before trnman_commit_trn():
|
|
if Checkpoint comes in the middle it will see trn is not committed,
|
|
then if crash, Recovery might roll back trn (if min(rec_lsn) is after
|
|
COMMIT record) and this is not an issue as
|
|
* transaction's updates were not made visible to other transactions
|
|
* "commit ok" was not sent to client
|
|
Alternatively, Recovery might commit trn (if min(rec_lsn) is before COMMIT
|
|
record), which is ok too. All in all it means that "trn committed" is not
|
|
100% equal to "COMMIT record written".
|
|
- if COMMIT record is written after trnman_commit_trn():
|
|
if crash happens between the two, trn will be rolled back which is an
|
|
issue (transaction's updates were made visible to other transactions).
|
|
So we need to go the first way.
|
|
*/
|
|
|
|
/*
|
|
We do not store "thd->transaction.xid_state.xid" for now, it will be
|
|
needed only when we support XA.
|
|
*/
|
|
res= (translog_write_record(&commit_lsn, LOGREC_COMMIT,
|
|
trn, NULL, 0,
|
|
sizeof(log_array)/sizeof(log_array[0]),
|
|
log_array, NULL, NULL) ||
|
|
translog_flush(commit_lsn) ||
|
|
trnman_commit_trn(trn));
|
|
/*
|
|
Note: if trnman_commit_trn() fails above, we have already
|
|
written the COMMIT record, so Checkpoint and Recovery will see the
|
|
transaction as committed.
|
|
*/
|
|
DBUG_RETURN(res);
|
|
}
|
|
|
|
|
|
/**
|
|
@brief Writes a COMMIT record for a transaciton associated with a file
|
|
|
|
@param info Maria handler
|
|
|
|
@return Operation status
|
|
@retval 0 ok
|
|
@retval # error (disk error or out of memory)
|
|
*/
|
|
|
|
int maria_commit(MARIA_HA *info)
|
|
{
|
|
return info->s->now_transactional ? ma_commit(info->trn) : 0;
|
|
}
|
|
|
|
|
|
/**
|
|
@brief Starts a transaction on a file handle
|
|
|
|
@param info Maria handler
|
|
|
|
@return Operation status
|
|
@retval 0 ok
|
|
@retval # Error code.
|
|
*/
|
|
|
|
|
|
int maria_begin(MARIA_HA *info)
|
|
{
|
|
DBUG_ENTER("maria_begin");
|
|
|
|
if (info->s->now_transactional)
|
|
{
|
|
TRN *trn;
|
|
struct st_my_thread_var *mysys_var= my_thread_var;
|
|
trn= trnman_new_trn(&mysys_var->mutex,
|
|
&mysys_var->suspend,
|
|
(char*) &mysys_var + STACK_DIRECTION *1024*128);
|
|
if (unlikely(!trn))
|
|
DBUG_RETURN(HA_ERR_OUT_OF_MEM);
|
|
|
|
DBUG_PRINT("info", ("TRN set to 0x%lx", (ulong) trn));
|
|
info->trn= trn;
|
|
}
|
|
DBUG_RETURN(0);
|
|
}
|