mariadb/storage/maria/ma_commit.c
unknown 18bc7b695a WL#3072 - Maria Recovery
* to honour WAL we now force the whole log when flushing a bitmap page.
* ability to intentionally crash in various places for recovery testing
* bugfix (dirty pages list found in checkpoint record was ignored)
* smaller checkpoint record
* misc small cleanups and comments


mysql-test/include/maria_empty_logs.inc:
  maria-purge.test creates ~11 logs, remove them all
mysql-test/r/maria-recovery-bitmap.result:
  result is good; without the _ma_bitmap_get_log_address() call,
  we got
  check   error   Bitmap at 0 has pages reserved outside of data file length
mysql-test/r/maria-recovery.result:
  result update
mysql-test/t/maria-recovery-bitmap.test:
  enable test of "bitmap-flush should flush whole log otherwise
  corrupted data file (bitmap ahead of data pages)".
mysql-test/t/maria-recovery.test:
  test of checkpoint
sql/sql_table.cc:
  comment
storage/maria/ha_maria.cc:
  _ma_reenable_logging_for_table() now includes file->trn=0.
  At the end of repair() we don't need to re-enable logging, it is
  done already by caller (like copy_data_between_tables()); it sounds
  strange that this function could decide to re-enable, it should be
  up to caller who knows what other operations it plans. Removing this
  line led to assertion failure in maria_lock_database(F_UNLCK), fixed
  by removing the assertion: maria_lock_database()
  is here called in a context where F_UNLCK does not make the
  table visible to others so assertion is excessive, and external_lock()
  is already designed to honour the asserted condition.
  Ability to crash at the end of bulk insert when indices
  have been enabled.
storage/maria/ma_bitmap.c:
  Better use pagecache_file_init() than set pagecache callbacks directly;
  and a new function to set those callbacks for bitmap so that we can
  reuse it.
  _ma_bitmap_get_log_address() is a pagecache get_log_address callback
  which causes the whole log to be flushed when a bitmap page
  is flushed by the page cache. This was required by WAL.
storage/maria/ma_blockrec.c:
  get_log_address pagecache callback for data (non bitmap) pages:
  just reads the LSN from the page's content, like was hard-coded
  before in ma_pagecache.c.
storage/maria/ma_blockrec.h:
  functions which need to be exported
storage/maria/ma_check.c:
  create_new_data_handle() can be static.
  Ability to crash after rebuilding the index in OPTIMIZE,
  in REPAIR. my_lock() implemented already.
storage/maria/ma_checkpoint.c:
  As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(),
  we don't need to store kfile/dfile descriptors in checkpoint record,
  2-byte-id of the table plus one byte to say if this is data or index
  file is enough. So we go from 4+4 bytes per table down to 2+1.
storage/maria/ma_commit.c:
  removing duplicate functions (see _ma_tmp_disable_logging_for_table())
storage/maria/ma_extra.c:
  Monty fixed
storage/maria/ma_key_recover.c:
  comment
storage/maria/ma_locking.c:
  Sometimes other code does funny things with maria_lock_database(),
  like ha_maria::repair() calling it at start and end without going
  through ha_maria::external_lock(). So it happens that maria_lock_database()
  is called with now_transactional!=born_transactional.
storage/maria/ma_loghandler.c:
  update to new prototype
storage/maria/ma_open.c:
  set_data|index_pagecache_callbacks() need to be exported as
  they are now called when disabling/enabling transactionality.
storage/maria/ma_pagecache.c:
  Removing PAGE_LSN_OFFSET, as much of the code relies on it being
  0 anyway (let's not give impression we can just change this constant).
  When flushing a page to disk, call the get_log_address callback to
  know up to which LSN the log should be flushed.
  As we now can access MARIA_SHARE* we can know share->id and store
  it into the checkpoint record; we thus go from 4 bytes per dirty page
  to 2+1.
storage/maria/ma_pagecache.h:
  get_log_address callback
storage/maria/ma_panic.c:
  No reason to reset pagecache callbacks in HA_PANIC_READ:
  all we do is reopen files if they were closed; callbacks should
  be in place already as 'info' exists; we just want to modify
  the file descriptors, not the full PAGECACHE_FILE structure.
  If we open data file and it was closed, share->bitmap.file needs
  to be set.
  Note that the modified code is disabled anyway.
storage/maria/ma_recovery.c:
  Checkpoint record does not contain kfile/dfile descriptors anymore
  so code can be simplified. Hash key in all_dirty_pages is 
  not made from file_descriptor & pageno anymore, but
  index_or_data & table-short-id & pageno.
  If a table's create_rename_lsn is higher than record's LSN,
  we skip the table and don't fail if it's corrupted (because the LSNs
  say that we don't have to look at this table).
  If a table is skipped (for example due to create_rename_lsn),
  its UNDOs still cause undo_lsn to advance; this is so that if later
  we notice the transaction has to rollback we fail (as table should
  not be skipped in this case).
  Fixing a bug: the dirty_pages list was never used, because
  the LSN below which it was used was the minimum rec_lsn of dirty pages!
  It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)).
  When we disable/reenable transactionality, we modify pagecache
  callbacks (needed for example for get_log_address: changing
  share->page_type is not enough anymore).
storage/maria/ma_write.c:
  'records' and 'checksum' are protected: they are updated under
  log's mutex in write-hooks when UNDO is written.
storage/maria/maria_chk.c:
  remove use of duplicate functions.
storage/maria/maria_def.h:
  set_data|index_pagecache_callbacks() need to be exported;
  _ma_reenable_logging_for_table() changes to a real function.
storage/maria/unittest/ma_pagecache_consist.c:
  new prototype
storage/maria/unittest/ma_pagecache_single.c:
  new prototype
storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
  new prototype
2007-12-30 21:32:07 +01:00

118 lines
3.6 KiB
C

/* Copyright (C) 2007 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#include "maria_def.h"
#include "trnman.h"
/**
@brief writes a COMMIT record to log and commits transaction in memory
@param trn transaction
@return Operation status
@retval 0 ok
@retval 1 error (disk error or out of memory)
*/
int ma_commit(TRN *trn)
{
int res;
LSN commit_lsn;
LEX_STRING log_array[TRANSLOG_INTERNAL_PARTS];
DBUG_ENTER("ma_commit");
if (trn->undo_lsn == 0) /* no work done, rollback (cheaper than commit) */
DBUG_RETURN(trnman_rollback_trn(trn));
/*
- if COMMIT record is written before trnman_commit_trn():
if Checkpoint comes in the middle it will see trn is not committed,
then if crash, Recovery might roll back trn (if min(rec_lsn) is after
COMMIT record) and this is not an issue as
* transaction's updates were not made visible to other transactions
* "commit ok" was not sent to client
Alternatively, Recovery might commit trn (if min(rec_lsn) is before COMMIT
record), which is ok too. All in all it means that "trn committed" is not
100% equal to "COMMIT record written".
- if COMMIT record is written after trnman_commit_trn():
if crash happens between the two, trn will be rolled back which is an
issue (transaction's updates were made visible to other transactions).
So we need to go the first way.
*/
/*
We do not store "thd->transaction.xid_state.xid" for now, it will be
needed only when we support XA.
*/
res= (translog_write_record(&commit_lsn, LOGREC_COMMIT,
trn, NULL, 0,
sizeof(log_array)/sizeof(log_array[0]),
log_array, NULL, NULL) ||
translog_flush(commit_lsn) ||
trnman_commit_trn(trn));
/*
Note: if trnman_commit_trn() fails above, we have already
written the COMMIT record, so Checkpoint and Recovery will see the
transaction as committed.
*/
DBUG_RETURN(res);
}
/**
@brief Writes a COMMIT record for a transaciton associated with a file
@param info Maria handler
@return Operation status
@retval 0 ok
@retval # error (disk error or out of memory)
*/
int maria_commit(MARIA_HA *info)
{
return info->s->now_transactional ? ma_commit(info->trn) : 0;
}
/**
@brief Starts a transaction on a file handle
@param info Maria handler
@return Operation status
@retval 0 ok
@retval # Error code.
*/
int maria_begin(MARIA_HA *info)
{
DBUG_ENTER("maria_begin");
if (info->s->now_transactional)
{
TRN *trn;
struct st_my_thread_var *mysys_var= my_thread_var;
trn= trnman_new_trn(&mysys_var->mutex,
&mysys_var->suspend,
(char*) &mysys_var + STACK_DIRECTION *1024*128);
if (unlikely(!trn))
DBUG_RETURN(HA_ERR_OUT_OF_MEM);
DBUG_PRINT("info", ("TRN set to 0x%lx", (ulong) trn));
info->trn= trn;
}
DBUG_RETURN(0);
}