mariadb/storage/maria/ma_close.c
Michael Widenius 6db663d614 Fixes for Aria storage engine:
- Fixed bug lp:624099 ma_close.c:75: maria_close: Assertion `share->in_trans == 0' failed on UNLOCK TABLES
- Fixed bug that caused table to be marked as not closed (crashed) during recovery testing.
- Use maria_delete_table_files() instead of maria_delete_table() to delete tempoary tables (faster and safer)
- Added checks to ensure that bitmap and internal mutex are always taken in right order.
- For transactional tables, only mark the table as changed before page for table is to be written to disk (and thus the log is flushed).
  This speeds up things a bit and fixes a problem where open_count was incremented on disk but there was no log entry to fix it during recovery -> table was crashed.
- Fixed a bug in repair() where table was not automaticly repaired.
- Ensure that state->global_changed, share->changed and share->state.open_count are set and reset properly.
- Added option --ignore-control-file to maria_chk to be able to run maria_chk even if the control file is locked.


mysql-test/suite/maria/r/maria-recover.result:
  Test result changed as we now force checkpoint before copying table, which results in pagecache getting flushed and we have more rows to recover.
mysql-test/suite/maria/r/maria.result:
  Added new tests
mysql-test/suite/maria/t/maria-recover.test:
  Force checkpoint before copying table.
  This is needed as now the open-count is increased first when first page is flushed.
mysql-test/suite/maria/t/maria.test:
  Added tests to verify fix for lp:624099
storage/maria/ha_maria.cc:
  Use table->in_use instead of current_thd (trivial optimization)
  Use maria_delete_table_files() instead of maria_delete_table() to delete tempoary tables (faster and safer)
  More DBUG_ASSERT()
  Reset locked tables count after locked tables have been moved to new transaction. This fixed lp:624099
storage/maria/ma_bitmap.c:
  Temporarly unlock bitmap mutex when calling _ma_mark_file_changed() and pagecache_delete_pages() to ensure right mutex lock order.
  Call _ma_bitmap_unpin_all() when bitmap->non_flusable is set to 0. This fixed a case when bitmap was not proparly unpinned.
  More comments
  Added DBUG_ASSERT() for detecting wrong share->bitmap usage
storage/maria/ma_blockrec.c:
  More DBUG_ASSERT()
  Moved code around in _ma_apply_redo_insert_row_head_or_tail() to make things safer on error conditions.
storage/maria/ma_check.c:
  Changed parameter for _ma_set_uuid()
  Corrected test for detecting if we lost many rows. This fixed some cases where auto-recovery failed.
  share->changed need to be set if state.open_count is changed.
  Removed setting of share->changed= 0 as called function sets it.
storage/maria/ma_close.c:
  - Added code to properly decrement open_count and have it written by _ma_state_info_write() for transactional tables.
    (This is more correct and also saves us one extra write by _ma_decrement_open_count() at close.
  - Added DBUG_ASSERT() to detect if open_count is wrong at maria_close().
storage/maria/ma_delete.c:
  Updated argument to _ma_mark_file_changed()
storage/maria/ma_delete_all.c:
  Updated argument to _ma_mark_file_changed()
  For transactional tables, call _ma_mark_file_changed() after log entry has been written (to allow recover to fix open_count)
  Reset more needed variables in _ma_reset_status()
storage/maria/ma_delete_table.c:
  Moved deletion of Aria files to maria_delete_table_files().
  Remove RAID usage (old not working code)
storage/maria/ma_extra.c:
  Set share->changed=1 when state needs to be updated on disk.
  Don't reset share->changed after call to _ma_state_info_write() as this calls sets share->changed.
  Set share->state.open_count to 1 to force table to be auto repaired if drop fails.
  Set share->global_changed before call to _ma_flush_table_files() to ensure that we don't try to mark the table changed during flush.
  Added DBUG_ENTER
storage/maria/ma_locking.c:
  Split _ma_mark_file_changed() into two functions to delay marking transactional tables as changed on disk until first disk write.
  Added argument to _ma_decrement_open_count() to tell if we should call ma_lock_database() or not.
  Don't decrement open count for transactional tables during _ma_decrement_open_count(). This will be done during close.
  Changed parameter for _ma_set_uuid()
storage/maria/ma_open.c:
  Set share->open_count_not_zero_on_open if state.open_count is not zero.
  This is needed for DBUG_ASSERT() in maria_close() that is there to enforce that open_count is always 0 at close.
  This test doesn't however work for tables that had open_count != 0 already on disk (ie, crashed tables).
  Enforce right mutex order for share->intern_lock and share->bitmap.bitmap_lock
  Don't set share->changed to 0 if share->state.open_count != 0, as state needs to be be written at close
storage/maria/ma_pagecache.c:
  Moved a bit of code in find_block() to avoid one if.
  More DBUG_ASSERT()
  (I left a comment in the code for Sanja to look at;  The code probably works but we need to check if it's optimal)
storage/maria/ma_pagecrc.c:
  For transactional tables, just before first write to disk, but after log is flushed, mark the file changed.
  This fixes some cases where recovery() did not detect that table was marked as changed and could thus not recover the marker.
storage/maria/ma_recovery.c:
  Set share->changed when share->global_changed is set.
storage/maria/ma_update.c:
  Updated parameter for _ma_mark_file_changed()
storage/maria/ma_write.c:
  Updated parameter for _ma_mark_file_changed()
storage/maria/maria_chk.c:
  Added option --ignore-control-file to be able to run maria_chk even if the control file is locked.
storage/maria/maria_def.h:
  Updated function prototypes.
  Added open_count_not_zero_on_open to MARIA_SHARE.
storage/myisam/ha_myisam.cc:
  current_thd -> table->in_use
2011-02-10 20:33:51 +02:00

233 lines
7.7 KiB
C

/* Copyright (C) 2006 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/* close a isam-database */
/*
TODO:
We need to have a separate mutex on the closed file to allow other threads
to open other files during the time we flush the cache and close this file
*/
#include "maria_def.h"
int maria_close(register MARIA_HA *info)
{
int error=0,flag;
my_bool share_can_be_freed= FALSE;
MARIA_SHARE *share= info->s;
DBUG_ENTER("maria_close");
DBUG_PRINT("enter",("base: 0x%lx reopen: %u locks: %u",
(long) info, (uint) share->reopen,
(uint) share->tot_locks));
/* Check that we have unlocked key delete-links properly */
DBUG_ASSERT(info->key_del_used == 0);
pthread_mutex_lock(&THR_LOCK_maria);
if (info->lock_type == F_EXTRA_LCK)
info->lock_type=F_UNLCK; /* HA_EXTRA_NO_USER_CHANGE */
if (info->lock_type != F_UNLCK)
{
if (maria_lock_database(info,F_UNLCK))
error=my_errno;
}
pthread_mutex_lock(&share->close_lock);
pthread_mutex_lock(&share->intern_lock);
if (share->options & HA_OPTION_READ_ONLY_DATA)
{
share->r_locks--;
share->tot_locks--;
}
if (info->opt_flag & (READ_CACHE_USED | WRITE_CACHE_USED))
{
if (end_io_cache(&info->rec_cache))
error=my_errno;
info->opt_flag&= ~(READ_CACHE_USED | WRITE_CACHE_USED);
}
flag= !--share->reopen;
maria_open_list=list_delete(maria_open_list,&info->open_list);
my_free(info->rec_buff, MYF(MY_ALLOW_ZERO_PTR));
(*share->end)(info);
if (flag)
{
/* Last close of file; Flush everything */
/* Check that we don't have any dangling pointers from the transaction */
DBUG_ASSERT(share->in_trans == 0);
if (share->kfile.file >= 0)
{
my_bool save_global_changed= share->global_changed;
/* Avoid _ma_mark_file_changed() when flushing pages */
share->global_changed= 1;
if ((*share->once_end)(share))
error= my_errno;
if (flush_pagecache_blocks(share->pagecache, &share->kfile,
((share->temporary || share->deleting) ?
FLUSH_IGNORE_CHANGED :
FLUSH_RELEASE)))
error= my_errno;
#ifdef HAVE_MMAP
if (share->file_map)
_ma_unmap_file(info);
#endif
/*
If we are crashed, we can safely flush the current state as it will
not change the crashed state.
We can NOT write the state in other cases as other threads
may be using the file at this point
IF using --external-locking, which does not apply to Maria.
*/
if (((share->changed && share->base.born_transactional) ||
maria_is_crashed(info)))
{
if (save_global_changed)
{
/*
Reset effect of _ma_mark_file_changed(). Better to do it
here than in _ma_decrement_open_count(), as
_ma_state_info_write() will write the open_count.
*/
save_global_changed= 0;
share->state.open_count--;
}
/*
State must be written to file as it was not done at table's
unlocking.
*/
if (_ma_state_info_write(share, MA_STATE_INFO_WRITE_DONT_MOVE_OFFSET))
error= my_errno;
}
DBUG_ASSERT(maria_is_crashed(info) || !share->base.born_transactional ||
share->state.open_count == 0 ||
share->open_count_not_zero_on_open);
/* Ensure that open_count is zero on close */
share->global_changed= save_global_changed;
_ma_decrement_open_count(info, 0);
/* Ensure that open_count really is zero */
DBUG_ASSERT(maria_is_crashed(info) || share->temporary ||
share->state.open_count == 0 ||
share->open_count_not_zero_on_open);
/*
File must be synced as it is going out of the maria_open_list and so
becoming unknown to future Checkpoints.
*/
if (share->now_transactional && my_sync(share->kfile.file, MYF(MY_WME)))
error= my_errno;
if (my_close(share->kfile.file, MYF(0)))
error= my_errno;
}
#ifdef THREAD
thr_lock_delete(&share->lock);
(void) pthread_mutex_destroy(&share->key_del_lock);
{
int i,keys;
keys = share->state.header.keys;
VOID(rwlock_destroy(&share->mmap_lock));
for(i=0; i<keys; i++) {
VOID(rwlock_destroy(&share->keyinfo[i].root_lock));
}
}
#endif
DBUG_ASSERT(share->now_transactional == share->base.born_transactional);
/*
We assign -1 because checkpoint does not need to flush (in case we
have concurrent checkpoint if no then we do not need it here also)
*/
share->kfile.file= -1;
/*
Remember share->history for future opens
We have to unlock share->intern_lock then lock it after
LOCK_trn_list (trnman_lock()) to avoid dead locks.
*/
pthread_mutex_unlock(&share->intern_lock);
_ma_remove_not_visible_states_with_lock(share, TRUE);
pthread_mutex_lock(&share->intern_lock);
if (share->in_checkpoint & MARIA_CHECKPOINT_LOOKS_AT_ME)
{
/* we cannot my_free() the share, Checkpoint would see a bad pointer */
share->in_checkpoint|= MARIA_CHECKPOINT_SHOULD_FREE_ME;
}
else
share_can_be_freed= TRUE;
if (share->state_history)
{
MARIA_STATE_HISTORY_CLOSED *history;
/*
Here we ignore the unlikely case that we don't have memory to
store the state. In the worst case what happens is that any transaction
that tries to access this table will get a wrong status information.
*/
if ((history= (MARIA_STATE_HISTORY_CLOSED *)
my_malloc(sizeof(*history), MYF(MY_WME))))
{
history->create_rename_lsn= share->state.create_rename_lsn;
history->state_history= share->state_history;
if (my_hash_insert(&maria_stored_state, (uchar*) history))
my_free(history, MYF(0));
}
/* Marker for concurrent checkpoint */
share->state_history= 0;
}
}
pthread_mutex_unlock(&THR_LOCK_maria);
pthread_mutex_unlock(&share->intern_lock);
pthread_mutex_unlock(&share->close_lock);
if (share_can_be_freed)
{
(void) pthread_mutex_destroy(&share->intern_lock);
(void) pthread_mutex_destroy(&share->close_lock);
(void) pthread_cond_destroy(&share->key_del_cond);
my_free((uchar *)share, MYF(0));
/*
If share cannot be freed, it's because checkpoint has previously
recorded to include this share in the checkpoint and so is soon going to
look at some of its content (share->in_checkpoint/id/last_version).
*/
}
my_free(info->ftparser_param, MYF(MY_ALLOW_ZERO_PTR));
if (info->dfile.file >= 0)
{
/*
This is outside of mutex so would confuse a concurrent
Checkpoint. Fortunately in BLOCK_RECORD we close earlier under mutex.
*/
if (my_close(info->dfile.file, MYF(0)))
error= my_errno;
}
delete_dynamic(&info->pinned_pages);
my_free(info, MYF(0));
if (error)
{
DBUG_PRINT("error", ("Got error on close: %d", my_errno));
DBUG_RETURN(my_errno= error);
}
DBUG_RETURN(0);
} /* maria_close */