mariadb/storage/maria/ma_locking.c
Michael Widenius 52cb0c24a6 Added versioning of Maria index
Store max_trid in index file as state.create_trid. This is used to pack all transids in the index pages relative to max possible transid for file.
Enable versioning for transactional tables with index. Tables with an auto-increment key, rtree or fulltext keys are not versioned.
Changed info->lastkey to type MARIA_KEY. Removed info->lastkey_length as this is now part of info->lastkey
Renamed old info->lastkey to info->lastkey_buff
Use exact key lenghts for keys, not USE_WHOLE_KEY
For partial key searches, use SEARCH_PART_KEY
When searching to insert new key on page, use SEARCH_INSERT to mark that key has rowid

Changes done in a lot of files:
- Modified functions to use MARIA_KEY instead of key pointer and key length
- Use keyinfo->root_lock instead of share->key_root_lock[keynr]
- Simplify code by using local variable keyinfo instead if share->keyinfo[i]
- Added #fdef EXTERNAL_LOCKING around removed state elements
- HA_MAX_KEY_BUFF -> MARIA_MAX_KEY_BUFF (to reserve space for transid)
- Changed type of 'nextflag' to uint32 to ensure all SEARCH_xxx flags fits into it

.bzrignore:
  Added missing temporary directory
extra/Makefile.am:
  comp_err is now deleted on make distclean
include/maria.h:
  Added structure MARIA_KEY, which is used for intern key objects in Maria.
  Changed functions to take MARIA_KEY as an argument instead of pointer to packed key.
  Changed some functions that always return true or false to my_bool.
  Added virtual function make_key() to avoid if in _ma_make_key()
  Moved rw_lock_t for locking trees from share->key_root_lock to MARIA_KEYDEF. This makes usage of the locks simpler and faster
include/my_base.h:
  Added HA_RTREE_INDEX flag to mark rtree index. Used for easier checks in ma_check()
  Added SEARCH_INSERT to be used when inserting new keys
  Added SEARCH_PART_KEY for partial searches
  Added SEARCH_USER_KEY_HAS_TRANSID to be used when key we use for searching in btree has a TRANSID
  Added SEARCH_PAGE_KEY_HAS_TRANSID to be used when key we found in btree has a transid
include/my_handler.h:
  Make next_flag 32 bit to make sure we can handle all SEARCH_ bits
mysql-test/include/maria_empty_logs.inc:
  Read and restore current database; Don't assume we are using mysqltest.
  Don't log use databasename to log. Using this include should not cause any result changes.
mysql-test/r/maria-gis-rtree-dynamic.result:
  Updated results after adding some check table commands to help pinpoint errors
mysql-test/r/maria-mvcc.result:
  New tests
mysql-test/r/maria-purge.result:
  New result after adding removal of logs
mysql-test/r/maria-recovery-big.result:
  maria_empty_logs doesn't log 'use mysqltest' anymore
mysql-test/r/maria-recovery-bitmap.result:
  maria_empty_logs doesn't log 'use mysqltest' anymore
mysql-test/r/maria-recovery-rtree-ft.result:
  maria_empty_logs doesn't log 'use mysqltest' anymore
mysql-test/r/maria-recovery.result:
  maria_empty_logs doesn't log 'use mysqltest' anymore
mysql-test/r/maria.result:
  New tests
mysql-test/r/variables-big.result:
  Don't log id as it's not predictable
mysql-test/suite/rpl_ndb/r/rpl_truncate_7ndb_2.result:
  Updated results to new binlog results. (Test has not been run in a long time as it requires --big)
mysql-test/suite/rpl_ndb/t/rpl_truncate_7ndb_2-master.opt:
  Moved file to ndb replication test directory
mysql-test/suite/rpl_ndb/t/rpl_truncate_7ndb_2.test:
  Fixed wrong path to included tests
mysql-test/t/maria-gis-rtree-dynamic.test:
  Added some check table commands to help pinpoint errors
mysql-test/t/maria-mvcc.test:
  New tests
mysql-test/t/maria-purge.test:
  Remove logs to make test results predictable
mysql-test/t/maria.test:
  New tests for some possible problems
mysql-test/t/variables-big.test:
  Don't log id as it's not predictable
mysys/my_handler.c:
  Updated function comment to reflect old code
  Changed nextflag to be uint32 to ensure we can have flags > 16 bit
  Changed checking if we are in insert with NULL keys as next_flag can now include additional bits that have to be ignored.
  Added SEARCH_INSERT flag to be used when inserting new keys in btree. This flag tells us the that the keys includes row position and it's thus safe to remove SEARCH_FIND
  Added comparision of transid. This is only done if the keys actually have a transid, which is indicated by nextflag
mysys/my_lock.c:
  Fixed wrong test (Found by Guilhem)
scripts/Makefile.am:
  Ensure that test programs are deleted by make clean
sql/rpl_rli.cc:
  Moved assignment order to fix compiler warning
storage/heap/hp_write.c:
  Add SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys
storage/maria/Makefile.am:
  Remove also maria log files when doing make distclean
storage/maria/ha_maria.cc:
  Use 'file->start_state' as default state for transactional tables without versioning
  At table unlock, set file->state to point to live state. (Needed for information schema to pick up right number of rows)
  In ha_maria::implicit_commit() move all locked (ie open) tables to new transaction. This is needed to ensure ha_maria->info doesn't point to a deleted history event.
  Disable concurrent inserts for insert ... select and table changes with subqueries if statement based replication as this would cause wrong results on slave
storage/maria/ma_blockrec.c:
  Updated comment
storage/maria/ma_check.c:
  Compact key pages (removes transid) when doing --zerofill
  Check that 'page_flag' on key pages contains KEYPAGE_FLAG_HAS_TRANSID if there is a single key on the page with a transid
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Use new interface to _ma_rec_pos(), _ma_dpointer(), _ma_ft_del(), ma_update_state_lsn()
  Removed not needed argument from get_record_for_key()
  Fixed that we check doesn't give errors for RTREE; We now treath these like SPATIAL
  Remove some SPATIAL specific code where the virtual functions can handle this in a general manner
  Use info->lastkey_buff instead of info->lastkey
  _ma_dpos() -> _ma_row_pos_from_key()
  _ma_make_key() -> keyinfo->make_key()
  _ma_print_key() -> _ma_print_keydata()
  _ma_move_key() -> ma_copy_copy()
  Add SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys
  Ensure that data on page doesn't overwrite page checksum position
  Use DBUG_DUMP_KEY instead of DBUG_DUMP
  Use exact key lengths instead of USE_WHOLE_KEY to ha_key_cmp()
  Fixed check if rowid points outside of BLOCK_RECORD data file
  Use info->lastkey_buff instead of key on stack in some safe places
  Added #fdef EXTERNAL_LOCKING around removed state elements
storage/maria/ma_close.c:
  Use keyinfo->root_lock instead of share->key_root_lock[keynr]
storage/maria/ma_create.c:
  Removed assert that is already checked in maria_init()
  Force transactinal tables to be of type BLOCK_RECORD
  Fixed wrong usage of HA_PACK_RECORD (should be HA_OPTION_PACK_RECORD)
  Mark keys that uses HA_KEY_ALG_RTREE with HA_RTREE_INDEX for easier handling of these in ma_check
  Store max_trid in index file as state.create_trid. This is used to pack all transids in the index pages relative to max possible transid for file.
storage/maria/ma_dbug.c:
  Changed _ma_print_key() to use MARIA_KEY
storage/maria/ma_delete.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  info->lastkey2-> info->lastkey_buff2
  Added SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys
  Use new interface for get_key(), _ma_get_last_key() and others
  _ma_dpos() -> ma_row_pos_from_key()
  Simplify setting of prev_key in del()
  Ensure that KEYPAGE_FLAG_HAS_TRANSID is set in page_flag if key page has transid
  Treath key pages that may have a transid as if keys would be of variable length
storage/maria/ma_delete_all.c:
  Reset history state if maria_delete_all_rows() are called
  Update parameters to _ma_update_state_lsns() call
storage/maria/ma_extra.c:
  Store and restore info->lastkey
storage/maria/ma_ft_boolean_search.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
storage/maria/ma_ft_nlq_search.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Use lastkey_buff2 instead of info->lastkey+info->s->base.max_key_length (same thing)
storage/maria/ma_ft_update.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
storage/maria/ma_ftdefs.h:
  Modified functions to use MARIA_KEY instead of key pointer and key length
storage/maria/ma_fulltext.h:
  Modified functions to use MARIA_KEY instead of key pointer and key length
storage/maria/ma_init.c:
  Check if blocksize is legal
  (Moved test here from ma_open())
storage/maria/ma_key.c:
  Added functions for storing/reading of transid 
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Moved _ma_sp_make_key() out of _ma_make_key() as we now use keyinfo->make_key to create keys
  Add transid to keys if table is versioned
  Added _ma_copy_key()
storage/maria/ma_key_recover.c:
  Add logging of page_flag (holds information if there are keys with transid on page)
  Changed DBUG_PRINT("info" -> DBUG_PRINT("redo" as the redo logging can be quite extensive
  Added lots of DBUG_PRINT()
  Added support for index page operations: KEY_OP_SET_PAGEFLAG and KEY_OP_COMPACT_PAGE
storage/maria/ma_key_recover.h:
  Modified functions to use MARIA_KEY instead of key pointer and key length
storage/maria/ma_locking.c:
  Added new arguments to _ma_update_state_lsns_sub()
storage/maria/ma_loghandler.c:
  Fixed all logging of LSN to look similar in DBUG log
  Changed if (left != 0) to if (left) as the later is used also later in the code
storage/maria/ma_loghandler.h:
  Added new index page operations
storage/maria/ma_open.c:
  Removed allocated "state_dummy" and instead use share->state.common for transactional tables that are not versioned
  This is needed to not get double increments of state.records (one in ma_write.c and on when log is written)
  Changed info->lastkey to MARIA_KEY type
  Removed resetting of MARIA_HA variables that have 0 as default value (as info is zerofilled)
  Enable versioning for transactional tables with index. Tables with an auto-increment key, rtree or fulltext keys are not versioned.
  Check on open that state.create_trid is correct
  Extend share->base.max_key_length in case of transactional table so that it can hold transid
  Removed 4.0 compatible fulltext key mode as this is not relevant for Maria
  Removed old and wrong #ifdef ENABLE_WHEN_WE_HAVE_TRANS_ROW_ID code block
  Initialize all new virtual function pointers
  Removed storing of state->unique, state->process and store state->create_trid instead
storage/maria/ma_page.c:
  Added comment to describe key page structure
  Added functions to compact key page and log the compact operation
storage/maria/ma_range.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Use SEARCH_PART_KEY indicator instead of USE_WHOLE_KEY to detect if we are doing a part key search
  Added handling of pages with transid
storage/maria/ma_recovery.c:
  Don't assert if table we opened are not transactional. This may be a table which has been changed from transactional to not transactinal
  Added new arguments to _ma_update_state_lsns()
storage/maria/ma_rename.c:
  Added new arguments to _ma_update_state_lsns()
storage/maria/ma_rkey.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Don't use USE_WHOLE_KEY, use real length of key
  Use share->row_is_visible() to test if row is visible
  Moved search_flag == HA_READ_KEY_EXACT out of 'read-next-row' loop as this only need to be tested once
  Removed test if last_used_keyseg != 0 as this is always true
storage/maria/ma_rnext.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Simplify code by using local variable keyinfo instead if share->keyinfo[i]
  Use share->row_is_visible() to test if row is visible
storage/maria/ma_rnext_same.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  lastkey2 -> lastkey_buff2
storage/maria/ma_rprev.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Simplify code by using local variable keyinfo instead if share->keyinfo[i]
  Use share->row_is_visible() to test if row is visible
storage/maria/ma_rsame.c:
  Updated comment
  Simplify code by using local variable keyinfo instead if share->keyinfo[i]
  Modified functions to use MARIA_KEY instead of key pointer and key length
storage/maria/ma_rsamepos.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
storage/maria/ma_rt_index.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Use better variable names
  Removed not needed casts
  _ma_dpos() -> _ma_row_pos_from_key()
  Use info->last_rtree_keypos to save position to key instead of info->int_keypos
  Simplify err: condition
  Changed return type for maria_rtree_insert() to my_bool as we are only intressed in ok/fail from this function
storage/maria/ma_rt_index.h:
  Modified functions to use MARIA_KEY instead of key pointer and key length
storage/maria/ma_rt_key.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Simplify maria_rtree_add_key by combining idenitcal code and removing added_len
storage/maria/ma_rt_key.h:
  Modified functions to use MARIA_KEY instead of key pointer and key length
storage/maria/ma_rt_mbr.c:
  Changed type of 'nextflag' to uint32
  Added 'to' argument to RT_PAGE_MBR_XXX functions to more clearly see which variables changes value
storage/maria/ma_rt_mbr.h:
  Changed type of 'nextflag' to uint32
storage/maria/ma_rt_split.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  key_length -> key_data_length to catch possible errors
storage/maria/ma_rt_test.c:
  Fixed wrong comment
  Reset recinfo to avoid valgrind varnings
  Fixed wrong argument to create_record() that caused test to fail
storage/maria/ma_search.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Added support of keys with optional trid
  Test for SEARCH_PART_KEY instead of USE_WHOLE_KEY to detect part key reads
  _ma_dpos() -> _ma_row_pos_from_key()
  If there may be keys with transid on the page, have _ma_bin_search() call _ma_seq_search()
  Add _ma_skip_xxx() functions to quickly step over keys (faster than calling get_key() in most cases as we don't have to copy key data)
  Combine similar code at end of _ma_get_binary_pack_key()
  Removed not used function _ma_move_key()
  In _ma_search_next() don't call _ma_search() if we aren't on a nod page.
  Update info->cur_row.trid with trid for found key
  
  
  
  Removed some not needed casts
  Added _ma_trid_from_key()
  Use MARIA_SHARE instead of MARIA_HA as arguments to _ma_rec_pos(), _ma_dpointer() and _ma_xxx_keypos_to_recpos() to make functions faster and smaller
storage/maria/ma_sort.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
storage/maria/ma_sp_defs.h:
  _ma_sp_make_key() now fills in and returns (MARIA_KEY *) value
storage/maria/ma_sp_key.c:
  _ma_sp_make_key() now fills in and returns (MARIA_KEY *) value
  Don't test sizeof(double), test against 8 as we are using float8store()
  Use mi_float8store() instead of doing swap of value (same thing but faster)
storage/maria/ma_state.c:
  maria_versioning() now only calls _ma_block_get_status() if table supports versioning
  Added _ma_row_visible_xxx() functions for different occasions
  When emptying history, set info->state to point to the first history event.
storage/maria/ma_state.h:
  Added _ma_row_visible_xxx() prototypes
storage/maria/ma_static.c:
  Indentation changes
storage/maria/ma_statrec.c:
  Fixed arguments to _ma_dpointer() and _ma_rec_pos()
storage/maria/ma_test1.c:
  Call init_thr_lock() if we have versioning
storage/maria/ma_test2.c:
  Call init_thr_lock() if we have versioning
storage/maria/ma_unique.c:
  Modified functions to use MARIA_KEY
storage/maria/ma_update.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
storage/maria/ma_write.c:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Simplify code by using local variable keyinfo instead if share->keyinfo[i]
  In _ma_enlarge_root(), mark in page_flag if new key has transid
  _ma_dpos() -> _ma_row_pos_from_key()
  Changed return type of _ma_ck_write_tree() to my_bool as we are only testing if result is true or not
  Moved 'reversed' to outside block as area was used later
storage/maria/maria_chk.c:
  Added error if trying to sort with HA_BINARY_PACK_KEY
  Use new interface to get_key() and _ma_dpointer()
  _ma_dpos() -> _ma_row_pos_from_key()
storage/maria/maria_def.h:
  Modified functions to use MARIA_KEY instead of key pointer and key length
  Added 'common' to MARIA_SHARE->state for storing state for transactional tables without versioning
  Added create_trid to MARIA_SHARE
  Removed not used state variables 'process' and 'unique'
  Added defines for handling TRID's in index pages
  Changed to use MARIA_SHARE instead of MARIA_HA for some functions
  Added 'have_versioning' flag if table supports versioning
  Moved key_root_lock from MARIA_SHARE to MARIA_KEYDEF
  Changed last_key to be of type MARIA_KEY. Removed lastkey_length
  lastkey -> lastkey_buff, lastkey2 -> lastkey_buff2
  Added _ma_get_used_and_nod_with_flag() for faster access to page data when page_flag is read
  Added DBUG_DUMP_KEY for easier DBUG_DUMP of a key
  Changed 'nextflag' and assocaited variables to uint32
storage/maria/maria_ftdump.c:
  lastkey -> lastkey_buff
storage/maria/trnman.c:
  Fixed wrong initialization of min_read_from and max_commit_trid
  Added trnman_get_min_safe_trid()
storage/maria/unittest/ma_test_all-t:
  Added --start-from
storage/myisam/mi_check.c:
  Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order
storage/myisam/mi_delete.c:
  Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order
storage/myisam/mi_range.c:
  Updated comment
storage/myisam/mi_write.c:
  Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order
storage/myisam/rt_index.c:
  Fixed wrong parameter to rtree_get_req() which could cause crash
2008-06-26 08:18:28 +03:00

540 lines
16 KiB
C

/* Copyright (C) 2006 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
Locking of Maria-tables.
Must be first request before doing any furter calls to any Maria function.
Is used to allow many process use the same non transactional Maria table
*/
#include "ma_ftdefs.h"
/* lock table by F_UNLCK, F_RDLCK or F_WRLCK */
int maria_lock_database(MARIA_HA *info, int lock_type)
{
int error;
uint count;
MARIA_SHARE *share= info->s;
DBUG_ENTER("maria_lock_database");
DBUG_PRINT("enter",("lock_type: %d old lock %d r_locks: %u w_locks: %u "
"global_changed: %d open_count: %u name: '%s'",
lock_type, info->lock_type, share->r_locks,
share->w_locks,
share->global_changed, share->state.open_count,
share->index_file_name));
if (share->options & HA_OPTION_READ_ONLY_DATA ||
info->lock_type == lock_type)
DBUG_RETURN(0);
if (lock_type == F_EXTRA_LCK) /* Used by TMP tables */
{
++share->w_locks;
++share->tot_locks;
info->lock_type= lock_type;
DBUG_RETURN(0);
}
error=0;
pthread_mutex_lock(&share->intern_lock);
if (share->kfile.file >= 0) /* May only be false on windows */
{
switch (lock_type) {
case F_UNLCK:
maria_ftparser_call_deinitializer(info);
if (info->lock_type == F_RDLCK)
{
count= --share->r_locks;
if (share->lock_restore_status)
(*share->lock_restore_status)(info);
}
else
{
count= --share->w_locks;
if (share->lock.update_status)
(*share->lock.update_status)(info);
}
--share->tot_locks;
if (info->lock_type == F_WRLCK && !share->w_locks)
{
/* pages of transactional tables get flushed at Checkpoint */
if (!share->base.born_transactional && !share->temporary &&
_ma_flush_table_files(info,
share->delay_key_write ? MARIA_FLUSH_DATA :
MARIA_FLUSH_DATA | MARIA_FLUSH_INDEX,
FLUSH_KEEP, FLUSH_KEEP))
error= my_errno;
}
if (info->opt_flag & (READ_CACHE_USED | WRITE_CACHE_USED))
{
if (end_io_cache(&info->rec_cache))
{
error=my_errno;
maria_print_error(info->s, HA_ERR_CRASHED);
maria_mark_crashed(info);
}
}
if (!count)
{
DBUG_PRINT("info",("changed: %u w_locks: %u",
(uint) share->changed, share->w_locks));
if (share->changed && !share->w_locks)
{
#ifdef HAVE_MMAP
if ((share->mmaped_length !=
share->state.state.data_file_length) &&
(share->nonmmaped_inserts > MAX_NONMAPPED_INSERTS))
{
if (share->lock_key_trees)
rw_wrlock(&share->mmap_lock);
_ma_remap_file(info, share->state.state.data_file_length);
share->nonmmaped_inserts= 0;
if (share->lock_key_trees)
rw_unlock(&share->mmap_lock);
}
#endif
#ifdef EXTERNAL_LOCKING
share->state.process= share->last_process=share->this_process;
share->state.unique= info->last_unique= info->this_unique;
share->state.update_count= info->last_loop= ++info->this_loop;
#endif
/* transactional tables rather flush their state at Checkpoint */
if (!share->base.born_transactional)
{
if (_ma_state_info_write_sub(share->kfile.file, &share->state, 1))
error= my_errno;
else
{
/* A value of 0 means below means "state flushed" */
share->changed= 0;
}
}
if (maria_flush)
{
if (_ma_sync_table_files(info))
error= my_errno;
}
else
share->not_flushed=1;
if (error)
{
maria_print_error(info->s, HA_ERR_CRASHED);
maria_mark_crashed(info);
}
}
}
info->opt_flag&= ~(READ_CACHE_USED | WRITE_CACHE_USED);
info->lock_type= F_UNLCK;
break;
case F_RDLCK:
if (info->lock_type == F_WRLCK)
{
/*
Change RW to READONLY
mysqld does not turn write locks to read locks,
so we're never here in mysqld.
*/
share->w_locks--;
share->r_locks++;
info->lock_type=lock_type;
break;
}
#ifdef MARIA_EXTERNAL_LOCKING
if (!share->r_locks && !share->w_locks)
{
/* note that a transactional table should not do this */
if (_ma_state_info_read_dsk(share->kfile.file, &share->state))
{
error=my_errno;
break;
}
}
#endif
VOID(_ma_test_if_changed(info));
share->r_locks++;
share->tot_locks++;
info->lock_type=lock_type;
break;
case F_WRLCK:
if (info->lock_type == F_RDLCK)
{ /* Change READONLY to RW */
if (share->r_locks == 1)
{
share->r_locks--;
share->w_locks++;
info->lock_type=lock_type;
break;
}
}
#ifdef MARIA_EXTERNAL_LOCKING
if (!(share->options & HA_OPTION_READ_ONLY_DATA))
{
if (!share->w_locks)
{
if (!share->r_locks)
{
/*
Note that transactional tables should not do this.
If we enabled this code, we should make sure to skip it if
born_transactional is true. We should not test
now_transactional to decide if we can call
_ma_state_info_read_dsk(), because it can temporarily be 0
(TRUNCATE on a partitioned table) and thus it would make a state
modification below without mutex, confusing a concurrent
checkpoint running.
Even if this code was enabled only for non-transactional tables:
in scenario LOCK TABLE t1 WRITE; INSERT INTO t1; DELETE FROM t1;
state on disk read by DELETE is obsolete as it was not flushed
at the end of INSERT. MyISAM same. It however causes no issue as
maria_delete_all_rows() calls _ma_reset_status() thus is not
influenced by the obsolete read values.
*/
if (_ma_state_info_read_dsk(share->kfile.file, &share->state))
{
error=my_errno;
break;
}
}
}
}
#endif /* defined(MARIA_EXTERNAL_LOCKING) */
VOID(_ma_test_if_changed(info));
info->lock_type=lock_type;
info->invalidator=share->invalidator;
share->w_locks++;
share->tot_locks++;
break;
default:
DBUG_ASSERT(0);
break; /* Impossible */
}
}
#ifdef __WIN__
else
{
/*
Check for bad file descriptors if this table is part
of a merge union. Failing to capture this may cause
a crash on windows if the table is renamed and
later on referenced by the merge table.
*/
if( info->owned_by_merge && (info->s)->kfile.file < 0 )
{
error = HA_ERR_NO_SUCH_TABLE;
}
}
#endif
pthread_mutex_unlock(&share->intern_lock);
DBUG_RETURN(error);
} /* maria_lock_database */
/****************************************************************************
** functions to read / write the state
****************************************************************************/
int _ma_readinfo(register MARIA_HA *info __attribute__ ((unused)),
int lock_type __attribute__ ((unused)),
int check_keybuffer __attribute__ ((unused)))
{
#ifdef MARIA_EXTERNAL_LOCKING
DBUG_ENTER("_ma_readinfo");
if (info->lock_type == F_UNLCK)
{
MARIA_SHARE *share= info->s;
if (!share->tot_locks)
{
/* should not be done for transactional tables */
if (_ma_state_info_read_dsk(share->kfile.file, &share->state))
{
if (!my_errno)
my_errno= HA_ERR_FILE_TOO_SHORT;
DBUG_RETURN(1);
}
}
if (check_keybuffer)
VOID(_ma_test_if_changed(info));
info->invalidator=share->invalidator;
}
else if (lock_type == F_WRLCK && info->lock_type == F_RDLCK)
{
my_errno=EACCES; /* Not allowed to change */
DBUG_RETURN(-1); /* when have read_lock() */
}
DBUG_RETURN(0);
#else
return 0;
#endif /* defined(MARIA_EXTERNAL_LOCKING) */
} /* _ma_readinfo */
/*
Every isam-function that uppdates the isam-database MUST end with this
request
NOTES
my_errno is not changed if this succeeds!
*/
int _ma_writeinfo(register MARIA_HA *info, uint operation)
{
int error,olderror;
MARIA_SHARE *share= info->s;
DBUG_ENTER("_ma_writeinfo");
DBUG_PRINT("info",("operation: %u tot_locks: %u", operation,
share->tot_locks));
error=0;
if (share->tot_locks == 0 && !share->base.born_transactional)
{
/* transactional tables flush their state at Checkpoint */
if (operation)
{ /* Two threads can't be here */
olderror= my_errno; /* Remember last error */
#ifdef EXTERNAL_LOCKING
/*
The following only makes sense if we want to be allow two different
processes access the same table at the same time
*/
share->state.process= share->last_process= share->this_process;
share->state.unique= info->last_unique= info->this_unique;
share->state.update_count= info->last_loop= ++info->this_loop;
#endif
if ((error= _ma_state_info_write_sub(share->kfile.file,
&share->state, 1)))
olderror=my_errno;
#ifdef __WIN__
if (maria_flush)
{
_commit(share->kfile.file);
_commit(info->dfile.file);
}
#endif
my_errno=olderror;
}
}
else if (operation)
share->changed= 1; /* Mark keyfile changed */
DBUG_RETURN(error);
} /* _ma_writeinfo */
/*
Test if an external process has changed the database
(Should be called after readinfo)
*/
int _ma_test_if_changed(register MARIA_HA *info)
{
#ifdef EXTERNAL_LOCKING
MARIA_SHARE *share= info->s;
if (share->state.process != share->last_process ||
share->state.unique != info->last_unique ||
share->state.update_count != info->last_loop)
{ /* Keyfile has changed */
DBUG_PRINT("info",("index file changed"));
if (share->state.process != share->this_process)
VOID(flush_pagecache_blocks(share->pagecache, &share->kfile,
FLUSH_RELEASE));
share->last_process=share->state.process;
info->last_unique= share->state.unique;
info->last_loop= share->state.update_count;
info->update|= HA_STATE_WRITTEN; /* Must use file on next */
info->data_changed= 1; /* For maria_is_changed */
return 1;
}
#endif
return (!(info->update & HA_STATE_AKTIV) ||
(info->update & (HA_STATE_WRITTEN | HA_STATE_DELETED |
HA_STATE_KEY_CHANGED)));
} /* _ma_test_if_changed */
/*
Put a mark in the .MAI file that someone is updating the table
DOCUMENTATION
state.open_count in the .MAI file is used the following way:
- For the first change of the .MYI file in this process open_count is
incremented by _ma_mark_file_changed(). (We have a write lock on the file
when this happens)
- In maria_close() it's decremented by _ma_decrement_open_count() if it
was incremented in the same process.
This mean that if we are the only process using the file, the open_count
tells us if the MARIA file wasn't properly closed. (This is true if
my_disable_locking is set).
open_count is not maintained on disk for temporary tables.
*/
int _ma_mark_file_changed(MARIA_HA *info)
{
uchar buff[3];
register MARIA_SHARE *share= info->s;
DBUG_ENTER("_ma_mark_file_changed");
if (!(share->state.changed & STATE_CHANGED) || ! share->global_changed)
{
share->state.changed|=(STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_OPTIMIZED_KEYS);
if (!share->global_changed)
{
share->global_changed=1;
share->state.open_count++;
}
/*
Temp tables don't need an open_count as they are removed on crash.
In theory transactional tables are fixed by log-based recovery, so don't
need an open_count either, but if recovery has failed and logs have been
removed (by maria-force-start-after-recovery-failures), we still need to
detect dubious tables.
If we didn't maintain open_count on disk for a table, after a crash
we wouldn't know if it was closed at crash time (thus does not need a
check) or not. So we would have to check all tables: overkill.
*/
if (!share->temporary)
{
mi_int2store(buff,share->state.open_count);
buff[2]=1; /* Mark that it's changed */
if (my_pwrite(share->kfile.file, buff, sizeof(buff),
sizeof(share->state.header) +
MARIA_FILE_OPEN_COUNT_OFFSET,
MYF(MY_NABP)))
DBUG_RETURN(1);
}
/* Set uuid of file if not yet set (zerofilled file) */
if (share->base.born_transactional &&
!(share->state.changed & STATE_NOT_MOVABLE))
{
/* Lock table to current installation */
if (_ma_set_uuid(info, 0) ||
(share->state.create_rename_lsn == LSN_REPAIRED_BY_MARIA_CHK &&
_ma_update_state_lsns_sub(share, translog_get_horizon(),
trnman_get_min_trid(),
TRUE, TRUE)))
DBUG_RETURN(1);
share->state.changed|= STATE_NOT_MOVABLE;
}
}
DBUG_RETURN(0);
}
/*
Check that a region is all zero
SYNOPSIS
check_if_zero()
pos Start of memory to check
length length of memory region
NOTES
Used mainly to detect rows with wrong extent information
*/
my_bool _ma_check_if_zero(uchar *pos, size_t length)
{
uchar *end;
for (end= pos+ length; pos != end ; pos++)
if (pos[0] != 0)
return 1;
return 0;
}
/*
This is only called by close or by extra(HA_FLUSH) if the OS has the pwrite()
call. In these context the following code should be safe!
*/
int _ma_decrement_open_count(MARIA_HA *info)
{
uchar buff[2];
register MARIA_SHARE *share= info->s;
int lock_error=0,write_error=0;
if (share->global_changed)
{
uint old_lock=info->lock_type;
share->global_changed=0;
lock_error=maria_lock_database(info,F_WRLCK);
/* Its not fatal even if we couldn't get the lock ! */
if (share->state.open_count > 0)
{
share->state.open_count--;
share->changed= 1; /* We have to update state */
if (!share->temporary)
{
mi_int2store(buff,share->state.open_count);
write_error= (int) my_pwrite(share->kfile.file, buff, sizeof(buff),
sizeof(share->state.header) +
MARIA_FILE_OPEN_COUNT_OFFSET,
MYF(MY_NABP));
}
}
if (!lock_error)
lock_error=maria_lock_database(info,old_lock);
}
return test(lock_error || write_error);
}
/** @brief mark file as crashed */
void _ma_mark_file_crashed(MARIA_SHARE *share)
{
uchar buff[2];
DBUG_ENTER("_ma_mark_file_crashed");
share->state.changed|= STATE_CRASHED;
mi_int2store(buff, share->state.changed);
/*
We can ignore the errors, as if the mark failed, there isn't anything
else we can do; The user should already have got an error that the
table was crashed.
*/
(void) my_pwrite(share->kfile.file, buff, sizeof(buff),
sizeof(share->state.header) +
MARIA_FILE_CHANGED_OFFSET,
MYF(MY_NABP));
DBUG_VOID_RETURN;
}
/**
@brief Set uuid of for a Maria file
@fn _ma_set_uuid()
@param info Maria handler
@param reset_uuid Instead of setting file to maria_uuid, set it to
0 to mark it as movable
*/
my_bool _ma_set_uuid(MARIA_HA *info, my_bool reset_uuid)
{
uchar buff[MY_UUID_SIZE], *uuid;
uuid= maria_uuid;
if (reset_uuid)
{
bzero(buff, sizeof(buff));
uuid= buff;
}
return (my_bool) my_pwrite(info->s->kfile.file, uuid, MY_UUID_SIZE,
mi_uint2korr(info->s->state.header.base_pos),
MYF(MY_NABP));
}