WL#3138: Maria - fast "SELECT COUNT(*) FROM t;" and "CHECKSUM TABLE t"

Added argument to maria_end_bulk_insert() to know if the table will be deleted after the operation
Fixed wrong call to strmake
Don't call bulk insert in case of inserting only one row (speed optimization as starting/stopping bulk insert
Allow storing year 2155 in year field
When running with purify/valgrind avoid copying structures over themself
Added hook 'trnnam_end_trans_hook' that is called when transaction ends
Added trn->used_tables that is used to an entry for all tables used by transaction
Fixed that ndb doesn't crash on duplicate key error when start_bulk_insert/end_bulk_insert are not called


include/maria.h:
  Added argument to maria_end_bulk_insert() to know if the table will be deleted after the operation
include/my_tree.h:
  Added macro 'reset_free_element()' to be able to ignore calls to the external free function.
  Is used to optimize end-bulk-insert in case of failures, in which case we don't want write the remaining keys in the tree
mysql-test/install_test_db.sh:
  Upgrade to new mysql_install_db options
mysql-test/r/maria-mvcc.result:
  New tests
mysql-test/r/maria.result:
  New tests
mysql-test/suite/ndb/r/ndb_auto_increment.result:
  Fixed error message now when bulk insert is not always called
mysql-test/suite/ndb/t/ndb_auto_increment.test:
  Fixed error message now when bulk insert is not always called
mysql-test/t/maria-mvcc.test:
  Added testing of versioning of count(*)
mysql-test/t/maria-page-checksum.test:
  Added comment
mysql-test/t/maria.test:
  More tests
mysys/hash.c:
  Code style change
sql/field.cc:
  Allow storing year 2155 in year field
sql/ha_ndbcluster.cc:
  Added new argument to end_bulk_insert() to signal if the bulk insert should ignored
sql/ha_ndbcluster.h:
  Added new argument to end_bulk_insert() to signal if the bulk insert should ignored
sql/ha_partition.cc:
  Added new argument to end_bulk_insert() to signal if the bulk insert should ignored
sql/ha_partition.h:
  Added new argument to end_bulk_insert() to signal if the bulk insert should ignored
sql/handler.cc:
  Don't call get_dup_key() if there is no table object. This can happen if the handler generates a duplicate key error on commit
sql/handler.h:
  Added new argument to end_bulk_insert() to signal if the bulk insert should ignored (ie, the table will be deleted)
sql/item.cc:
  Style fix
  Removed compiler warning
sql/log_event.cc:
  Added new argument to ha_end_bulk_insert()
sql/log_event_old.cc:
  Added new argument to ha_end_bulk_insert()
sql/mysqld.cc:
  Removed compiler warning
sql/protocol.cc:
  Added DBUG
sql/sql_class.cc:
  Added DBUG
  Fixed wrong call to strmake
sql/sql_insert.cc:
  Don't call bulk insert in case of inserting only one row (speed optimization as starting/stopping bulk insert involves a lot of if's)
  Added new argument to ha_end_bulk_insert()
sql/sql_load.cc:
  Added new argument to ha_end_bulk_insert()
sql/sql_parse.cc:
  Style fixes
  Avoid goto in common senario
sql/sql_select.cc:
  When running with purify/valgrind avoid copying structures over themself.  This is not a real bug in itself, but it's a waste of cycles and causes valgrind warnings
sql/sql_select.h:
  Avoid copying structures over themself.  This is not a real bug in itself, but it's a waste of cycles and causes valgrind warnings
sql/sql_table.cc:
  Call HA_EXTRA_PREPARE_FOR_DROP if table created by ALTER TABLE is going to be dropped
  Added new argument to ha_end_bulk_insert()
storage/archive/ha_archive.cc:
  Added new argument to end_bulk_insert()
storage/archive/ha_archive.h:
  Added new argument to end_bulk_insert()
storage/federated/ha_federated.cc:
  Added new argument to end_bulk_insert()
storage/federated/ha_federated.h:
  Added new argument to end_bulk_insert()
storage/maria/Makefile.am:
  Added ma_state.c and ma_state.h
storage/maria/ha_maria.cc:
  Versioning of count(*) and checksum
  - share->state.state is now assumed to be correct, not handler->state
  - Call _ma_setup_live_state() in external lock to get count(*)/checksum versioning. In case of
    not versioned and not concurrent insertable table, file->s->state.state contains the correct state information
  
  Other things:
  - file->s -> share
  - Added DBUG_ASSERT() for unlikely case
  - Optimized end_bulk_insert() to not write anything if table is going to be deleted (as in failed alter table)
  - Indentation changes in external_lock becasue of removed 'goto' caused a big conflict even if very little was changed
storage/maria/ha_maria.h:
  New argument to end_bulk_insert()
storage/maria/ma_blockrec.c:
  Update for versioning of count(*) and checksum
  Keep share->state.state.data_file_length up to date (not info->state->data_file_length)
  Moved _ma_block_xxxx_status() and maria_versioning() functions to ma_state.c
storage/maria/ma_check.c:
  Update and use share->state.state instead of info->state
  info->s to share
  Update info->state at end of repair
  Call _ma_reset_state() to update share->state_history at end of repair
storage/maria/ma_checkpoint.c:
  Call _ma_remove_not_visible_states() on checkpoint to clean up not visible state history from tables
storage/maria/ma_close.c:
  Remember state history for running transaction even if table is closed
storage/maria/ma_commit.c:
  Ensure we always call trnman_commit_trn() even if other calls fails. If we don't do that, the translog and state structures will not be freed
storage/maria/ma_delete.c:
  Versioning of count(*) and checksum:
  - Always update info->state->checksum and info->state->records
storage/maria/ma_delete_all.c:
  Versioning of count(*) and checksum:
  - Ensure that share->state.state is updated, as here is where we store the primary information
storage/maria/ma_dynrec.c:
  Use lock_key_trees instead of concurrent_insert to check if trees should be locked.
  This allows us to lock trees both for concurrent_insert and for index versioning.
storage/maria/ma_extra.c:
  Versioning of count(*) and checksum:
  - Use share->state.state instead of info->state
  - share->concurrent_insert -> share->non_transactional_concurrent_insert
  - Don't update share->state.state from info->state if transactional table
  
  Optimization:
  - Don't flush io_cache or bitmap if we are using FLUSH_IGNORE_CHANGED
storage/maria/ma_info.c:
  Get most state information from current state
storage/maria/ma_init.c:
  Add hash table and free function to store states for closed tables
  Install hook for transaction commit/rollback to update history state
storage/maria/ma_key_recover.c:
  Versioning of count(*) and checksum:
  - Use share->state.state instead of info->state
storage/maria/ma_locking.c:
  Versioning of count(*) and checksum:
  - Call virtual functions (if exists) to restore/update status
  - Move _ma_xxx_status() functions to ma_state.c
  
  info->s -> share
storage/maria/ma_open.c:
  Versioning of count(*) and checksum:
  - For not transactional tables, set info->state to point to new allocated state structure.
  - Initialize new info->state_start variable that points to state at start of transaction
  - Copy old history states from hash table (maria_stored_states) first time the table is opened
  - Split flag share->concurrent_insert to non_transactional_concurrent_insert & lock_key_tree
  - For now, only enable versioning of tables without keys (to be fixed in soon!)
  - Added new virtual function to restore status in maria_lock_database)
  
  More DBUG
storage/maria/ma_page.c:
  Versioning of count(*) and checksum:
  - Use share->state.state instead of info->state
  - Modify share->state.state.key_file_length under share->intern_lock
storage/maria/ma_range.c:
  Versioning of count(*) and checksum:
  - Lock trees based on share->lock_key_trees
  
  info->s -> share
storage/maria/ma_recovery.c:
  Versioning of count(*) and checksum:
  - Use share->state.state instead of info->state
  - Update state information on close and when reenabling logging
storage/maria/ma_rkey.c:
  Versioning of count(*) and checksum:
  - Lock trees based on share->lock_key_trees
storage/maria/ma_rnext.c:
  Versioning of count(*) and checksum:
  - Lock trees based on share->lock_key_trees
storage/maria/ma_rnext_same.c:
  Versioning of count(*) and checksum:
  - Lock trees based on share->lock_key_trees
  - Only skip rows based on file length if non_transactional_concurrent_insert is set
storage/maria/ma_rprev.c:
  Versioning of count(*) and checksum:
  - Lock trees based on share->lock_key_trees
storage/maria/ma_rsame.c:
  Versioning of count(*) and checksum:
  - Lock trees based on share->lock_key_trees
storage/maria/ma_sort.c:
  Use share->state.state instead of info->state
  Fixed indentation
storage/maria/ma_static.c:
  Added maria_stored_state
storage/maria/ma_update.c:
  Versioning of count(*) and checksum:
  - Always update info->state->checksum and info->state->records
  - Remove optimization for index file update as it doesn't work for transactional tables
storage/maria/ma_write.c:
  Versioning of count(*) and checksum:
  - Always update info->state->checksum and info->state->records
storage/maria/maria_def.h:
  Move MARIA_STATUS_INFO to ma_state.h
  
  Changes to MARIA_SHARE:
  - Added state_history to store count(*)/checksum states
  - Added in_trans as counter if table is used by running transactions
  - Split concurrent_insert into lock_key_trees and on_transactional_concurrent_insert.
  - Added virtual function lock_restore_status
  
  Changes to MARIA_HA:
  - save_state -> state_save
  - Added state_start to store state at start of transaction
storage/maria/maria_pack.c:
  Versioning of count(*) and checksum:
  - Use share->state.state instead of info->state
  
  Indentation fixes
storage/maria/trnman.c:
  Added hook 'trnnam_end_trans_hook' that is called when transaction ends
  Added trn->used_tables that is used to an entry for all tables used by transaction
  More DBUG
  Changed return type of trnman_end_trn() to my_bool
  Added trnman_get_min_trid() to get minimum trid in use.
  Added trnman_exists_active_transactions() to check if there exist a running transaction started between two commit id
storage/maria/trnman.h:
  Added 'used_tables'
  Moved all pointers into same groups to get better memory alignment
storage/maria/trnman_public.h:
  Added prototypes for new functions and variables
  Chagned return type of trnman_end_trn() to my_bool
storage/myisam/ha_myisam.cc:
  Added argument to end_bulk_insert() if operation should be aborted
storage/myisam/ha_myisam.h:
  Added argument to end_bulk_insert() if operation should be aborted
storage/maria/ma_state.c:
  Functions to handle state of count(*) and checksum
storage/maria/ma_state.h:
  Structures and declarations to handle state of count(*) and checksum
This commit is contained in:
unknown 2008-05-29 18:33:33 +03:00
commit 5099033c26
72 changed files with 1405 additions and 659 deletions

View file

@ -51,6 +51,10 @@ static TRN **short_trid_to_active_trn;
/* locks for short_trid_to_active_trn and pool */
static my_atomic_rwlock_t LOCK_short_trid_to_trn, LOCK_pool;
static my_bool default_trnman_end_trans_hook(TRN *, my_bool, my_bool);
my_bool (*trnman_end_trans_hook)(TRN *, my_bool, my_bool)=
default_trnman_end_trans_hook;
/*
Simple interface functions
@ -78,6 +82,16 @@ void trnman_reset_locked_tables(TRN *trn, uint locked_tables)
}
static my_bool
default_trnman_end_trans_hook(TRN *trn __attribute__ ((unused)),
my_bool commit __attribute__ ((unused)),
my_bool active_transactions
__attribute__ ((unused)))
{
return 0;
}
/*
NOTE
Just as short_id doubles as loid, this function doubles as
@ -325,6 +339,7 @@ TRN *trnman_new_trn(pthread_mutex_t *mutex, pthread_cond_t *cond,
trn->commit_trid= 0;
trn->rec_lsn= trn->undo_lsn= trn->first_undo_lsn= 0;
trn->used_tables= 0;
trn->locks.mutex= mutex;
trn->locks.cond= cond;
@ -342,6 +357,9 @@ TRN *trnman_new_trn(pthread_mutex_t *mutex, pthread_cond_t *cond,
*/
set_short_trid(trn);
DBUG_PRINT("exit", ("trn: x%lx trid: 0x%lu",
(ulong) trn, (ulong) trn->trid));
DBUG_RETURN(trn);
}
@ -362,7 +380,7 @@ TRN *trnman_new_trn(pthread_mutex_t *mutex, pthread_cond_t *cond,
0 ok
1 error
*/
int trnman_end_trn(TRN *trn, my_bool commit)
my_bool trnman_end_trn(TRN *trn, my_bool commit)
{
int res= 1;
TRN *free_me= 0;
@ -435,8 +453,7 @@ int trnman_end_trn(TRN *trn, my_bool commit)
if (res)
{
/*
res == 1 means the condition in the if() above
was false.
res == 1 means the condition in the if() above was false.
res == -1 means lf_hash_insert failed
*/
trn->next= free_me;
@ -446,8 +463,10 @@ int trnman_end_trn(TRN *trn, my_bool commit)
{
committed_list_max.prev= trn->prev->next= trn;
}
if ((*trnman_end_trans_hook)(trn, commit,
active_list_min.next != &active_list_max))
res= -1;
trnman_active_transactions--;
DBUG_PRINT("info", ("pthread_mutex_unlock LOCK_trn_list"));
pthread_mutex_unlock(&LOCK_trn_list);
/* the rest is done outside of a critical section */
@ -763,9 +782,30 @@ TRN *trnman_get_any_trn()
}
/**
Returns the minimum existing transaction id.
*/
TrID trnman_get_min_trid()
{
TrID min_read_from;
if (short_trid_to_active_trn == NULL)
{
/* Transaction manager not initialize; Probably called from maria_chk */
return ~(TrID) 0;
}
pthread_mutex_lock(&LOCK_trn_list);
min_read_from= active_list_min.next->min_read_from;
pthread_mutex_unlock(&LOCK_trn_list);
return min_read_from;
}
/**
Returns maximum transaction id given to a transaction so far.
*/
TrID trnman_get_max_trid()
{
TrID id;
@ -776,3 +816,39 @@ TrID trnman_get_max_trid()
pthread_mutex_unlock(&LOCK_trn_list);
return id;
}
/**
Check if there exist an active transaction between two commit_id's
@todo
Improve speed of this.
- Store transactions in tree or skip list
- Have function to copying all active transaction id's to b-tree
and use b-tree for checking states. This could be a big win
for checkpoint that will call this function for a lot of objects.
@return
0 No transaction exists
1 There is at least on active transaction in the given range
*/
my_bool trnman_exists_active_transactions(TrID min_id, TrID max_id,
my_bool trnman_is_locked)
{
TRN *trn;
my_bool ret= 0;
if (!trnman_is_locked)
pthread_mutex_lock(&LOCK_trn_list);
for (trn= active_list_min.next; trn != &active_list_max; trn= trn->next)
{
if (trn->trid > min_id && trn->trid < max_id)
{
ret= 1;
break;
}
}
if (!trnman_is_locked)
pthread_mutex_unlock(&LOCK_trn_list);
return ret;
}