mariadb/storage/myisammrg/ha_myisammrg.cc
Konstantin Osipov eff3780dd8 Initial import of WL#3726 "DDL locking for all metadata objects".
Backport of:
------------------------------------------------------------
revno: 2630.4.1
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w
timestamp: Fri 2008-05-23 17:54:03 +0400
message:
  WL#3726 "DDL locking for all metadata objects".

  After review fixes in progress.
------------------------------------------------------------

This is the first patch in series. It transforms the metadata 
locking subsystem to use a dedicated module (mdl.h,cc). No 
significant changes in the locking protocol. 
The import passes the test suite with the exception of 
deprecated/removed 6.0 features, and MERGE tables. The latter
are subject to a fix by WL#4144.
Unfortunately, the original changeset comments got lost in a merge,
thus this import has its own (largely insufficient) comments.

This patch fixes Bug#25144 "replication / binlog with view breaks".
Warning: this patch introduces an incompatible change:
Under LOCK TABLES, it's no longer possible to FLUSH a table that 
was not locked for WRITE.
Under LOCK TABLES, it's no longer possible to DROP a table or
VIEW that was not locked for WRITE.

******
Backport of:
------------------------------------------------------------
revno: 2630.4.2
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w
timestamp: Sat 2008-05-24 14:03:45 +0400
message:
  WL#3726 "DDL locking for all metadata objects".

  After review fixes in progress.

******
Backport of:
------------------------------------------------------------
revno: 2630.4.3
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w
timestamp: Sat 2008-05-24 14:08:51 +0400
message:
  WL#3726 "DDL locking for all metadata objects"

  Fixed failing Windows builds by adding mdl.cc to the lists
  of files needed to build server/libmysqld on Windows.

******
Backport of:
------------------------------------------------------------
revno: 2630.4.4
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w
timestamp: Sat 2008-05-24 21:57:58 +0400
message:
  WL#3726 "DDL locking for all metadata objects".

  Fix for assert failures in kill.test which occured when one
  tried to kill ALTER TABLE statement on merge table while it
  was waiting in wait_while_table_is_used() for other connections
  to close this table.

  These assert failures stemmed from the fact that cleanup code
  in this case assumed that temporary table representing new
  version of table was open with adding to THD::temporary_tables
  list while code which were opening this temporary table wasn't
  always fulfilling this.

  This patch changes code that opens new version of table to
  always do this linking in. It also streamlines cleanup process
  for cases when error occurs while we have new version of table
  open.

******
WL#3726 "DDL locking for all metadata objects"
Add libmysqld/mdl.cc to .bzrignore.
******
Backport of:
------------------------------------------------------------
revno: 2630.4.6
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-6.0-3726-w
timestamp: Sun 2008-05-25 00:33:22 +0400
message:
  WL#3726 "DDL locking for all metadata objects".

  Addition to the fix of assert failures in kill.test caused by
  changes for this worklog.


Make sure we close the new table only once.

.bzrignore:
  Add libmysqld/mdl.cc
libmysqld/CMakeLists.txt:
  Added mdl.cc to the list of files needed for building of libmysqld.
libmysqld/Makefile.am:
  Added files implementing new meta-data locking subsystem to the server.
mysql-test/include/handler.inc:
  Use separate connection for waiting while threads performing DDL
  operations conflicting with open HANDLER tables reach blocked
  state. This is required because now we check and close tables open
  by HANDLER statements in this connection conflicting with DDL in
  another each time open_tables() is called and thus select from I_S
  which is used for waiting will unblock DDL operations if issued
  from connection with open HANDLERs.
mysql-test/r/create.result:
  Adjusted test case after change in implementation of CREATE TABLE
  ... SELECT.  We no longer have special check in open_table() which
  catches the case when we select from the table created. Instead we
  rely on unique_table() call which happens after opening and
  locking all tables.
mysql-test/r/flush.result:
  FLUSH TABLES WITH READ LOCK can no longer happen under LOCK
  TABLES.  Updated test accordingly.
mysql-test/r/flush_table.result:
  Under LOCK TABLES we no longer allow to do FLUSH TABLES for tables
  locked for read. Updated test accordingly.
mysql-test/r/handler_innodb.result:
  Use separate connection for waiting while threads performing DDL
  operations conflicting with open HANDLER tables reach blocked
  state. This is required because now we check and close tables open
  by HANDLER statements in this connection conflicting with DDL in
  another each time open_tables() is called and thus select from I_S
  which is used for waiting will unblock DDL operations if issued
  from connection with open HANDLERs.
mysql-test/r/handler_myisam.result:
  Use separate connection for waiting while threads performing DDL
  operations conflicting with open HANDLER tables reach blocked
  state. This is required because now we check and close tables open
  by HANDLER statements in this connection conflicting with DDL in
  another each time open_tables() is called and thus select from I_S
  which is used for waiting will unblock DDL operations if issued
  from connection with open HANDLERs.
mysql-test/r/information_schema.result:
  Additional test for WL#3726 "DDL locking for all metadata
  objects".  Check that we use high-priority metadata lock requests
  when filling I_S tables.
  
  Rearrange tests to match 6.0 better (fewer merge conflicts).
mysql-test/r/kill.result:
  Added tests checking that DDL and DML statements waiting for
  metadata locks can be interrupted by KILL command.
mysql-test/r/lock.result:
  One no longer is allowed to do DROP VIEW under LOCK TABLES even if
  this view is locked by LOCK TABLES. The problem is that in such
  situation write locks on view are not mutually exclusive so
  upgrading metadata lock which is required for dropping of view
  will lead to deadlock.
mysql-test/r/partition_column_prune.result:
  Update results (same results in 6.0), WL#3726
mysql-test/r/partition_pruning.result:
  Update results (same results in 6.0), WL#3726
mysql-test/r/ps_ddl.result:
  We no longer invalidate prepared CREATE TABLE ... SELECT statement
  if target table changes. This is OK since it is not strictly
  necessary.
  
  
  The first change is wrong, is caused by FLUSH TABLE
  now flushing all unused tables. This is a regression that
  Dmitri fixed in 6.0 in a follow up patch.
mysql-test/r/sp.result:
  Under LOCK TABLES we no longer allow accessing views which were
  not explicitly locked. To access view we need to obtain metadata
  lock on it and doing this under LOCK TABLES may lead to deadlocks.
mysql-test/r/view.result:
  One no longer is allowed to do DROP VIEW under LOCK TABLES even if
  this view is locked by LOCK TABLES. The problem is that in such
  situation even "write locks" on view are not mutually exclusive so
  upgrading metadata lock which is required for dropping of view
  will lead to deadlock
mysql-test/r/view_grant.result:
  ALTER VIEW implementation was changed to open a view only after
  checking that user which does alter has appropriate privileges on
  it. This means that in case when user's privileges are
  insufficient for this we won't check that new view definer is the
  same as original one or user performing alter has SUPER privilege.
  Adjusted test case accordingly.
mysql-test/r/view_multi.result:
  Added test case for bug#25144 "replication / binlog with view
  breaks".
mysql-test/suite/rpl/t/disabled.def:
  Disable test for deprecated features (they don't work with new MDL).
mysql-test/t/create.test:
  Adjusted test case after change in implementation of CREATE TABLE
  ... SELECT.  We no longer have special check in open_table() which
  catches the case when we select from the table created. Instead we
  rely on unique_table() call which happens after opening and
  locking all tables.
mysql-test/t/disabled.def:
  Disable merge.test, subject of WL#4144
mysql-test/t/flush.test:
  
  FLUSH TABLES WITH READ LOCK can no longer happen under LOCK
  TABLES.  Updated test accordingly.
mysql-test/t/flush_table.test:
  Under LOCK TABLES we no longer allow to do FLUSH TABLES for tables
  locked for read. Updated test accordingly.
mysql-test/t/information_schema.test:
  Additional test for WL#3726 "DDL locking for all metadata
  objects".  Check that we use high-priority metadata lock requests
  when filling I_S tables.
  
  Rearrange the results for easier merges with 6.0.
mysql-test/t/kill.test:
  Added tests checking that DDL and DML statements waiting for
  metadata locks can be interrupted by KILL command.
mysql-test/t/lock.test:
  One no longer is allowed to do DROP VIEW under LOCK TABLES even if
  this view is locked by LOCK TABLES. The problem is that in such
  situation write locks on view are not mutually exclusive so
  upgrading metadata lock which is required for dropping of view
  will lead to deadlock.
mysql-test/t/lock_multi.test:
  Adjusted test case to the changes of status in various places
  caused by change in implementation FLUSH TABLES WITH READ LOCK,
  which is now takes global metadata lock before flushing tables and
  therefore waits on at these places.
mysql-test/t/ps_ddl.test:
  We no longer invalidate prepared CREATE TABLE ... SELECT statement
  if target table changes. This is OK since it is not strictly
  necessary.
  
  
  The first change is wrong, is caused by FLUSH TABLE
  now flushing all unused tables. This is a regression that
  Dmitri fixed in 6.0 in a follow up patch.
mysql-test/t/sp.test:
  Under LOCK TABLES we no longer allow accessing views which were
  not explicitly locked. To access view we need to obtain metadata
  lock on it and doing this under LOCK TABLES may lead to deadlocks.
mysql-test/t/trigger_notembedded.test:
  Adjusted test case to the changes of status in various places
  caused by change in implementation FLUSH TABLES WITH READ LOCK,
  which is now takes global metadata lock before flushing tables and
  therefore waits on at these places.
mysql-test/t/view.test:
  One no longer is allowed to do DROP VIEW under LOCK TABLES even if
  this view is locked by LOCK TABLES. The problem is that in such
  situation even "write locks" on view are not mutually exclusive so
  upgrading metadata lock which is required for dropping of view
  will lead to deadlock.
mysql-test/t/view_grant.test:
  ALTER VIEW implementation was changed to open a view only after
  checking that user which does alter has appropriate privileges on
  it. This means that in case when user's privileges are
  insufficient for this we won't check that new view definer is the
  same as original one or user performing alter has SUPER privilege.
  Adjusted test case accordingly.
mysql-test/t/view_multi.test:
  Added test case for bug#25144 "replication / binlog with view
  breaks".
sql/CMakeLists.txt:
  Added mdl.cc to the list of files needed for building of server.
sql/Makefile.am:
  Added files implementing new meta-data locking subsystem to the
  server.
sql/event_db_repository.cc:
  
  Allocate metadata lock requests objects (MDL_LOCK) on execution
  memory root in cases when TABLE_LIST objects is also allocated
  there or on stack.
sql/ha_ndbcluster.cc:
  Adjusted code to work nicely with new metadata locking subsystem.
  close_cached_tables() no longer has wait_for_placeholder argument.
  Instead of relying on this parameter and related behavior FLUSH
  TABLES WITH READ LOCK now takes global shared metadata lock.
sql/ha_ndbcluster_binlog.cc:
  Adjusted code to work with new metadata locking subsystem.
  close_cached_tables() no longer has wait_for_placeholder argument.
  Instead of relying on this parameter and related behavior FLUSH
  TABLES WITH READ LOCK now takes global shared metadata lock.
sql/handler.cc:
  update_frm_version():
    Directly update TABLE_SHARE::mysql_version member instead of
    going through all TABLE instances for this table (old code was a
    legacy from pre-table-definition-cache days).
sql/lock.cc:
  Use new metadata locking subsystem. Threw away most of functions
  related to name locking as now one is supposed to use metadata
  locking API instead.  In lock_global_read_lock() and
  unlock_global_read_lock() in order to avoid problems with global
  read lock sneaking in at the moment when we perform FLUSH TABLES
  or ALTER TABLE under LOCK TABLES and when tables being reopened
  are protected only by metadata locks we also have to take global
  shared meta data lock.
sql/log_event.cc:
  Adjusted code to work with new metadata locking subsystem.  For
  tables open by slave thread for applying RBR events allocate
  memory for lock request object in the same chunk of memory as
  TABLE_LIST objects for them. In order to ensure that we keep these
  objects around until tables are open always close tables before
  calling Relay_log_info::clear_tables_to_lock(). Use new auxiliary
  Relay_log_info::slave_close_thread_tables() method to enforce
  this.
sql/log_event_old.cc:
  Adjusted code to work with new metadata locking subsystem.  Since
  for tables open by slave thread for applying RBR events memory for
  lock request object is allocated in the same chunk of memory as
  TABLE_LIST objects for them we have to ensure that we keep these
  objects around until tables are open. To ensure this we always
  close tables before calling
  Relay_log_info::clear_tables_to_lock(). To enfore this we use
  new auxiliary Relay_log_info::slave_close_thread_tables()
  method.
sql/mdl.cc:
  Implemented new metadata locking subsystem and API described in
  WL3726 "DDL locking for all metadata objects".
sql/mdl.h:
  Implemented new metadata locking subsystem and API described in
  WL3726 "DDL locking for all metadata objects".
sql/mysql_priv.h:
  - close_thread_tables()/close_tables_for_reopen() now has one more
    argument which indicates that metadata locks should be released
    but not removed from the context in order to be used later in
    mdl_wait_for_locks() and tdc_wait_for_old_version().
  - close_cached_table() routine is no longer public.
  - Thread waiting in wait_while_table_is_used() can be now killed
    so this function returns boolean to make caller aware of such
    situation.
  - We no longer have  table cache as separate entity instead used
    and unused TABLE instances are linked to TABLE_SHARE objects in
    table definition cache.
  - Now third argument of open_table() is also used for requesting
    table repair or auto-discovery of table's new definition. So its
    type was changed from bool to enum.
  - Added tdc_open_view() function for opening view by getting its
    definition from disk (and table cache in future).
  - reopen_name_locked_table() no longer needs "link_in" argument as
    now we have exclusive metadata locks instead of dummy TABLE
    instances when this function is called.
  - find_locked_table() now takes head of list of TABLE instances
    instead of always scanning through THD::open_tables list. Also
    added find_write_locked_table() auxiliary.
  - reopen_tables(), close_cached_tables() no longer have
    mark_share_as_old and wait_for_placeholder arguments. Instead of
    relying on this parameters and related behavior FLUSH TABLES
    WITH READ LOCK now takes global shared metadata lock.
  - We no longer need drop_locked_tables() and
    abort_locked_tables().
  - mysql_ha_rm_tables() now always assume that LOCK_open is not
    acquired by caller.
  - Added notify_thread_having_shared_lock() callback invoked by
    metadata locking subsystem when acquiring an exclusive lock, for
    each thread that has a conflicting shared metadata lock.
  - Introduced expel_table_from_cache() as replacement for
    remove_table_from_cache() (the main difference is that this new
    function assumes that caller follows metadata locking protocol
    and never waits).
  - Threw away most of functions related to name locking. One should
    use new metadata locking subsystem and API instead.
sql/mysqld.cc:
  Got rid of call initializing/deinitializing table cache since now
  it is embedded into table definition cache. Added calls for
  initializing/ deinitializing metadata locking subsystem.
sql/rpl_rli.cc:
  Introduced auxiliary Relay_log_info::slave_close_thread_tables()
  method which is used for enforcing that we always close tables
  open for RBR before deallocating TABLE_LIST elements and MDL_LOCK
  objects for them.
sql/rpl_rli.h:
  Introduced auxiliary Relay_log_info::slave_close_thread_tables()
  method which is used for enforcing that we always close tables
  open for RBR before deallocating TABLE_LIST elements and MDL_LOCK
  objects for them.
sql/set_var.cc:
  close_cached_tables() no longer has wait_for_placeholder argument.
  Instead of relying on this parameter and related behavior FLUSH
  TABLES WITH READ LOCK now takes global shared metadata lock.
sql/sp_head.cc:
  For tables added to the statement's table list by prelocking
  algorithm we allocate these objects either on the same memory as
  corresponding table list elements or on THD::locked_tables_root
  (if we are building table list for LOCK TABLES).
sql/sql_acl.cc:
  Allocate metadata lock requests objects (MDL_LOCK) on execution
  memory root in cases when we use stack TABLE_LIST objects to open
  tables.  Got rid of redundant code by using unlock_locked_tables()
  function.
sql/sql_base.cc:
  Changed code to use new MDL subsystem. Got rid of separate table
  cache.  Now used and unused TABLE instances are linked to the
  TABLE_SHAREs in table definition cache.
  
  check_unused():
    Adjusted code to the fact that we no longer have separate table
    cache.  Removed dead code.
  table_def_free():
    Free TABLE instances referenced from TABLE_SHARE objects before
    destroying table definition cache.
  get_table_share():
    Added assert which ensures that noone will be able to access
    table (and its share) without acquiring some kind of metadata
    lock first.
  close_handle_and_leave_table_as_lock():
    Adjusted code to the fact that TABLE instances now are linked to
    list in TABLE_SHARE.
  list_open_tables():
    Changed this function to use table definition cache instead of
    table cache.
  free_cache_entry():
    Unlink freed TABLE elements from the list of all TABLE instances
    for the table in TABLE_SHARE.
  kill_delayed_thread_for_table():
    Added auxiliary for killing delayed insert threads for
    particular table.
  close_cached_tables():
    Got rid of wait_for_refresh argument as we now rely on global
    shared metadata lock to prevent FLUSH WITH READ LOCK sneaking in
    when we are reopening tables. Heavily reworked this function to
    use new MDL code and not to rely on separate table cache entity.
  close_open_tables():
    We no longer have separate table cache.
  close_thread_tables():
    Release metadata locks after closing all tables. Added skip_mdl
    argument which allows us not to remove metadata lock requests
    from the context in case when we are going to use this requests
    later in mdl_wait_for_locks() and tdc_wait_for_old_versions().
  close_thread_table()/close_table_for_reopen():
    Since we no longer have separate table cache and all TABLE
    instances are linked to TABLE_SHARE objects in table definition
    cache we have to link/unlink TABLE object to/from appropriate
    lists in the share.
  name_lock_locked_table():
   Moved redundant code to find_write_locked_table() function and
    adjusted code to the fact that wait_while_table_is_used() can
    now return with an error if our thread is killed.
  reopen_table_entry():
    We no longer need "link_in" argument as with MDL we no longer
    call this function with dummy TABLE object pre-allocated and
    added to the THD::open_tables. Also now we add newly-open TABLE
    instance to the list of share's used TABLE instances.
  table_cache_insert_placeholder():
    Got rid of name-locking legacy.
  lock_table_name_if_not_cached():
    Moved to sql_table.cc the only place where it is used. It was
    also reimplemented using new MDL API.
  open_table():
    - Reworked this function to use new MDL subsystem.
    - Changed code to deal with table definition cache directly
      instead of going through separate table cache.
    - Now third argument is also used for requesting table repair
      or auto-discovery of table's new definition. So its type was
      changed from bool to enum.
  find_locked_table()/find_write_locked_table():
    Accept head of list of TABLE objects as first argument and use
    this list instead of always searching in THD::open_tables list.
    Also added auxiliary for finding write-locked locked tables.
  reopen_table():
    Adjusted function to work with new MDL subsystem and to properly
    manuipulate with lists of used/unused TABLE instaces in
    TABLE_SHARE.
  reopen_tables():
    Removed mark_share_as_old parameter. Instead of relying on it
    and related behavior FLUSH TABLES WITH READ LOCK now takes
    global shared metadata lock. Changed code after removing
    separate table cache.
  drop_locked_tables()/abort_locked_tables():
    Got rid of functions which are no longer needed.
    unlock_locked_tables():
    Moved this function from sql_parse.cc and changed it to release
    memory which was used for allocating metadata lock requests for
    tables open and locked by LOCK TABLES.
  tdc_open_view():
    Intoduced function for opening a view by getting its definition
    from disk (and table cache in future).
  reopen_table_entry():
    Introduced function for opening table definitions while holding
    exclusive metatadata lock on it.
  open_unireg_entry():
   Got rid of this function. Most of its functionality is relocated
    to open_table() and open_table_fini() functions, and some of it
    to reopen_table_entry() and tdc_open_view(). Also code
    resposible for auto-repair and auto-discovery of tables was
    moved to separate function.
  open_table_entry_fini():
    Introduced function which contains common actions which finalize
    process of TABLE object creation.
  auto_repair_table():
    Moved code responsible for auto-repair of table being opened
    here.
  handle_failed_open_table_attempt()
    Moved code responsible for handling failing attempt to open
    table to one place (retry due to lock conflict/old version,
    auto-discovery and repair).
  open_tables():
    - Flush open HANDLER tables if they have old version of if there
      is conflicting metadata lock against them (before this moment
      we had this code in open_table()).
    - When we open view which should be processed via derived table
      on the second execution of prepared statement or stored
      routine we still should call open_table() for it in order to
      obtain metadata lock on it and prepare its security context.
    - In cases when we discover that some special handling of
      failure to open table is needed call
      handle_failed_open_table_attempt() which handles all such
      scenarios.
  open_ltable():
    Handling of various special scenarios of failure to open a table
    was moved to separate handle_failed_open_table_attempt()
    function.
  remove_db_from_cache():
    Removed this function as it is no longer used.
  notify_thread_having_shared_lock():
    Added callback which is invoked by MDL subsystem when acquiring
    an exclusive lock, for each thread that has a conflicting shared
    metadata lock.
  expel_table_from_cache():
    Introduced function for removing unused TABLE instances. Unlike
    remove_table_from_cache() it relies on caller following MDL
    protocol and having appropriate locks when calling it and thus
    does not do any waiting if table is still in use.
  tdc_wait_for_old_version():
    Added function which allows open_tables() to wait in cases when
    we discover that we should back-off due to presence of old
    version of table.
  abort_and_upgrade_lock():
    Use new MDL calls.
  mysql_wait_completed_table():
    Got rid of unused function.
  open_system_tables_for_read/for_update()/performance_schema_table():
    Allocate MDL_LOCK objects on execution memory root in cases when
    TABLE_LIST objects for corresponding tables is allocated on
    stack.
  close_performance_schema_table():
    Release metadata locks after closing tables.
  ******
  Use I_P_List for free/used tables list in the table share.
sql/sql_binlog.cc:
  Use Relay_log_info::slave_close_thread_tables() method to enforce
  that we always close tables open for RBR before deallocating
  TABLE_LIST elements and MDL_LOCK objects for them.
sql/sql_class.cc:
  Added meta-data locking contexts as part of Open_tables_state
  context.  Also introduced THD::locked_tables_root memory root
  which is to be used for allocating MDL_LOCK objects for tables in
  LOCK TABLES statement (end of lifetime for such objects is UNLOCK
  TABLES so we can't use statement or execution root for them).
sql/sql_class.h:
  Added meta-data locking contexts as part of Open_tables_state
  context.  Also introduced THD::locked_tables_root memory root
  which is to be used for allocating MDL_LOCK objects for tables in
  LOCK TABLES statement (end of lifetime for such objects is UNLOCK
  TABLES so we can't use statement or execution root for them).
  
  Note: handler_mdl_context and locked_tables_root and
  mdl_el_root will be removed by subsequent patches.
sql/sql_db.cc:
  mysql_rm_db() does not really need to call remove_db_from_cache()
  as it drops each table in the database using
  mysql_rm_table_part2(), which performs all necessary operations on
  table (definition) cache.
sql/sql_delete.cc:
  Use the new metadata locking API for TRUNCATE.
sql/sql_handler.cc:
  Changed HANDLER implementation to use new metadata locking
  subsystem.  Note that MDL_LOCK objects for HANDLER tables are
  allocated in the same chunk of heap memory as TABLE_LIST object
  for those tables.
sql/sql_insert.cc:
  mysql_insert():
    find_locked_table() now takes head of list of TABLE object as
    its argument instead of always scanning through THD::open_tables
    list.
  handle_delayed_insert():
    Allocate metadata lock request object for table open by delayed
    insert thread on execution memroot.  create_table_from_items():
    We no longer allocate dummy TABLE objects for tables being
    created if they don't exist. As consequence
    reopen_name_locked_table() no longer has link_in argument.
    open_table() now has one more argument which is not relevant for
    temporary tables.
sql/sql_parse.cc:
  - Moved unlock_locked_tables() routine to sql_base.cc and made
    available it in other files. Got rid of some redundant code by
    using this function.
  - Replaced boolean TABLE_LIST::create member with enum
    open_table_type member.
  - Use special memory root for allocating MDL_LOCK objects for
    tables open and locked by LOCK TABLES (these object should live
    till UNLOCK TABLES so we can't allocate them on statement nor
    execution memory root). Also properly set metadata lock
    upgradability attribure for those tables.
  - Under LOCK TABLES it is no longer allowed to flush tables which
    are not write-locked as this breaks metadata locking protocol
    and thus potentially might lead to deadlock.
  - Added auxiliary adjust_mdl_locks_upgradability() function.
sql/sql_partition.cc:
  Adjusted code to the fact that reopen_tables() no longer has
  "mark_share_as_old" argument. Got rid of comments which are no
  longer true.
sql/sql_plist.h:
  Added I_P_List template class for parametrized intrusive doubly
  linked lists and I_P_List_iterator for corresponding iterator.
  Unlike for I_List<> list elements of such list can participate in
  several lists. Unlike List<> such lists are doubly-linked and
  intrusive.
sql/sql_plugin.cc:
  Allocate metadata lock requests objects (MDL_LOCK) on execution
  memory root in cases when we use stack TABLE_LIST objects to open
  tables.
sql/sql_prepare.cc:
  Replaced boolean TABLE_LIST::create member with enum
  open_table_type member.  This allows easily handle situation in
  which instead of opening the table we want only to take exclusive
  metadata lock on it.
sql/sql_rename.cc:
  Use new metadata locking subsystem in implementation of RENAME
  TABLE.
sql/sql_servers.cc:
  Allocate metadata lock requests objects (MDL_LOCK) on execution
  memory root in cases when we use stack TABLE_LIST objects to open
  tables. Got rid of redundant code by using unlock_locked_tables()
  function.
sql/sql_show.cc:
  Acquire shared metadata lock when we are getting information for
  I_S table directly from TABLE_SHARE without doing full-blown table
  open.  We use high priority lock request in this situation in
  order to avoid deadlocks.
  Also allocate metadata lock requests objects (MDL_LOCK) on
  execution memory root in cases when TABLE_LIST objects are also
  allocated there
sql/sql_table.cc:
  mysql_rm_table():
    Removed comment which is no longer relevant.
  mysql_rm_table_part2():
    Now caller of mysql_ha_rm_tables() should not own LOCK_open.
    Adjusted code to use new metadata locking subsystem instead of
    name-locks.
  lock_table_name_if_not_cached():
    Moved this function from sql_base.cc to this file and
    reimplemented it using metadata locking API.
  mysql_create_table():
    Adjusted code to use new MDL API.
  wait_while_table_is_used():
    Changed function to use new MDL subsystem. Made thread waiting
    in it killable (this also led to introduction of return value so
    caller can distinguish successful executions from situations
    when waiting was aborted).
  close_cached_tables():
    Thread waiting in this function is killable now. As result it
    has return value for distinguishing between succes and failure.
    Got rid of redundant boradcast_refresh() call.
  prepare_for_repair():
    Use MDL subsystem instead of name-locks.
  mysql_admin_table():
    mysql_ha_rm_tables() now always assumes that caller doesn't own
    LOCK_open.
  mysql_repair_table():
    We should mark all elements of table list as requiring
    upgradable metadata locks.
  mysql_create_table_like():
    Use new MDL subsystem instead of name-locks.
  create_temporary_tables():
    We don't need to obtain metadata locks when creating temporary
    table.
  mysql_fast_or_online_alter_table():
    Thread waiting in wait_while_table_is_used() is now killable.
  mysql_alter_table():
    Adjusted code to work with new MDL subsystem and to the fact
    that threads waiting in what_while_table_is_used() and
    close_cached_table() are now killable.
sql/sql_test.cc:
  We no longer have separate table cache. TABLE instances are now
  associated with/linked to TABLE_SHARE objects in table definition
  cache.
sql/sql_trigger.cc:
  Adjusted code to work with new metadata locking subsystem.  Also
  reopen_tables() no longer has mark_share_as_old argument (Instead
  of relying on this parameter and related behavior FLUSH TABLES
  WITH READ LOCK now takes global shared metadata lock).
sql/sql_udf.cc:
  Allocate metadata lock requests objects (MDL_LOCK) on execution
  memory root in cases when we use stack TABLE_LIST objects to open
  tables.
sql/sql_update.cc:
  Adjusted code to work with new meta-data locking subsystem.
sql/sql_view.cc:
  Added proper meta-data locking to implementations of
  CREATE/ALTER/DROP VIEW statements. Now we obtain exclusive
  meta-data lock on a view before creating/ changing/dropping it.
  This ensures that all concurrent statements that use this view
  will finish before our statement will proceed and therefore we
  will get correct order of statements in the binary log.
  Also ensure that TABLE_LIST::mdl_upgradable attribute is properly
  propagated for underlying tables of view.
sql/table.cc:
  Added auxiliary alloc_mdl_locks() function for allocating metadata
  lock request objects for all elements of table list.
sql/table.h:
  TABLE_SHARE:
    Got rid of unused members. Introduced members for storing lists
    of used and unused TABLE objects for this share.
  TABLE:
    Added members for linking TABLE objects into per-share lists of
    used and unused TABLE instances. Added member for holding
    pointer to metadata lock for this table.
  TABLE_LIST:
    Replaced boolean TABLE_LIST::create member with enum
    open_table_type member.  This allows easily handle situation in
    which instead of opening the table we want only to take
    exclusive meta-data lock on it (we need this in order to handle
    ALTER VIEW and CREATE VIEW statements).
    Introduced new mdl_upgradable member for marking elements of
    table list for which we need to take upgradable shared metadata
    lock instead of plain shared metadata lock.  Added pointer for
    holding pointer to MDL_LOCK for the table.
  Added auxiliary alloc_mdl_locks() function for allocating metadata
  lock requests objects for all elements of table list.  Added
  auxiliary set_all_mdl_upgradable() function for marking all
  elements in table list as requiring upgradable metadata locks.
storage/myisammrg/ha_myisammrg.cc:
  Allocate MDL_LOCK objects for underlying tables of MERGE table.
  To be reworked once Ingo pushes his patch for WL4144.
2009-11-30 18:55:03 +03:00

1317 lines
43 KiB
C++

/* Copyright (C) 2000-2006 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
MyISAM MERGE tables
A MyISAM MERGE table is kind of a union of zero or more MyISAM tables.
Besides the normal form file (.frm) a MERGE table has a meta file
(.MRG) with a list of tables. These are paths to the MyISAM table
files. The last two components of the path contain the database name
and the table name respectively.
When a MERGE table is open, there exists an TABLE object for the MERGE
table itself and a TABLE object for each of the MyISAM tables. For
abbreviated writing, I call the MERGE table object "parent" and the
MyISAM table objects "children".
A MERGE table is almost always opened through open_and_lock_tables()
and hence through open_tables(). When the parent appears in the list
of tables to open, the initial open of the handler does nothing but
read the meta file and collect a list of TABLE_LIST objects for the
children. This list is attached to the parent TABLE object as
TABLE::child_l. The end of the children list is saved in
TABLE::child_last_l.
Back in open_tables(), add_merge_table_list() is called. It updates
each list member with the lock type and a back pointer to the parent
TABLE_LIST object TABLE_LIST::parent_l. The list is then inserted in
the list of tables to open, right behind the parent. Consequently,
open_tables() opens the children, one after the other. The TABLE
references of the TABLE_LIST objects are implicitly set to the open
tables. The children are opened as independent MyISAM tables, right as
if they are used by the SQL statement.
TABLE_LIST::parent_l is required to find the parent 1. when the last
child has been opened and children are to be attached, and 2. when an
error happens during child open and the child list must be removed
from the queuery list. In these cases the current child does not have
TABLE::parent set or does not have a TABLE at all respectively.
When the last child is open, attach_merge_children() is called. It
removes the list of children from the open list. Then the children are
"attached" to the parent. All required references between parent and
children are set up.
The MERGE storage engine sets up an array with references to the
low-level MyISAM table objects (MI_INFO). It remembers the state of
the table in MYRG_INFO::children_attached.
Every child TABLE::parent references the parent TABLE object. That way
TABLE objects belonging to a MERGE table can be identified.
TABLE::parent is required because the parent and child TABLE objects
can live longer than the parent TABLE_LIST object. So the path
child->pos_in_table_list->parent_l->table can be broken.
If necessary, the compatibility of parent and children is checked.
This check is necessary when any of the objects are reopend. This is
detected by comparing the current table def version against the
remembered child def version. On parent open, the list members are
initialized to an "impossible"/"undefined" version value. So the check
is always executed on the first attach.
The version check is done in myisammrg_attach_children_callback(),
which is called for every child. ha_myisammrg::attach_children()
initializes 'need_compat_check' to FALSE and
myisammrg_attach_children_callback() sets it ot TRUE if a table
def version mismatches the remembered child def version.
Finally the parent TABLE::children_attached is set.
---
On parent open the storage engine structures are allocated and initialized.
They stay with the open table until its final close.
*/
#ifdef USE_PRAGMA_IMPLEMENTATION
#pragma implementation // gcc: Class implementation
#endif
#define MYSQL_SERVER 1
#include "mysql_priv.h"
#include "probes_mysql.h"
#include <mysql/plugin.h>
#include <m_ctype.h>
#include "../myisam/ha_myisam.h"
#include "ha_myisammrg.h"
#include "myrg_def.h"
static handler *myisammrg_create_handler(handlerton *hton,
TABLE_SHARE *table,
MEM_ROOT *mem_root)
{
return new (mem_root) ha_myisammrg(hton, table);
}
/**
@brief Constructor
*/
ha_myisammrg::ha_myisammrg(handlerton *hton, TABLE_SHARE *table_arg)
:handler(hton, table_arg), file(0), is_cloned(0)
{}
/**
@brief Destructor
*/
ha_myisammrg::~ha_myisammrg(void)
{}
static const char *ha_myisammrg_exts[] = {
".MRG",
NullS
};
extern int table2myisam(TABLE *table_arg, MI_KEYDEF **keydef_out,
MI_COLUMNDEF **recinfo_out, uint *records_out);
extern int check_definition(MI_KEYDEF *t1_keyinfo, MI_COLUMNDEF *t1_recinfo,
uint t1_keys, uint t1_recs,
MI_KEYDEF *t2_keyinfo, MI_COLUMNDEF *t2_recinfo,
uint t2_keys, uint t2_recs, bool strict,
TABLE *table_arg);
static void split_file_name(const char *file_name,
LEX_STRING *db, LEX_STRING *name);
extern "C" void myrg_print_wrong_table(const char *table_name)
{
LEX_STRING db= {NULL, 0}, name;
char buf[FN_REFLEN];
split_file_name(table_name, &db, &name);
memcpy(buf, db.str, db.length);
buf[db.length]= '.';
memcpy(buf + db.length + 1, name.str, name.length);
buf[db.length + name.length + 1]= 0;
push_warning_printf(current_thd, MYSQL_ERROR::WARN_LEVEL_WARN,
ER_ADMIN_WRONG_MRG_TABLE, ER(ER_ADMIN_WRONG_MRG_TABLE),
buf);
}
const char **ha_myisammrg::bas_ext() const
{
return ha_myisammrg_exts;
}
const char *ha_myisammrg::index_type(uint key_number)
{
return ((table->key_info[key_number].flags & HA_FULLTEXT) ?
"FULLTEXT" :
(table->key_info[key_number].flags & HA_SPATIAL) ?
"SPATIAL" :
(table->key_info[key_number].algorithm == HA_KEY_ALG_RTREE) ?
"RTREE" :
"BTREE");
}
/**
@brief Callback function for open of a MERGE parent table.
@detail This function adds a TABLE_LIST object for a MERGE child table
to a list of tables of the parent TABLE object. It is called for
each child table.
The list of child TABLE_LIST objects is kept in the TABLE object of
the parent for the whole life time of the MERGE table. It is
inserted in the statement list behind the MERGE parent TABLE_LIST
object when the MERGE table is opened. It is removed from the
statement list after the last child is opened.
All memeory used for the child TABLE_LIST objects and the strings
referred by it are taken from the parent TABLE::mem_root. Thus they
are all freed implicitly at the final close of the table.
TABLE::child_l -> TABLE_LIST::next_global -> TABLE_LIST::next_global
# # ^ # ^
# # | # |
# # +--------- TABLE_LIST::prev_global
# # |
# |<--- TABLE_LIST::prev_global |
# |
TABLE::child_last_l -----------------------------------------+
@param[in] callback_param data pointer as given to myrg_parent_open()
@param[in] filename file name of MyISAM table
without extension.
@return status
@retval 0 OK
@retval != 0 Error
*/
static int myisammrg_parent_open_callback(void *callback_param,
const char *filename)
{
ha_myisammrg *ha_myrg;
TABLE *parent;
TABLE_LIST *child_l;
const char *db;
const char *table_name;
size_t dirlen;
char dir_path[FN_REFLEN];
DBUG_ENTER("myisammrg_parent_open_callback");
/* Extract child table name and database name from filename. */
dirlen= dirname_length(filename);
if (dirlen >= FN_REFLEN)
{
/* purecov: begin inspected */
DBUG_PRINT("error", ("name too long: '%.64s'", filename));
my_errno= ENAMETOOLONG;
DBUG_RETURN(1);
/* purecov: end */
}
table_name= filename + dirlen;
dirlen--; /* Strip off trailing '/'. */
memcpy(dir_path, filename, dirlen);
dir_path[dirlen]= '\0';
db= base_name(dir_path);
dirlen-= db - dir_path; /* This is now the length of 'db'. */
DBUG_PRINT("myrg", ("open: '%s'.'%s'", db, table_name));
ha_myrg= (ha_myisammrg*) callback_param;
parent= ha_myrg->table_ptr();
/* Get a TABLE_LIST object. */
if (!(child_l= (TABLE_LIST*) alloc_root(&parent->mem_root,
sizeof(TABLE_LIST))))
{
/* purecov: begin inspected */
DBUG_PRINT("error", ("my_malloc error: %d", my_errno));
DBUG_RETURN(1);
/* purecov: end */
}
bzero((char*) child_l, sizeof(TABLE_LIST));
/* Set database (schema) name. */
child_l->db_length= dirlen;
child_l->db= strmake_root(&parent->mem_root, db, dirlen);
/* Set table name. */
child_l->table_name_length= strlen(table_name);
child_l->table_name= strmake_root(&parent->mem_root, table_name,
child_l->table_name_length);
/* Convert to lowercase if required. */
if (lower_case_table_names && child_l->table_name_length)
child_l->table_name_length= my_casedn_str(files_charset_info,
child_l->table_name);
/* Set alias. */
child_l->alias= child_l->table_name;
/*
FIXME: Actually we should use some other mem-root here.
To be fixed once Ingo pushes his patch for WL4144.
*/
alloc_mdl_locks(child_l, &parent->mem_root);
/* Initialize table map to 'undefined'. */
child_l->init_child_def_version();
/* Link TABLE_LIST object into the parent list. */
if (!parent->child_last_l)
{
/* Initialize parent->child_last_l when handling first child. */
parent->child_last_l= &parent->child_l;
}
*parent->child_last_l= child_l;
child_l->prev_global= parent->child_last_l;
parent->child_last_l= &child_l->next_global;
DBUG_RETURN(0);
}
/**
@brief Callback function for attaching a MERGE child table.
@detail This function retrieves the MyISAM table handle from the
next child table. It is called for each child table.
@param[in] callback_param data pointer as given to
myrg_attach_children()
@return pointer to open MyISAM table structure
@retval !=NULL OK, returning pointer
@retval NULL, my_errno == 0 Ok, no more child tables
@retval NULL, my_errno != 0 error
*/
static MI_INFO *myisammrg_attach_children_callback(void *callback_param)
{
ha_myisammrg *ha_myrg;
TABLE *parent;
TABLE *child;
TABLE_LIST *child_l;
MI_INFO *myisam;
DBUG_ENTER("myisammrg_attach_children_callback");
my_errno= 0;
ha_myrg= (ha_myisammrg*) callback_param;
parent= ha_myrg->table_ptr();
/* Get child list item. */
child_l= ha_myrg->next_child_attach;
if (!child_l)
{
DBUG_PRINT("myrg", ("No more children to attach"));
DBUG_RETURN(NULL);
}
child= child_l->table;
DBUG_PRINT("myrg", ("child table: '%s'.'%s' 0x%lx", child->s->db.str,
child->s->table_name.str, (long) child));
/*
Prepare for next child. Used as child_l in next call to this function.
We cannot rely on a NULL-terminated chain.
*/
if (&child_l->next_global == parent->child_last_l)
{
DBUG_PRINT("myrg", ("attaching last child"));
ha_myrg->next_child_attach= NULL;
}
else
ha_myrg->next_child_attach= child_l->next_global;
/* Set parent reference. */
child->parent= parent;
/*
Do a quick compatibility check. The table def version is set when
the table share is created. The child def version is copied
from the table def version after a sucessful compatibility check.
We need to repeat the compatibility check only if a child is opened
from a different share than last time it was used with this MERGE
table.
*/
DBUG_PRINT("myrg", ("table_def_version last: %lu current: %lu",
(ulong) child_l->get_child_def_version(),
(ulong) child->s->get_table_def_version()));
if (child_l->get_child_def_version() != child->s->get_table_def_version())
ha_myrg->need_compat_check= TRUE;
/*
If parent is temporary, children must be temporary too and vice
versa. This check must be done for every child on every open because
the table def version can overlap between temporary and
non-temporary tables. We need to detect the case where a
non-temporary table has been replaced with a temporary table of the
same version. Or vice versa. A very unlikely case, but it could
happen.
*/
if (child->s->tmp_table != parent->s->tmp_table)
{
DBUG_PRINT("error", ("temporary table mismatch parent: %d child: %d",
parent->s->tmp_table, child->s->tmp_table));
my_errno= HA_ERR_WRONG_MRG_TABLE_DEF;
goto err;
}
/* Extract the MyISAM table structure pointer from the handler object. */
if ((child->file->ht->db_type != DB_TYPE_MYISAM) ||
!(myisam= ((ha_myisam*) child->file)->file_ptr()))
{
DBUG_PRINT("error", ("no MyISAM handle for child table: '%s'.'%s' 0x%lx",
child->s->db.str, child->s->table_name.str,
(long) child));
my_errno= HA_ERR_WRONG_MRG_TABLE_DEF;
}
DBUG_PRINT("myrg", ("MyISAM handle: 0x%lx my_errno: %d",
(long) myisam, my_errno));
err:
DBUG_RETURN(my_errno ? NULL : myisam);
}
/**
@brief Open a MERGE parent table, not its children.
@detail This function initializes the MERGE storage engine structures
and adds a child list of TABLE_LIST to the parent TABLE.
@param[in] name MERGE table path name
@param[in] mode read/write mode, unused
@param[in] test_if_locked open flags
@return status
@retval 0 OK
@retval -1 Error, my_errno gives reason
*/
int ha_myisammrg::open(const char *name, int mode __attribute__((unused)),
uint test_if_locked)
{
DBUG_ENTER("ha_myisammrg::open");
DBUG_PRINT("myrg", ("name: '%s' table: 0x%lx", name, (long) table));
DBUG_PRINT("myrg", ("test_if_locked: %u", test_if_locked));
/* Save for later use. */
this->test_if_locked= test_if_locked;
/* retrieve children table list. */
my_errno= 0;
if (is_cloned)
{
/*
Open and attaches the MyISAM tables,that are under the MERGE table
parent, on the MyISAM storage engine interface directly within the
MERGE engine. The new MyISAM table instances, as well as the MERGE
clone itself, are not visible in the table cache. This is not a
problem because all locking is handled by the original MERGE table
from which this is cloned of.
*/
if (!(file= myrg_open(table->s->normalized_path.str, table->db_stat,
HA_OPEN_IGNORE_IF_LOCKED)))
{
DBUG_PRINT("error", ("my_errno %d", my_errno));
DBUG_RETURN(my_errno ? my_errno : -1);
}
file->children_attached= TRUE;
info(HA_STATUS_NO_LOCK | HA_STATUS_VARIABLE | HA_STATUS_CONST);
}
else if (!(file= myrg_parent_open(name, myisammrg_parent_open_callback, this)))
{
DBUG_PRINT("error", ("my_errno %d", my_errno));
DBUG_RETURN(my_errno ? my_errno : -1);
}
DBUG_PRINT("myrg", ("MYRG_INFO: 0x%lx", (long) file));
DBUG_RETURN(0);
}
/**
Returns a cloned instance of the current handler.
@return A cloned handler instance.
*/
handler *ha_myisammrg::clone(MEM_ROOT *mem_root)
{
MYRG_TABLE *u_table,*newu_table;
ha_myisammrg *new_handler=
(ha_myisammrg*) get_new_handler(table->s, mem_root, table->s->db_type());
if (!new_handler)
return NULL;
/* Inform ha_myisammrg::open() that it is a cloned handler */
new_handler->is_cloned= TRUE;
/*
Allocate handler->ref here because otherwise ha_open will allocate it
on this->table->mem_root and we will not be able to reclaim that memory
when the clone handler object is destroyed.
*/
if (!(new_handler->ref= (uchar*) alloc_root(mem_root, ALIGN_SIZE(ref_length)*2)))
{
delete new_handler;
return NULL;
}
if (new_handler->ha_open(table, table->s->normalized_path.str, table->db_stat,
HA_OPEN_IGNORE_IF_LOCKED))
{
delete new_handler;
return NULL;
}
/*
Iterate through the original child tables and
copy the state into the cloned child tables.
We need to do this because all the child tables
can be involved in delete.
*/
newu_table= new_handler->file->open_tables;
for (u_table= file->open_tables; u_table < file->end_table; u_table++)
{
newu_table->table->state= u_table->table->state;
newu_table++;
}
return new_handler;
}
/**
@brief Attach children to a MERGE table.
@detail Let the storage engine attach its children through a callback
function. Check table definitions for consistency.
@note Special thd->open_options may be in effect. We can make use of
them in attach. I.e. we use HA_OPEN_FOR_REPAIR to report the names
of mismatching child tables. We cannot transport these options in
ha_myisammrg::test_if_locked because they may change after the
parent is opened. The parent is kept open in the table cache over
multiple statements and can be used by other threads. Open options
can change over time.
@return status
@retval 0 OK
@retval != 0 Error, my_errno gives reason
*/
int ha_myisammrg::attach_children(void)
{
MYRG_TABLE *u_table;
MI_COLUMNDEF *recinfo;
MI_KEYDEF *keyinfo;
uint recs;
uint keys= table->s->keys;
int error;
DBUG_ENTER("ha_myisammrg::attach_children");
DBUG_PRINT("myrg", ("table: '%s'.'%s' 0x%lx", table->s->db.str,
table->s->table_name.str, (long) table));
DBUG_PRINT("myrg", ("test_if_locked: %u", this->test_if_locked));
DBUG_ASSERT(!this->file->children_attached);
/*
Initialize variables that are used, modified, and/or set by
myisammrg_attach_children_callback().
'next_child_attach' traverses the chain of TABLE_LIST objects
that has been compiled during myrg_parent_open(). Every call
to myisammrg_attach_children_callback() moves the pointer to
the next object.
'need_compat_check' is set by myisammrg_attach_children_callback()
if a child fails the table def version check.
'my_errno' is set by myisammrg_attach_children_callback() in
case of an error.
*/
next_child_attach= table->child_l;
need_compat_check= FALSE;
my_errno= 0;
if (myrg_attach_children(this->file, this->test_if_locked |
current_thd->open_options,
myisammrg_attach_children_callback, this,
(my_bool *) &need_compat_check))
{
DBUG_PRINT("error", ("my_errno %d", my_errno));
DBUG_RETURN(my_errno ? my_errno : -1);
}
DBUG_PRINT("myrg", ("calling myrg_extrafunc"));
myrg_extrafunc(file, query_cache_invalidate_by_MyISAM_filename_ref);
if (!(test_if_locked == HA_OPEN_WAIT_IF_LOCKED ||
test_if_locked == HA_OPEN_ABORT_IF_LOCKED))
myrg_extra(file,HA_EXTRA_NO_WAIT_LOCK,0);
info(HA_STATUS_NO_LOCK | HA_STATUS_VARIABLE | HA_STATUS_CONST);
if (!(test_if_locked & HA_OPEN_WAIT_IF_LOCKED))
myrg_extra(file,HA_EXTRA_WAIT_LOCK,0);
/*
The compatibility check is required only if one or more children do
not match their table def version from the last check. This will
always happen at the first attach because the reference child def
version is initialized to 'undefined' at open.
*/
DBUG_PRINT("myrg", ("need_compat_check: %d", need_compat_check));
if (need_compat_check)
{
TABLE_LIST *child_l;
if (table->s->reclength != stats.mean_rec_length && stats.mean_rec_length)
{
DBUG_PRINT("error",("reclength: %lu mean_rec_length: %lu",
table->s->reclength, stats.mean_rec_length));
if (test_if_locked & HA_OPEN_FOR_REPAIR)
myrg_print_wrong_table(file->open_tables->table->filename);
error= HA_ERR_WRONG_MRG_TABLE_DEF;
goto err;
}
/*
Both recinfo and keyinfo are allocated by my_multi_malloc(), thus
only recinfo must be freed.
*/
if ((error= table2myisam(table, &keyinfo, &recinfo, &recs)))
{
/* purecov: begin inspected */
DBUG_PRINT("error", ("failed to convert TABLE object to MyISAM "
"key and column definition"));
goto err;
/* purecov: end */
}
for (u_table= file->open_tables; u_table < file->end_table; u_table++)
{
if (check_definition(keyinfo, recinfo, keys, recs,
u_table->table->s->keyinfo, u_table->table->s->rec,
u_table->table->s->base.keys,
u_table->table->s->base.fields, false, NULL))
{
DBUG_PRINT("error", ("table definition mismatch: '%s'",
u_table->table->filename));
error= HA_ERR_WRONG_MRG_TABLE_DEF;
if (!(this->test_if_locked & HA_OPEN_FOR_REPAIR))
{
my_free((uchar*) recinfo, MYF(0));
goto err;
}
myrg_print_wrong_table(u_table->table->filename);
}
}
my_free((uchar*) recinfo, MYF(0));
if (error == HA_ERR_WRONG_MRG_TABLE_DEF)
goto err;
/* All checks passed so far. Now update child def version. */
for (child_l= table->child_l; ; child_l= child_l->next_global)
{
child_l->set_child_def_version(
child_l->table->s->get_table_def_version());
if (&child_l->next_global == table->child_last_l)
break;
}
}
#if !defined(BIG_TABLES) || SIZEOF_OFF_T == 4
/* Merge table has more than 2G rows */
if (table->s->crashed)
{
DBUG_PRINT("error", ("MERGE table marked crashed"));
error= HA_ERR_WRONG_MRG_TABLE_DEF;
goto err;
}
#endif
DBUG_RETURN(0);
err:
myrg_detach_children(file);
DBUG_RETURN(my_errno= error);
}
/**
@brief Detach all children from a MERGE table.
@note Detach must not touch the children in any way.
They may have been closed at ths point already.
All references to the children should be removed.
@return status
@retval 0 OK
@retval != 0 Error, my_errno gives reason
*/
int ha_myisammrg::detach_children(void)
{
DBUG_ENTER("ha_myisammrg::detach_children");
DBUG_ASSERT(this->file && this->file->children_attached);
if (myrg_detach_children(this->file))
{
/* purecov: begin inspected */
DBUG_PRINT("error", ("my_errno %d", my_errno));
DBUG_RETURN(my_errno ? my_errno : -1);
/* purecov: end */
}
DBUG_RETURN(0);
}
/**
@brief Close a MERGE parent table, not its children.
@note The children are expected to be closed separately by the caller.
@return status
@retval 0 OK
@retval != 0 Error, my_errno gives reason
*/
int ha_myisammrg::close(void)
{
int rc;
DBUG_ENTER("ha_myisammrg::close");
/*
Children must not be attached here. Unless the MERGE table has no
children or the handler instance has been cloned. In these cases
children_attached is always true.
*/
DBUG_ASSERT(!this->file->children_attached || !this->file->tables || this->is_cloned);
rc= myrg_close(file);
file= 0;
DBUG_RETURN(rc);
}
int ha_myisammrg::write_row(uchar * buf)
{
DBUG_ENTER("ha_myisammrg::write_row");
DBUG_ASSERT(this->file->children_attached);
ha_statistic_increment(&SSV::ha_write_count);
if (file->merge_insert_method == MERGE_INSERT_DISABLED || !file->tables)
DBUG_RETURN(HA_ERR_TABLE_READONLY);
if (table->timestamp_field_type & TIMESTAMP_AUTO_SET_ON_INSERT)
table->timestamp_field->set_time();
if (table->next_number_field && buf == table->record[0])
{
int error;
if ((error= update_auto_increment()))
DBUG_RETURN(error); /* purecov: inspected */
}
DBUG_RETURN(myrg_write(file,buf));
}
int ha_myisammrg::update_row(const uchar * old_data, uchar * new_data)
{
DBUG_ASSERT(this->file->children_attached);
ha_statistic_increment(&SSV::ha_update_count);
if (table->timestamp_field_type & TIMESTAMP_AUTO_SET_ON_UPDATE)
table->timestamp_field->set_time();
return myrg_update(file,old_data,new_data);
}
int ha_myisammrg::delete_row(const uchar * buf)
{
DBUG_ASSERT(this->file->children_attached);
ha_statistic_increment(&SSV::ha_delete_count);
return myrg_delete(file,buf);
}
int ha_myisammrg::index_read_map(uchar * buf, const uchar * key,
key_part_map keypart_map,
enum ha_rkey_function find_flag)
{
DBUG_ASSERT(this->file->children_attached);
MYSQL_INDEX_READ_ROW_START(table_share->db.str, table_share->table_name.str);
ha_statistic_increment(&SSV::ha_read_key_count);
int error=myrg_rkey(file,buf,active_index, key, keypart_map, find_flag);
table->status=error ? STATUS_NOT_FOUND: 0;
MYSQL_INDEX_READ_ROW_DONE(error);
return error;
}
int ha_myisammrg::index_read_idx_map(uchar * buf, uint index, const uchar * key,
key_part_map keypart_map,
enum ha_rkey_function find_flag)
{
DBUG_ASSERT(this->file->children_attached);
MYSQL_INDEX_READ_ROW_START(table_share->db.str, table_share->table_name.str);
ha_statistic_increment(&SSV::ha_read_key_count);
int error=myrg_rkey(file,buf,index, key, keypart_map, find_flag);
table->status=error ? STATUS_NOT_FOUND: 0;
MYSQL_INDEX_READ_ROW_DONE(error);
return error;
}
int ha_myisammrg::index_read_last_map(uchar *buf, const uchar *key,
key_part_map keypart_map)
{
DBUG_ASSERT(this->file->children_attached);
MYSQL_INDEX_READ_ROW_START(table_share->db.str, table_share->table_name.str);
ha_statistic_increment(&SSV::ha_read_key_count);
int error=myrg_rkey(file,buf,active_index, key, keypart_map,
HA_READ_PREFIX_LAST);
table->status=error ? STATUS_NOT_FOUND: 0;
MYSQL_INDEX_READ_ROW_DONE(error);
return error;
}
int ha_myisammrg::index_next(uchar * buf)
{
DBUG_ASSERT(this->file->children_attached);
MYSQL_INDEX_READ_ROW_START(table_share->db.str, table_share->table_name.str);
ha_statistic_increment(&SSV::ha_read_next_count);
int error=myrg_rnext(file,buf,active_index);
table->status=error ? STATUS_NOT_FOUND: 0;
MYSQL_INDEX_READ_ROW_DONE(error);
return error;
}
int ha_myisammrg::index_prev(uchar * buf)
{
DBUG_ASSERT(this->file->children_attached);
MYSQL_INDEX_READ_ROW_START(table_share->db.str, table_share->table_name.str);
ha_statistic_increment(&SSV::ha_read_prev_count);
int error=myrg_rprev(file,buf, active_index);
table->status=error ? STATUS_NOT_FOUND: 0;
MYSQL_INDEX_READ_ROW_DONE(error);
return error;
}
int ha_myisammrg::index_first(uchar * buf)
{
DBUG_ASSERT(this->file->children_attached);
MYSQL_INDEX_READ_ROW_START(table_share->db.str, table_share->table_name.str);
ha_statistic_increment(&SSV::ha_read_first_count);
int error=myrg_rfirst(file, buf, active_index);
table->status=error ? STATUS_NOT_FOUND: 0;
MYSQL_INDEX_READ_ROW_DONE(error);
return error;
}
int ha_myisammrg::index_last(uchar * buf)
{
DBUG_ASSERT(this->file->children_attached);
MYSQL_INDEX_READ_ROW_START(table_share->db.str, table_share->table_name.str);
ha_statistic_increment(&SSV::ha_read_last_count);
int error=myrg_rlast(file, buf, active_index);
table->status=error ? STATUS_NOT_FOUND: 0;
MYSQL_INDEX_READ_ROW_DONE(error);
return error;
}
int ha_myisammrg::index_next_same(uchar * buf,
const uchar *key __attribute__((unused)),
uint length __attribute__((unused)))
{
int error;
DBUG_ASSERT(this->file->children_attached);
MYSQL_INDEX_READ_ROW_START(table_share->db.str, table_share->table_name.str);
ha_statistic_increment(&SSV::ha_read_next_count);
do
{
error= myrg_rnext_same(file,buf);
} while (error == HA_ERR_RECORD_DELETED);
table->status=error ? STATUS_NOT_FOUND: 0;
MYSQL_INDEX_READ_ROW_DONE(error);
return error;
}
int ha_myisammrg::rnd_init(bool scan)
{
DBUG_ASSERT(this->file->children_attached);
return myrg_reset(file);
}
int ha_myisammrg::rnd_next(uchar *buf)
{
DBUG_ASSERT(this->file->children_attached);
MYSQL_READ_ROW_START(table_share->db.str, table_share->table_name.str,
TRUE);
ha_statistic_increment(&SSV::ha_read_rnd_next_count);
int error=myrg_rrnd(file, buf, HA_OFFSET_ERROR);
table->status=error ? STATUS_NOT_FOUND: 0;
MYSQL_READ_ROW_DONE(error);
return error;
}
int ha_myisammrg::rnd_pos(uchar * buf, uchar *pos)
{
DBUG_ASSERT(this->file->children_attached);
MYSQL_READ_ROW_START(table_share->db.str, table_share->table_name.str,
TRUE);
ha_statistic_increment(&SSV::ha_read_rnd_count);
int error=myrg_rrnd(file, buf, my_get_ptr(pos,ref_length));
table->status=error ? STATUS_NOT_FOUND: 0;
MYSQL_READ_ROW_DONE(error);
return error;
}
void ha_myisammrg::position(const uchar *record)
{
DBUG_ASSERT(this->file->children_attached);
ulonglong row_position= myrg_position(file);
my_store_ptr(ref, ref_length, (my_off_t) row_position);
}
ha_rows ha_myisammrg::records_in_range(uint inx, key_range *min_key,
key_range *max_key)
{
DBUG_ASSERT(this->file->children_attached);
return (ha_rows) myrg_records_in_range(file, (int) inx, min_key, max_key);
}
int ha_myisammrg::info(uint flag)
{
MYMERGE_INFO mrg_info;
DBUG_ASSERT(this->file->children_attached);
(void) myrg_status(file,&mrg_info,flag);
/*
The following fails if one has not compiled MySQL with -DBIG_TABLES
and one has more than 2^32 rows in the merge tables.
*/
stats.records = (ha_rows) mrg_info.records;
stats.deleted = (ha_rows) mrg_info.deleted;
#if !defined(BIG_TABLES) || SIZEOF_OFF_T == 4
if ((mrg_info.records >= (ulonglong) 1 << 32) ||
(mrg_info.deleted >= (ulonglong) 1 << 32))
table->s->crashed= 1;
#endif
stats.data_file_length= mrg_info.data_file_length;
if (mrg_info.errkey >= (int) table_share->keys)
{
/*
If value of errkey is higher than the number of keys
on the table set errkey to MAX_KEY. This will be
treated as unknown key case and error message generator
won't try to locate key causing segmentation fault.
*/
mrg_info.errkey= MAX_KEY;
}
table->s->keys_in_use.set_prefix(table->s->keys);
stats.mean_rec_length= mrg_info.reclength;
/*
The handler::block_size is used all over the code in index scan cost
calculations. It is used to get number of disk seeks required to
retrieve a number of index tuples.
If the merge table has N underlying tables, then (assuming underlying
tables have equal size, the only "simple" approach we can use)
retrieving X index records from a merge table will require N times more
disk seeks compared to doing the same on a MyISAM table with equal
number of records.
In the edge case (file_tables > myisam_block_size) we'll get
block_size==0, and index calculation code will act as if we need one
disk seek to retrieve one index tuple.
TODO: In 5.2 index scan cost calculation will be factored out into a
virtual function in class handler and we'll be able to remove this hack.
*/
stats.block_size= 0;
if (file->tables)
stats.block_size= myisam_block_size / file->tables;
stats.update_time= 0;
#if SIZEOF_OFF_T > 4
ref_length=6; // Should be big enough
#else
ref_length=4; // Can't be > than my_off_t
#endif
if (flag & HA_STATUS_CONST)
{
if (table->s->key_parts && mrg_info.rec_per_key)
{
#ifdef HAVE_purify
/*
valgrind may be unhappy about it, because optimizer may access values
between file->keys and table->key_parts, that will be uninitialized.
It's safe though, because even if opimizer will decide to use a key
with such a number, it'll be an error later anyway.
*/
bzero((char*) table->key_info[0].rec_per_key,
sizeof(table->key_info[0].rec_per_key[0]) * table->s->key_parts);
#endif
memcpy((char*) table->key_info[0].rec_per_key,
(char*) mrg_info.rec_per_key,
sizeof(table->key_info[0].rec_per_key[0]) *
min(file->keys, table->s->key_parts));
}
}
if (flag & HA_STATUS_ERRKEY)
{
errkey= mrg_info.errkey;
my_store_ptr(dup_ref, ref_length, mrg_info.dupp_key_pos);
}
return 0;
}
int ha_myisammrg::extra(enum ha_extra_function operation)
{
if (operation == HA_EXTRA_ATTACH_CHILDREN)
{
int rc= attach_children();
if (!rc)
(void) extra(HA_EXTRA_NO_READCHECK); // Not needed in SQL
return(rc);
}
else if (operation == HA_EXTRA_DETACH_CHILDREN)
{
/*
Note that detach must not touch the children in any way.
They may have been closed at ths point already.
*/
int rc= detach_children();
return(rc);
}
/* As this is just a mapping, we don't have to force the underlying
tables to be closed */
if (operation == HA_EXTRA_FORCE_REOPEN ||
operation == HA_EXTRA_PREPARE_FOR_DROP)
return 0;
return myrg_extra(file,operation,0);
}
int ha_myisammrg::reset(void)
{
return myrg_reset(file);
}
/* To be used with WRITE_CACHE, EXTRA_CACHE and BULK_INSERT_BEGIN */
int ha_myisammrg::extra_opt(enum ha_extra_function operation, ulong cache_size)
{
DBUG_ASSERT(this->file->children_attached);
if ((specialflag & SPECIAL_SAFE_MODE) && operation == HA_EXTRA_WRITE_CACHE)
return 0;
return myrg_extra(file, operation, (void*) &cache_size);
}
int ha_myisammrg::external_lock(THD *thd, int lock_type)
{
DBUG_ASSERT(this->file->children_attached);
return myrg_lock_database(file,lock_type);
}
uint ha_myisammrg::lock_count(void) const
{
/*
Return the real lock count even if the children are not attached.
This method is used for allocating memory. If we would return 0
to another thread (e.g. doing FLUSH TABLE), and attach the children
before the other thread calls store_lock(), then we would return
more locks in store_lock() than we claimed by lock_count(). The
other tread would overrun its memory.
*/
return file->tables;
}
THR_LOCK_DATA **ha_myisammrg::store_lock(THD *thd,
THR_LOCK_DATA **to,
enum thr_lock_type lock_type)
{
MYRG_TABLE *open_table;
/*
This method can be called while another thread is attaching the
children. If the processor reorders instructions or write to memory,
'children_attached' could be set before 'open_tables' has all the
pointers to the children. Use of a mutex here and in
myrg_attach_children() forces consistent data.
*/
pthread_mutex_lock(&this->file->mutex);
/*
When MERGE table is open, but not yet attached, other threads
could flush it, which means call mysql_lock_abort_for_thread()
on this threads TABLE. 'children_attached' is FALSE in this
situaton. Since the table is not locked, return no lock data.
*/
if (!this->file->children_attached)
goto end; /* purecov: tested */
for (open_table=file->open_tables ;
open_table != file->end_table ;
open_table++)
{
*(to++)= &open_table->table->lock;
if (lock_type != TL_IGNORE && open_table->table->lock.type == TL_UNLOCK)
open_table->table->lock.type=lock_type;
}
end:
pthread_mutex_unlock(&this->file->mutex);
return to;
}
/* Find out database name and table name from a filename */
static void split_file_name(const char *file_name,
LEX_STRING *db, LEX_STRING *name)
{
size_t dir_length, prefix_length;
char buff[FN_REFLEN];
db->length= 0;
strmake(buff, file_name, sizeof(buff)-1);
dir_length= dirname_length(buff);
if (dir_length > 1)
{
/* Get database */
buff[dir_length-1]= 0; // Remove end '/'
prefix_length= dirname_length(buff);
db->str= (char*) file_name+ prefix_length;
db->length= dir_length - prefix_length -1;
}
name->str= (char*) file_name+ dir_length;
name->length= (uint) (fn_ext(name->str) - name->str);
}
void ha_myisammrg::update_create_info(HA_CREATE_INFO *create_info)
{
DBUG_ENTER("ha_myisammrg::update_create_info");
if (!(create_info->used_fields & HA_CREATE_USED_UNION))
{
MYRG_TABLE *open_table;
THD *thd=current_thd;
create_info->merge_list.next= &create_info->merge_list.first;
create_info->merge_list.elements=0;
for (open_table=file->open_tables ;
open_table != file->end_table ;
open_table++)
{
TABLE_LIST *ptr;
LEX_STRING db, name;
LINT_INIT(db.str);
if (!(ptr = (TABLE_LIST *) thd->calloc(sizeof(TABLE_LIST))))
goto err;
split_file_name(open_table->table->filename, &db, &name);
if (!(ptr->table_name= thd->strmake(name.str, name.length)))
goto err;
if (db.length && !(ptr->db= thd->strmake(db.str, db.length)))
goto err;
create_info->merge_list.elements++;
(*create_info->merge_list.next) = (uchar*) ptr;
create_info->merge_list.next= (uchar**) &ptr->next_local;
}
*create_info->merge_list.next=0;
}
if (!(create_info->used_fields & HA_CREATE_USED_INSERT_METHOD))
{
create_info->merge_insert_method = file->merge_insert_method;
}
DBUG_VOID_RETURN;
err:
create_info->merge_list.elements=0;
create_info->merge_list.first=0;
DBUG_VOID_RETURN;
}
int ha_myisammrg::create(const char *name, register TABLE *form,
HA_CREATE_INFO *create_info)
{
char buff[FN_REFLEN];
const char **table_names, **pos;
TABLE_LIST *tables= (TABLE_LIST*) create_info->merge_list.first;
THD *thd= current_thd;
size_t dirlgt= dirname_length(name);
DBUG_ENTER("ha_myisammrg::create");
/* Allocate a table_names array in thread mem_root. */
if (!(table_names= (const char**)
thd->alloc((create_info->merge_list.elements+1) * sizeof(char*))))
DBUG_RETURN(HA_ERR_OUT_OF_MEM);
/* Create child path names. */
for (pos= table_names; tables; tables= tables->next_local)
{
const char *table_name;
/*
Construct the path to the MyISAM table. Try to meet two conditions:
1.) Allow to include MyISAM tables from different databases, and
2.) allow for moving DATADIR around in the file system.
The first means that we need paths in the .MRG file. The second
means that we should not have absolute paths in the .MRG file.
The best, we can do, is to use 'mysql_data_home', which is '.'
in mysqld and may be an absolute path in an embedded server.
This means that it might not be possible to move the DATADIR of
an embedded server without changing the paths in the .MRG file.
Do the same even for temporary tables. MERGE children are now
opened through the table cache. They are opened by db.table_name,
not by their path name.
*/
uint length= build_table_filename(buff, sizeof(buff),
tables->db, tables->table_name, "", 0);
/*
If a MyISAM table is in the same directory as the MERGE table,
we use the table name without a path. This means that the
DATADIR can easily be moved even for an embedded server as long
as the MyISAM tables are from the same database as the MERGE table.
*/
if ((dirname_length(buff) == dirlgt) && ! memcmp(buff, name, dirlgt))
table_name= tables->table_name;
else
if (! (table_name= thd->strmake(buff, length)))
DBUG_RETURN(HA_ERR_OUT_OF_MEM); /* purecov: inspected */
*pos++= table_name;
}
*pos=0;
/* Create a MERGE meta file from the table_names array. */
DBUG_RETURN(myrg_create(fn_format(buff,name,"","",
MY_RESOLVE_SYMLINKS|
MY_UNPACK_FILENAME|MY_APPEND_EXT),
table_names,
create_info->merge_insert_method,
(my_bool) 0));
}
void ha_myisammrg::append_create_info(String *packet)
{
const char *current_db;
size_t db_length;
THD *thd= current_thd;
MYRG_TABLE *open_table, *first;
if (file->merge_insert_method != MERGE_INSERT_DISABLED)
{
packet->append(STRING_WITH_LEN(" INSERT_METHOD="));
packet->append(get_type(&merge_insert_method,file->merge_insert_method-1));
}
/*
There is no sence adding UNION clause in case there is no underlying
tables specified.
*/
if (file->open_tables == file->end_table)
return;
packet->append(STRING_WITH_LEN(" UNION=("));
current_db= table->s->db.str;
db_length= table->s->db.length;
for (first=open_table=file->open_tables ;
open_table != file->end_table ;
open_table++)
{
LEX_STRING db, name;
LINT_INIT(db.str);
split_file_name(open_table->table->filename, &db, &name);
if (open_table != first)
packet->append(',');
/* Report database for mapped table if it isn't in current database */
if (db.length &&
(db_length != db.length ||
strncmp(current_db, db.str, db.length)))
{
append_identifier(thd, packet, db.str, db.length);
packet->append('.');
}
append_identifier(thd, packet, name.str, name.length);
}
packet->append(')');
}
bool ha_myisammrg::check_if_incompatible_data(HA_CREATE_INFO *info,
uint table_changes)
{
/*
For myisammrg, we should always re-generate the mapping file as this
is trivial to do
*/
return COMPATIBLE_DATA_NO;
}
int ha_myisammrg::check(THD* thd, HA_CHECK_OPT* check_opt)
{
return HA_ADMIN_OK;
}
ha_rows ha_myisammrg::records()
{
return myrg_records(file);
}
extern int myrg_panic(enum ha_panic_function flag);
int myisammrg_panic(handlerton *hton, ha_panic_function flag)
{
return myrg_panic(flag);
}
static int myisammrg_init(void *p)
{
handlerton *myisammrg_hton;
myisammrg_hton= (handlerton *)p;
myisammrg_hton->db_type= DB_TYPE_MRG_MYISAM;
myisammrg_hton->create= myisammrg_create_handler;
myisammrg_hton->panic= myisammrg_panic;
myisammrg_hton->flags= HTON_NO_PARTITION;
return 0;
}
struct st_mysql_storage_engine myisammrg_storage_engine=
{ MYSQL_HANDLERTON_INTERFACE_VERSION };
mysql_declare_plugin(myisammrg)
{
MYSQL_STORAGE_ENGINE_PLUGIN,
&myisammrg_storage_engine,
"MRG_MYISAM",
"MySQL AB",
"Collection of identical MyISAM tables",
PLUGIN_LICENSE_GPL,
myisammrg_init, /* Plugin Init */
NULL, /* Plugin Deinit */
0x0100, /* 1.0 */
NULL, /* status variables */
NULL, /* system variables */
NULL /* config options */
}
mysql_declare_plugin_end;