mariadb/sql/transaction.cc
Dmitry Lenev afd15c43a9 Implement new type-of-operation-aware metadata locks.
Add a wait-for graph based deadlock detector to the
MDL subsystem.

Fixes bug #46272 "MySQL 5.4.4, new MDL: unnecessary deadlock" and
bug #37346 "innodb does not detect deadlock between update and
alter table".

The first bug manifested itself as an unwarranted abort of a
transaction with ER_LOCK_DEADLOCK error by a concurrent ALTER
statement, when this transaction tried to repeat use of a
table, which it has already used in a similar fashion before
ALTER started.

The second bug showed up as a deadlock between table-level
locks and InnoDB row locks, which was "detected" only after
innodb_lock_wait_timeout timeout.

A transaction would start using the table and modify a few
rows.
Then ALTER TABLE would come in, and start copying rows
into a temporary table. Eventually it would stumble on
the modified records and get blocked on a row lock.
The first transaction would try to do more updates, and get
blocked on thr_lock.c lock.
This situation of circular wait would only get resolved
by a timeout.

Both these bugs stemmed from inadequate solutions to the
problem of deadlocks occurring between different
locking subsystems.

In the first case we tried to avoid deadlocks between metadata
locking and table-level locking subsystems, when upgrading shared
metadata lock to exclusive one.
Transactions holding the shared lock on the table and waiting for
some table-level lock used to be aborted too aggressively.

We also allowed ALTER TABLE to start in presence of transactions
that modify the subject table. ALTER TABLE acquires
TL_WRITE_ALLOW_READ lock at start, and that block all writes
against the table (naturally, we don't want any writes to be lost
when switching the old and the new table). TL_WRITE_ALLOW_READ
lock, in turn, would block the started transaction on thr_lock.c
lock, should they do more updates. This, again, lead to the need
to abort such transactions.

The second bug occurred simply because we didn't have any
mechanism to detect deadlocks between the table-level locks
in thr_lock.c and row-level locks in InnoDB, other than
innodb_lock_wait_timeout.

This patch solves both these problems by moving lock conflicts
which are causing these deadlocks into the metadata locking
subsystem, thus making it possible to avoid or detect such
deadlocks inside MDL.

To do this we introduce new type-of-operation-aware metadata
locks, which allow MDL subsystem to know not only the fact that
transaction has used or is going to use some object but also what
kind of operation it has carried out or going to carry out on the
object.

This, along with the addition of a special kind of upgradable
metadata lock, allows ALTER TABLE to wait until all
transactions which has updated the table to go away.
This solves the second issue.
Another special type of upgradable metadata lock is acquired
by LOCK TABLE WRITE. This second lock type allows to solve the
first issue, since abortion of table-level locks in event of
DDL under LOCK TABLES becomes also unnecessary.

Below follows the list of incompatible changes introduced by
this patch:

- From now on, ALTER TABLE and CREATE/DROP TRIGGER SQL (i.e. those
  statements that acquire TL_WRITE_ALLOW_READ lock)
  wait for all transactions which has *updated* the table to
  complete.

- From now on, LOCK TABLES ... WRITE, REPAIR/OPTIMIZE TABLE
  (i.e. all statements which acquire TL_WRITE table-level lock) wait
  for all transaction which *updated or read* from the table
  to complete.
  As a consequence, innodb_table_locks=0 option no longer applies
  to LOCK TABLES ... WRITE.

- DROP DATABASE, DROP TABLE, RENAME TABLE no longer abort
  statements or transactions which use tables being dropped or
  renamed, and instead wait for these transactions to complete.

- Since LOCK TABLES WRITE now takes a special metadata lock,
  not compatible with with reads or writes against the subject table
  and transaction-wide, thr_lock.c deadlock avoidance algorithm
  that used to ensure absence of deadlocks between LOCK TABLES
  WRITE and other statements is no longer sufficient, even for
  MyISAM. The wait-for graph based deadlock detector of MDL
  subsystem may sometimes be necessary and is involved. This may
  lead to ER_LOCK_DEADLOCK error produced for multi-statement
  transactions even if these only use MyISAM:

  session 1:         session 2:
  begin;

  update t1 ...      lock table t2 write, t1 write;
                     -- gets a lock on t2, blocks on t1

  update t2 ...
  (ER_LOCK_DEADLOCK)

- Finally,  support of LOW_PRIORITY option for LOCK TABLES ... WRITE
  was abandoned.
  LOCK TABLE ... LOW_PRIORITY WRITE from now on has the same
  priority as the usual LOCK TABLE ... WRITE.
  SELECT HIGH PRIORITY no longer trumps LOCK TABLE ... WRITE  in
  the wait queue.

- We do not take upgradable metadata locks on implicitly
  locked tables. So if one has, say, a view v1 that uses
  table t1, and issues:
  LOCK TABLE v1 WRITE;
  FLUSH TABLE t1; -- (or just 'FLUSH TABLES'),
  an error is produced.
  In order to be able to perform DDL on a table under LOCK TABLES,
  the table must be locked explicitly in the LOCK TABLES list.
2010-02-01 14:43:06 +03:00

670 lines
16 KiB
C++

/* Copyright (C) 2008 Sun/MySQL
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#ifdef USE_PRAGMA_IMPLEMENTATION
#pragma implementation // gcc: Class implementation
#endif
#include "mysql_priv.h"
#include "transaction.h"
#include "rpl_handler.h"
/* Conditions under which the transaction state must not change. */
static bool trans_check(THD *thd)
{
enum xa_states xa_state= thd->transaction.xid_state.xa_state;
DBUG_ENTER("trans_check");
if (unlikely(thd->in_sub_stmt))
my_error(ER_COMMIT_NOT_ALLOWED_IN_SF_OR_TRG, MYF(0));
if (xa_state != XA_NOTR)
my_error(ER_XAER_RMFAIL, MYF(0), xa_state_names[xa_state]);
else
DBUG_RETURN(FALSE);
DBUG_RETURN(TRUE);
}
/**
Mark a XA transaction as rollback-only if the RM unilaterally
rolled back the transaction branch.
@note If a rollback was requested by the RM, this function sets
the appropriate rollback error code and transits the state
to XA_ROLLBACK_ONLY.
@return TRUE if transaction was rolled back or if the transaction
state is XA_ROLLBACK_ONLY. FALSE otherwise.
*/
static bool xa_trans_rolled_back(XID_STATE *xid_state)
{
if (xid_state->rm_error)
{
switch (xid_state->rm_error) {
case ER_LOCK_WAIT_TIMEOUT:
my_error(ER_XA_RBTIMEOUT, MYF(0));
break;
case ER_LOCK_DEADLOCK:
my_error(ER_XA_RBDEADLOCK, MYF(0));
break;
default:
my_error(ER_XA_RBROLLBACK, MYF(0));
}
xid_state->xa_state= XA_ROLLBACK_ONLY;
}
return (xid_state->xa_state == XA_ROLLBACK_ONLY);
}
/**
Begin a new transaction.
@note Beginning a transaction implicitly commits any current
transaction and releases existing locks.
@param thd Current thread
@param flags Transaction flags
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_begin(THD *thd, uint flags)
{
int res= FALSE;
DBUG_ENTER("trans_begin");
if (trans_check(thd))
DBUG_RETURN(TRUE);
thd->locked_tables_list.unlock_locked_tables(thd);
DBUG_ASSERT(!thd->locked_tables_mode);
if (trans_commit_implicit(thd))
DBUG_RETURN(TRUE);
/*
Release transactional metadata locks only after the
transaction has been committed.
*/
thd->mdl_context.release_transactional_locks();
thd->options|= OPTION_BEGIN;
thd->server_status|= SERVER_STATUS_IN_TRANS;
if (flags & MYSQL_START_TRANS_OPT_WITH_CONS_SNAPSHOT)
res= ha_start_consistent_snapshot(thd);
DBUG_RETURN(test(res));
}
/**
Commit the current transaction, making its changes permanent.
@param thd Current thread
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_commit(THD *thd)
{
int res;
DBUG_ENTER("trans_commit");
if (trans_check(thd))
DBUG_RETURN(TRUE);
thd->server_status&= ~SERVER_STATUS_IN_TRANS;
res= ha_commit_trans(thd, TRUE);
if (res)
/*
if res is non-zero, then ha_commit_trans has rolled back the
transaction, so the hooks for rollback will be called.
*/
RUN_HOOK(transaction, after_rollback, (thd, FALSE));
else
RUN_HOOK(transaction, after_commit, (thd, FALSE));
thd->options&= ~(OPTION_BEGIN | OPTION_KEEP_LOG);
thd->transaction.all.modified_non_trans_table= FALSE;
thd->lex->start_transaction_opt= 0;
DBUG_RETURN(test(res));
}
/**
Implicitly commit the current transaction.
@note A implicit commit does not releases existing table locks.
@param thd Current thread
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_commit_implicit(THD *thd)
{
bool res= FALSE;
DBUG_ENTER("trans_commit_implicit");
if (trans_check(thd))
DBUG_RETURN(TRUE);
if (thd->in_multi_stmt_transaction() || (thd->options & OPTION_TABLE_LOCK))
{
/* Safety if one did "drop table" on locked tables */
if (!thd->locked_tables_mode)
thd->options&= ~OPTION_TABLE_LOCK;
thd->server_status&= ~SERVER_STATUS_IN_TRANS;
res= test(ha_commit_trans(thd, TRUE));
}
thd->options&= ~(OPTION_BEGIN | OPTION_KEEP_LOG);
thd->transaction.all.modified_non_trans_table= FALSE;
DBUG_RETURN(res);
}
/**
Rollback the current transaction, canceling its changes.
@param thd Current thread
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_rollback(THD *thd)
{
int res;
DBUG_ENTER("trans_rollback");
if (trans_check(thd))
DBUG_RETURN(TRUE);
thd->server_status&= ~SERVER_STATUS_IN_TRANS;
res= ha_rollback_trans(thd, TRUE);
RUN_HOOK(transaction, after_rollback, (thd, FALSE));
thd->options&= ~(OPTION_BEGIN | OPTION_KEEP_LOG);
thd->transaction.all.modified_non_trans_table= FALSE;
thd->lex->start_transaction_opt= 0;
DBUG_RETURN(test(res));
}
/**
Commit the single statement transaction.
@note Note that if the autocommit is on, then the following call
inside InnoDB will commit or rollback the whole transaction
(= the statement). The autocommit mechanism built into InnoDB
is based on counting locks, but if the user has used LOCK
TABLES then that mechanism does not know to do the commit.
@param thd Current thread
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_commit_stmt(THD *thd)
{
DBUG_ENTER("trans_commit_stmt");
int res= FALSE;
if (thd->transaction.stmt.ha_list)
res= ha_commit_trans(thd, FALSE);
if (res)
/*
if res is non-zero, then ha_commit_trans has rolled back the
transaction, so the hooks for rollback will be called.
*/
RUN_HOOK(transaction, after_rollback, (thd, FALSE));
else
RUN_HOOK(transaction, after_commit, (thd, FALSE));
DBUG_RETURN(test(res));
}
/**
Rollback the single statement transaction.
@param thd Current thread
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_rollback_stmt(THD *thd)
{
DBUG_ENTER("trans_rollback_stmt");
if (thd->transaction.stmt.ha_list)
{
ha_rollback_trans(thd, FALSE);
if (thd->transaction_rollback_request && !thd->in_sub_stmt)
ha_rollback_trans(thd, TRUE);
}
RUN_HOOK(transaction, after_rollback, (thd, FALSE));
DBUG_RETURN(FALSE);
}
/* Find a named savepoint in the current transaction. */
static SAVEPOINT **
find_savepoint(THD *thd, LEX_STRING name)
{
SAVEPOINT **sv= &thd->transaction.savepoints;
while (*sv)
{
if (my_strnncoll(system_charset_info, (uchar *) name.str, name.length,
(uchar *) (*sv)->name, (*sv)->length) == 0)
break;
sv= &(*sv)->prev;
}
return sv;
}
/**
Set a named transaction savepoint.
@param thd Current thread
@param name Savepoint name
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_savepoint(THD *thd, LEX_STRING name)
{
SAVEPOINT **sv, *newsv;
DBUG_ENTER("trans_savepoint");
if (!(thd->in_multi_stmt_transaction() || thd->in_sub_stmt) ||
!opt_using_transactions)
DBUG_RETURN(FALSE);
sv= find_savepoint(thd, name);
if (*sv) /* old savepoint of the same name exists */
{
newsv= *sv;
ha_release_savepoint(thd, *sv);
*sv= (*sv)->prev;
}
else if ((newsv= (SAVEPOINT *) alloc_root(&thd->transaction.mem_root,
savepoint_alloc_size)) == NULL)
{
my_error(ER_OUT_OF_RESOURCES, MYF(0));
DBUG_RETURN(TRUE);
}
newsv->name= strmake_root(&thd->transaction.mem_root, name.str, name.length);
newsv->length= name.length;
/*
if we'll get an error here, don't add new savepoint to the list.
we'll lose a little bit of memory in transaction mem_root, but it'll
be free'd when transaction ends anyway
*/
if (ha_savepoint(thd, newsv))
DBUG_RETURN(TRUE);
newsv->prev= thd->transaction.savepoints;
thd->transaction.savepoints= newsv;
/*
Remember the last acquired lock before the savepoint was set.
This is used as a marker to only release locks acquired after
the setting of this savepoint.
Note: this works just fine if we're under LOCK TABLES,
since mdl_savepoint() is guaranteed to be beyond
the last locked table. This allows to release some
locks acquired during LOCK TABLES.
*/
newsv->mdl_savepoint = thd->mdl_context.mdl_savepoint();
DBUG_RETURN(FALSE);
}
/**
Rollback a transaction to the named savepoint.
@note Modifications that the current transaction made to
rows after the savepoint was set are undone in the
rollback.
@note Savepoints that were set at a later time than the
named savepoint are deleted.
@param thd Current thread
@param name Savepoint name
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_rollback_to_savepoint(THD *thd, LEX_STRING name)
{
int res= FALSE;
SAVEPOINT *sv= *find_savepoint(thd, name);
DBUG_ENTER("trans_rollback_to_savepoint");
if (sv == NULL)
{
my_error(ER_SP_DOES_NOT_EXIST, MYF(0), "SAVEPOINT", name.str);
DBUG_RETURN(TRUE);
}
if (ha_rollback_to_savepoint(thd, sv))
res= TRUE;
else if (((thd->options & OPTION_KEEP_LOG) ||
thd->transaction.all.modified_non_trans_table) &&
!thd->slave_thread)
push_warning(thd, MYSQL_ERROR::WARN_LEVEL_WARN,
ER_WARNING_NOT_COMPLETE_ROLLBACK,
ER(ER_WARNING_NOT_COMPLETE_ROLLBACK));
thd->transaction.savepoints= sv;
/*
Release metadata locks that were acquired during this savepoint unit.
*/
if (!res)
thd->mdl_context.rollback_to_savepoint(sv->mdl_savepoint);
DBUG_RETURN(test(res));
}
/**
Remove the named savepoint from the set of savepoints of
the current transaction.
@note No commit or rollback occurs. It is an error if the
savepoint does not exist.
@param thd Current thread
@param name Savepoint name
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_release_savepoint(THD *thd, LEX_STRING name)
{
int res= FALSE;
SAVEPOINT *sv= *find_savepoint(thd, name);
DBUG_ENTER("trans_release_savepoint");
if (sv == NULL)
{
my_error(ER_SP_DOES_NOT_EXIST, MYF(0), "SAVEPOINT", name.str);
DBUG_RETURN(TRUE);
}
if (ha_release_savepoint(thd, sv))
res= TRUE;
thd->transaction.savepoints= sv->prev;
DBUG_RETURN(test(res));
}
/**
Starts an XA transaction with the given xid value.
@param thd Current thread
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_xa_start(THD *thd)
{
enum xa_states xa_state= thd->transaction.xid_state.xa_state;
DBUG_ENTER("trans_xa_start");
if (xa_state == XA_IDLE && thd->lex->xa_opt == XA_RESUME)
{
bool not_equal= !thd->transaction.xid_state.xid.eq(thd->lex->xid);
if (not_equal)
my_error(ER_XAER_NOTA, MYF(0));
else
thd->transaction.xid_state.xa_state= XA_ACTIVE;
DBUG_RETURN(not_equal);
}
/* TODO: JOIN is not supported yet. */
if (thd->lex->xa_opt != XA_NONE)
my_error(ER_XAER_INVAL, MYF(0));
else if (xa_state != XA_NOTR)
my_error(ER_XAER_RMFAIL, MYF(0), xa_state_names[xa_state]);
else if (thd->locked_tables_mode || thd->active_transaction())
my_error(ER_XAER_OUTSIDE, MYF(0));
else if (xid_cache_search(thd->lex->xid))
my_error(ER_XAER_DUPID, MYF(0));
else if (!trans_begin(thd))
{
DBUG_ASSERT(thd->transaction.xid_state.xid.is_null());
thd->transaction.xid_state.xa_state= XA_ACTIVE;
thd->transaction.xid_state.rm_error= 0;
thd->transaction.xid_state.xid.set(thd->lex->xid);
xid_cache_insert(&thd->transaction.xid_state);
DBUG_RETURN(FALSE);
}
DBUG_RETURN(TRUE);
}
/**
Put a XA transaction in the IDLE state.
@param thd Current thread
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_xa_end(THD *thd)
{
DBUG_ENTER("trans_xa_end");
/* TODO: SUSPEND and FOR MIGRATE are not supported yet. */
if (thd->lex->xa_opt != XA_NONE)
my_error(ER_XAER_INVAL, MYF(0));
else if (thd->transaction.xid_state.xa_state != XA_ACTIVE)
my_error(ER_XAER_RMFAIL, MYF(0),
xa_state_names[thd->transaction.xid_state.xa_state]);
else if (!thd->transaction.xid_state.xid.eq(thd->lex->xid))
my_error(ER_XAER_NOTA, MYF(0));
else if (!xa_trans_rolled_back(&thd->transaction.xid_state))
thd->transaction.xid_state.xa_state= XA_IDLE;
DBUG_RETURN(thd->transaction.xid_state.xa_state != XA_IDLE);
}
/**
Put a XA transaction in the PREPARED state.
@param thd Current thread
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_xa_prepare(THD *thd)
{
DBUG_ENTER("trans_xa_prepare");
if (thd->transaction.xid_state.xa_state != XA_IDLE)
my_error(ER_XAER_RMFAIL, MYF(0),
xa_state_names[thd->transaction.xid_state.xa_state]);
else if (!thd->transaction.xid_state.xid.eq(thd->lex->xid))
my_error(ER_XAER_NOTA, MYF(0));
else if (ha_prepare(thd))
{
xid_cache_delete(&thd->transaction.xid_state);
thd->transaction.xid_state.xa_state= XA_NOTR;
my_error(ER_XA_RBROLLBACK, MYF(0));
}
else
thd->transaction.xid_state.xa_state= XA_PREPARED;
DBUG_RETURN(thd->transaction.xid_state.xa_state != XA_PREPARED);
}
/**
Commit and terminate the a XA transaction.
@param thd Current thread
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_xa_commit(THD *thd)
{
bool res= TRUE;
enum xa_states xa_state= thd->transaction.xid_state.xa_state;
DBUG_ENTER("trans_xa_commit");
if (!thd->transaction.xid_state.xid.eq(thd->lex->xid))
{
XID_STATE *xs= xid_cache_search(thd->lex->xid);
res= !xs || xs->in_thd;
if (res)
my_error(ER_XAER_NOTA, MYF(0));
else
{
res= xa_trans_rolled_back(xs);
ha_commit_or_rollback_by_xid(thd->lex->xid, !res);
xid_cache_delete(xs);
}
DBUG_RETURN(res);
}
if (xa_trans_rolled_back(&thd->transaction.xid_state))
{
if ((res= test(ha_rollback_trans(thd, TRUE))))
my_error(ER_XAER_RMERR, MYF(0));
}
else if (xa_state == XA_IDLE && thd->lex->xa_opt == XA_ONE_PHASE)
{
int r= ha_commit_trans(thd, TRUE);
if ((res= test(r)))
my_error(r == 1 ? ER_XA_RBROLLBACK : ER_XAER_RMERR, MYF(0));
}
else if (xa_state == XA_PREPARED && thd->lex->xa_opt == XA_NONE)
{
if (thd->global_read_lock.wait_if_global_read_lock(thd, FALSE, FALSE))
{
ha_rollback_trans(thd, TRUE);
my_error(ER_XAER_RMERR, MYF(0));
}
else
{
res= test(ha_commit_one_phase(thd, 1));
if (res)
my_error(ER_XAER_RMERR, MYF(0));
thd->global_read_lock.start_waiting_global_read_lock(thd);
}
}
else
{
my_error(ER_XAER_RMFAIL, MYF(0), xa_state_names[xa_state]);
DBUG_RETURN(TRUE);
}
thd->options&= ~(OPTION_BEGIN | OPTION_KEEP_LOG);
thd->transaction.all.modified_non_trans_table= FALSE;
thd->server_status&= ~SERVER_STATUS_IN_TRANS;
xid_cache_delete(&thd->transaction.xid_state);
thd->transaction.xid_state.xa_state= XA_NOTR;
DBUG_RETURN(res);
}
/**
Roll back and terminate a XA transaction.
@param thd Current thread
@retval FALSE Success
@retval TRUE Failure
*/
bool trans_xa_rollback(THD *thd)
{
bool res= TRUE;
enum xa_states xa_state= thd->transaction.xid_state.xa_state;
DBUG_ENTER("trans_xa_rollback");
if (!thd->transaction.xid_state.xid.eq(thd->lex->xid))
{
XID_STATE *xs= xid_cache_search(thd->lex->xid);
if (!xs || xs->in_thd)
my_error(ER_XAER_NOTA, MYF(0));
else
{
xa_trans_rolled_back(xs);
ha_commit_or_rollback_by_xid(thd->lex->xid, 0);
xid_cache_delete(xs);
}
DBUG_RETURN(thd->stmt_da->is_error());
}
if (xa_state != XA_IDLE && xa_state != XA_PREPARED && xa_state != XA_ROLLBACK_ONLY)
{
my_error(ER_XAER_RMFAIL, MYF(0), xa_state_names[xa_state]);
DBUG_RETURN(TRUE);
}
/*
Resource Manager error is meaningless at this point, as we perform
explicit rollback request by user. We must reset rm_error before
calling ha_rollback(), so thd->transaction.xid structure gets reset
by ha_rollback()/THD::transaction::cleanup().
*/
thd->transaction.xid_state.rm_error= 0;
if ((res= test(ha_rollback_trans(thd, TRUE))))
my_error(ER_XAER_RMERR, MYF(0));
thd->options&= ~(OPTION_BEGIN | OPTION_KEEP_LOG);
thd->transaction.all.modified_non_trans_table= FALSE;
thd->server_status&= ~SERVER_STATUS_IN_TRANS;
xid_cache_delete(&thd->transaction.xid_state);
thd->transaction.xid_state.xa_state= XA_NOTR;
DBUG_RETURN(res);
}