mariadb/sql/sql_delete.cc
Dmitry Lenev afd15c43a9 Implement new type-of-operation-aware metadata locks.
Add a wait-for graph based deadlock detector to the
MDL subsystem.

Fixes bug #46272 "MySQL 5.4.4, new MDL: unnecessary deadlock" and
bug #37346 "innodb does not detect deadlock between update and
alter table".

The first bug manifested itself as an unwarranted abort of a
transaction with ER_LOCK_DEADLOCK error by a concurrent ALTER
statement, when this transaction tried to repeat use of a
table, which it has already used in a similar fashion before
ALTER started.

The second bug showed up as a deadlock between table-level
locks and InnoDB row locks, which was "detected" only after
innodb_lock_wait_timeout timeout.

A transaction would start using the table and modify a few
rows.
Then ALTER TABLE would come in, and start copying rows
into a temporary table. Eventually it would stumble on
the modified records and get blocked on a row lock.
The first transaction would try to do more updates, and get
blocked on thr_lock.c lock.
This situation of circular wait would only get resolved
by a timeout.

Both these bugs stemmed from inadequate solutions to the
problem of deadlocks occurring between different
locking subsystems.

In the first case we tried to avoid deadlocks between metadata
locking and table-level locking subsystems, when upgrading shared
metadata lock to exclusive one.
Transactions holding the shared lock on the table and waiting for
some table-level lock used to be aborted too aggressively.

We also allowed ALTER TABLE to start in presence of transactions
that modify the subject table. ALTER TABLE acquires
TL_WRITE_ALLOW_READ lock at start, and that block all writes
against the table (naturally, we don't want any writes to be lost
when switching the old and the new table). TL_WRITE_ALLOW_READ
lock, in turn, would block the started transaction on thr_lock.c
lock, should they do more updates. This, again, lead to the need
to abort such transactions.

The second bug occurred simply because we didn't have any
mechanism to detect deadlocks between the table-level locks
in thr_lock.c and row-level locks in InnoDB, other than
innodb_lock_wait_timeout.

This patch solves both these problems by moving lock conflicts
which are causing these deadlocks into the metadata locking
subsystem, thus making it possible to avoid or detect such
deadlocks inside MDL.

To do this we introduce new type-of-operation-aware metadata
locks, which allow MDL subsystem to know not only the fact that
transaction has used or is going to use some object but also what
kind of operation it has carried out or going to carry out on the
object.

This, along with the addition of a special kind of upgradable
metadata lock, allows ALTER TABLE to wait until all
transactions which has updated the table to go away.
This solves the second issue.
Another special type of upgradable metadata lock is acquired
by LOCK TABLE WRITE. This second lock type allows to solve the
first issue, since abortion of table-level locks in event of
DDL under LOCK TABLES becomes also unnecessary.

Below follows the list of incompatible changes introduced by
this patch:

- From now on, ALTER TABLE and CREATE/DROP TRIGGER SQL (i.e. those
  statements that acquire TL_WRITE_ALLOW_READ lock)
  wait for all transactions which has *updated* the table to
  complete.

- From now on, LOCK TABLES ... WRITE, REPAIR/OPTIMIZE TABLE
  (i.e. all statements which acquire TL_WRITE table-level lock) wait
  for all transaction which *updated or read* from the table
  to complete.
  As a consequence, innodb_table_locks=0 option no longer applies
  to LOCK TABLES ... WRITE.

- DROP DATABASE, DROP TABLE, RENAME TABLE no longer abort
  statements or transactions which use tables being dropped or
  renamed, and instead wait for these transactions to complete.

- Since LOCK TABLES WRITE now takes a special metadata lock,
  not compatible with with reads or writes against the subject table
  and transaction-wide, thr_lock.c deadlock avoidance algorithm
  that used to ensure absence of deadlocks between LOCK TABLES
  WRITE and other statements is no longer sufficient, even for
  MyISAM. The wait-for graph based deadlock detector of MDL
  subsystem may sometimes be necessary and is involved. This may
  lead to ER_LOCK_DEADLOCK error produced for multi-statement
  transactions even if these only use MyISAM:

  session 1:         session 2:
  begin;

  update t1 ...      lock table t2 write, t1 write;
                     -- gets a lock on t2, blocks on t1

  update t2 ...
  (ER_LOCK_DEADLOCK)

- Finally,  support of LOW_PRIORITY option for LOCK TABLES ... WRITE
  was abandoned.
  LOCK TABLE ... LOW_PRIORITY WRITE from now on has the same
  priority as the usual LOCK TABLE ... WRITE.
  SELECT HIGH PRIORITY no longer trumps LOCK TABLE ... WRITE  in
  the wait queue.

- We do not take upgradable metadata locks on implicitly
  locked tables. So if one has, say, a view v1 that uses
  table t1, and issues:
  LOCK TABLE v1 WRITE;
  FLUSH TABLE t1; -- (or just 'FLUSH TABLES'),
  an error is produced.
  In order to be able to perform DDL on a table under LOCK TABLES,
  the table must be locked explicitly in the LOCK TABLES list.
2010-02-01 14:43:06 +03:00

1271 lines
38 KiB
C++

/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
Delete of records and truncate of tables.
Multi-table deletes were introduced by Monty and Sinisa
*/
#include "mysql_priv.h"
#include "sql_select.h"
#include "sp_head.h"
#include "sql_trigger.h"
#include "transaction.h"
/**
Implement DELETE SQL word.
@note Like implementations of other DDL/DML in MySQL, this function
relies on the caller to close the thread tables. This is done in the
end of dispatch_command().
*/
bool mysql_delete(THD *thd, TABLE_LIST *table_list, COND *conds,
SQL_LIST *order, ha_rows limit, ulonglong options,
bool reset_auto_increment)
{
bool will_batch;
int error, loc_error;
TABLE *table;
SQL_SELECT *select=0;
READ_RECORD info;
bool using_limit=limit != HA_POS_ERROR;
bool transactional_table, safe_update, const_cond;
bool const_cond_result;
ha_rows deleted= 0;
bool triggers_applicable;
uint usable_index= MAX_KEY;
SELECT_LEX *select_lex= &thd->lex->select_lex;
THD::killed_state killed_status= THD::NOT_KILLED;
DBUG_ENTER("mysql_delete");
THD::enum_binlog_query_type query_type=
thd->lex->sql_command == SQLCOM_TRUNCATE ?
THD::STMT_QUERY_TYPE :
THD::ROW_QUERY_TYPE;
if (open_and_lock_tables(thd, table_list))
DBUG_RETURN(TRUE);
if (!(table= table_list->table))
{
my_error(ER_VIEW_DELETE_MERGE_VIEW, MYF(0),
table_list->view_db.str, table_list->view_name.str);
DBUG_RETURN(TRUE);
}
thd_proc_info(thd, "init");
table->map=1;
if (mysql_prepare_delete(thd, table_list, &conds))
DBUG_RETURN(TRUE);
/* check ORDER BY even if it can be ignored */
if (order && order->elements)
{
TABLE_LIST tables;
List<Item> fields;
List<Item> all_fields;
bzero((char*) &tables,sizeof(tables));
tables.table = table;
tables.alias = table_list->alias;
if (select_lex->setup_ref_array(thd, order->elements) ||
setup_order(thd, select_lex->ref_pointer_array, &tables,
fields, all_fields, (ORDER*) order->first))
{
delete select;
free_underlaid_joins(thd, &thd->lex->select_lex);
DBUG_RETURN(TRUE);
}
}
const_cond= (!conds || conds->const_item());
safe_update=test(thd->options & OPTION_SAFE_UPDATES);
if (safe_update && const_cond)
{
my_message(ER_UPDATE_WITHOUT_KEY_IN_SAFE_MODE,
ER(ER_UPDATE_WITHOUT_KEY_IN_SAFE_MODE), MYF(0));
DBUG_RETURN(TRUE);
}
select_lex->no_error= thd->lex->ignore;
const_cond_result= const_cond && (!conds || conds->val_int());
if (thd->is_error())
{
/* Error evaluating val_int(). */
DBUG_RETURN(TRUE);
}
/*
Test if the user wants to delete all rows and deletion doesn't have
any side-effects (because of triggers), so we can use optimized
handler::delete_all_rows() method.
We implement fast TRUNCATE for InnoDB even if triggers are
present. TRUNCATE ignores triggers.
We can use delete_all_rows() if and only if:
- We allow new functions (not using option --skip-new), and are
not in safe mode (not using option --safe-mode)
- There is no limit clause
- The condition is constant
- If there is a condition, then it it produces a non-zero value
- If the current command is DELETE FROM with no where clause
(i.e., not TRUNCATE) then:
- We should not be binlogging this statement row-based, and
- there should be no delete triggers associated with the table.
*/
if (!using_limit && const_cond_result &&
!(specialflag & (SPECIAL_NO_NEW_FUNC | SPECIAL_SAFE_MODE)) &&
(thd->lex->sql_command == SQLCOM_TRUNCATE ||
(!thd->current_stmt_binlog_row_based &&
!(table->triggers && table->triggers->has_delete_triggers()))))
{
/* Update the table->file->stats.records number */
table->file->info(HA_STATUS_VARIABLE | HA_STATUS_NO_LOCK);
ha_rows const maybe_deleted= table->file->stats.records;
DBUG_PRINT("debug", ("Trying to use delete_all_rows()"));
if (!(error=table->file->ha_delete_all_rows()))
{
/*
If delete_all_rows() is used, it is not possible to log the
query in row format, so we have to log it in statement format.
*/
query_type= THD::STMT_QUERY_TYPE;
error= -1; // ok
deleted= maybe_deleted;
goto cleanup;
}
if (error != HA_ERR_WRONG_COMMAND)
{
table->file->print_error(error,MYF(0));
error=0;
goto cleanup;
}
/* Handler didn't support fast delete; Delete rows one by one */
}
if (conds)
{
Item::cond_result result;
conds= remove_eq_conds(thd, conds, &result);
if (result == Item::COND_FALSE) // Impossible where
limit= 0;
}
#ifdef WITH_PARTITION_STORAGE_ENGINE
if (prune_partitions(thd, table, conds))
{
free_underlaid_joins(thd, select_lex);
thd->row_count_func= 0;
my_ok(thd, (ha_rows) thd->row_count_func); // No matching records
DBUG_RETURN(0);
}
#endif
/* Update the table->file->stats.records number */
table->file->info(HA_STATUS_VARIABLE | HA_STATUS_NO_LOCK);
table->covering_keys.clear_all();
table->quick_keys.clear_all(); // Can't use 'only index'
select=make_select(table, 0, 0, conds, 0, &error);
if (error)
DBUG_RETURN(TRUE);
if ((select && select->check_quick(thd, safe_update, limit)) || !limit)
{
delete select;
free_underlaid_joins(thd, select_lex);
thd->row_count_func= 0;
/*
Error was already created by quick select evaluation (check_quick()).
TODO: Add error code output parameter to Item::val_xxx() methods.
Currently they rely on the user checking DA for
errors when unwinding the stack after calling Item::val_xxx().
*/
if (thd->is_error())
DBUG_RETURN(TRUE);
my_ok(thd, (ha_rows) thd->row_count_func);
/*
We don't need to call reset_auto_increment in this case, because
mysql_truncate always gives a NULL conds argument, hence we never
get here.
*/
DBUG_RETURN(0); // Nothing to delete
}
/* If running in safe sql mode, don't allow updates without keys */
if (table->quick_keys.is_clear_all())
{
thd->server_status|=SERVER_QUERY_NO_INDEX_USED;
if (safe_update && !using_limit)
{
delete select;
free_underlaid_joins(thd, select_lex);
my_message(ER_UPDATE_WITHOUT_KEY_IN_SAFE_MODE,
ER(ER_UPDATE_WITHOUT_KEY_IN_SAFE_MODE), MYF(0));
DBUG_RETURN(TRUE);
}
}
if (options & OPTION_QUICK)
(void) table->file->extra(HA_EXTRA_QUICK);
if (order && order->elements)
{
uint length= 0;
SORT_FIELD *sortorder;
ha_rows examined_rows;
if ((!select || table->quick_keys.is_clear_all()) && limit != HA_POS_ERROR)
usable_index= get_index_for_order(table, (ORDER*)(order->first), limit);
if (usable_index == MAX_KEY)
{
table->sort.io_cache= (IO_CACHE *) my_malloc(sizeof(IO_CACHE),
MYF(MY_FAE | MY_ZEROFILL));
if (!(sortorder= make_unireg_sortorder((ORDER*) order->first,
&length, NULL)) ||
(table->sort.found_records = filesort(thd, table, sortorder, length,
select, HA_POS_ERROR, 1,
&examined_rows))
== HA_POS_ERROR)
{
delete select;
free_underlaid_joins(thd, &thd->lex->select_lex);
DBUG_RETURN(TRUE);
}
/*
Filesort has already found and selected the rows we want to delete,
so we don't need the where clause
*/
delete select;
free_underlaid_joins(thd, select_lex);
select= 0;
}
}
/* If quick select is used, initialize it before retrieving rows. */
if (select && select->quick && select->quick->reset())
{
delete select;
free_underlaid_joins(thd, select_lex);
DBUG_RETURN(TRUE);
}
if (usable_index==MAX_KEY)
init_read_record(&info, thd, table, select, 1, 1, FALSE);
else
init_read_record_idx(&info, thd, table, 1, usable_index);
init_ftfuncs(thd, select_lex, 1);
thd_proc_info(thd, "updating");
/* NOTE: TRUNCATE must not invoke triggers. */
triggers_applicable= table->triggers &&
thd->lex->sql_command != SQLCOM_TRUNCATE;
if (triggers_applicable &&
table->triggers->has_triggers(TRG_EVENT_DELETE,
TRG_ACTION_AFTER))
{
/*
The table has AFTER DELETE triggers that might access to subject table
and therefore might need delete to be done immediately. So we turn-off
the batching.
*/
(void) table->file->extra(HA_EXTRA_DELETE_CANNOT_BATCH);
will_batch= FALSE;
}
else
will_batch= !table->file->start_bulk_delete();
table->mark_columns_needed_for_delete();
while (!(error=info.read_record(&info)) && !thd->killed &&
! thd->is_error())
{
// thd->is_error() is tested to disallow delete row on error
if (!(select && select->skip_record())&& ! thd->is_error() )
{
if (triggers_applicable &&
table->triggers->process_triggers(thd, TRG_EVENT_DELETE,
TRG_ACTION_BEFORE, FALSE))
{
error= 1;
break;
}
if (!(error= table->file->ha_delete_row(table->record[0])))
{
deleted++;
if (triggers_applicable &&
table->triggers->process_triggers(thd, TRG_EVENT_DELETE,
TRG_ACTION_AFTER, FALSE))
{
error= 1;
break;
}
if (!--limit && using_limit)
{
error= -1;
break;
}
}
else
{
table->file->print_error(error,MYF(0));
/*
In < 4.0.14 we set the error number to 0 here, but that
was not sensible, because then MySQL would not roll back the
failed DELETE, and also wrote it to the binlog. For MyISAM
tables a DELETE probably never should fail (?), but for
InnoDB it can fail in a FOREIGN KEY error or an
out-of-tablespace error.
*/
error= 1;
break;
}
}
else
table->file->unlock_row(); // Row failed selection, release lock on it
}
killed_status= thd->killed;
if (killed_status != THD::NOT_KILLED || thd->is_error())
error= 1; // Aborted
if (will_batch && (loc_error= table->file->end_bulk_delete()))
{
if (error != 1)
table->file->print_error(loc_error,MYF(0));
error=1;
}
thd_proc_info(thd, "end");
end_read_record(&info);
if (options & OPTION_QUICK)
(void) table->file->extra(HA_EXTRA_NORMAL);
if (reset_auto_increment && (error < 0))
{
/*
We're really doing a truncate and need to reset the table's
auto-increment counter.
*/
int error2= table->file->ha_reset_auto_increment(0);
if (error2 && (error2 != HA_ERR_WRONG_COMMAND))
{
table->file->print_error(error2, MYF(0));
error= 1;
}
}
cleanup:
/*
Invalidate the table in the query cache if something changed. This must
be before binlog writing and ha_autocommit_...
*/
if (deleted)
{
query_cache_invalidate3(thd, table_list, 1);
}
delete select;
transactional_table= table->file->has_transactions();
if (!transactional_table && deleted > 0)
thd->transaction.stmt.modified_non_trans_table= TRUE;
/* See similar binlogging code in sql_update.cc, for comments */
if ((error < 0) || thd->transaction.stmt.modified_non_trans_table)
{
if (mysql_bin_log.is_open())
{
bool const is_trans=
thd->lex->sql_command == SQLCOM_TRUNCATE ?
FALSE :
transactional_table;
int errcode= 0;
if (error < 0)
thd->clear_error();
else
errcode= query_error_code(thd, killed_status == THD::NOT_KILLED);
/*
[binlog]: If 'handler::delete_all_rows()' was called and the
storage engine does not inject the rows itself, we replicate
statement-based; otherwise, 'ha_delete_row()' was used to
delete specific rows which we might log row-based.
Note that TRUNCATE TABLE is not transactional and should
therefore be treated as a DDL.
*/
int log_result= thd->binlog_query(query_type,
thd->query(), thd->query_length(),
is_trans, FALSE, errcode);
if (log_result)
{
error=1;
}
}
if (thd->transaction.stmt.modified_non_trans_table)
thd->transaction.all.modified_non_trans_table= TRUE;
}
DBUG_ASSERT(transactional_table || !deleted || thd->transaction.stmt.modified_non_trans_table);
free_underlaid_joins(thd, select_lex);
if (error < 0 || (thd->lex->ignore && !thd->is_fatal_error))
{
/*
If a TRUNCATE TABLE was issued, the number of rows should be reported as
zero since the exact number is unknown.
*/
thd->row_count_func= reset_auto_increment ? 0 : deleted;
my_ok(thd, (ha_rows) thd->row_count_func);
DBUG_PRINT("info",("%ld records deleted",(long) deleted));
}
DBUG_RETURN(error >= 0 || thd->is_error());
}
/*
Prepare items in DELETE statement
SYNOPSIS
mysql_prepare_delete()
thd - thread handler
table_list - global/local table list
conds - conditions
RETURN VALUE
FALSE OK
TRUE error
*/
int mysql_prepare_delete(THD *thd, TABLE_LIST *table_list, Item **conds)
{
Item *fake_conds= 0;
SELECT_LEX *select_lex= &thd->lex->select_lex;
DBUG_ENTER("mysql_prepare_delete");
List<Item> all_fields;
/*
Statement-based replication of DELETE ... LIMIT is not safe as order of
rows is not defined, so in mixed mode we go to row-based.
Note that we may consider a statement as safe if ORDER BY primary_key
is present. However it may confuse users to see very similiar statements
replicated differently.
*/
if (thd->lex->current_select->select_limit)
{
thd->lex->set_stmt_unsafe();
thd->set_current_stmt_binlog_row_based_if_mixed();
}
thd->lex->allow_sum_func= 0;
if (setup_tables_and_check_access(thd, &thd->lex->select_lex.context,
&thd->lex->select_lex.top_join_list,
table_list,
&select_lex->leaf_tables, FALSE,
DELETE_ACL, SELECT_ACL) ||
setup_conds(thd, table_list, select_lex->leaf_tables, conds) ||
setup_ftfuncs(select_lex))
DBUG_RETURN(TRUE);
if (!table_list->updatable || check_key_in_view(thd, table_list))
{
my_error(ER_NON_UPDATABLE_TABLE, MYF(0), table_list->alias, "DELETE");
DBUG_RETURN(TRUE);
}
{
TABLE_LIST *duplicate;
if ((duplicate= unique_table(thd, table_list, table_list->next_global, 0)))
{
update_non_unique_table_error(table_list, "DELETE", duplicate);
DBUG_RETURN(TRUE);
}
}
if (select_lex->inner_refs_list.elements &&
fix_inner_refs(thd, all_fields, select_lex, select_lex->ref_pointer_array))
DBUG_RETURN(TRUE);
select_lex->fix_prepare_information(thd, conds, &fake_conds);
DBUG_RETURN(FALSE);
}
/***************************************************************************
Delete multiple tables from join
***************************************************************************/
#define MEM_STRIP_BUF_SIZE current_thd->variables.sortbuff_size
extern "C" int refpos_order_cmp(void* arg, const void *a,const void *b)
{
handler *file= (handler*)arg;
return file->cmp_ref((const uchar*)a, (const uchar*)b);
}
/*
make delete specific preparation and checks after opening tables
SYNOPSIS
mysql_multi_delete_prepare()
thd thread handler
RETURN
FALSE OK
TRUE Error
*/
int mysql_multi_delete_prepare(THD *thd)
{
LEX *lex= thd->lex;
TABLE_LIST *aux_tables= (TABLE_LIST *)lex->auxiliary_table_list.first;
TABLE_LIST *target_tbl;
DBUG_ENTER("mysql_multi_delete_prepare");
/*
setup_tables() need for VIEWs. JOIN::prepare() will not do it second
time.
lex->query_tables also point on local list of DELETE SELECT_LEX
*/
if (setup_tables_and_check_access(thd, &thd->lex->select_lex.context,
&thd->lex->select_lex.top_join_list,
lex->query_tables,
&lex->select_lex.leaf_tables, FALSE,
DELETE_ACL, SELECT_ACL))
DBUG_RETURN(TRUE);
/*
Multi-delete can't be constructed over-union => we always have
single SELECT on top and have to check underlying SELECTs of it
*/
lex->select_lex.exclude_from_table_unique_test= TRUE;
/* Fix tables-to-be-deleted-from list to point at opened tables */
for (target_tbl= (TABLE_LIST*) aux_tables;
target_tbl;
target_tbl= target_tbl->next_local)
{
if (!(target_tbl->table= target_tbl->correspondent_table->table))
{
DBUG_ASSERT(target_tbl->correspondent_table->view &&
target_tbl->correspondent_table->merge_underlying_list &&
target_tbl->correspondent_table->merge_underlying_list->
next_local);
my_error(ER_VIEW_DELETE_MERGE_VIEW, MYF(0),
target_tbl->correspondent_table->view_db.str,
target_tbl->correspondent_table->view_name.str);
DBUG_RETURN(TRUE);
}
if (!target_tbl->correspondent_table->updatable ||
check_key_in_view(thd, target_tbl->correspondent_table))
{
my_error(ER_NON_UPDATABLE_TABLE, MYF(0),
target_tbl->table_name, "DELETE");
DBUG_RETURN(TRUE);
}
/*
Check that table from which we delete is not used somewhere
inside subqueries/view.
*/
{
TABLE_LIST *duplicate;
if ((duplicate= unique_table(thd, target_tbl->correspondent_table,
lex->query_tables, 0)))
{
update_non_unique_table_error(target_tbl->correspondent_table,
"DELETE", duplicate);
DBUG_RETURN(TRUE);
}
}
}
/*
Reset the exclude flag to false so it doesn't interfare
with further calls to unique_table
*/
lex->select_lex.exclude_from_table_unique_test= FALSE;
DBUG_RETURN(FALSE);
}
multi_delete::multi_delete(TABLE_LIST *dt, uint num_of_tables_arg)
: delete_tables(dt), deleted(0), found(0),
num_of_tables(num_of_tables_arg), error(0),
do_delete(0), transactional_tables(0), normal_tables(0), error_handled(0)
{
tempfiles= (Unique **) sql_calloc(sizeof(Unique *) * num_of_tables);
}
int
multi_delete::prepare(List<Item> &values, SELECT_LEX_UNIT *u)
{
DBUG_ENTER("multi_delete::prepare");
unit= u;
do_delete= 1;
thd_proc_info(thd, "deleting from main table");
DBUG_RETURN(0);
}
bool
multi_delete::initialize_tables(JOIN *join)
{
TABLE_LIST *walk;
Unique **tempfiles_ptr;
DBUG_ENTER("initialize_tables");
if ((thd->options & OPTION_SAFE_UPDATES) && error_if_full_join(join))
DBUG_RETURN(1);
table_map tables_to_delete_from=0;
delete_while_scanning= 1;
for (walk= delete_tables; walk; walk= walk->next_local)
{
tables_to_delete_from|= walk->table->map;
if (delete_while_scanning &&
unique_table(thd, walk, join->tables_list, false))
{
/*
If the table we are going to delete from appears
in join, we need to defer delete. So the delete
doesn't interfers with the scaning of results.
*/
delete_while_scanning= 0;
}
}
walk= delete_tables;
for (JOIN_TAB *tab=join->join_tab, *end=join->join_tab+join->tables;
tab < end;
tab++)
{
if (tab->table->map & tables_to_delete_from)
{
/* We are going to delete from this table */
TABLE *tbl=walk->table=tab->table;
walk= walk->next_local;
/* Don't use KEYREAD optimization on this table */
tbl->no_keyread=1;
/* Don't use record cache */
tbl->no_cache= 1;
tbl->covering_keys.clear_all();
if (tbl->file->has_transactions())
transactional_tables= 1;
else
normal_tables= 1;
if (tbl->triggers &&
tbl->triggers->has_triggers(TRG_EVENT_DELETE,
TRG_ACTION_AFTER))
{
/*
The table has AFTER DELETE triggers that might access to subject
table and therefore might need delete to be done immediately.
So we turn-off the batching.
*/
(void) tbl->file->extra(HA_EXTRA_DELETE_CANNOT_BATCH);
}
tbl->prepare_for_position();
tbl->mark_columns_needed_for_delete();
}
else if ((tab->type != JT_SYSTEM && tab->type != JT_CONST) &&
walk == delete_tables)
{
/*
We are not deleting from the table we are scanning. In this
case send_data() shouldn't delete any rows a we may touch
the rows in the deleted table many times
*/
delete_while_scanning= 0;
}
}
walk= delete_tables;
tempfiles_ptr= tempfiles;
if (delete_while_scanning)
{
table_being_deleted= delete_tables;
walk= walk->next_local;
}
for (;walk ;walk= walk->next_local)
{
TABLE *table=walk->table;
*tempfiles_ptr++= new Unique (refpos_order_cmp,
(void *) table->file,
table->file->ref_length,
MEM_STRIP_BUF_SIZE);
}
init_ftfuncs(thd, thd->lex->current_select, 1);
DBUG_RETURN(thd->is_fatal_error != 0);
}
multi_delete::~multi_delete()
{
for (table_being_deleted= delete_tables;
table_being_deleted;
table_being_deleted= table_being_deleted->next_local)
{
TABLE *table= table_being_deleted->table;
table->no_keyread=0;
}
for (uint counter= 0; counter < num_of_tables; counter++)
{
if (tempfiles[counter])
delete tempfiles[counter];
}
}
bool multi_delete::send_data(List<Item> &values)
{
int secure_counter= delete_while_scanning ? -1 : 0;
TABLE_LIST *del_table;
DBUG_ENTER("multi_delete::send_data");
bool ignore= thd->lex->current_select->no_error;
for (del_table= delete_tables;
del_table;
del_table= del_table->next_local, secure_counter++)
{
TABLE *table= del_table->table;
/* Check if we are using outer join and we didn't find the row */
if (table->status & (STATUS_NULL_ROW | STATUS_DELETED))
continue;
table->file->position(table->record[0]);
found++;
if (secure_counter < 0)
{
/* We are scanning the current table */
DBUG_ASSERT(del_table == table_being_deleted);
if (table->triggers &&
table->triggers->process_triggers(thd, TRG_EVENT_DELETE,
TRG_ACTION_BEFORE, FALSE))
DBUG_RETURN(1);
table->status|= STATUS_DELETED;
if (!(error=table->file->ha_delete_row(table->record[0])))
{
deleted++;
if (!table->file->has_transactions())
thd->transaction.stmt.modified_non_trans_table= TRUE;
if (table->triggers &&
table->triggers->process_triggers(thd, TRG_EVENT_DELETE,
TRG_ACTION_AFTER, FALSE))
DBUG_RETURN(1);
}
else if (!ignore)
{
/*
If the IGNORE option is used errors caused by ha_delete_row don't
have to stop the iteration.
*/
table->file->print_error(error,MYF(0));
DBUG_RETURN(1);
}
}
else
{
error=tempfiles[secure_counter]->unique_add((char*) table->file->ref);
if (error)
{
error= 1; // Fatal error
DBUG_RETURN(1);
}
}
}
DBUG_RETURN(0);
}
void multi_delete::send_error(uint errcode,const char *err)
{
DBUG_ENTER("multi_delete::send_error");
/* First send error what ever it is ... */
my_message(errcode, err, MYF(0));
DBUG_VOID_RETURN;
}
void multi_delete::abort()
{
DBUG_ENTER("multi_delete::abort");
/* the error was handled or nothing deleted and no side effects return */
if (error_handled ||
(!thd->transaction.stmt.modified_non_trans_table && !deleted))
DBUG_VOID_RETURN;
/* Something already deleted so we have to invalidate cache */
if (deleted)
query_cache_invalidate3(thd, delete_tables, 1);
/*
If rows from the first table only has been deleted and it is
transactional, just do rollback.
The same if all tables are transactional, regardless of where we are.
In all other cases do attempt deletes ...
*/
if (do_delete && normal_tables &&
(table_being_deleted != delete_tables ||
!table_being_deleted->table->file->has_transactions()))
{
/*
We have to execute the recorded do_deletes() and write info into the
error log
*/
error= 1;
send_eof();
DBUG_ASSERT(error_handled);
DBUG_VOID_RETURN;
}
if (thd->transaction.stmt.modified_non_trans_table)
{
/*
there is only side effects; to binlog with the error
*/
if (mysql_bin_log.is_open())
{
int errcode= query_error_code(thd, thd->killed == THD::NOT_KILLED);
thd->binlog_query(THD::ROW_QUERY_TYPE,
thd->query(), thd->query_length(),
transactional_tables, FALSE, errcode);
}
thd->transaction.all.modified_non_trans_table= true;
}
DBUG_VOID_RETURN;
}
/**
Do delete from other tables.
@retval 0 ok
@retval 1 error
@todo Is there any reason not use the normal nested-loops join? If not, and
there is no documentation supporting it, this method and callee should be
removed and there should be hooks within normal execution.
*/
int multi_delete::do_deletes()
{
DBUG_ENTER("do_deletes");
DBUG_ASSERT(do_delete);
do_delete= 0; // Mark called
if (!found)
DBUG_RETURN(0);
table_being_deleted= (delete_while_scanning ? delete_tables->next_local :
delete_tables);
for (uint counter= 0; table_being_deleted;
table_being_deleted= table_being_deleted->next_local, counter++)
{
TABLE *table = table_being_deleted->table;
if (tempfiles[counter]->get(table))
DBUG_RETURN(1);
int local_error=
do_table_deletes(table, thd->lex->current_select->no_error);
if (thd->killed && !local_error)
DBUG_RETURN(1);
if (local_error == -1) // End of file
local_error = 0;
if (local_error)
DBUG_RETURN(local_error);
}
DBUG_RETURN(0);
}
/**
Implements the inner loop of nested-loops join within multi-DELETE
execution.
@param table The table from which to delete.
@param ignore If used, all non fatal errors will be translated
to warnings and we should not break the row-by-row iteration.
@return Status code
@retval 0 All ok.
@retval 1 Triggers or handler reported error.
@retval -1 End of file from handler.
*/
int multi_delete::do_table_deletes(TABLE *table, bool ignore)
{
int local_error= 0;
READ_RECORD info;
ha_rows last_deleted= deleted;
DBUG_ENTER("do_deletes_for_table");
init_read_record(&info, thd, table, NULL, 0, 1, FALSE);
/*
Ignore any rows not found in reference tables as they may already have
been deleted by foreign key handling
*/
info.ignore_not_found_rows= 1;
bool will_batch= !table->file->start_bulk_delete();
while (!(local_error= info.read_record(&info)) && !thd->killed)
{
if (table->triggers &&
table->triggers->process_triggers(thd, TRG_EVENT_DELETE,
TRG_ACTION_BEFORE, FALSE))
{
local_error= 1;
break;
}
local_error= table->file->ha_delete_row(table->record[0]);
if (local_error && !ignore)
{
table->file->print_error(local_error, MYF(0));
break;
}
/*
Increase the reported number of deleted rows only if no error occurred
during ha_delete_row.
Also, don't execute the AFTER trigger if the row operation failed.
*/
if (!local_error)
{
deleted++;
if (table->triggers &&
table->triggers->process_triggers(thd, TRG_EVENT_DELETE,
TRG_ACTION_AFTER, FALSE))
{
local_error= 1;
break;
}
}
}
if (will_batch)
{
int tmp_error= table->file->end_bulk_delete();
if (tmp_error && !local_error)
{
local_error= tmp_error;
table->file->print_error(local_error, MYF(0));
}
}
if (last_deleted != deleted && !table->file->has_transactions())
thd->transaction.stmt.modified_non_trans_table= TRUE;
end_read_record(&info);
DBUG_RETURN(local_error);
}
/*
Send ok to the client
return: 0 sucess
1 error
*/
bool multi_delete::send_eof()
{
THD::killed_state killed_status= THD::NOT_KILLED;
thd_proc_info(thd, "deleting from reference tables");
/* Does deletes for the last n - 1 tables, returns 0 if ok */
int local_error= do_deletes(); // returns 0 if success
/* compute a total error to know if something failed */
local_error= local_error || error;
killed_status= (local_error == 0)? THD::NOT_KILLED : thd->killed;
/* reset used flags */
thd_proc_info(thd, "end");
/*
We must invalidate the query cache before binlog writing and
ha_autocommit_...
*/
if (deleted)
{
query_cache_invalidate3(thd, delete_tables, 1);
}
if ((local_error == 0) || thd->transaction.stmt.modified_non_trans_table)
{
if (mysql_bin_log.is_open())
{
int errcode= 0;
if (local_error == 0)
thd->clear_error();
else
errcode= query_error_code(thd, killed_status == THD::NOT_KILLED);
if (thd->binlog_query(THD::ROW_QUERY_TYPE,
thd->query(), thd->query_length(),
transactional_tables, FALSE, errcode) &&
!normal_tables)
{
local_error=1; // Log write failed: roll back the SQL statement
}
}
if (thd->transaction.stmt.modified_non_trans_table)
thd->transaction.all.modified_non_trans_table= TRUE;
}
if (local_error != 0)
error_handled= TRUE; // to force early leave from ::send_error()
if (!local_error)
{
thd->row_count_func= deleted;
::my_ok(thd, (ha_rows) thd->row_count_func);
}
return 0;
}
/***************************************************************************
TRUNCATE TABLE
****************************************************************************/
/*
Row-by-row truncation if the engine does not support table recreation.
Probably a InnoDB table.
*/
static bool mysql_truncate_by_delete(THD *thd, TABLE_LIST *table_list)
{
bool error, save_binlog_row_based= thd->current_stmt_binlog_row_based;
DBUG_ENTER("mysql_truncate_by_delete");
table_list->lock_type= TL_WRITE;
mysql_init_select(thd->lex);
thd->clear_current_stmt_binlog_row_based();
/* Delete all rows from table */
error= mysql_delete(thd, table_list, NULL, NULL, HA_POS_ERROR, LL(0), TRUE);
/*
All effects of a TRUNCATE TABLE operation are rolled back if a row by row
deletion fails. Otherwise, operation is automatically committed at the end.
*/
if (error)
{
DBUG_ASSERT(thd->stmt_da->is_error());
trans_rollback_stmt(thd);
trans_rollback(thd);
}
thd->current_stmt_binlog_row_based= save_binlog_row_based;
DBUG_RETURN(error);
}
/*
Optimize delete of all rows by doing a full generate of the table
This will work even if the .ISM and .ISD tables are destroyed
dont_send_ok should be set if:
- We should always wants to generate the table (even if the table type
normally can't safely do this.
- We don't want an ok to be sent to the end user.
- We don't want to log the truncate command
- If we want to keep exclusive metadata lock on the table (obtained by
caller) on exit without errors.
*/
bool mysql_truncate(THD *thd, TABLE_LIST *table_list, bool dont_send_ok)
{
HA_CREATE_INFO create_info;
char path[FN_REFLEN + 1];
TABLE *table;
bool error= TRUE;
uint path_length;
/*
Is set if we're under LOCK TABLES, and used
to downgrade the exclusive lock after the
table was truncated.
*/
MDL_ticket *mdl_ticket= NULL;
bool has_mdl_lock= FALSE;
DBUG_ENTER("mysql_truncate");
bzero((char*) &create_info,sizeof(create_info));
/* Remove tables from the HANDLER's hash. */
mysql_ha_rm_tables(thd, table_list);
/* If it is a temporary table, close and regenerate it */
if (!dont_send_ok && (table= find_temporary_table(thd, table_list)))
{
handlerton *table_type= table->s->db_type();
TABLE_SHARE *share= table->s;
/* Note that a temporary table cannot be partitioned */
if (!ha_check_storage_engine_flag(table_type, HTON_CAN_RECREATE))
goto trunc_by_del;
table->file->info(HA_STATUS_AUTO | HA_STATUS_NO_LOCK);
close_temporary_table(thd, table, 0, 0); // Don't free share
ha_create_table(thd, share->normalized_path.str,
share->db.str, share->table_name.str, &create_info, 1);
// We don't need to call invalidate() because this table is not in cache
if ((error= (int) !(open_temporary_table(thd, share->path.str,
share->db.str,
share->table_name.str, 1))))
(void) rm_temporary_table(table_type, path);
else
thd->thread_specific_used= TRUE;
free_table_share(share);
my_free((char*) table,MYF(0));
/*
If we return here we will not have logged the truncation to the bin log
and we will not my_ok() to the client.
*/
goto end;
}
path_length= build_table_filename(path, sizeof(path) - 1, table_list->db,
table_list->table_name, reg_ext, 0);
if (!dont_send_ok)
{
enum legacy_db_type table_type;
/*
FIXME: Code of TRUNCATE breaks the meta-data
locking protocol since it tries to find out the table storage
engine and therefore accesses table in some way without holding
any kind of meta-data lock.
*/
mysql_frm_type(thd, path, &table_type);
if (table_type == DB_TYPE_UNKNOWN)
{
my_error(ER_NO_SUCH_TABLE, MYF(0),
table_list->db, table_list->table_name);
DBUG_RETURN(TRUE);
}
#ifdef WITH_PARTITION_STORAGE_ENGINE
/*
TODO: Add support for TRUNCATE PARTITION for NDB and other engines
supporting native partitioning
*/
if (table_type != DB_TYPE_PARTITION_DB &&
thd->lex->alter_info.flags & ALTER_ADMIN_PARTITION)
{
my_error(ER_PARTITION_MGMT_ON_NONPARTITIONED, MYF(0));
DBUG_RETURN(TRUE);
}
#endif
if (!ha_check_storage_engine_flag(ha_resolve_by_legacy_type(thd,
table_type),
HTON_CAN_RECREATE) ||
thd->lex->alter_info.flags & ALTER_ADMIN_PARTITION)
goto trunc_by_del;
if (thd->locked_tables_mode)
{
if (!(table= find_table_for_mdl_upgrade(thd->open_tables, table_list->db,
table_list->table_name, FALSE)))
DBUG_RETURN(TRUE);
mdl_ticket= table->mdl_ticket;
if (wait_while_table_is_used(thd, table, HA_EXTRA_FORCE_REOPEN))
goto end;
close_all_tables_for_name(thd, table->s, FALSE);
}
else
{
MDL_request mdl_global_request, mdl_request;
MDL_request_list mdl_requests;
/*
Even though we could use the previous execution branch
here just as well, we must not try to open the table:
MySQL manual documents that TRUNCATE can be used to
repair a damaged table, i.e. a table that can not be
fully "opened". In particular MySQL manual says:
As long as the table format file tbl_name.frm is valid,
the table can be re-created as an empty table with TRUNCATE
TABLE, even if the data or index files have become corrupted.
*/
mdl_global_request.init(MDL_key::GLOBAL, "", "", MDL_INTENTION_EXCLUSIVE);
mdl_request.init(MDL_key::TABLE, table_list->db, table_list->table_name,
MDL_EXCLUSIVE);
mdl_requests.push_front(&mdl_request);
mdl_requests.push_front(&mdl_global_request);
if (thd->mdl_context.acquire_locks(&mdl_requests))
DBUG_RETURN(TRUE);
has_mdl_lock= TRUE;
pthread_mutex_lock(&LOCK_open);
tdc_remove_table(thd, TDC_RT_REMOVE_ALL, table_list->db,
table_list->table_name);
pthread_mutex_unlock(&LOCK_open);
}
}
/*
Remove the .frm extension AIX 5.2 64-bit compiler bug (BUG#16155): this
crashes, replacement works. *(path + path_length - reg_ext_length)=
'\0';
*/
path[path_length - reg_ext_length] = 0;
pthread_mutex_lock(&LOCK_open);
error= ha_create_table(thd, path, table_list->db, table_list->table_name,
&create_info, 1);
pthread_mutex_unlock(&LOCK_open);
query_cache_invalidate3(thd, table_list, 0);
end:
if (!dont_send_ok)
{
if (thd->locked_tables_mode && thd->locked_tables_list.reopen_tables(thd))
thd->locked_tables_list.unlink_all_closed_tables(thd, NULL, 0);
/*
Even if we failed to reopen some tables,
the operation itself succeeded, write the binlog.
*/
if (!error)
{
/*
TRUNCATE must always be statement-based binlogged (not row-based) so
we don't test current_stmt_binlog_row_based.
*/
write_bin_log(thd, TRUE, thd->query(), thd->query_length());
my_ok(thd); // This should return record count
}
if (has_mdl_lock)
thd->mdl_context.release_transactional_locks();
if (mdl_ticket)
mdl_ticket->downgrade_exclusive_lock(MDL_SHARED_NO_READ_WRITE);
}
DBUG_PRINT("exit", ("error: %d", error));
DBUG_RETURN(error);
trunc_by_del:
error= mysql_truncate_by_delete(thd, table_list);
DBUG_RETURN(error);
}