BUG#36197: flush tables (or little table cache) can cause crash on slave

When flushing tables, there were a slight chance that the flush was occuring
between processing of two table map events. Since the tables are opened
one by one, it might result in that the tables were not valid and that sub-
sequent locking of tables would cause the slave to crash.

The problem is solved by opening and locking all tables at once using
simple_open_n_lock_tables(). Also, the patch contain a change to open_tables()
so that pre-locking only takes place when the trg_event_map is not zero, which
was not the case before (this caused the lock to be placed in thd->locked_tables
instead of thd->lock since the assumption was that triggers would be called
later and therefore the tables should be pre-locked).


mysql-test/suite/rpl/r/rpl_found_rows.result:
  Result change
mysql-test/suite/rpl/r/rpl_row_inexist_tbl.result:
  Result change
mysql-test/suite/rpl/t/rpl_found_rows.test:
  Adding drop of table that was created in test.
mysql-test/suite/rpl/t/rpl_slave_status.test:
  Adding waits for slave start and stop to ensure that test works.
sql/log_event.cc:
  Replacing table-by-table open and lock with a single call
  to simple_open_n_lock_tables(), which in turn required some
  changes to other code.
sql/log_event_old.cc:
  Replacing table-by-table open and lock with a single call
  to simple_open_n_lock_tables(), which in turn required some
  changes to other code.
sql/sql_base.cc:
  Extending check inside open_tables() so that pre-locking in only done if
  tables->trg_egent_map is non-zero.
mysql-test/include/wait_for_slave_sql_to_start.inc:
  New BitKeeper file ``mysql-test/include/wait_for_slave_sql_to_start.inc''
This commit is contained in:
unknown 2008-05-12 19:50:53 +02:00
commit dac6ffb958
8 changed files with 112 additions and 211 deletions

View file

@ -53,81 +53,46 @@ Old_rows_log_event::do_apply_event(Old_rows_log_event *ev, const Relay_log_info
*/
if (!thd->lock)
{
bool need_reopen= 1; /* To execute the first lap of the loop below */
/*
lock_tables() reads the contents of thd->lex, so they must be
initialized. Contrary to in
Table_map_log_event::do_apply_event() we don't call
mysql_init_query() as that may reset the binlog format.
Lock_tables() reads the contents of thd->lex, so they must be
initialized.
We also call the mysql_reset_thd_for_next_command(), since this
is the logical start of the next "statement". Note that this
call might reset the value of current_stmt_binlog_row_based, so
we need to do any changes to that value after this function.
*/
lex_start(thd);
mysql_reset_thd_for_next_command(thd);
while ((error= lock_tables(thd, rli->tables_to_lock,
rli->tables_to_lock_count, &need_reopen)))
/*
Check if the slave is set to use SBR. If so, it should switch
to using RBR until the end of the "statement", i.e., next
STMT_END_F or next error.
*/
if (!thd->current_stmt_binlog_row_based &&
mysql_bin_log.is_open() && (thd->options & OPTION_BIN_LOG))
{
if (!need_reopen)
thd->set_current_stmt_binlog_row_based();
}
if (simple_open_n_lock_tables(thd, rli->tables_to_lock))
{
uint actual_error= thd->main_da.sql_errno();
if (thd->is_slave_error || thd->is_fatal_error)
{
if (thd->is_slave_error || thd->is_fatal_error)
{
/*
Error reporting borrowed from Query_log_event with many excessive
simplifications (we don't honour --slave-skip-errors)
*/
uint actual_error= thd->main_da.sql_errno();
rli->report(ERROR_LEVEL, actual_error,
"Error '%s' in %s event: when locking tables",
(actual_error ? thd->main_da.message() :
"unexpected success or fatal error"),
ev->get_type_str());
thd->is_fatal_error= 1;
}
else
{
rli->report(ERROR_LEVEL, error,
"Error in %s event: when locking tables",
ev->get_type_str());
}
const_cast<Relay_log_info*>(rli)->clear_tables_to_lock();
DBUG_RETURN(error);
}
/*
So we need to reopen the tables.
We need to flush the pending RBR event, since it keeps a
pointer to an open table.
ALTERNATIVE SOLUTION (not implemented): Extract a pointer to
the pending RBR event and reset the table pointer after the
tables has been reopened.
NOTE: For this new scheme there should be no pending event:
need to add code to assert that is the case.
*/
thd->binlog_flush_pending_rows_event(false);
TABLE_LIST *tables= rli->tables_to_lock;
close_tables_for_reopen(thd, &tables);
uint tables_count= rli->tables_to_lock_count;
if ((error= open_tables(thd, &tables, &tables_count, 0)))
{
if (thd->is_slave_error || thd->is_fatal_error)
{
/*
Error reporting borrowed from Query_log_event with many excessive
simplifications (we don't honour --slave-skip-errors)
*/
uint actual_error= thd->main_da.sql_errno();
rli->report(ERROR_LEVEL, actual_error,
"Error '%s' on reopening tables",
(actual_error ? thd->main_da.message() :
"unexpected success or fatal error"));
thd->is_slave_error= 1;
}
const_cast<Relay_log_info*>(rli)->clear_tables_to_lock();
DBUG_RETURN(error);
/*
Error reporting borrowed from Query_log_event with many excessive
simplifications (we don't honour --slave-skip-errors)
*/
rli->report(ERROR_LEVEL, actual_error,
"Error '%s' on opening tables",
(actual_error ? thd->main_da.message() :
"unexpected success or fatal error"));
thd->is_slave_error= 1;
}
const_cast<Relay_log_info*>(rli)->clear_tables_to_lock();
DBUG_RETURN(actual_error);
}
/*