mariadb/sql/rpl_rli.cc

1318 lines
46 KiB
C++
Raw Normal View History

2012-02-16 10:48:16 +01:00
/* Copyright (c) 2006, 2012, Oracle and/or its affiliates. All rights reserved.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
2011-06-30 17:46:53 +02:00
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
#include "sql_priv.h"
#include "unireg.h" // HAVE_*
#include "rpl_mi.h"
#include "rpl_rli.h"
#include "sql_base.h" // close_thread_tables
#include <my_dir.h> // For MY_STAT
#include "sql_repl.h" // For check_binlog_magic
#include "log_event.h" // Format_description_log_event, Log_event,
// FORMAT_DESCRIPTION_LOG_EVENT, ROTATE_EVENT,
// PREFIX_SQL_LOAD
#include "rpl_utility.h"
#include "transaction.h"
#include "sql_parse.h" // end_trans, ROLLBACK
#include <mysql/plugin.h>
#include <mysql/service_thd_wait.h>
static int count_relay_log_space(Relay_log_info* rli);
// Defined in slave.cc
int init_intvar_from_file(int* var, IO_CACHE* f, int default_val);
int init_strvar_from_file(char *var, int max_size, IO_CACHE *f,
const char *default_val);
BUG#40337 Fsyncing master and relay log to disk after every event is too slow NOTE: Backporting the patch to next-mr. The fix proposed in BUG#35542 and BUG#31665 introduces a performance issue when fsyncing the master.info, relay.info and relay-log.bin* after #th events. Although such solution has been proposed to reduce the probability of corrupted files due to a slave-crash, the performance penalty introduced by it has made the approach impractical for highly intensive workloads. In a nutshell, the option --syn-relay-log proposed in BUG#35542 and BUG#31665 simultaneously fsyncs master.info, relay-log.info and relay-log.bin* and this is the main source of performance issues. This patch introduces new options that give more control to the user on what should be fsynced and how often: 1) (--sync-master-info, integer) which syncs the master.info after #th event; 2) (--sync-relay-log, integer) which syncs the relay-log.bin* after #th events. 3) (--sync-relay-log-info, integer) which syncs the relay.info after #th transactions. To provide both performance and increased reliability, we recommend the following setup: 1) --sync-master-info = 0 eventually the operating system will fsync it; 2) --sync-relay-log = 0 eventually the operating system will fsync it; 3) --sync-relay-log-info = 1 fsyncs it after every transaction; Notice, that the previous setup does not reduce the probability of corrupted master.info and relay-log.bin*. To overcome the issue, this patch also introduces a recovery mechanism that right after restart throws away relay-log.bin* retrieved from a master and updates the master.info based on the relay.info: 4) (--relay-log-recovery, boolean) which enables a recovery mechanism that throws away relay-log.bin* after a crash. However, it can only recover the incorrect binlog file and position in master.info, if other informations (host, port password, etc) are corrupted or incorrect, then this recovery mechanism will fail to work.
2009-09-29 16:40:52 +02:00
Relay_log_info::Relay_log_info(bool is_slave_recovery)
:Slave_reporting_capability("SQL"),
no_storage(FALSE), replicate_same_server_id(::replicate_same_server_id),
info_fd(-1), cur_log_fd(-1), relay_log(&sync_relaylog_period),
BUG#40337 Fsyncing master and relay log to disk after every event is too slow NOTE: Backporting the patch to next-mr. The fix proposed in BUG#35542 and BUG#31665 introduces a performance issue when fsyncing the master.info, relay.info and relay-log.bin* after #th events. Although such solution has been proposed to reduce the probability of corrupted files due to a slave-crash, the performance penalty introduced by it has made the approach impractical for highly intensive workloads. In a nutshell, the option --syn-relay-log proposed in BUG#35542 and BUG#31665 simultaneously fsyncs master.info, relay-log.info and relay-log.bin* and this is the main source of performance issues. This patch introduces new options that give more control to the user on what should be fsynced and how often: 1) (--sync-master-info, integer) which syncs the master.info after #th event; 2) (--sync-relay-log, integer) which syncs the relay-log.bin* after #th events. 3) (--sync-relay-log-info, integer) which syncs the relay.info after #th transactions. To provide both performance and increased reliability, we recommend the following setup: 1) --sync-master-info = 0 eventually the operating system will fsync it; 2) --sync-relay-log = 0 eventually the operating system will fsync it; 3) --sync-relay-log-info = 1 fsyncs it after every transaction; Notice, that the previous setup does not reduce the probability of corrupted master.info and relay-log.bin*. To overcome the issue, this patch also introduces a recovery mechanism that right after restart throws away relay-log.bin* retrieved from a master and updates the master.info based on the relay.info: 4) (--relay-log-recovery, boolean) which enables a recovery mechanism that throws away relay-log.bin* after a crash. However, it can only recover the incorrect binlog file and position in master.info, if other informations (host, port password, etc) are corrupted or incorrect, then this recovery mechanism will fail to work.
2009-09-29 16:40:52 +02:00
sync_counter(0), is_relay_log_recovery(is_slave_recovery),
save_temporary_tables(0), cur_log_old_open_count(0), group_relay_log_pos(0),
event_relay_log_pos(0),
#if HAVE_purify
is_fake(FALSE),
#endif
group_master_log_pos(0), log_space_total(0), ignore_log_space_limit(0),
last_master_timestamp(0), slave_skip_counter(0),
abort_pos_wait(0), slave_run_id(0), sql_thd(0),
inited(0), abort_slave(0), slave_running(0), until_condition(UNTIL_NONE),
until_log_pos(0), retried_trans(0),
tables_to_lock(0), tables_to_lock_count(0),
last_event_start_time(0), deferred_events(NULL),m_flags(0),
row_stmt_start_timestamp(0), long_find_row_note_printed(false)
{
DBUG_ENTER("Relay_log_info::Relay_log_info");
#ifdef HAVE_PSI_INTERFACE
relay_log.set_psi_keys(key_RELAYLOG_LOCK_index,
key_RELAYLOG_update_cond,
key_file_relaylog,
key_file_relaylog_index);
#endif
group_relay_log_name[0]= event_relay_log_name[0]=
group_master_log_name[0]= 0;
until_log_name[0]= ign_master_log_name_end[0]= 0;
bzero((char*) &info_file, sizeof(info_file));
bzero((char*) &cache_buf, sizeof(cache_buf));
cached_charset_invalidate();
mysql_mutex_init(key_relay_log_info_run_lock, &run_lock, MY_MUTEX_INIT_FAST);
mysql_mutex_init(key_relay_log_info_data_lock,
&data_lock, MY_MUTEX_INIT_FAST);
mysql_mutex_init(key_relay_log_info_log_space_lock,
&log_space_lock, MY_MUTEX_INIT_FAST);
mysql_mutex_init(key_relay_log_info_sleep_lock, &sleep_lock, MY_MUTEX_INIT_FAST);
mysql_cond_init(key_relay_log_info_data_cond, &data_cond, NULL);
mysql_cond_init(key_relay_log_info_start_cond, &start_cond, NULL);
mysql_cond_init(key_relay_log_info_stop_cond, &stop_cond, NULL);
mysql_cond_init(key_relay_log_info_log_space_cond, &log_space_cond, NULL);
mysql_cond_init(key_relay_log_info_sleep_cond, &sleep_cond, NULL);
relay_log.init_pthread_objects();
DBUG_VOID_RETURN;
}
Relay_log_info::~Relay_log_info()
{
DBUG_ENTER("Relay_log_info::~Relay_log_info");
mysql_mutex_destroy(&run_lock);
mysql_mutex_destroy(&data_lock);
mysql_mutex_destroy(&log_space_lock);
mysql_mutex_destroy(&sleep_lock);
mysql_cond_destroy(&data_cond);
mysql_cond_destroy(&start_cond);
mysql_cond_destroy(&stop_cond);
mysql_cond_destroy(&log_space_cond);
mysql_cond_destroy(&sleep_cond);
relay_log.cleanup();
DBUG_VOID_RETURN;
}
int init_relay_log_info(Relay_log_info* rli,
const char* info_fname)
{
char fname[FN_REFLEN+128];
int info_fd;
const char* msg = 0;
int error = 0;
DBUG_ENTER("init_relay_log_info");
DBUG_ASSERT(!rli->no_storage); // Don't init if there is no storage
if (rli->inited) // Set if this function called
DBUG_RETURN(0);
fn_format(fname, info_fname, mysql_data_home, "", 4+32);
mysql_mutex_lock(&rli->data_lock);
info_fd = rli->info_fd;
rli->cur_log_fd = -1;
rli->slave_skip_counter=0;
rli->abort_pos_wait=0;
rli->log_space_limit= relay_log_space_limit;
rli->log_space_total= 0;
rli->tables_to_lock= 0;
rli->tables_to_lock_count= 0;
char pattern[FN_REFLEN];
(void) my_realpath(pattern, slave_load_tmpdir, 0);
if (fn_format(pattern, PREFIX_SQL_LOAD, pattern, "",
MY_SAFE_PATH | MY_RETURN_REAL_PATH) == NullS)
{
mysql_mutex_unlock(&rli->data_lock);
sql_print_error("Unable to use slave's temporary directory %s",
slave_load_tmpdir);
DBUG_RETURN(1);
}
unpack_filename(rli->slave_patternload_file, pattern);
rli->slave_patternload_file_size= strlen(rli->slave_patternload_file);
/*
The relay log will now be opened, as a SEQ_READ_APPEND IO_CACHE.
Note that the I/O thread flushes it to disk after writing every
event, in flush_master_info(mi, 1, ?).
*/
/*
For the maximum log size, we choose max_relay_log_size if it is
non-zero, max_binlog_size otherwise. If later the user does SET
GLOBAL on one of these variables, fix_max_binlog_size and
fix_max_relay_log_size will reconsider the choice (for example
if the user changes max_relay_log_size to zero, we have to
switch to using max_binlog_size for the relay log) and update
rli->relay_log.max_size (and mysql_bin_log.max_size).
*/
{
/* Reports an error and returns, if the --relay-log's path
is a directory.*/
if (opt_relay_logname &&
opt_relay_logname[strlen(opt_relay_logname) - 1] == FN_LIBCHAR)
{
mysql_mutex_unlock(&rli->data_lock);
sql_print_error("Path '%s' is a directory name, please specify \
a file name for --relay-log option", opt_relay_logname);
DBUG_RETURN(1);
}
/* Reports an error and returns, if the --relay-log-index's path
is a directory.*/
if (opt_relaylog_index_name &&
opt_relaylog_index_name[strlen(opt_relaylog_index_name) - 1]
== FN_LIBCHAR)
{
mysql_mutex_unlock(&rli->data_lock);
sql_print_error("Path '%s' is a directory name, please specify \
a file name for --relay-log-index option", opt_relaylog_index_name);
DBUG_RETURN(1);
}
char buf[FN_REFLEN];
const char *ln;
static bool name_warning_sent= 0;
ln= rli->relay_log.generate_name(opt_relay_logname, "-relay-bin",
1, buf);
/* We send the warning only at startup, not after every RESET SLAVE */
if (!opt_relay_logname && !opt_relaylog_index_name && !name_warning_sent)
{
/*
User didn't give us info to name the relay log index file.
Picking `hostname`-relay-bin.index like we do, causes replication to
fail if this slave's hostname is changed later. So, we would like to
instead require a name. But as we don't want to break many existing
setups, we only give warning, not error.
*/
sql_print_warning("Neither --relay-log nor --relay-log-index were used;"
" so replication "
"may break when this MySQL server acts as a "
"slave and has his hostname changed!! Please "
"use '--relay-log=%s' to avoid this problem.", ln);
name_warning_sent= 1;
}
/*
note, that if open() fails, we'll still have index file open
but a destructor will take care of that
*/
BUG#45292 orphan binary log created when starting server twice This patch fixes three bugs as follows. First, aborting the server while purging binary logs might generate orphan files due to how the purge operation was implemented: (purge routine - sql/log.cc - MYSQL_BIN_LOG::purge_logs) 1 - register the files to be removed in a temporary buffer. 2 - update the log-bin.index. 3 - flush the log-bin.index. 4 - erase the files whose names where register in the temporary buffer in step 1. Thus a failure while executing step 4 would generate an orphan file. Second, a similar issue might happen while creating a new binary as follows: (create routine - sql/log.cc - MYSQL_BIN_LOG::open) 1 - open the new log-bin. 2 - update the log-bin.index. Thus a failure while executing step 1 would generate an orphan file. To fix these issues, we record the files to be purged or created before really removing or adding them. So if a failure happens such records can be used to automatically remove dangling files. The new steps might be outlined as follows: (purge routine - sql/log.cc - MYSQL_BIN_LOG::purge_logs) 1 - register the files to be removed in the log-bin.~rec~ placed in the data directory. 2 - update the log-bin.index. 3 - flush the log-bin.index. 4 - delete the log-bin.~rec~. (create routine - sql/log.cc - MYSQL_BIN_LOG::open) 1 - register the file to be created in the log-bin.~rec~ placed in the data directory. 2 - open the new log-bin. 3 - update the log-bin.index. 4 - delete the log-bin.~rec~. (recovery routine - sql/log.cc - MYSQL_BIN_LOG::open_index_file) 1 - open the log-bin.index. 2 - open the log-bin.~rec~. 3 - for each file in log-bin.~rec~. 3.1 Check if the file is in the log-bin.index and if so ignore it. 3.2 Otherwise, delete it. The third issue can be described as follows. The purge operation was allowing to remove a file in use thus leading to the loss of data and possible inconsistencies between the master and slave. Roughly, the routine was only taking into account the dump threads and so if a slave was not connect the file might be delete even though it was in use.
2009-12-04 15:40:42 +01:00
if (rli->relay_log.open_index_file(opt_relaylog_index_name, ln, TRUE) ||
rli->relay_log.open(ln, LOG_BIN, 0, SEQ_READ_APPEND, 0,
(max_relay_log_size ? max_relay_log_size :
BUG#45292 orphan binary log created when starting server twice This patch fixes three bugs as follows. First, aborting the server while purging binary logs might generate orphan files due to how the purge operation was implemented: (purge routine - sql/log.cc - MYSQL_BIN_LOG::purge_logs) 1 - register the files to be removed in a temporary buffer. 2 - update the log-bin.index. 3 - flush the log-bin.index. 4 - erase the files whose names where register in the temporary buffer in step 1. Thus a failure while executing step 4 would generate an orphan file. Second, a similar issue might happen while creating a new binary as follows: (create routine - sql/log.cc - MYSQL_BIN_LOG::open) 1 - open the new log-bin. 2 - update the log-bin.index. Thus a failure while executing step 1 would generate an orphan file. To fix these issues, we record the files to be purged or created before really removing or adding them. So if a failure happens such records can be used to automatically remove dangling files. The new steps might be outlined as follows: (purge routine - sql/log.cc - MYSQL_BIN_LOG::purge_logs) 1 - register the files to be removed in the log-bin.~rec~ placed in the data directory. 2 - update the log-bin.index. 3 - flush the log-bin.index. 4 - delete the log-bin.~rec~. (create routine - sql/log.cc - MYSQL_BIN_LOG::open) 1 - register the file to be created in the log-bin.~rec~ placed in the data directory. 2 - open the new log-bin. 3 - update the log-bin.index. 4 - delete the log-bin.~rec~. (recovery routine - sql/log.cc - MYSQL_BIN_LOG::open_index_file) 1 - open the log-bin.index. 2 - open the log-bin.~rec~. 3 - for each file in log-bin.~rec~. 3.1 Check if the file is in the log-bin.index and if so ignore it. 3.2 Otherwise, delete it. The third issue can be described as follows. The purge operation was allowing to remove a file in use thus leading to the loss of data and possible inconsistencies between the master and slave. Roughly, the routine was only taking into account the dump threads and so if a slave was not connect the file might be delete even though it was in use.
2009-12-04 15:40:42 +01:00
max_binlog_size), 1, TRUE))
{
mysql_mutex_unlock(&rli->data_lock);
sql_print_error("Failed in open_log() called from init_relay_log_info()");
DBUG_RETURN(1);
}
rli->relay_log.is_relay_log= TRUE;
}
/* if file does not exist */
if (access(fname,F_OK))
{
/*
If someone removed the file from underneath our feet, just close
the old descriptor and re-create the old file
*/
if (info_fd >= 0)
mysql_file_close(info_fd, MYF(MY_WME));
if ((info_fd= mysql_file_open(key_file_relay_log_info,
fname, O_CREAT|O_RDWR|O_BINARY, MYF(MY_WME))) < 0)
{
sql_print_error("Failed to create a new relay log info file (\
file '%s', errno %d)", fname, my_errno);
msg= current_thd->stmt_da->message();
goto err;
}
if (init_io_cache(&rli->info_file, info_fd, IO_SIZE*2, READ_CACHE, 0L,0,
MYF(MY_WME)))
{
sql_print_error("Failed to create a cache on relay log info file '%s'",
fname);
msg= current_thd->stmt_da->message();
goto err;
}
/* Init relay log with first entry in the relay index file */
if (init_relay_log_pos(rli,NullS,BIN_LOG_HEADER_SIZE,0 /* no data lock */,
&msg, 0))
{
sql_print_error("Failed to open the relay log 'FIRST' (relay_log_pos 4)");
goto err;
}
rli->group_master_log_name[0]= 0;
rli->group_master_log_pos= 0;
rli->info_fd= info_fd;
}
else // file exists
{
if (info_fd >= 0)
reinit_io_cache(&rli->info_file, READ_CACHE, 0L,0,0);
else
{
int error=0;
if ((info_fd= mysql_file_open(key_file_relay_log_info,
fname, O_RDWR|O_BINARY, MYF(MY_WME))) < 0)
{
sql_print_error("\
Failed to open the existing relay log info file '%s' (errno %d)",
fname, my_errno);
error= 1;
}
else if (init_io_cache(&rli->info_file, info_fd,
IO_SIZE*2, READ_CACHE, 0L, 0, MYF(MY_WME)))
{
sql_print_error("Failed to create a cache on relay log info file '%s'",
fname);
error= 1;
}
if (error)
{
if (info_fd >= 0)
mysql_file_close(info_fd, MYF(0));
rli->info_fd= -1;
rli->relay_log.close(LOG_CLOSE_INDEX | LOG_CLOSE_STOP_EVENT);
mysql_mutex_unlock(&rli->data_lock);
DBUG_RETURN(1);
}
}
rli->info_fd = info_fd;
int relay_log_pos, master_log_pos;
if (init_strvar_from_file(rli->group_relay_log_name,
sizeof(rli->group_relay_log_name),
&rli->info_file, "") ||
init_intvar_from_file(&relay_log_pos,
&rli->info_file, BIN_LOG_HEADER_SIZE) ||
init_strvar_from_file(rli->group_master_log_name,
sizeof(rli->group_master_log_name),
&rli->info_file, "") ||
init_intvar_from_file(&master_log_pos, &rli->info_file, 0))
{
msg="Error reading slave log configuration";
goto err;
}
strmake(rli->event_relay_log_name,rli->group_relay_log_name,
sizeof(rli->event_relay_log_name)-1);
rli->group_relay_log_pos= rli->event_relay_log_pos= relay_log_pos;
rli->group_master_log_pos= master_log_pos;
if (rli->is_relay_log_recovery && init_recovery(rli->mi, &msg))
goto err;
if (init_relay_log_pos(rli,
rli->group_relay_log_name,
rli->group_relay_log_pos,
0 /* no data lock*/,
&msg, 0))
{
char llbuf[22];
sql_print_error("Failed to open the relay log '%s' (relay_log_pos %s)",
rli->group_relay_log_name,
llstr(rli->group_relay_log_pos, llbuf));
goto err;
}
}
#ifndef DBUG_OFF
{
char llbuf1[22], llbuf2[22];
DBUG_PRINT("info", ("my_b_tell(rli->cur_log)=%s rli->event_relay_log_pos=%s",
llstr(my_b_tell(rli->cur_log),llbuf1),
llstr(rli->event_relay_log_pos,llbuf2)));
DBUG_ASSERT(rli->event_relay_log_pos >= BIN_LOG_HEADER_SIZE);
DBUG_ASSERT(my_b_tell(rli->cur_log) == rli->event_relay_log_pos);
}
#endif
/*
Now change the cache from READ to WRITE - must do this
before flush_relay_log_info
*/
reinit_io_cache(&rli->info_file, WRITE_CACHE,0L,0,1);
if ((error= flush_relay_log_info(rli)))
{
msg= "Failed to flush relay log info file";
goto err;
}
if (count_relay_log_space(rli))
{
msg="Error counting relay log space";
goto err;
}
rli->inited= 1;
mysql_mutex_unlock(&rli->data_lock);
DBUG_RETURN(error);
err:
sql_print_error("%s", msg);
end_io_cache(&rli->info_file);
if (info_fd >= 0)
mysql_file_close(info_fd, MYF(0));
rli->info_fd= -1;
rli->relay_log.close(LOG_CLOSE_INDEX | LOG_CLOSE_STOP_EVENT);
mysql_mutex_unlock(&rli->data_lock);
DBUG_RETURN(1);
}
static inline int add_relay_log(Relay_log_info* rli,LOG_INFO* linfo)
{
MY_STAT s;
DBUG_ENTER("add_relay_log");
if (!mysql_file_stat(key_file_relaylog,
linfo->log_file_name, &s, MYF(0)))
{
sql_print_error("log %s listed in the index, but failed to stat",
linfo->log_file_name);
DBUG_RETURN(1);
}
rli->log_space_total += s.st_size;
#ifndef DBUG_OFF
char buf[22];
DBUG_PRINT("info",("log_space_total: %s", llstr(rli->log_space_total,buf)));
#endif
DBUG_RETURN(0);
}
static int count_relay_log_space(Relay_log_info* rli)
{
LOG_INFO linfo;
DBUG_ENTER("count_relay_log_space");
rli->log_space_total= 0;
if (rli->relay_log.find_log_pos(&linfo, NullS, 1))
{
sql_print_error("Could not find first log while counting relay log space");
DBUG_RETURN(1);
}
do
{
if (add_relay_log(rli,&linfo))
DBUG_RETURN(1);
} while (!rli->relay_log.find_next_log(&linfo, 1));
/*
As we have counted everything, including what may have written in a
preceding write, we must reset bytes_written, or we may count some space
twice.
*/
rli->relay_log.reset_bytes_written();
DBUG_RETURN(0);
}
/*
Reset UNTIL condition for Relay_log_info
SYNOPSYS
clear_until_condition()
rli - Relay_log_info structure where UNTIL condition should be reset
*/
void Relay_log_info::clear_until_condition()
{
DBUG_ENTER("clear_until_condition");
until_condition= Relay_log_info::UNTIL_NONE;
until_log_name[0]= 0;
until_log_pos= 0;
DBUG_VOID_RETURN;
}
/*
Open the given relay log
SYNOPSIS
init_relay_log_pos()
rli Relay information (will be initialized)
log Name of relay log file to read from. NULL = First log
pos Position in relay log file
need_data_lock Set to 1 if this functions should do mutex locks
errmsg Store pointer to error message here
look_for_description_event
1 if we should look for such an event. We only need
this when the SQL thread starts and opens an existing
relay log and has to execute it (possibly from an
offset >4); then we need to read the first event of
the relay log to be able to parse the events we have
to execute.
DESCRIPTION
- Close old open relay log files.
- If we are using the same relay log as the running IO-thread, then set
rli->cur_log to point to the same IO_CACHE entry.
- If not, open the 'log' binary file.
TODO
- check proper initialization of group_master_log_name/group_master_log_pos
RETURN VALUES
0 ok
1 error. errmsg is set to point to the error message
*/
int init_relay_log_pos(Relay_log_info* rli,const char* log,
ulonglong pos, bool need_data_lock,
const char** errmsg,
bool look_for_description_event)
{
DBUG_ENTER("init_relay_log_pos");
DBUG_PRINT("info", ("pos: %lu", (ulong) pos));
*errmsg=0;
mysql_mutex_t *log_lock= rli->relay_log.get_log_lock();
if (need_data_lock)
mysql_mutex_lock(&rli->data_lock);
/*
Slave threads are not the only users of init_relay_log_pos(). CHANGE MASTER
is, too, and init_slave() too; these 2 functions allocate a description
event in init_relay_log_pos, which is not freed by the terminating SQL slave
thread as that thread is not started by these functions. So we have to free
the description_event here, in case, so that there is no memory leak in
running, say, CHANGE MASTER.
*/
delete rli->relay_log.description_event_for_exec;
/*
By default the relay log is in binlog format 3 (4.0).
Even if format is 4, this will work enough to read the first event
(Format_desc) (remember that format 4 is just lenghtened compared to format
3; format 3 is a prefix of format 4).
*/
rli->relay_log.description_event_for_exec= new
Format_description_log_event(3);
mysql_mutex_lock(log_lock);
/* Close log file and free buffers if it's already open */
if (rli->cur_log_fd >= 0)
{
end_io_cache(&rli->cache_buf);
mysql_file_close(rli->cur_log_fd, MYF(MY_WME));
rli->cur_log_fd = -1;
}
rli->group_relay_log_pos = rli->event_relay_log_pos = pos;
/*
Test to see if the previous run was with the skip of purging
If yes, we do not purge when we restart
*/
if (rli->relay_log.find_log_pos(&rli->linfo, NullS, 1))
{
*errmsg="Could not find first log during relay log initialization";
goto err;
}
if (log && rli->relay_log.find_log_pos(&rli->linfo, log, 1))
{
*errmsg="Could not find target log during relay log initialization";
goto err;
}
strmake(rli->group_relay_log_name,rli->linfo.log_file_name,
sizeof(rli->group_relay_log_name)-1);
strmake(rli->event_relay_log_name,rli->linfo.log_file_name,
sizeof(rli->event_relay_log_name)-1);
if (rli->relay_log.is_active(rli->linfo.log_file_name))
{
/*
The IO thread is using this log file.
In this case, we will use the same IO_CACHE pointer to
read data as the IO thread is using to write data.
*/
my_b_seek((rli->cur_log=rli->relay_log.get_log_file()), (off_t)0);
if (check_binlog_magic(rli->cur_log,errmsg))
goto err;
rli->cur_log_old_open_count=rli->relay_log.get_open_count();
}
else
{
/*
Open the relay log and set rli->cur_log to point at this one
*/
if ((rli->cur_log_fd=open_binlog(&rli->cache_buf,
rli->linfo.log_file_name,errmsg)) < 0)
goto err;
rli->cur_log = &rli->cache_buf;
}
/*
In all cases, check_binlog_magic() has been called so we're at offset 4 for
sure.
*/
if (pos > BIN_LOG_HEADER_SIZE) /* If pos<=4, we stay at 4 */
{
Log_event* ev;
while (look_for_description_event)
{
/*
Read the possible Format_description_log_event; if position
was 4, no need, it will be read naturally.
*/
DBUG_PRINT("info",("looking for a Format_description_log_event"));
if (my_b_tell(rli->cur_log) >= pos)
break;
/*
Because of we have rli->data_lock and log_lock, we can safely read an
event
*/
if (!(ev=Log_event::read_log_event(rli->cur_log,0,
rli->relay_log.description_event_for_exec)))
{
DBUG_PRINT("info",("could not read event, rli->cur_log->error=%d",
rli->cur_log->error));
if (rli->cur_log->error) /* not EOF */
{
*errmsg= "I/O error reading event at position 4";
goto err;
}
break;
}
else if (ev->get_type_code() == FORMAT_DESCRIPTION_EVENT)
{
DBUG_PRINT("info",("found Format_description_log_event"));
delete rli->relay_log.description_event_for_exec;
rli->relay_log.description_event_for_exec= (Format_description_log_event*) ev;
/*
As ev was returned by read_log_event, it has passed is_valid(), so
my_malloc() in ctor worked, no need to check again.
*/
/*
Ok, we found a Format_description event. But it is not sure that this
describes the whole relay log; indeed, one can have this sequence
(starting from position 4):
Format_desc (of slave)
Rotate (of master)
Format_desc (of master)
So the Format_desc which really describes the rest of the relay log
is the 3rd event (it can't be further than that, because we rotate
the relay log when we queue a Rotate event from the master).
But what describes the Rotate is the first Format_desc.
So what we do is:
go on searching for Format_description events, until you exceed the
position (argument 'pos') or until you find another event than Rotate
or Format_desc.
*/
}
else
{
DBUG_PRINT("info",("found event of another type=%d",
ev->get_type_code()));
look_for_description_event= (ev->get_type_code() == ROTATE_EVENT);
delete ev;
}
}
my_b_seek(rli->cur_log,(off_t)pos);
#ifndef DBUG_OFF
{
char llbuf1[22], llbuf2[22];
DBUG_PRINT("info", ("my_b_tell(rli->cur_log)=%s rli->event_relay_log_pos=%s",
llstr(my_b_tell(rli->cur_log),llbuf1),
llstr(rli->event_relay_log_pos,llbuf2)));
}
#endif
}
err:
/*
If we don't purge, we can't honour relay_log_space_limit ;
silently discard it
*/
if (!relay_log_purge)
rli->log_space_limit= 0;
mysql_cond_broadcast(&rli->data_cond);
mysql_mutex_unlock(log_lock);
if (need_data_lock)
mysql_mutex_unlock(&rli->data_lock);
if (!rli->relay_log.description_event_for_exec->is_valid() && !*errmsg)
*errmsg= "Invalid Format_description log event; could be out of memory";
DBUG_RETURN ((*errmsg) ? 1 : 0);
}
/*
Waits until the SQL thread reaches (has executed up to) the
log/position or timed out.
SYNOPSIS
wait_for_pos()
thd client thread that sent SELECT MASTER_POS_WAIT
log_name log name to wait for
log_pos position to wait for
timeout timeout in seconds before giving up waiting
NOTES
timeout is longlong whereas it should be ulong ; but this is
to catch if the user submitted a negative timeout.
RETURN VALUES
-2 improper arguments (log_pos<0)
or slave not running, or master info changed
during the function's execution,
or client thread killed. -2 is translated to NULL by caller
-1 timed out
>=0 number of log events the function had to wait
before reaching the desired log/position
*/
int Relay_log_info::wait_for_pos(THD* thd, String* log_name,
longlong log_pos,
longlong timeout)
{
int event_count = 0;
ulong init_abort_pos_wait;
int error=0;
struct timespec abstime; // for timeout checking
const char *msg;
DBUG_ENTER("Relay_log_info::wait_for_pos");
if (!inited)
2008-03-14 17:52:57 +01:00
DBUG_RETURN(-2);
DBUG_PRINT("enter",("log_name: '%s' log_pos: %lu timeout: %lu",
log_name->c_ptr(), (ulong) log_pos, (ulong) timeout));
set_timespec(abstime,timeout);
mysql_mutex_lock(&data_lock);
msg= thd->enter_cond(&data_cond, &data_lock,
"Waiting for the slave SQL thread to "
"advance position");
/*
This function will abort when it notices that some CHANGE MASTER or
RESET MASTER has changed the master info.
To catch this, these commands modify abort_pos_wait ; We just monitor
abort_pos_wait and see if it has changed.
Why do we have this mechanism instead of simply monitoring slave_running
in the loop (we do this too), as CHANGE MASTER/RESET SLAVE require that
the SQL thread be stopped?
This is becasue if someones does:
STOP SLAVE;CHANGE MASTER/RESET SLAVE; START SLAVE;
the change may happen very quickly and we may not notice that
slave_running briefly switches between 1/0/1.
*/
init_abort_pos_wait= abort_pos_wait;
/*
We'll need to
handle all possible log names comparisons (e.g. 999 vs 1000).
We use ulong for string->number conversion ; this is no
stronger limitation than in find_uniq_filename in sql/log.cc
*/
ulong log_name_extension;
char log_name_tmp[FN_REFLEN]; //make a char[] from String
strmake(log_name_tmp, log_name->ptr(), min(log_name->length(), FN_REFLEN-1));
char *p= fn_ext(log_name_tmp);
char *p_end;
if (!*p || log_pos<0)
{
error= -2; //means improper arguments
goto err;
}
// Convert 0-3 to 4
log_pos= max(log_pos, BIN_LOG_HEADER_SIZE);
/* p points to '.' */
log_name_extension= strtoul(++p, &p_end, 10);
/*
p_end points to the first invalid character.
If it equals to p, no digits were found, error.
If it contains '\0' it means conversion went ok.
*/
if (p_end==p || *p_end)
{
error= -2;
goto err;
}
/* The "compare and wait" main loop */
while (!thd->killed &&
init_abort_pos_wait == abort_pos_wait &&
slave_running)
{
bool pos_reached;
int cmp_result= 0;
DBUG_PRINT("info",
("init_abort_pos_wait: %ld abort_pos_wait: %ld",
init_abort_pos_wait, abort_pos_wait));
DBUG_PRINT("info",("group_master_log_name: '%s' pos: %lu",
group_master_log_name, (ulong) group_master_log_pos));
/*
group_master_log_name can be "", if we are just after a fresh
replication start or after a CHANGE MASTER TO MASTER_HOST/PORT
(before we have executed one Rotate event from the master) or
(rare) if the user is doing a weird slave setup (see next
paragraph). If group_master_log_name is "", we assume we don't
have enough info to do the comparison yet, so we just wait until
more data. In this case master_log_pos is always 0 except if
somebody (wrongly) sets this slave to be a slave of itself
without using --replicate-same-server-id (an unsupported
configuration which does nothing), then group_master_log_pos
will grow and group_master_log_name will stay "".
*/
if (*group_master_log_name)
{
char *basename= (group_master_log_name +
dirname_length(group_master_log_name));
/*
First compare the parts before the extension.
Find the dot in the master's log basename,
and protect against user's input error :
if the names do not match up to '.' included, return error
*/
char *q= (char*)(fn_ext(basename)+1);
if (strncmp(basename, log_name_tmp, (int)(q-basename)))
{
error= -2;
break;
}
// Now compare extensions.
char *q_end;
ulong group_master_log_name_extension= strtoul(q, &q_end, 10);
if (group_master_log_name_extension < log_name_extension)
cmp_result= -1 ;
else
cmp_result= (group_master_log_name_extension > log_name_extension) ? 1 : 0 ;
pos_reached= ((!cmp_result && group_master_log_pos >= (ulonglong)log_pos) ||
cmp_result > 0);
if (pos_reached || thd->killed)
break;
}
//wait for master update, with optional timeout.
DBUG_PRINT("info",("Waiting for master update"));
/*
We are going to mysql_cond_(timed)wait(); if the SQL thread stops it
will wake us up.
*/
thd_wait_begin(thd, THD_WAIT_BINLOG);
if (timeout > 0)
{
/*
Note that mysql_cond_timedwait checks for the timeout
before for the condition ; i.e. it returns ETIMEDOUT
if the system time equals or exceeds the time specified by abstime
before the condition variable is signaled or broadcast, _or_ if
the absolute time specified by abstime has already passed at the time
of the call.
For that reason, mysql_cond_timedwait will do the "timeoutting" job
even if its condition is always immediately signaled (case of a loaded
master).
*/
error= mysql_cond_timedwait(&data_cond, &data_lock, &abstime);
}
else
mysql_cond_wait(&data_cond, &data_lock);
thd_wait_end(thd);
DBUG_PRINT("info",("Got signal of master update or timed out"));
if (error == ETIMEDOUT || error == ETIME)
{
error= -1;
break;
}
error=0;
event_count++;
DBUG_PRINT("info",("Testing if killed or SQL thread not running"));
}
err:
thd->exit_cond(msg);
DBUG_PRINT("exit",("killed: %d abort: %d slave_running: %d \
improper_arguments: %d timed_out: %d",
thd->killed_errno(),
(int) (init_abort_pos_wait != abort_pos_wait),
(int) slave_running,
(int) (error == -2),
(int) (error == -1)));
if (thd->killed || init_abort_pos_wait != abort_pos_wait ||
!slave_running)
{
error= -2;
}
DBUG_RETURN( error ? error : event_count );
}
void Relay_log_info::inc_group_relay_log_pos(ulonglong log_pos,
bool skip_lock)
{
DBUG_ENTER("Relay_log_info::inc_group_relay_log_pos");
if (!skip_lock)
mysql_mutex_lock(&data_lock);
inc_event_relay_log_pos();
group_relay_log_pos= event_relay_log_pos;
strmake(group_relay_log_name,event_relay_log_name,
sizeof(group_relay_log_name)-1);
notify_group_relay_log_name_update();
/*
If the slave does not support transactions and replicates a transaction,
users should not trust group_master_log_pos (which they can display with
SHOW SLAVE STATUS or read from relay-log.info), because to compute
group_master_log_pos the slave relies on log_pos stored in the master's
binlog, but if we are in a master's transaction these positions are always
the BEGIN's one (excepted for the COMMIT), so group_master_log_pos does
not advance as it should on the non-transactional slave (it advances by
big leaps, whereas it should advance by small leaps).
*/
/*
In 4.x we used the event's len to compute the positions here. This is
wrong if the event was 3.23/4.0 and has been converted to 5.0, because
then the event's len is not what is was in the master's binlog, so this
will make a wrong group_master_log_pos (yes it's a bug in 3.23->4.0
replication: Exec_master_log_pos is wrong). Only way to solve this is to
have the original offset of the end of the event the relay log. This is
what we do in 5.0: log_pos has become "end_log_pos" (because the real use
of log_pos in 4.0 was to compute the end_log_pos; so better to store
end_log_pos instead of begin_log_pos.
If we had not done this fix here, the problem would also have appeared
when the slave and master are 5.0 but with different event length (for
example the slave is more recent than the master and features the event
UID). It would give false MASTER_POS_WAIT, false Exec_master_log_pos in
SHOW SLAVE STATUS, and so the user would do some CHANGE MASTER using this
value which would lead to badly broken replication.
Even the relay_log_pos will be corrupted in this case, because the len is
the relay log is not "val".
With the end_log_pos solution, we avoid computations involving lengthes.
*/
DBUG_PRINT("info", ("log_pos: %lu group_master_log_pos: %lu",
(long) log_pos, (long) group_master_log_pos));
if (log_pos) // 3.23 binlogs don't have log_posx
{
group_master_log_pos= log_pos;
}
mysql_cond_broadcast(&data_cond);
if (!skip_lock)
mysql_mutex_unlock(&data_lock);
DBUG_VOID_RETURN;
}
void Relay_log_info::close_temporary_tables()
{
TABLE *table,*next;
DBUG_ENTER("Relay_log_info::close_temporary_tables");
for (table=save_temporary_tables ; table ; table=next)
{
next=table->next;
/*
Don't ask for disk deletion. For now, anyway they will be deleted when
slave restarts, but it is a better intention to not delete them.
*/
DBUG_PRINT("info", ("table: 0x%lx", (long) table));
close_temporary(table, 1, 0);
}
save_temporary_tables= 0;
slave_open_temp_tables= 0;
DBUG_VOID_RETURN;
}
/*
purge_relay_logs()
NOTES
Assumes to have a run lock on rli and that no slave thread are running.
*/
int purge_relay_logs(Relay_log_info* rli, THD *thd, bool just_reset,
const char** errmsg)
{
int error=0;
DBUG_ENTER("purge_relay_logs");
/*
Even if rli->inited==0, we still try to empty rli->master_log_* variables.
Indeed, rli->inited==0 does not imply that they already are empty.
It could be that slave's info initialization partly succeeded :
for example if relay-log.info existed but *relay-bin*.*
have been manually removed, init_relay_log_info reads the old
relay-log.info and fills rli->master_log_*, then init_relay_log_info
checks for the existence of the relay log, this fails and
init_relay_log_info leaves rli->inited to 0.
In that pathological case, rli->master_log_pos* will be properly reinited
at the next START SLAVE (as RESET SLAVE or CHANGE
MASTER, the callers of purge_relay_logs, will delete bogus *.info files
or replace them with correct files), however if the user does SHOW SLAVE
STATUS before START SLAVE, he will see old, confusing rli->master_log_*.
In other words, we reinit rli->master_log_* for SHOW SLAVE STATUS
to display fine in any case.
*/
rli->group_master_log_name[0]= 0;
rli->group_master_log_pos= 0;
if (!rli->inited)
{
DBUG_PRINT("info", ("rli->inited == 0"));
DBUG_RETURN(0);
}
DBUG_ASSERT(rli->slave_running == 0);
DBUG_ASSERT(rli->mi->slave_running == 0);
rli->slave_skip_counter=0;
mysql_mutex_lock(&rli->data_lock);
/*
we close the relay log fd possibly left open by the slave SQL thread,
to be able to delete it; the relay log fd possibly left open by the slave
I/O thread will be closed naturally in reset_logs() by the
close(LOG_CLOSE_TO_BE_OPENED) call
*/
if (rli->cur_log_fd >= 0)
{
end_io_cache(&rli->cache_buf);
mysql_file_close(rli->cur_log_fd, MYF(MY_WME));
rli->cur_log_fd= -1;
}
if (rli->relay_log.reset_logs(thd))
{
*errmsg = "Failed during log reset";
error=1;
goto err;
}
/* Save name of used relay log file */
strmake(rli->group_relay_log_name, rli->relay_log.get_log_fname(),
sizeof(rli->group_relay_log_name)-1);
strmake(rli->event_relay_log_name, rli->relay_log.get_log_fname(),
sizeof(rli->event_relay_log_name)-1);
rli->group_relay_log_pos= rli->event_relay_log_pos= BIN_LOG_HEADER_SIZE;
if (count_relay_log_space(rli))
{
*errmsg= "Error counting relay log space";
error=1;
goto err;
}
if (!just_reset)
error= init_relay_log_pos(rli, rli->group_relay_log_name,
rli->group_relay_log_pos,
0 /* do not need data lock */, errmsg, 0);
err:
#ifndef DBUG_OFF
char buf[22];
#endif
DBUG_PRINT("info",("log_space_total: %s",llstr(rli->log_space_total,buf)));
mysql_mutex_unlock(&rli->data_lock);
DBUG_RETURN(error);
}
/*
Check if condition stated in UNTIL clause of START SLAVE is reached.
SYNOPSYS
Relay_log_info::is_until_satisfied()
master_beg_pos position of the beginning of to be executed event
(not log_pos member of the event that points to the
beginning of the following event)
DESCRIPTION
Checks if UNTIL condition is reached. Uses caching result of last
comparison of current log file name and target log file name. So cached
value should be invalidated if current log file name changes
(see Relay_log_info::notify_... functions).
This caching is needed to avoid of expensive string comparisons and
strtol() conversions needed for log names comparison. We don't need to
compare them each time this function is called, we only need to do this
when current log name changes. If we have UNTIL_MASTER_POS condition we
need to do this only after Rotate_log_event::do_apply_event() (which is
rare, so caching gives real benifit), and if we have UNTIL_RELAY_POS
condition then we should invalidate cached comarison value after
inc_group_relay_log_pos() which called for each group of events (so we
have some benefit if we have something like queries that use
autoincrement or if we have transactions).
Should be called ONLY if until_condition != UNTIL_NONE !
RETURN VALUE
true - condition met or error happened (condition seems to have
bad log file name)
false - condition not met
*/
bool Relay_log_info::is_until_satisfied(THD *thd, Log_event *ev)
{
const char *log_name;
ulonglong log_pos;
DBUG_ENTER("Relay_log_info::is_until_satisfied");
DBUG_ASSERT(until_condition != UNTIL_NONE);
if (until_condition == UNTIL_MASTER_POS)
{
if (ev && ev->server_id == (uint32) ::server_id && !replicate_same_server_id)
DBUG_RETURN(FALSE);
log_name= group_master_log_name;
log_pos= (!ev)? group_master_log_pos :
WL#4738 streamline/simplify @@variable creation process Bug#16565 mysqld --help --verbose does not order variablesBug#20413 sql_slave_skip_counter is not shown in show variables Bug#20415 Output of mysqld --help --verbose is incomplete Bug#25430 variable not found in SELECT @@global.ft_max_word_len; Bug#32902 plugin variables don't know their names Bug#34599 MySQLD Option and Variable Reference need to be consistent in formatting! Bug#34829 No default value for variable and setting default does not raise error Bug#34834 ? Is accepted as a valid sql mode Bug#34878 Few variables have default value according to documentation but error occurs Bug#34883 ft_boolean_syntax cant be assigned from user variable to global var. Bug#37187 `INFORMATION_SCHEMA`.`GLOBAL_VARIABLES`: inconsistent status Bug#40988 log_output_basic.test succeeded though syntactically false. Bug#41010 enum-style command-line options are not honoured (maria.maria-recover fails) Bug#42103 Setting key_buffer_size to a negative value may lead to very large allocations Bug#44691 Some plugins configured as MYSQL_PLUGIN_MANDATORY in can be disabled Bug#44797 plugins w/o command-line options have no disabling option in --help Bug#46314 string system variables don't support expressions Bug#46470 sys_vars.max_binlog_cache_size_basic_32 is broken Bug#46586 When using the plugin interface the type "set" for options caused a crash. Bug#47212 Crash in DBUG_PRINT in mysqltest.cc when trying to print octal number Bug#48758 mysqltest crashes on sys_vars.collation_server_basic in gcov builds Bug#49417 some complaints about mysqld --help --verbose output Bug#49540 DEFAULT value of binlog_format isn't the default value Bug#49640 ambiguous option '--skip-skip-myisam' (double skip prefix) Bug#49644 init_connect and \0 Bug#49645 init_slave and multi-byte characters Bug#49646 mysql --show-warnings crashes when server dies
2009-12-22 10:35:56 +01:00
((thd->variables.option_bits & OPTION_BEGIN || !ev->log_pos) ?
group_master_log_pos : ev->log_pos - ev->data_written);
}
else
{ /* until_condition == UNTIL_RELAY_POS */
log_name= group_relay_log_name;
log_pos= group_relay_log_pos;
}
#ifndef DBUG_OFF
{
char buf[32];
DBUG_PRINT("info", ("group_master_log_name='%s', group_master_log_pos=%s",
group_master_log_name, llstr(group_master_log_pos, buf)));
DBUG_PRINT("info", ("group_relay_log_name='%s', group_relay_log_pos=%s",
group_relay_log_name, llstr(group_relay_log_pos, buf)));
DBUG_PRINT("info", ("(%s) log_name='%s', log_pos=%s",
until_condition == UNTIL_MASTER_POS ? "master" : "relay",
log_name, llstr(log_pos, buf)));
DBUG_PRINT("info", ("(%s) until_log_name='%s', until_log_pos=%s",
until_condition == UNTIL_MASTER_POS ? "master" : "relay",
until_log_name, llstr(until_log_pos, buf)));
}
#endif
if (until_log_names_cmp_result == UNTIL_LOG_NAMES_CMP_UNKNOWN)
{
/*
We have no cached comparison results so we should compare log names
and cache result.
If we are after RESET SLAVE, and the SQL slave thread has not processed
any event yet, it could be that group_master_log_name is "". In that case,
just wait for more events (as there is no sensible comparison to do).
*/
if (*log_name)
{
const char *basename= log_name + dirname_length(log_name);
const char *q= (const char*)(fn_ext(basename)+1);
if (strncmp(basename, until_log_name, (int)(q-basename)) == 0)
{
/* Now compare extensions. */
char *q_end;
ulong log_name_extension= strtoul(q, &q_end, 10);
if (log_name_extension < until_log_name_extension)
until_log_names_cmp_result= UNTIL_LOG_NAMES_CMP_LESS;
else
until_log_names_cmp_result=
(log_name_extension > until_log_name_extension) ?
UNTIL_LOG_NAMES_CMP_GREATER : UNTIL_LOG_NAMES_CMP_EQUAL ;
}
else
{
/* Probably error so we aborting */
sql_print_error("Slave SQL thread is stopped because UNTIL "
"condition is bad.");
DBUG_RETURN(TRUE);
}
}
else
DBUG_RETURN(until_log_pos == 0);
}
DBUG_RETURN(((until_log_names_cmp_result == UNTIL_LOG_NAMES_CMP_EQUAL &&
log_pos >= until_log_pos) ||
until_log_names_cmp_result == UNTIL_LOG_NAMES_CMP_GREATER));
}
void Relay_log_info::cached_charset_invalidate()
{
DBUG_ENTER("Relay_log_info::cached_charset_invalidate");
/* Full of zeroes means uninitialized. */
bzero(cached_charset, sizeof(cached_charset));
DBUG_VOID_RETURN;
}
bool Relay_log_info::cached_charset_compare(char *charset) const
{
DBUG_ENTER("Relay_log_info::cached_charset_compare");
if (memcmp(cached_charset, charset, sizeof(cached_charset)))
{
memcpy(const_cast<char*>(cached_charset), charset, sizeof(cached_charset));
DBUG_RETURN(1);
}
DBUG_RETURN(0);
}
void Relay_log_info::stmt_done(my_off_t event_master_log_pos,
time_t event_creation_time)
{
2007-10-13 22:12:50 +02:00
#ifndef DBUG_OFF
extern uint debug_not_change_ts_if_art_event;
#endif
clear_flag(IN_STMT);
/*
If in a transaction, and if the slave supports transactions, just
inc_event_relay_log_pos(). We only have to check for OPTION_BEGIN
(not OPTION_NOT_AUTOCOMMIT) as transactions are logged with
BEGIN/COMMIT, not with SET AUTOCOMMIT= .
CAUTION: opt_using_transactions means innodb || bdb ; suppose the
master supports InnoDB and BDB, but the slave supports only BDB,
problems will arise: - suppose an InnoDB table is created on the
master, - then it will be MyISAM on the slave - but as
opt_using_transactions is true, the slave will believe he is
transactional with the MyISAM table. And problems will come when
one does START SLAVE; STOP SLAVE; START SLAVE; (the slave will
resume at BEGIN whereas there has not been any rollback). This is
the problem of using opt_using_transactions instead of a finer
"does the slave support _transactional handler used on the
master_".
More generally, we'll have problems when a query mixes a
transactional handler and MyISAM and STOP SLAVE is issued in the
middle of the "transaction". START SLAVE will resume at BEGIN
while the MyISAM table has already been updated.
*/
WL#4738 streamline/simplify @@variable creation process Bug#16565 mysqld --help --verbose does not order variablesBug#20413 sql_slave_skip_counter is not shown in show variables Bug#20415 Output of mysqld --help --verbose is incomplete Bug#25430 variable not found in SELECT @@global.ft_max_word_len; Bug#32902 plugin variables don't know their names Bug#34599 MySQLD Option and Variable Reference need to be consistent in formatting! Bug#34829 No default value for variable and setting default does not raise error Bug#34834 ? Is accepted as a valid sql mode Bug#34878 Few variables have default value according to documentation but error occurs Bug#34883 ft_boolean_syntax cant be assigned from user variable to global var. Bug#37187 `INFORMATION_SCHEMA`.`GLOBAL_VARIABLES`: inconsistent status Bug#40988 log_output_basic.test succeeded though syntactically false. Bug#41010 enum-style command-line options are not honoured (maria.maria-recover fails) Bug#42103 Setting key_buffer_size to a negative value may lead to very large allocations Bug#44691 Some plugins configured as MYSQL_PLUGIN_MANDATORY in can be disabled Bug#44797 plugins w/o command-line options have no disabling option in --help Bug#46314 string system variables don't support expressions Bug#46470 sys_vars.max_binlog_cache_size_basic_32 is broken Bug#46586 When using the plugin interface the type "set" for options caused a crash. Bug#47212 Crash in DBUG_PRINT in mysqltest.cc when trying to print octal number Bug#48758 mysqltest crashes on sys_vars.collation_server_basic in gcov builds Bug#49417 some complaints about mysqld --help --verbose output Bug#49540 DEFAULT value of binlog_format isn't the default value Bug#49640 ambiguous option '--skip-skip-myisam' (double skip prefix) Bug#49644 init_connect and \0 Bug#49645 init_slave and multi-byte characters Bug#49646 mysql --show-warnings crashes when server dies
2009-12-22 10:35:56 +01:00
if ((sql_thd->variables.option_bits & OPTION_BEGIN) && opt_using_transactions)
inc_event_relay_log_pos();
else
{
inc_group_relay_log_pos(event_master_log_pos);
flush_relay_log_info(this);
/*
Note that Rotate_log_event::do_apply_event() does not call this
function, so there is no chance that a fake rotate event resets
last_master_timestamp. Note that we update without mutex
(probably ok - except in some very rare cases, only consequence
is that value may take some time to display in
Seconds_Behind_Master - not critical).
*/
2007-10-13 22:12:50 +02:00
#ifndef DBUG_OFF
if (!(event_creation_time == 0 && debug_not_change_ts_if_art_event > 0))
#else
if (event_creation_time != 0)
#endif
last_master_timestamp= event_creation_time;
}
}
#if !defined(MYSQL_CLIENT) && defined(HAVE_REPLICATION)
void Relay_log_info::cleanup_context(THD *thd, bool error)
{
DBUG_ENTER("Relay_log_info::cleanup_context");
DBUG_ASSERT(sql_thd == thd);
/*
1) Instances of Table_map_log_event, if ::do_apply_event() was called on them,
may have opened tables, which we cannot be sure have been closed (because
maybe the Rows_log_event have not been found or will not be, because slave
SQL thread is stopping, or relay log has a missing tail etc). So we close
all thread's tables. And so the table mappings have to be cancelled.
2) Rows_log_event::do_apply_event() may even have started statements or
transactions on them, which we need to rollback in case of error.
3) If finding a Format_description_log_event after a BEGIN, we also need
to rollback before continuing with the next events.
4) so we need this "context cleanup" function.
*/
if (error)
{
trans_rollback_stmt(thd); // if a "statement transaction"
trans_rollback(thd); // if a "real transaction"
}
m_table_map.clear_tables();
Initial import of WL#3726 "DDL locking for all metadata objects". Backport of: ------------------------------------------------------------ revno: 2630.4.1 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Fri 2008-05-23 17:54:03 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ------------------------------------------------------------ This is the first patch in series. It transforms the metadata locking subsystem to use a dedicated module (mdl.h,cc). No significant changes in the locking protocol. The import passes the test suite with the exception of deprecated/removed 6.0 features, and MERGE tables. The latter are subject to a fix by WL#4144. Unfortunately, the original changeset comments got lost in a merge, thus this import has its own (largely insufficient) comments. This patch fixes Bug#25144 "replication / binlog with view breaks". Warning: this patch introduces an incompatible change: Under LOCK TABLES, it's no longer possible to FLUSH a table that was not locked for WRITE. Under LOCK TABLES, it's no longer possible to DROP a table or VIEW that was not locked for WRITE. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.2 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:03:45 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.3 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:08:51 +0400 message: WL#3726 "DDL locking for all metadata objects" Fixed failing Windows builds by adding mdl.cc to the lists of files needed to build server/libmysqld on Windows. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.4 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 21:57:58 +0400 message: WL#3726 "DDL locking for all metadata objects". Fix for assert failures in kill.test which occured when one tried to kill ALTER TABLE statement on merge table while it was waiting in wait_while_table_is_used() for other connections to close this table. These assert failures stemmed from the fact that cleanup code in this case assumed that temporary table representing new version of table was open with adding to THD::temporary_tables list while code which were opening this temporary table wasn't always fulfilling this. This patch changes code that opens new version of table to always do this linking in. It also streamlines cleanup process for cases when error occurs while we have new version of table open. ****** WL#3726 "DDL locking for all metadata objects" Add libmysqld/mdl.cc to .bzrignore. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.6 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sun 2008-05-25 00:33:22 +0400 message: WL#3726 "DDL locking for all metadata objects". Addition to the fix of assert failures in kill.test caused by changes for this worklog. Make sure we close the new table only once.
2009-11-30 16:55:03 +01:00
slave_close_thread_tables(thd);
A prerequisite patch for the fix for Bug#46224 "HANDLER statements within a transaction might lead to deadlocks". Introduce a notion of a sentinel to MDL_context. A sentinel is a ticket that separates all tickets in the context into two groups: before and after it. Currently we can have (and need) only one designated sentinel -- it separates all locks taken by LOCK TABLE or HANDLER statement, which must survive COMMIT and ROLLBACK and all other locks, which must be released at COMMIT or ROLLBACK. The tricky part is maintaining the sentinel up to date when someone release its corresponding ticket. This can happen, e.g. if someone issues DROP TABLE under LOCK TABLES (generally, see all calls to release_all_locks_for_name()). MDL_context::release_ticket() is modified to take care of it. ****** A fix and a test case for Bug#46224 "HANDLER statements within a transaction might lead to deadlocks". An attempt to mix HANDLER SQL statements, which are transaction- agnostic, an open multi-statement transaction, and DDL against the involved tables (in a concurrent connection) could lead to a deadlock. The deadlock would occur when HANDLER OPEN or HANDLER READ would have to wait on a conflicting metadata lock. If the connection that issued HANDLER statement also had other metadata locks (say, acquired in scope of a transaction), a classical deadlock situation of mutual wait could occur. Incompatible change: entering LOCK TABLES mode automatically closes all open HANDLERs in the current connection. Incompatible change: previously an attempt to wait on a lock in a connection that has an open HANDLER statement could wait indefinitely/deadlock. After this patch, an error ER_LOCK_DEADLOCK is produced. The idea of the fix is to merge thd->handler_mdl_context with the main mdl_context of the connection, used for transactional locks. This makes deadlock detection possible, since all waits with locks are "visible" and available to analysis in a single MDL context of the connection. Since HANDLER locks and transactional locks have a different life cycle -- HANDLERs are explicitly open and closed, and so are HANDLER locks, explicitly acquired and released, whereas transactional locks "accumulate" till the end of a transaction and are released only with COMMIT, ROLLBACK and ROLLBACK TO SAVEPOINT, a concept of "sentinel" was introduced to MDL_context. All locks, HANDLER and others, reside in the same linked list. However, a selected element of the list separates locks with different life cycle. HANDLER locks always reside at the end of the list, after the sentinel. Transactional locks are prepended to the beginning of the list, before the sentinel. Thus, ROLLBACK, COMMIT or ROLLBACK TO SAVEPOINT, only release those locks that reside before the sentinel. HANDLER locks must be released explicitly as part of HANDLER CLOSE statement, or an implicit close. The same approach with sentinel is also employed for LOCK TABLES locks. Since HANDLER and LOCK TABLES statement has never worked together, the implementation is made simple and only maintains one sentinel, which is used either for HANDLER locks, or for LOCK TABLES locks.
2009-12-22 17:09:15 +01:00
if (error)
thd->mdl_context.release_transactional_locks();
clear_flag(IN_STMT);
/*
Cleanup for the flags that have been set at do_apply_event.
*/
WL#4738 streamline/simplify @@variable creation process Bug#16565 mysqld --help --verbose does not order variablesBug#20413 sql_slave_skip_counter is not shown in show variables Bug#20415 Output of mysqld --help --verbose is incomplete Bug#25430 variable not found in SELECT @@global.ft_max_word_len; Bug#32902 plugin variables don't know their names Bug#34599 MySQLD Option and Variable Reference need to be consistent in formatting! Bug#34829 No default value for variable and setting default does not raise error Bug#34834 ? Is accepted as a valid sql mode Bug#34878 Few variables have default value according to documentation but error occurs Bug#34883 ft_boolean_syntax cant be assigned from user variable to global var. Bug#37187 `INFORMATION_SCHEMA`.`GLOBAL_VARIABLES`: inconsistent status Bug#40988 log_output_basic.test succeeded though syntactically false. Bug#41010 enum-style command-line options are not honoured (maria.maria-recover fails) Bug#42103 Setting key_buffer_size to a negative value may lead to very large allocations Bug#44691 Some plugins configured as MYSQL_PLUGIN_MANDATORY in can be disabled Bug#44797 plugins w/o command-line options have no disabling option in --help Bug#46314 string system variables don't support expressions Bug#46470 sys_vars.max_binlog_cache_size_basic_32 is broken Bug#46586 When using the plugin interface the type "set" for options caused a crash. Bug#47212 Crash in DBUG_PRINT in mysqltest.cc when trying to print octal number Bug#48758 mysqltest crashes on sys_vars.collation_server_basic in gcov builds Bug#49417 some complaints about mysqld --help --verbose output Bug#49540 DEFAULT value of binlog_format isn't the default value Bug#49640 ambiguous option '--skip-skip-myisam' (double skip prefix) Bug#49644 init_connect and \0 Bug#49645 init_slave and multi-byte characters Bug#49646 mysql --show-warnings crashes when server dies
2009-12-22 10:35:56 +01:00
thd->variables.option_bits&= ~OPTION_NO_FOREIGN_KEY_CHECKS;
thd->variables.option_bits&= ~OPTION_RELAXED_UNIQUE_CHECKS;
/*
Reset state related to long_find_row notes in the error log:
- timestamp
- flag that decides whether the slave prints or not
*/
reset_row_stmt_start_timestamp();
unset_long_find_row_note_printed();
DBUG_VOID_RETURN;
}
void Relay_log_info::clear_tables_to_lock()
{
while (tables_to_lock)
{
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
uchar* to_free= reinterpret_cast<uchar*>(tables_to_lock);
if (tables_to_lock->m_tabledef_valid)
{
tables_to_lock->m_tabledef.table_def::~table_def();
tables_to_lock->m_tabledef_valid= FALSE;
}
/*
If blob fields were used during conversion of field values
from the master table into the slave table, then we need to
free the memory used temporarily to store their values before
copying into the slave's table.
*/
if (tables_to_lock->m_conv_table)
free_blobs(tables_to_lock->m_conv_table);
tables_to_lock=
static_cast<RPL_TABLE_LIST*>(tables_to_lock->next_global);
tables_to_lock_count--;
Bug#34043: Server loops excessively in _checkchunk() when safemalloc is enabled Essentially, the problem is that safemalloc is excruciatingly slow as it checks all allocated blocks for overrun at each memory management primitive, yielding a almost exponential slowdown for the memory management functions (malloc, realloc, free). The overrun check basically consists of verifying some bytes of a block for certain magic keys, which catches some simple forms of overrun. Another minor problem is violation of aliasing rules and that its own internal list of blocks is prone to corruption. Another issue with safemalloc is rather the maintenance cost as the tool has a significant impact on the server code. Given the magnitude of memory debuggers available nowadays, especially those that are provided with the platform malloc implementation, maintenance of a in-house and largely obsolete memory debugger becomes a burden that is not worth the effort due to its slowness and lack of support for detecting more common forms of heap corruption. Since there are third-party tools that can provide the same functionality at a lower or comparable performance cost, the solution is to simply remove safemalloc. Third-party tools can provide the same functionality at a lower or comparable performance cost. The removal of safemalloc also allows a simplification of the malloc wrappers, removing quite a bit of kludge: redefinition of my_malloc, my_free and the removal of the unused second argument of my_free. Since free() always check whether the supplied pointer is null, redudant checks are also removed. Also, this patch adds unit testing for my_malloc and moves my_realloc implementation into the same file as the other memory allocation primitives.
2010-07-08 23:20:08 +02:00
my_free(to_free);
}
DBUG_ASSERT(tables_to_lock == NULL && tables_to_lock_count == 0);
}
Initial import of WL#3726 "DDL locking for all metadata objects". Backport of: ------------------------------------------------------------ revno: 2630.4.1 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Fri 2008-05-23 17:54:03 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ------------------------------------------------------------ This is the first patch in series. It transforms the metadata locking subsystem to use a dedicated module (mdl.h,cc). No significant changes in the locking protocol. The import passes the test suite with the exception of deprecated/removed 6.0 features, and MERGE tables. The latter are subject to a fix by WL#4144. Unfortunately, the original changeset comments got lost in a merge, thus this import has its own (largely insufficient) comments. This patch fixes Bug#25144 "replication / binlog with view breaks". Warning: this patch introduces an incompatible change: Under LOCK TABLES, it's no longer possible to FLUSH a table that was not locked for WRITE. Under LOCK TABLES, it's no longer possible to DROP a table or VIEW that was not locked for WRITE. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.2 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:03:45 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.3 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:08:51 +0400 message: WL#3726 "DDL locking for all metadata objects" Fixed failing Windows builds by adding mdl.cc to the lists of files needed to build server/libmysqld on Windows. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.4 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 21:57:58 +0400 message: WL#3726 "DDL locking for all metadata objects". Fix for assert failures in kill.test which occured when one tried to kill ALTER TABLE statement on merge table while it was waiting in wait_while_table_is_used() for other connections to close this table. These assert failures stemmed from the fact that cleanup code in this case assumed that temporary table representing new version of table was open with adding to THD::temporary_tables list while code which were opening this temporary table wasn't always fulfilling this. This patch changes code that opens new version of table to always do this linking in. It also streamlines cleanup process for cases when error occurs while we have new version of table open. ****** WL#3726 "DDL locking for all metadata objects" Add libmysqld/mdl.cc to .bzrignore. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.6 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sun 2008-05-25 00:33:22 +0400 message: WL#3726 "DDL locking for all metadata objects". Addition to the fix of assert failures in kill.test caused by changes for this worklog. Make sure we close the new table only once.
2009-11-30 16:55:03 +01:00
void Relay_log_info::slave_close_thread_tables(THD *thd)
{
thd->stmt_da->can_overwrite_status= TRUE;
thd->is_error() ? trans_rollback_stmt(thd) : trans_commit_stmt(thd);
thd->stmt_da->can_overwrite_status= FALSE;
Initial import of WL#3726 "DDL locking for all metadata objects". Backport of: ------------------------------------------------------------ revno: 2630.4.1 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Fri 2008-05-23 17:54:03 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ------------------------------------------------------------ This is the first patch in series. It transforms the metadata locking subsystem to use a dedicated module (mdl.h,cc). No significant changes in the locking protocol. The import passes the test suite with the exception of deprecated/removed 6.0 features, and MERGE tables. The latter are subject to a fix by WL#4144. Unfortunately, the original changeset comments got lost in a merge, thus this import has its own (largely insufficient) comments. This patch fixes Bug#25144 "replication / binlog with view breaks". Warning: this patch introduces an incompatible change: Under LOCK TABLES, it's no longer possible to FLUSH a table that was not locked for WRITE. Under LOCK TABLES, it's no longer possible to DROP a table or VIEW that was not locked for WRITE. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.2 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:03:45 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.3 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:08:51 +0400 message: WL#3726 "DDL locking for all metadata objects" Fixed failing Windows builds by adding mdl.cc to the lists of files needed to build server/libmysqld on Windows. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.4 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 21:57:58 +0400 message: WL#3726 "DDL locking for all metadata objects". Fix for assert failures in kill.test which occured when one tried to kill ALTER TABLE statement on merge table while it was waiting in wait_while_table_is_used() for other connections to close this table. These assert failures stemmed from the fact that cleanup code in this case assumed that temporary table representing new version of table was open with adding to THD::temporary_tables list while code which were opening this temporary table wasn't always fulfilling this. This patch changes code that opens new version of table to always do this linking in. It also streamlines cleanup process for cases when error occurs while we have new version of table open. ****** WL#3726 "DDL locking for all metadata objects" Add libmysqld/mdl.cc to .bzrignore. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.6 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sun 2008-05-25 00:33:22 +0400 message: WL#3726 "DDL locking for all metadata objects". Addition to the fix of assert failures in kill.test caused by changes for this worklog. Make sure we close the new table only once.
2009-11-30 16:55:03 +01:00
close_thread_tables(thd);
/*
- If inside a multi-statement transaction,
defer the release of metadata locks until the current
transaction is either committed or rolled back. This prevents
other statements from modifying the table for the entire
duration of this transaction. This provides commit ordering
and guarantees serializability across multiple transactions.
- If in autocommit mode, or outside a transactional context,
automatically release metadata locks of the current statement.
*/
if (! thd->in_multi_stmt_transaction_mode())
thd->mdl_context.release_transactional_locks();
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
else
thd->mdl_context.release_statement_locks();
Initial import of WL#3726 "DDL locking for all metadata objects". Backport of: ------------------------------------------------------------ revno: 2630.4.1 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Fri 2008-05-23 17:54:03 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ------------------------------------------------------------ This is the first patch in series. It transforms the metadata locking subsystem to use a dedicated module (mdl.h,cc). No significant changes in the locking protocol. The import passes the test suite with the exception of deprecated/removed 6.0 features, and MERGE tables. The latter are subject to a fix by WL#4144. Unfortunately, the original changeset comments got lost in a merge, thus this import has its own (largely insufficient) comments. This patch fixes Bug#25144 "replication / binlog with view breaks". Warning: this patch introduces an incompatible change: Under LOCK TABLES, it's no longer possible to FLUSH a table that was not locked for WRITE. Under LOCK TABLES, it's no longer possible to DROP a table or VIEW that was not locked for WRITE. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.2 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:03:45 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.3 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:08:51 +0400 message: WL#3726 "DDL locking for all metadata objects" Fixed failing Windows builds by adding mdl.cc to the lists of files needed to build server/libmysqld on Windows. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.4 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 21:57:58 +0400 message: WL#3726 "DDL locking for all metadata objects". Fix for assert failures in kill.test which occured when one tried to kill ALTER TABLE statement on merge table while it was waiting in wait_while_table_is_used() for other connections to close this table. These assert failures stemmed from the fact that cleanup code in this case assumed that temporary table representing new version of table was open with adding to THD::temporary_tables list while code which were opening this temporary table wasn't always fulfilling this. This patch changes code that opens new version of table to always do this linking in. It also streamlines cleanup process for cases when error occurs while we have new version of table open. ****** WL#3726 "DDL locking for all metadata objects" Add libmysqld/mdl.cc to .bzrignore. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.6 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sun 2008-05-25 00:33:22 +0400 message: WL#3726 "DDL locking for all metadata objects". Addition to the fix of assert failures in kill.test caused by changes for this worklog. Make sure we close the new table only once.
2009-11-30 16:55:03 +01:00
clear_tables_to_lock();
}
#endif