Problem: it is unsafe to read base64-printed events without first
reading the Format_description_log_event (FD). Currently, mysqlbinlog
cannot print the FD.
As a side effect, another bug has also been fixed: When mysqlbinlog
--start-position=X was specified, no ROLLBACK was printed. I changed
this, so that ROLLBACK is always printed.
This patch does several things:
- Format_description_log_event (FD) now print themselves in base64
format.
- mysqlbinlog is now able to print FD events. It has three modes:
--base64-output=auto Print row events in base64 output, and print
FD event. The FD event is printed even if
it is outside the range specified with
--start-position, because it would not be
safe to read row events otherwise. This is
the default.
--base64-output=always Like --base64-output=auto, but also print
base64 output for query events. This is
like the old --base64-output flag, which
is also a shorthand for
--base64-output=always
--base64-output=never Never print base64 output, generate error if
row events occur in binlog. This is
useful to suppress the FD event in binlogs
known not to contain row events (e.g.,
because BINLOG statement is unsafe,
requires root privileges, is not SQL, etc)
- the BINLOG statement now handles FD events correctly, by setting
the thread's rli's relay log's description_event_for_exec to the
loaded event.
In fact, executing a BINLOG statement is almost the same as reading
an event from a relay log. Before my patch, the code for this was
separated (exec_relay_log_event in slave.cc executes events from
the relay log, mysql_client_binlog_statement in sql_binlog.cc
executes BINLOG statements). I needed to augment
mysql_client_binlog_statement to do parts of what
exec_relay_log_event does. Hence, I did a small refactoring and
moved parts of exec_relay_log_event to a new function, which I
named apply_event_and_update_pos. apply_event_and_update_pos is
called both from exec_relay_log_event and from
mysql_client_binlog_statement.
- When a non-FD event is executed in a BINLOG statement, without
previously executing a FD event in a BINLOG statement, it generates
an error, because that's unsafe. I took a new error code for that:
ER_NO_FORMAT_DESCRIPTION_EVENT_BEFORE_BINLOG_STATEMENTS.
In order to get a decent error message containing the name of the
event, I added the class method char*
Log_event::get_type_str(Log_event_type type), which returns a
string name for the given Log_event_type. This is just like the
existing char* Log_event::get_type_str(), except it is a class
method that takes the log event type as parameter.
I also added PRE_GA_*_ROWS_LOG_EVENT to Log_event::get_type_str(),
so that names of old rows event are properly printed.
- When reading an event, I added a check that the event type is known
by the current Format_description_log_event. Without this, it may
crash on bad input (and I was struck by this several times).
- I patched the following test cases, which all contain BINLOG
statements for row events which must be preceded by BINLOG
statements for FD events:
- rpl_bug31076
While I was here, I fixed some small things in log_event.cc:
- replaced hard-coded 4 by EVENT_TYPE_OFFSET in 3 places
- replaced return by DBUG_VOID_RETURN in one place
- The name of the logfile can be '-' to indicate stdin. Before my
patch, the code just checked if the first character is '-'; now it
does a full strcmp(). Probably, all arguments that begin with a -
are already handled somewhere else as flags, but I still think it
is better that the code reflects what it is supposed to do, with as
little dependencies as possible on other parts of the code. If we
one day implement that all command line arguments after -- are
files (as most unix tools do), then we need this.
I also fixed the following in slave.cc:
- next_event() was declared twice, and queue_event was not static but
should be static (not used outside the file).
cause ROLLBACK of statement", part 1. Review fixes.
Do not send OK/EOF packets to the client until we reached the end of
the current statement.
This is a consolidation, to keep the functionality that is shared by all
SQL statements in one place in the server.
Currently this functionality includes:
- close_thread_tables()
- log_slow_statement().
After this patch and the subsequent patch for Bug#12713, it shall also include:
- ha_autocommit_or_rollback()
- net_end_statement()
- query_cache_end_of_result().
In future it may also include:
- mysql_reset_thd_for_next_command().
Problem:
The "Slave I/O thread couldn't register on master" error sporadically
occurred in replication tests because the slave I/O thread got
killed by STOP SLAVE before or while registering on master.
Fixed by checking the state of the I/O thread, and issueing
the error only if it was not explicitely killed by a user.
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
partitioned table
Trying INSERT DELAYED on a partitioned table, that has not been
used right before, crashes the server. When a table is used for
select or update, it is kept open for some time. This period I
mean with "right before".
Information about partitioning of a table is stored in form of
a string in the .frm file. Parsing of this string requires a
correctly set up lexical analyzer (lex). The partitioning code
uses a new temporary instance of a lex. But it does still refer
to the previously active lex. The delayd insert thread does not
initialize its lex though...
Added initialization for thd->lex before open table in the delayed
thread and at all other places where it is necessary to call
lex_start() if all tables would be partitioned and need to parse
the .frm file.
If a stored function that contains a drop temporary table statement
is invoked by a create temporary table of the same name may cause
a server crash. The problem is that when dropping a table no check
is done to ensure that table is not being used by some outer query
(or outer statement), potentially leaving the outer query with a
reference to a stale (freed) table.
The solution is when dropping a temporary table, always check if
the table is being used by some outer statement as a temporary
table can be dropped inside stored procedures.
The check is performed by looking at the TABLE::query_id value for
temporary tables. To simplify this check and to solve a bug related
to handling of temporary tables in prelocked mode, this patch changes
the way in which this member is used to track the fact that table is
used/unused. Now we ensure that TABLE::query_id is zero for unused
temporary tables (which means that all temporary tables which were
used by a statement should be marked as free for reuse after it's
execution has been completed).
If a temporary error occured inside a group on an event that was not the first
event of the group, the slave could get stuck because the retry counter is reset
whenever an event was executed successfully.
This patch only reset the retry counter when an entire group has been successfully
executed, or failed with a non-transient error.
is possible):
When skipping the beginning of a transaction starting with BEGIN, the OPTION_BEGIN
flag was not set correctly, which caused the slave to not recognize that it was
inside a group. This patch sets the OPTION_BEGIN flag for BEGIN, COMMIT, ROLLBACK,
and XID events. It also adds checks if inside a group before decreasing the
slave skip counter to zero.
Begin_query_log_event was not marked that it could not end a group, which is now
corrected.
Report claims that Seconds_behind_master behaves unexpectedly.
Code analysis shows that there is an evident flaw in that treating of FormatDescription event is wrong
so that after FLUSH LOGS on slave the Seconds_behind_master's calculation slips and incorrect
value can be reported to SHOW SLAVE STATUS.
Even worse is that the gap between the correct and incorrect deltas grows with time.
Fixed with prohibiting changes to rpl->last_master_timestamp by artifical events (any kind of).
suggestion as comments is added how to fight with lack of info on the slave side by means of
new heartbeat feature coming.
The test can not be done ealily fully determistic.
Removing unguarded read of slave_running field from inside
terminate_slave_threads(). This could cause premature exit in the event
that the slave thread already were shutting down, but isn't finished yet.
The fields slave_running, io_thd, and sql_thread are guarded by an
associated run_lock. A read of these fields were not guarded inside
terminate_slave_threads(), which caused an assertion to fire. The
assertion was removed, and the code reorganized slightly.
--long-query-time is now given in seconds with microseconds as decimals
--min_examined_row_limit added for slow query log
long_query_time user variable is now double with 6 decimals
Added functions to get time in microseconds
Added faster time() functions for system that has gethrtime() (Solaris)
We now do less time() calls.
Added field->in_read_set() and field->in_write_set() for easier field manipulation by handlers
set_var.cc and my_getopt() can now handle DOUBLE variables.
All time() calls changed to my_time()
my_time() now does retry's if time() call fails.
Added debug function for stopping in mysql_admin_table() when tables are locked
Some trivial function and struct variable renames to avoid merge errors.
Fixed compiler warnings
Initialization of some time variables on windows moved to my_init()
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
In case of out-of-memory error received from the master, print the corresponding message to the error log and stop slave I/O thread to avoid reconnecting with a wrong binary log position.
Problem: "Under high load, the slave registering to the master can timeout
during the COM_REGISTER_SLAVE execution. This causes an error, which
prevents the slave from connecting at all."
Fix: Do not abort immediately, but retry registering on master.