'flush tables' crashes
The server crashes when 'show procedure status' and 'flush tables' are
run concurrently.
This is caused by the way mysql.proc table is added twice to the list
of table to lock although the requirements on the current locking API
assumes differently.
No test case is submitted because of the nature of the crash which is
currently difficult to reproduce in a deterministic way.
This is a backport from 5.1
> ------------------------------------------------------------
> revno: 2796
> revision-id: sergey.glukhov@sun.com-20090827102219-sgjz0v5t1rfccs14
> parent: joro@sun.com-20090824122803-1d5jlaysjc7a7j6q
> committer: Sergey Glukhov <Sergey.Glukhov@sun.com>
> branch nick: mysql-5.0-bugteam
> timestamp: Thu 2009-08-27 15:22:19 +0500
> message:
> Bug#46184 Crash, SELECT ... FROM derived table procedure analyze
> The crash happens because select_union object is used as result set
> for queries which have derived tables.
> select_union use temporary table as data storage and if
> fields count exceeds 10(count of values for procedure ANALYSE())
> then we get a crash on fill_record() function.
> ------------------------------------------------------------
> revno: 2791.2.3
> revision-id: joro@sun.com-20090827114042-h55n7qp9990bl6ge
> parent: anurag.shekhar@sun.com-20090831073231-e55y1hsck6n08ux8
> committer: Georgi Kodinov <joro@sun.com>
> branch nick: B46749-5.0-bugteam
> timestamp: Thu 2009-08-27 14:40:42 +0300
> message:
> Bug #46749: Segfault in add_key_fields() with outer subquery level
> field references
>
> This error requires a combination of factors :
> 1. An "impossible where" in the outermost SELECT
> 2. An aggregate in the outermost SELECT
> 3. A correlated subquery with a WHERE clause that includes an outer
> field reference as a top level WHERE sargable predicate
>
> When JOIN::optimize detects an "impossible WHERE" it will bail out
> without doing the rest of the work and initializations. It will not
> call make_join_statistics() as well. And make_join_statistics fills
> in various structures for each table referenced.
> When processing the result of the "impossible WHERE" the query must
> send a single row of data if there are aggregate functions in it.
> In this case the server marks all the aggregates as having received
> no rows and calls the relevant Item::val_xxx() method on the SELECT
> list. However if this SELECT list happens to contain a correlated
> subquery this subquery is evaluated in a normal evaluation mode.
> And if this correlated subquery has a reference to a field from the
> outermost "impossible where" SELECT the add_key_fields will mistakenly
> consider the outer field reference as a "local" field reference when
> looking for sargable predicates.
> But since the SELECT where the outer field reference refers to is not
> completely initialized due to the "impossible WHERE" in this level
> we'll get a NULL pointer reference.
> Fixed by making a better condition for discovering if a field is "local"
> to the SELECT level being processed.
> It's not enough to look for OUTER_REF_TABLE_BIT in this case since
> for outer references to constant tables the Item_field::used_tables()
> will return 0 regardless of whether the field reference is from the
> local SELECT or not.
The 'BEGIN/COMMIT/ROLLBACK' log event could be filtered out if the
database is not selected by --database option of mysqlbinlog command.
This can result in problem if there are some statements in the
transaction are not filtered out.
To fix the problem, mysqlbinlog will output 'BEGIN/ROLLBACK/COMMIT'
in regardless of the database filtering rules.
The 'BEGIN/COMMIT/ROLLBACK' log event could be filtered out if the
database is not selected by --database option of mysqlbinlog command.
This can result in problem if there are some statements in the
transaction are not filtered out.
To fix the problem, mysqlbinlog will output 'BEGIN/ROLLBACK/COMMIT'
in regardless of the database filtering rules.
The problem is that safe_kill_win fails to detect a dead process. OpenProcess() will
succeed even after the process died, it will first fail after the last handle to process
is closed.
To fix the problem, check process status with GetExitCodeProcess() and consider
process to be dead if the exit code returned by this routine is not STILL_ALIVE.
Backport from 6.0 to 5.1.
Only those sync points are included, which are used in debug_sync.test.
The Debug Sync Facility allows to place synchronization points
in the code:
open_tables(...)
DEBUG_SYNC(thd, "after_open_tables");
lock_tables(...)
When activated, a sync point can
- Send a signal and/or
- Wait for a signal
Nomenclature:
- signal: A value of a global variable that persists
until overwritten by a new signal. The global
variable can also be seen as a "signal post"
or "flag mast". Then the signal is what is
attached to the "signal post" or "flag mast".
- send a signal: Assign the value (the signal) to the global
variable ("set a flag") and broadcast a
global condition to wake those waiting for
a signal.
- wait for a signal: Loop over waiting for the global condition until
the global value matches the wait-for signal.
Please find more information in the top comment in debug_sync.cc
or in the worklog entry.
replication
MySQL server uses wrong lock type (always TL_READ instead of
TL_READ_NO_INSERT when appropriate) for tables used in
subqueries of UPDATE statement. This leads in some cases to
a broken replication as statements are written in the wrong
order to the binlog.
NOTE: Backporting the patch to next-mr.
The fix proposed in BUG#35542 and BUG#31665 introduces a performance issue
when fsyncing the master.info, relay.info and relay-log.bin* after #th events.
Although such solution has been proposed to reduce the probability of corrupted
files due to a slave-crash, the performance penalty introduced by it has
made the approach impractical for highly intensive workloads.
In a nutshell, the option --syn-relay-log proposed in BUG#35542 and BUG#31665
simultaneously fsyncs master.info, relay-log.info and relay-log.bin* and
this is the main source of performance issues.
This patch introduces new options that give more control to the user on
what should be fsynced and how often:
1) (--sync-master-info, integer) which syncs the master.info after #th event;
2) (--sync-relay-log, integer) which syncs the relay-log.bin* after #th
events.
3) (--sync-relay-log-info, integer) which syncs the relay.info after #th
transactions.
To provide both performance and increased reliability, we recommend the following
setup:
1) --sync-master-info = 0 eventually the operating system will fsync it;
2) --sync-relay-log = 0 eventually the operating system will fsync it;
3) --sync-relay-log-info = 1 fsyncs it after every transaction;
Notice, that the previous setup does not reduce the probability of
corrupted master.info and relay-log.bin*. To overcome the issue, this patch also
introduces a recovery mechanism that right after restart throws away relay-log.bin*
retrieved from a master and updates the master.info based on the relay.info:
4) (--relay-log-recovery, boolean) which enables a recovery mechanism that
throws away relay-log.bin* after a crash.
However, it can only recover the incorrect binlog file and position in master.info,
if other informations (host, port password, etc) are corrupted or incorrect,
then this recovery mechanism will fail to work.
BUG#31665 sync_binlog should cause relay logs to be synchronized
NOTE: Backporting the patch to next-mr.
Add sync_relay_log option to server, this option works for relay log
the same as option sync_binlog for binlog. This option also synchronize
master info to disk when set to non-zero value.
Original patches from Sinisa and Mark, with some modifications
vs not null
NOTE: Backporting the patch to next-mr.
The replication was generating corrupted data, warning messages on Valgrind
and aborting on debug mode while replicating a "null" to "not null" field.
Specifically the unpack_row routine, was considering the slave's table
definition and trying to retrieve a field value, where there was nothing to be
retrieved, ignoring the fact that the value was defined as "null" by the master.
To fix the problem, we proceed as follows:
1 - If it is not STRICT sql_mode, implicit default values are used, regardless
if it is multi-row or single-row statement.
2 - However, if it is STRICT mode, then a we do what follows:
2.1 If it is a transactional engine, we do a rollback on the first NULL that is
to be set into a NOT NULL column and return an error.
2.2 If it is a non-transactional engine and it is the first row to be inserted
with multi-row, we also return the error. Otherwise, we proceed with the
execution, use implicit default values and print out warning messages.
Unfortunately, the current patch cannot mimic the behavior showed by the master
for updates on multi-tables and multi-row inserts. This happens because such
statements are unfolded in different row events. For instance, considering the
following updates and strict mode:
(master)
create table t1 (a int);
create table t2 (a int not null);
insert into t1 values (1);
insert into t2 values (2);
update t1, t2 SET t1.a=10, t2.a=NULL;
t1 would have (10) and t2 would have (0) as this would be handled as a
multi-row update. On the other hand, if we had the following updates:
(master)
create table t1 (a int);
create table t2 (a int);
(slave)
create table t1 (a int);
create table t2 (a int not null);
(master)
insert into t1 values (1);
insert into t2 values (2);
update t1, t2 SET t1.a=10, t2.a=NULL;
On the master t1 would have (10) and t2 would have (NULL). On
the slave, t1 would have (10) but the update on t1 would fail.
logging is disabled
NOTE: this is the backport to next-mr.
If one sets binlog-format but does NOT enable binary log, server
refuses to start up. The following messages appears in the error log:
090217 12:47:14 [ERROR] You need to use --log-bin to make
--binlog-format work.
090217 12:47:14 [ERROR] Aborting
This patch addresses this by making the server not to bail out if the
binlog-format is set without the log-bin option. Additionally, the
specified binlog-format is stored, in the global system variable
"binlog_format", and a warning is printed instead of an error.
beyond unsigned long.
BUG#44779: binlog.binlog_max_extension may be causing failure on
next test in PB
NOTE1: this is the backport to next-mr.
NOTE2: already includes patch for BUG#44779.
Binlog file extensions would turn into negative numbers once the
variable used to hold the value reached maximum for signed
long. Consequently, incrementing value to the next (negative) number
would lead to .000000 extension, causing the server to fail.
This patch addresses this issue by not allowing negative extensions
and by returning an error on find_uniq_filename, when the limit is
reached. Additionally, warnings are printed to the error log when the
limit is approaching. FLUSH LOGS will also report warnings to the
user, if the extension number has reached the limit. The limit has been
set to 0x7FFFFFFF as the maximum.
STATUS'
NOTE: this is the backport to next-mr.
SHOW SHOW STATUS LIKE 'Slave_running' command believes that
if active_mi->slave_running != 0, then io thread is running normally.
But it isn't so in fact. When some errors happen to make io thread
try to reconnect master, then it will become transitional status
(MYSQL_SLAVE_RUN_NOT_CONNECT == 1), which also doesn't equal 0.
Yet, "SHOW SLAVE STATUS" believes that only if
active_mi->slave_running == MYSQL_SLAVE_RUN_CONNECT, then io thread is running.
So "SHOW SLAVE STATUS" can get the correct result.
Fixed to make SHOW SHOW STATUS LIKE 'Slave_running' command have the same
check condition with "SHOW SLAVE STATUS". It only believe that the io thread
is running when active_mi->slave_running == MYSQL_SLAVE_RUN_CONNECT.
NOTE: this is the backport to next-mr.
This patch addresses the bug reported by checking wether
host argument is an empty string or not. If empty, an error is
reported to the client, otherwise continue normally.
This commit is based on the originally proposed patch and adds
a test case as requested during review as well as refines comments,
and makes test case result file less verbose (compared to previous patch).
NOTE: this is the backport to next-mr.
When using replication, the slave will not log any slow query logs queries
replicated from the master, even if the option "--log-slow-slave-statements"
is set and these take more than "log_query_time" to execute.
In order to log slow queries in replicated thread one needs to set the
--log-slow-slave-statements, so that the SQL thread is initialized with the
correct switch. Although setting this flag correctly configures the slave
thread option to log slow queries, there is an issue with the condition that
is used to check whether to log the slow query or not. When replaying binlog
events the statement contains the SET TIMESTAMP clause which will force the
slow logging condition check to fail. Consequently, the slow query logging will
not take place.
This patch addresses this issue by removing the second condition from the
log_slow_statements as it prevents slow queries to be binlogged and seems
to be deprecated.
NOTE: Backporting the patch to next-mr.
The reason of the bug was incompatibile with the master side behaviour.
INSERT query on the master is allowed to insert into a table without specifying
values of DEFAULT-less fields if sql_mode is not strict.
Fixed with checking sql_mode by the sql thread to decide how to react.
Non-strict sql_mode should allow Write_rows event to complete.
todo: warnings can be shown via show slave status, still this is a
separate rather general issue how to show warnings for the slave threads.
NOTE: Backporting the patch to next-mr.
WL#4828 Augment DBUG_ENTER/DBUG_EXIT to crash MySQL in different functions
-------
The assessment of the replication code in the presence of faults is extremely
import to increase reliability. In particular, one needs to know if servers
will either correctly recovery or print out appropriate error messages thus
avoiding unexpected problems in a production environment.
In order to accomplish this, the current patch refactories the debug macros
already provided in the source code and introduces three new macros that
allows to inject faults, specifically crashes, while entering or exiting a
function or method. For instance, to crash a server while returning from
the init_slave function (see module sql/slave.cc), one needs to do what
follows:
1 - Modify the source replacing DBUG_RETURN by DBUG_CRASH_RETURN;
DBUG_CRASH_RETURN(0);
2 - Use the debug variable to activate dbug instructions:
SET SESSION debug="+d,init_slave_crash_return";
The new macros are briefly described below:
DBUG_CRASH_ENTER (function) is equivalent to DBUG_ENTER which registers the
beginning of a function but in addition to it allows for crashing the server
while entering the function if the appropriate dbug instruction is activate.
In this case, the dbug instruction should be "+d,function_crash_enter".
DBUG_CRASH_RETURN (value) is equivalent to DBUG_RETURN which notifies the
end of a function but in addition to it allows for crashing the server
while returning from the function if the appropriate dbug instruction is
activate. In this case, the dbug instruction should be
"+d,function_crash_return". Note that "function" should be the same string
used by either the DBUG_ENTER or DBUG_CRASH_ENTER.
DBUG_CRASH_VOID_RETURN (value) is equivalent to DBUG_VOID_RETURN which
notifies the end of a function but in addition to it allows for crashing
the server while returning from the function if the appropriate dbug
instruction is activate. In this case, the dbug instruction should be
"+d,function_crash_return". Note that "function" should be the same string
used by either the DBUG_ENTER or DBUG_CRASH_ENTER.
To inject other faults, for instance, wrong return values, one should rely
on the macros already available. The current patch also removes a set of
macros that were either not being used or were redundant as other macros
could be used to provide the same feature. In the future, we also consider
dynamic instrumentation of the code.
BUG#45747 DBUG_CRASH_* is not setting the strict option
---------
When combining DBUG_CRASH_* with "--debug=d:t:i:A,file" the server crashes
due to a call to the abort function in the DBUG_CRASH_* macro althought the
appropriate keyword has not been set.
NOTE: Backporting the patch to next-mr.
The use of option log_slave_updates without log_bin was preventing the server
from starting. To fix the problem, we replaced the error message and the exit
call by a warning message.
The problem was that appending values to the end of an existing
ENUM or SET column was being treated as table data modification,
preventing a immediately (fast) table alteration that occurs when
only table metadata is being modified.
The cause was twofold: adding a enumeration or set members to the
end of the list of valid member values was not being considered
a "compatible" table alteration, and for SET columns, the check
was being done upon the max display length and not the underlying
(pack) length of the field.
The solution is to augment the function that checks wether two ENUM
or SET fields are compatible -- by comparing the pack lengths and
performing a limited comparison of the member values.
The bug is not related to MERGE table or TRIGGER. More correct description
would be 'assertion on multi-table UPDATE + NATURAL JOIN + MERGEABLE VIEW'.
On PREPARE stage(see test case) we call mark_common_columns() func which
creates ON condition for NATURAL JOIN and sets appropriate
table read_set bitmaps for fields which are used in ON condition.
On EXECUTE stage mark_common_columns() is not called, we set
necessary read_set bitmaps in setup_conds(). But 'B.f1' field
is already processed and related item alredy fixed before
setup_conds() as updated field and setup_conds can not set
read_set bitmap because of that.
The fix is to set read_set bitmap for appropriate table field even
if Item_direct_view_ref item which represents a refernce to this field
is fixed.
files
NOTE: this is the backport to next-mr.
SHOW BINLOG EVENTS does not work with relay log files. If issuing
"SHOW BINLOG EVENTS IN 'relay-log.000001'" in a non-empty relay
log file (relay-log.000001), mysql reports empty set.
This patch addresses this issue by extending the SHOW command
with RELAYLOG. Events in relay log files can now be inspected by
issuing SHOW RELAYLOG EVENTS [IN 'log_name'] [FROM pos] [LIMIT
[offset,] row_count].