FROM SUBSELECT
ISSUE : In function find_all_keys.
If selected row do not satisfy condition
then we call unlock_row to release the locked
row. Suppose if we have subquery in condition
and we have an innodb error during its execution.
Then we should not call the unlock_row. If the error
is because of deadlock, innodb will rollback the
transaction. And calling unlock_row without
transaction is an invalid case hence an assertion
failure.
SOLUTION : We call unlock_row only if only there is no
error occurred previously.
The solution is back ported from 5.6
defect number 14226481
sql/filesort.cc:
Now we call unlock_row only if there is no
previous error.
Analysis
--------
Running 'MYSQLD --help --verbose' as ROOT user without
using '--user' option displays the help contents but
aborts at the end with an exit code '1'.
While starting the server, a validation is performed to
ensure when the server is started as root user, it should
be done using '--user' option. Else we abort. In case
of help, we dump the help contents and abort.
Fix:
---
During the validation, we skip aborting the server incase
we are using the help option under the condition mentioned
above.
NOTE: Test case has not been added since it requires using
'root' user.
Problem: Drop Trigger succeeds even after setting read_only
variable to ON.
Fix: Fix is to report the standard error
(ER_OPTION_PREVENTS_STATEMENT)when global read_only variable
is set to ON.
AUTO_INC COLUMN ONLY ON SLAVE
In RBR, if the slave's table as one additional auto_inc column,
then, it will insert the value 0 instead of generating the next
auto_inc number.
We fix this by checking that if an auto_inc extra column exists,
when compared to column data of the row event, we explicitly set
it to NULL and flag the engine that a nulled auto_inc column will
be inserted.
(MYSQLBINLOG -V CRASHES WITH THAT BINLOG)
Problem: If slave receives a corrupted row event,
slave server is crashing.
Analysis: When slave is unpacking the row event, it is
not validating the data before applying the event. If the
data is corrupted for eg: the length of a field is wrong,
it could end up reading wrong data leading to a crash.
A similar problem happens when mysqlbinlog tool is used
against a corrupted binlog using '-v' option. Due to -v
option, the tool tries to print the values of all the
fields. Corrupted field length could lead to a crash.
Fix: Before unpacking the field, a verification
will be made on the length. If it falls into the event
range, only then it will be unpacked. Otherwise,
"ER_SLAVE_CORRUPT_EVENT" error will be thrown.
Incase mysqlbinlog -v case, the field value will not be
printed and the processing of the file will be stopped.
sql/field.h:
Removed a function which is not required anymore
sql/log_event.cc:
Adding a validation on the field length before
the tool tries to print the value.
sql/log_event.h:
Changing unpack_row call according to the new arguments
sql/log_event_old.h:
Changing unpack_row call according to the new arguments
sql/rpl_record.cc:
Adding a new argument 'row_end' which tells
the end position of the complete data in the
row event. It will be used to do validation
before doing 'unpack' field.
sql/rpl_record.h:
Adding a new argument 'row_end' which tells
the end position of the complete data in the
row event. It will be used to do validation
before doing 'unpack' field.
sql/rpl_utility.cc:
Now calc_field_size() is required for client too.
Bug#17867117 - ERROR RESULT WHEN "COUNT + DISTINCT + CASE WHEN" NEED MERGE_WALK
Problem:
COUNT DISTINCT gives incorrect result when it uses a Unique
Tree and its last inserted record has null value.
Here is how COUNT DISTINCT is processed, given that this query is not
using loose index scan.
When a row is produced as a result of joining tables (there is only
one table here), we store the SELECTed value in a Unique tree. This
allows elimination of any duplicates, and thus implements DISTINCT.
When we have processed all rows like this, we walk the Unique tree,
counting its elements, in Aggregator_distinct::endup() (tree->walk());
for each element we call Item_sum_count::add(). Such function wants to
ignore any NULL value, for that it checks item_sum -> args[0] ->
null_value. It is a mistake: when walking the Unique tree, the value
to be aggregated is not item_sum ->args[0] but rather table ->
field[0].
Solution:
instead of item_sum -> args[0] -> null_value, use arg_is_null(), which
knows where to look (like in fix for bug 57932).
As a consequence of this solution, we have to make arg_is_null() a
little more general:
1) Because it was so far only used for AVG() (which always has a
single argument), this function was looking at a single argument; now
that it has to work with COUNT(DISTINCT expression1,expression2), it
must look at all arguments.
2) Because we start using arg_is_null () for COUNT(DISTINCT), i.e. in
Item_sum_count::add (), it implies that we are also using it for
COUNT(no DISTINCT) (same add ()). For COUNT(no DISTINCT), the
nullness to check is that of item_sum -> args[0]. But the null_value
of such item is reliable only if val_*() has been called on it. So far
arg_is_null() was always used after a call to arg_val*(), so could
rely on null_value; but for COUNT, there is no call to arg_val*(), so
arg_is_null() has to call is_null() instead.
Testcase for 16539979 by Neeraj. Testcase for 17867117 contributed by
Xiaobin Lin from Taobao.
MALFORMED XPATH EXP
Problem:
A malformed XPATH expression in the ExtractValue query is causing
a server crash. This malformed XPATH expression is resulted when
the position attribute in the substring function contains ".." in
the beginning.
Solution:
The original crash is happening because the "../" is being evaluated
prematurely. It tries to access XML while it hasn't been parsed yet.
The premature evaluation is happening because the val_nodeset function
is being set to constant, in which case we proceed to evaluate them in
JOIN:prepare stage only. The solution to this is setting the val_nodeset
functions as non-constant. This forces us to evaluate the function in
the JOIN:exec stage and thus avoid any premature evaluation of the
XML strings.
WITH MALFORMED XPATH EXP
Problem:
A malformed XPATH expression in the ExtractValue query is
causing a server crash. This malformed XPATH expression is
resulted when the position attribute in the substring function
contains ".." in the beginning.
Solution:
The original crash is happening because the "../" is being
evaluated prematurely. It tries to access XML while it
hasn't been parsed yet. The premature evaluation is happening
because the val_nodeset function is being set to constant,
in which case we proceed to evaluate them in JOIN:prepare
stage only. The solution to this is setting the val_nodeset
functions as non-constant. This forces us to evaluate the function
in the JOIN:exec stage and thus avoid any premature evaluation of
the XML strings.
revid:mattias.jonsson@oracle.com-20131119103616-u6t82s8cpgp0q3ex
Use of uninitialized memory in the priority queue used for returning records
in sorted order.
It happens if no previous partition have returned a row since the
beginning of index_init + an index_read* call returned
HA_ERR_KEY_NOT_FOUND for all partitions (otherwise the record
buffer/priority queue would be initialized) + an index_next/prev
call where all partitions returns HA_ERR_END_OF_FILE.
WITH SORT ABORTED LEAKS FILE DESCRIPTORS
ISSUE : IO_CACHE used for index_merge quick select
is freed only on successful retrieval of all rows
from index merge.
Suppose if there is a interrupt( or failure) to
this operation of row retrieval (let it be a
KILL_QUERY signal) then we are not freeing the IO_CACHE
resources allocated by index_merge quick select.
And hence temp file associated with it is also not closed.
This lead to a file descriptor leak.
SOLUTION : As part of file sort operation now we always
free the IO_CACHE allocated by index_merge quick select.
sql/filesort.cc:
In filesort function we try to free if any
IO_CACHE allocated by index_merge quick select
and if it is not yet freed.
Problem: The "--local-install" service does not perform as expected for, at least,
Windows.
Fix: A NULL pointer was dereferenced due to which there was crash.A check was introduced
for NULL string before dereferencing it.No test cases written as it is a bug during
installation.
Problem:
When log_warnings is greater than 1, master prints binlog
dump thread information in mysqld.1.err file.
The information contains slave server id, binlog file and
binlog position. The slave server id is uint32 and the print
format was wrongly specifified (%d instead of %u).
Hence a server id which is more than 2 billion is getting
printed with a negative value.
Eg: Start binlog_dump to slave_server(-1340259414),
pos(mysql-bin.001663, 325187493)
Fix: Changed the uint32 format to %u.
Problem:-
We have created a table with UTF8_BIN collation.
In case, when in our query we have ORDER BY clause over a function
call we are getting result in incorrect order.
Note:the bug is not there in 5.5.
Analysis:
In 5.5, for UTF16_BIN, we have min and max multi-byte length is 2 and 4
respectively.In make_sortkey(),for 2 byte character character we are
assuming that the resultant length will be 2 byte/character. But when we
use my_strnxfrm_unicode_full_bin(), we store sorting weights using 3 bytes
per character.This result in truncated result.
Same thing happen for UTF8MB4, where we have 1 byte min multi-byte and
4 byte max multi-byte.We will accsume resultant data as 1 byte/character,
which result in truncated result.
Solution:-
use strnxfrm(means use of MY_CS_STRNXFRM macro) is used for sort, in
which the resultant length is not dependent on source length.
"SHOW BINLOG EVENTS"
Problem:
========
mysql was crashed after executing "show binlog events in
'mysql-bin.000005' from 99", the crash happened randomly.
Analysis:
========
During construction of LOAD EVENT or NEW LOAD EVENT object
if the starting offset is provided as incorrect value then
all the object members that are retrieved from the offset
are also invalid. Some times it will lead to out of bound
address offsets. In the bug scenario, the file name is
extracrated from an invalid address and the same is fed to
strlen(fname) function. Passing invalid address to strlen
will lead to crash.
Fix:
===
Validate if the given offset falls within the event boundary
or not.
sql/log_event.cc:
Added code to validate fname's address. "fname" should
be within event boundary. Added code to find invalid
invents.
CAN RETURN WRONG RESULT SET
PROBLEM
-------
In ha_partition::cmp_ref() we were only calling the
underlying cmp_ref() of storage engine if the records
are in the same partiton,else we sort by partition and
returns the result.But the index merge intersect
algorithm expects first to sort by row-id first and
then by partition id.
FIX
---
Compare the refernces first using storage engine cmp_ref
and then if references are equal(only happens if
non clustered index is used) then sort it by partition id.
[Approved by Mattiasj #rb3755]
-
ON DELETE FROM A PARTITIONED TABLE
PROBLEM
-------
The user first disables all the non unique indexes
in the table and then rebuilds one partition.
During rebuild the indexes on that particular
partition are enabled. Now when we give a query
the optimizer is unaware that on one partition
indexes are enabled and if the optimizer selects
that index,myisam thinks that the index is not
active and gives an error.
FIX
---
Before rebuilding a partition check whether non
unique indexes are disabled on the partitons.
If they are disabled then after rebuild disable
the index on the partition.
[Approved by Mattiasj #rb3469]
Problem:
COM_CHANGE_USER allows brute-force attempts to crack a password at a very high
rate as it does not cause any significant delay after a login attempt has
failed. This issue was reproduced using John-The-Ripper password
cracking tool through which about 5000 passwords per second could be attempted.
Solution:
The non-GA version's solution was to disconnect the connection when a login
attempt failed. Now since our aim to to reduce the rate at which passwords
are tested, we introduced a sleep(1) after every login attempt failed. This
significantly increased the delay with which the password was cracked.
AS A INNODB PARTITTION.
PROBLEM
-------
The correct engine_type was not being set during
rebuild of the partition due to which the handler
was always created with the default engine,
which is innodb for 5.5+ ,therefore even if the
table was myisam, after rebuilding the partitions
ended up as innodb partitions.
FIX
---
Set the correct engine type during rebuild.
[Approved by mattiasj #rb3599]
REPLICATION FILTERS ARE USED.
Problem:
When Filtered-slave applies Int_var_log_event and when it
tries to write the event to its own binlog, LAST_INSERT_ID
value is written wrongly.
Analysis:
THD::stmt_depends_on_first_successful_insert_id_in_prev_stmt
is a variable which is set when LAST_INSERT_ID() is used by
a statement. If it is set, first_successful_insert_id_in_
prev_stmt_for_binlog will be stored in the statement-based
binlog. This variable is CUMULATIVE along the execution of
a stored function or trigger: if one substatement sets it
to 1 it will stay 1 until the function/trigger ends,
thus making sure that first_successful_insert_id_in_
prev_stmt_for_binlog does not change anymore and is
propagated to the caller for binlogging. This is achieved
using the following code
if(!stmt_depends_on_first_successful_insert_id_in_prev_stmt)
{
/* It's the first time we read it */
first_successful_insert_id_in_prev_stmt_for_binlog=
first_successful_insert_id_in_prev_stmt;
stmt_depends_on_first_successful_insert_id_in_prev_stmt= 1;
}
Slave server, after receiving Int_var_log_event event from
master, it is setting
stmt_depends_on_first_successful_insert_id_in_prev_stmt
to true(*which is wrong*) and not setting
first_successful_insert_id_in_prev_stmt_for_binlog. Because
of this problem, when the actual DML statement with
LAST_INSERT_ID() is parsed by slave SQL thread,
first_successful_insert_id_in_prev_stmt_for_binlog is not
set. Hence the value zero (default value) is written to
slave's binlog.
Why only *Filtered slave* is effected when the code is
in common place:
-------------------------------------------------------
In Query_log_event::do_apply_event,
THD::stmt_depends_on_first_successful_insert_id_in_prev_stmt
is reset to zero at the end of the function. In case of
normal slave (No Filters), this variable will be reset.
In Filtered slave, Slave SQL thread defers all IRU events's
execution until IRU's Query_log event is received. Once it
receives Query_log_event it executes all pending IRU events
and then it executes Query_log_event. Hence the variable is
not getting reset to 0, causing this bug.
Fix: As described above, the root cause was setting
THD::stmt_depends_on_first_successful_insert_id_in_prev_stmt
when Int_var_log_event was executed by a SQL thread. Hence
removing the problematic line from the code.
Description: Fix for bug CVE-2012-5611 (bug 67685) is
incomplete. The ACL_KEY_LENGTH-sized buffers in acl_get() and
check_grant_db() can be overflown by up to two bytes. That's
probably not enough to do anything more serious than crashing
mysqld.
Analysis: In acl_get() when "copy_length" is calculated it
just adding the variable lengths. But when we are using them
with strmov() we are adding +1 to each. This will lead to a
three byte buffer overflow (i.e two +1's at strmov() and one
byte for the null added by strmov() function). Similarly it
happens for check_grant_db() function as well.
Fix: We need to add "+2" to "copy_length" in acl_get()
and "+1" to "copy_length" in check_grant_db().
REPEATED TWICE IN BINLOG
Problem:
=======
If LOAD DATA ... SET ... is used the last argument of SET is
repeated twice in replication binlog.
Analysis:
========
LOAD DATA statements are reconstructed once again before
they are written to the binary log. When SET clauses are
specified as part of LOAD DATA statement, these SET clause
user command strings need to be stored in order to rebuild
the original user command. During parsing each column and
the value in the SET command are stored in two differenet
lists. All the values are stored in a string list.
When SET expression has more than one value as shown in the
following example:
SET a = @a, b = CONCAT(@b, '| 123456789');
Parser extracts values in the following manner i.e Item name
, value string, actual length of the value of the item with
in the string.
Item a:
Value for a:"= @a, b = CONCAT(@b, '| 123456789')
str_length = 4
Item b:
Value for b:"= CONCAT(@b, '| 123456789')
str_length = 27
During reconstructing the LOAD DATA command the above
strings are retrived as it is and appended to the LOAD DATA
statement. Hence it becomes as shown below.
SET `a`= @a, b = CONCAT(@b, '| 123456789'),
`b`= CONCAT(@b, '| 123456789')
Fix:
===
During reconstruction of SET command, retrieve exact item
value string rather than reading the entire string.
sql/sql_load.cc:
Added code to extract the exact Item value.
AND 'KILL SESSION' LEAD TO CRASH
Analysis:
--------
This situation occurs when the connection executes query
"show engine innodb status" and this connection is killed by
executing statement "kill <con>" by another connection.
In function "innodb_show_status", function "stat_print"
is called to print the status but return value of function
is not checked. After killing connection, if write to
connection fails then error is returned and same is set
in Diagnostic area. Since FALSE is returned from
"innodb_show_status" now, assert to check no error
is set in function "set_eof_status" (called from
my_eof) is failing.
Fix:
----
Changed code to check return value of function "stat_print"
in "innodb_show_status".
Description:
------------
There are 2 issues reported in the bug report,
1. One session running a "long" select, then, from the other
session, you kill that first one, while select is
running, and it receives that message "Server shutdown in
progress".
Reported Date: 02-Apr-2006
=> Looks like this isuse is already fixed in 2009 by the patch
pushed for bug28141.
2. Killing query which goes to filesort, logs error entries like:
120416 9:17:28 [ERROR] mysqld: Sort aborted: Server shutdown in
progress
120416 9:18:48 [ERROR] mysqld: Sort aborted: Server shutdown in
progress
120416 9:19:39 [ERROR] mysqld: Sort aborted: Server shutdown in
progress
Reported Date: 16-Apr-2012
=> This issue is introduced in 5.5+ versions. Fixing this issue
in this patch.
Analysis:
---------
In function "filesort()", on error we are logging error message.
To the error message, the message related THD::killed_errno is
also appeneded, if it is set.(THD::kill_errno value is obtained
by calling member function THD::killed_errno)
In the scenario mentioned in this bug report, when we kill the
connection, THD::kill_errno is set to the THD::KILL_CONNECTION.
Enum type THD::KILL_CONNECTION corressponds to value
ER_SERVER_SHUTDOWN. Because of this, "Server shutdown in ...." is
appended to the message logged.
Fix:
----
Modified code of "filesort()" function to append "KILL_QUERY"
status to error message when thread is killed and server
shutdown is not in progress.
For queries like
update t1 set ... where <unique key> order by ... limit ...
we need to handle the fact that unique keys allow NULL
values, and hence can return more than one row.
sql/opt_range.cc:
When the unique key has multiple key parts,
check each key_part for nullability, rather than the first key part.
(s/key->part ==/key_tree->part ==/)
Also: revert the if() test, for better readability.
Dump thread may encounter an error when reading events from the active binlog
file. However the errors may be temporary, so dump thread will try to read
the event again. But dump thread seeked to an wrong position, it caused some
events was sent twice.
To fix the bug, prev_pos is defined out the while loop and is set the correct
position after reading every event correctly.
This patch also make binlog_can_be_corrupted more accurate, only the binlogs
not closed normally are marked binlog_can_be_corrupted.
Finally, two warnings are added when dump threads encounter the temporary
errors.
DO WHAT YOU EXPECT"
Fix info:
--------
Backport of the deprecation bug fix (WL#5265) for global variable
'THREAD_CONCURRENCY' from mysql-5.6 to mysql-5.5
Note: With this backport, certain additional deprecation warnings
would be reported under error conditions while setting the
global/session variables.
CONTAINING NULL
Problem:-
In MySQL, We can obtain the number of distinct expression
combinations that do not contain NULL by giving a list of
expressions in COUNT(DISTINCT).
However rows with NULL values are
incorrectly included in the count when loose index scan is
used.
Analysis:-
In case of loose index scan, we check whether the field is null or
not and increase the count in Item_sum_count::add().
But there we are checking for the first field in COUNT(DISTINCT),
not for every field. This is causing an incorrect result.
Solution:-
Check all field in Item_sum_count::add(), whether there values
are null or not. Then only increment the count.
******
Bug#17222452 - SELECT COUNT(DISTINCT A,B) INCORRECTLY COUNTS ROWS
CONTAINING NULL
Problem:-
In MySQL, We can obtain the number of distinct expression
combinations that do not contain NULL by giving a list of
expressions in COUNT(DISTINCT).
However rows with NULL values are
incorrectly included in the count when loose index scan is
used.
Analysis:-
In case of loose index scan, we check whether the field is null or
not and increase the count in Item_sum_count::add().
But there we are checking for the first field in COUNT(DISTINCT),
not for every field. This is causing an incorrect result.
Solution:-
Check all field in Item_sum_count::add(), whether there values
are null or not. Then only increment the count.
Problem:-
Second execution of prepared statement for query with
parameter in limit clause, causes an assert when using
connectors (e.g., Connector C).
Analysis:-
In prepared statement, LIMIT parameters can be
specified using '?' markers. Value for the parameter can
be supplied while executing the prepared statement.
Passing string, float or double values for LIMIT clause
works well from command-line client. That's because, while
setting the LIMIT parameter value from a user-variable,
the value is converted to integer value.
However, when prepared statement is executed from other
interfaces as J connectors, or C applications etc,
the value for the parameters are sent to the server
with execute command. Each item in command has value and
the data TYPE. So, while setting parameter values
from this log, value is set to all the parameters
with the same data type as passed.
Here, we have the logic to convert the value to change the
state and item_type if it is part of LIMIT parameter and
its item_type is not INT.
But when we reset this parameter we save the item_type but change
state. So on second execution we have old item_type but our state
has been changed, which make us to use string type variable
in Item_param::query_str_val(). This cause an assert.
Fix:
Instead of checking the item_type of the parameter, check for
the state of the parameter. As state value are reset everytime
we execute the statement.
ER_LOCK_WAIT_TIMEOUT".
The problem was that after changes caused by fix bug 14188793 "DEADLOCK
CAUSED BY ALTER TABLE DOEN'T CLEAR STATUS OF ROLLBACKED TRANSACTION"/
bug 17054007 "TRANSACTION IS NOT FULLY ROLLED BACK IN CASE OF INNODB
DEADLOCK implicit rollback of transaction which occurred on ER_LOCK_DEADLOCK
(and ER_LOCK_WAIT_TIMEOUT if innodb_rollback_on_timeout option was set)
didn't start new transaction in @@autocommit=1 mode.
Such behavior although consistent with behavior of explicit ROLLBACK has
broken expectations of users and backward compatibility assumptions.
This patch fixes problem by reverting to starting new transaction
in 5.5/5.6.
The plan is to keep new behavior in trunk so the code change from this
patch is to be null-merged there.
Problem:-
In a Procedure, when we are comparing value of select query
with IN clause and they both have different collation, cause
error on first time execution and assert second time.
procedure will have query like
set @x = ((select a from t1) in (select d from t2));<---proc1
sel1 sel2
Analysis:-
When we execute this proc1(first time)
While resolving the fields of user variable, we will call
Item_in_subselect::fix_fields while will resolve sel2. There
in Item_in_subselect::select_transformer, we evaluate the
left expression(sel1) and store it in Item_cache_* object
(to avoid re-evaluating it many times during subquery execution)
by making Item_in_optimizer class.
While evaluating left expression we will prepare sel1.
After that, we will put a new condition in sel2
in Item_in_subselect::select_transformer() which will compare
t2.d and sel1(which is cached in Item_in_optimizer).
Later while checking the collation in agg_item_collations()
we get error and we cleanup the item. While cleaning up we cleaned
the cached value in Item_in_optimizer object.
When we execute the procedure second time, we have condition for
sel2 and while setup_cond(), we can't able to find reference item
as it is cleanup while item cleanup.So it assert.
Solution:-
We should not cleanup the cached value for Item_in_optimizer object,
if we have put the condition to subselect.
Problem:-
In a Procedure, when we are comparing value of select query
with IN clause and they both have different collation, cause
error on first time execution and assert second time.
procedure will have query like
set @x = ((select a from t1) in (select d from t2));<---proc1
sel1 sel2
Analysis:-
When we execute this proc1(first time)
While resolving the fields of user variable, we will call
Item_in_subselect::fix_fields while will resolve sel2. There
in Item_in_subselect::select_transformer, we evaluate the
left expression(sel1) and store it in Item_cache_* object
(to avoid re-evaluating it many times during subquery execution)
by making Item_in_optimizer class.
While evaluating left expression we will prepare sel1.
After that, we will put a new condition in sel2
in Item_in_subselect::select_transformer() which will compare
t2.d and sel1(which is cached in Item_in_optimizer).
Later while checking the collation in agg_item_collations()
we get error and we cleanup the item. While cleaning up we cleaned
the cached value in Item_in_optimizer object.
When we execute the procedure second time, we have condition for
sel2 and while setup_cond(), we can't able to find reference item
as it is cleanup while item cleanup.So it assert.
Solution:-
We should not cleanup the cached value for Item_in_optimizer object,
if we have put the condition to subselect.
"SHOW PROCESSLIST"
Analysis:
----------
The problem here is, if one connection changes its
default db and at the same time another connection executes
"SHOW PROCESSLIST", when it wants to read db of the another
connection then there is a chance of accessing the invalid
memory.
The db name stored in THD is not guarded while changing user
DB and while reading the user DB in "SHOW PROCESSLIST".
So, if THD.db is freed by thd "owner" thread and if another
thread executing "SHOW PROCESSLIST" statement tries to read
and copy THD.db at the same time then we may endup in the issue
reported here.
Fix:
----------
Used mutex "LOCK_thd_data" to guard THD.db while freeing it
and while copying it to processlist.
STATUS OF ROLLBACKED TRANSACTION" and bug #17054007 - "TRANSACTION
IS NOT FULLY ROLLED BACK IN CASE OF INNODB DEADLOCK".
The problem in the first bug report was that although deadlock involving
metadata locks was reported using the same error code and message as InnoDB
deadlock it didn't rollback transaction like the latter. This caused
confusion to users as in some cases after ER_LOCK_DEADLOCK transaction
could have been restarted immediately and in some cases rollback was
required.
The problem in the second bug report was that although InnoDB deadlock
caused transaction rollback in all storage engines it didn't cause release
of metadata locks. So concurrent DDL on the tables used in transaction was
blocked until implicit or explicit COMMIT or ROLLBACK was issued in the
connection which got InnoDB deadlock.
The former issue has stemmed from the fact that when support for detection
and reporting metadata locks deadlocks was added we erroneously assumed
that InnoDB doesn't rollback transaction on deadlock but only last statement
(while this is what happens on InnoDB lock timeout actually) and so didn't
implement rollback of transactions on MDL deadlocks.
The latter issue was caused by the fact that rollback of transaction due
to deadlock is carried out by setting THD::transaction_rollback_request
flag at the point where deadlock is detected and performing rollback
inside of trans_rollback_stmt() call when this flag is set. And
trans_rollback_stmt() is not aware of MDL locks, so no MDL locks are
released.
This patch solves these two problems in the following way:
- In case when MDL deadlock is detect transaction rollback is requested
by setting THD::transaction_rollback_request flag.
- Code performing rollback of transaction if THD::transaction_rollback_request
is moved out from trans_rollback_stmt(). Now we handle rollback request
on the same level as we call trans_rollback_stmt() and release statement/
transaction MDL locks.