"SHOW PROCESSLIST"
Analysis:
----------
The problem here is, if one connection changes its
default db and at the same time another connection executes
"SHOW PROCESSLIST", when it wants to read db of the another
connection then there is a chance of accessing the invalid
memory.
The db name stored in THD is not guarded while changing user
DB and while reading the user DB in "SHOW PROCESSLIST".
So, if THD.db is freed by thd "owner" thread and if another
thread executing "SHOW PROCESSLIST" statement tries to read
and copy THD.db at the same time then we may endup in the issue
reported here.
Fix:
----------
Used mutex "LOCK_thd_data" to guard THD.db while freeing it
and while copying it to processlist.
STATUS OF ROLLBACKED TRANSACTION" and bug #17054007 - "TRANSACTION
IS NOT FULLY ROLLED BACK IN CASE OF INNODB DEADLOCK".
The problem in the first bug report was that although deadlock involving
metadata locks was reported using the same error code and message as InnoDB
deadlock it didn't rollback transaction like the latter. This caused
confusion to users as in some cases after ER_LOCK_DEADLOCK transaction
could have been restarted immediately and in some cases rollback was
required.
The problem in the second bug report was that although InnoDB deadlock
caused transaction rollback in all storage engines it didn't cause release
of metadata locks. So concurrent DDL on the tables used in transaction was
blocked until implicit or explicit COMMIT or ROLLBACK was issued in the
connection which got InnoDB deadlock.
The former issue has stemmed from the fact that when support for detection
and reporting metadata locks deadlocks was added we erroneously assumed
that InnoDB doesn't rollback transaction on deadlock but only last statement
(while this is what happens on InnoDB lock timeout actually) and so didn't
implement rollback of transactions on MDL deadlocks.
The latter issue was caused by the fact that rollback of transaction due
to deadlock is carried out by setting THD::transaction_rollback_request
flag at the point where deadlock is detected and performing rollback
inside of trans_rollback_stmt() call when this flag is set. And
trans_rollback_stmt() is not aware of MDL locks, so no MDL locks are
released.
This patch solves these two problems in the following way:
- In case when MDL deadlock is detect transaction rollback is requested
by setting THD::transaction_rollback_request flag.
- Code performing rollback of transaction if THD::transaction_rollback_request
is moved out from trans_rollback_stmt(). Now we handle rollback request
on the same level as we call trans_rollback_stmt() and release statement/
transaction MDL locks.
DICT_TABLE_GET_FORMAT(CLUST_INDEX->TABLE) >= 1
The function row_sel_sec_rec_is_for_clust_rec() was incorrectly
preparing to compare a NULL column prefix in a secondary index with a
non-NULL column in a clustered index.
This can trigger an assertion failure in 5.1 plugin and later. In the
built-in InnoDB of MySQL 5.1 and earlier, we would apparently only do
some extra work, by trimming the clustered index field for the
comparison.
The code might actually have worked properly apart from this debug
assertion failure. It is merely doing some extra work in fetching a
BLOB column, and then comparing it to NULL (which would return the
same result, no matter what the BLOB contents is).
While the test case involves CHECK TABLE, this could theoretically
occur during any read that uses a secondary index on a column prefix
of a column that can be NULL.
rb#3101 approved by Mattias Jonsson
There was a race condition in the rollback of TRX_UNDO_UPD_DEL_REC.
Once row_undo_mod_clust() has rolled back the changes by the rolling-back
transaction, it attempts to purge the delete-marked record, if possible, in a
separate mini-transaction.
However, row_undo_mod_remove_clust_low() fails to check if the DB_TRX_ID of
the record that it found after repositioning the cursor, is still the same.
If it is not, it means that the record was purged and another record was
inserted in its place.
So, the rollback would have performed an incorrect purge, breaking the
locking rules and causing corruption.
The problem was found by creating a table that contains a unique
secondary index and a primary key, and two threads running REPLACE
with only one value for the unique column, so that the uniqueness
constraint would be violated all the time, leading to statement
rollback.
This bug exists in all InnoDB versions (I checked MySQL 3.23.53).
It has become easier to repeat in 5.5 and 5.6 thanks to scalability
improvements and a dedicated purge thread.
rb#3085 approved by Jimmy Yang
FAILED BLOB WRITE
btr_store_big_rec_extern_fields(): Relax a debug assertion so that
some BLOB pointers may remain zero if an error occurs.
btr_free_externally_stored_field(), row_undo_ins(): Allow the BLOB
pointer to be zero on any rollback.
rb#3059 approved by Jimmy Yang, Kevin Lewis
Problem Description:
A mysqld_safe instance is started. An InnoDB crash recovery begins which takes
few seconds to complete. During this crash recovery process happening, another
mysqld_safe instance is started with the same server startup parameters. Since
the mysqld's pid file is absent during the crash recovery process the second
instance assumes there is no other process and tries to acquire a lock on the
ibdata files in the datadir. But this step fails and the 2nd instance keeps
retrying 100 times each with a delay of 1 second. Now after the 100 attempts,
the server goes down, but while going down it hits the mysqld_safe script's
cleanup section and without any check it blindly deletes the socket and pid
files. Since no lock is placed on the socket file, it gets deleted.
Solution:
We create a mysqld_safe.pid file in the datadir, which protects the presence
server instance resources by storing the mysqld_safe's process id in it. We
place a check if the mysqld_safe.pid file is existing in the datadir. If yes
then we check if the pid it contains is an active pid or not. If yes again,
then the scripts logs an error saying "A mysqld_safe instance is already
running". Otherwise it will log the present mysqld_safe's pid into the
mysqld_safe.pid file.
Problem Description:
A mysqld_safe instance is started. An InnoDB crash recovery begins which takes
few seconds to complete. During this crash recovery process happening, another
mysqld_safe instance is started with the same server startup parameters. Since
the mysqld's pid file is absent during the crash recovery process the second
instance assumes there is no other process and tries to acquire a lock on the
ibdata files in the datadir. But this step fails and the 2nd instance keeps
retrying 100 times each with a delay of 1 second. Now after the 100 attempts,
the server goes down, but while going down it hits the mysqld_safe script's
cleanup section and without any check it blindly deletes the socket and pid
files. Since no lock is placed on the socket file, it gets deleted.
Solution:
We create a mysqld_safe.pid file in the datadir, which protects the presence
server instance resources by storing the mysqld_safe's process id in it. We
place a check if the mysqld_safe.pid file is existing in the datadir. If yes
then we check if the pid it contains is an active pid or not. If yes again,
then the scripts logs an error saying "A mysqld_safe instance is already
running". Otherwise it will log the present mysqld_safe's pid into the
mysqld_safe.pid file.
AND PARTITION VALUES IN (NULL)
The code assumed there was at least one list element
in LIST partitioned table.
Fixed by checking the number of list elements.
Since the mtr_t struct is marked as invalid in DEBUG_VALGRIND build
during mtr_commit, checking mtr->inside_ibuf will cause this warning.
Also since mtr->inside_ibuf cannot be set in mtr_commit (assert check)
and mtr->state is set to MTR_COMMITTED, the 'ut_ad(!ibuf_inside(&mtr))'
check is not needed if 'ut_ad(mtr.state == MTR_COMMITTED)' is also
checked.
IN STORED ROUTINE
Inside a loop in a stored procedure, we create a partitioned
table. The CREATE statement is thus treated as a prepared statement:
it is prepared once, and then executed by each iteration. Thus its Lex
is reused many times. This Lex contains a part_info member, which
describes how the partitions should be laid out, including the
partitioning function. Each execution of the CREATE does this, in
open_table_from_share ():
tmp= mysql_unpack_partition(thd, share->partition_info_str,
share->partition_info_str_len,
outparam, is_create_table,
share->default_part_db_type,
&work_part_info_used);
...
tmp= fix_partition_func(thd, outparam, is_create_table);
The first line calls init_lex_with_single_table() which creates
a TABLE_LIST, necessary for the "field fixing" which will be
done by the second line; this is how it is created:
if ((!(table_ident= new Table_ident(thd,
table->s->db,
table->s->table_name, TRUE))) ||
(!(table_list= select_lex->add_table_to_list(thd,
table_ident,
NULL,
0))))
return TRUE;
it is allocated in the execution memory root.
Then the partitioning function ("id", stored in Lex -> part_info)
is fixed, which calls Item_ident:: fix_fields (), which resolves
"id" to the table_list above, and stores in the item's
cached_table a pointer to this table_list.
The table is created, later it is dropped by another statement,
then we execute again the prepared CREATE. This reuses the Lex,
thus also its part_info, thus also the item representing the
partitioning function (part_info is cloned but it's a shallow
cloning); CREATE wants to fix the item again (which is
normal, every execution fixes items again), fix_fields ()
sees that the cached_table pointer is set and picks up the
pointed table_list. But this last object does not exist
anymore (it was allocated in the execution memory root of
the previous execution, so it has been freed), so we access
invalid memory.
The solution: when creating the table_list, mark that it
cannot be cached.
OF OLD STYLE DECIMALS
Problem: In RBR, Slave is unable to read row buffer
properly when the row event contains MYSQL_TYPE_DECIMAL
(old style decimals) data type column.
Analysis: In RBR, Slave assumes that Master sends
meta data information for all column types like
text,blob,varchar,old decimal,new decimal,float,
and few other types along with row buffer event.
But Master is not sending this meta data information
for old style decimal columns. Hence Slave is crashing
due to unknown precision value for these column types.
Master cannot send this precision value to Slave which
will break replication cross-version compatibility.
Fix: To fix the crash, Slave will now throw error if it
receives old-style decimal datatype. User should
consider changing the old-style decimal to new style
decimal data type by executing "ALTER table modify column"
query as mentioned in http://dev.mysql.com/
doc/refman/5.0/en/upgrading-from-previous-series.html.
Problem:
=======
It was detected an incorrect behavior of my_strtoll10 function when
converting strings with numbers in the following format:
"184467440XXXXXXXXXYY"
Where XXXXXXXXX > 737095516 and YY <= 15
Samples of problematic numbers:
"18446744073709551915"
"18446744073709552001"
Instead of returning the larger unsigned long long value and setting overflow
in the returned error code, my_strtoll10 function returns the lower 64-bits
of the evaluated number and did not set overflow in the returned error code.
Analysis:
========
Once trying to fix bug 16820156, I've found this bug in the overflow check of
my_strtoll10 function.
This function, once receiving a string with an integer number larger than
18446744073709551615 (the larger unsigned long long number) should return the
larger unsigned long long number and set overflow in the returned error code.
Because of a wrong overflow evaluation, the function didn't catch the
overflow cases where (i == cutoff) && (j > cutoff2) && (k <= cutoff3). When
the overflow evaluation fails, the function return the lower 64-bits of the
evaluated number and do not set overflow in the returned error code.
Fix:
===
Corrected the overflow evaluation in my_strtoll10.
Description:
Original fix Bug#11765744 changed mutex to read write lock
to avoid multiple recursive lock acquire operation on
LOCK_status mutex.
On Windows, locking read-write lock recursively is not safe.
Slim read-write locks, which MySQL uses if they are supported by
Windows version, do not support recursion according to their
documentation. For our own implementation of read-write lock,
which is used in cases when Windows version doesn't support SRW,
recursive locking of read-write lock can easily lead to deadlock
if there are concurrent lock requests.
Fix:
This patch reverts the previous fix for bug#11765744 that used
read-write locks. Instead problem of recursive locking for
LOCK_status mutex is solved by tracking recursion level using
counter in THD object and acquiring lock only once when we enter
fill_status() function first time.
Description:
Original fix Bug#11765744 changed mutex to read write lock
to avoid multiple recursive lock acquire operation on
LOCK_status mutex.
On Windows, locking read-write lock recursively is not safe.
Slim read-write locks, which MySQL uses if they are supported by
Windows version, do not support recursion according to their
documentation. For our own implementation of read-write lock,
which is used in cases when Windows version doesn't support SRW,
recursive locking of read-write lock can easily lead to deadlock
if there are concurrent lock requests.
Fix:
This patch reverts the previous fix for bug#11765744 that used
read-write locks. Instead problem of recursive locking for
LOCK_status mutex is solved by tracking recursion level using
counter in THD object and acquiring lock only once when we enter
fill_status() function first time.
SHUTDOWN IS IN PROGRESS
PROBLEM
-------
In the background thread srv_master_thread() we have a
a one second delay loop which will continuously monitor
server activity .If the server is inactive (with out any
user activity) or in a shutdown state we do some background
activity like flushing the changes.In the current code
we are not checking if server is in shutdown state before
sleeping for one second.
FIX
---
If server is in shutdown state ,then dont go to one second
sleep.
PARTITIONS.
ANALYSIS
--------
Whenever we query I_S.partitions,
ha_partition::get_dynamic_partition_info()
is called which resets the cardinality
according to the number of rows in last
partition.
Fix
---
When we call get_dynamic_partition_info()
avoid passing the flag HA_STATUS_CONST
to info() since HA_STATUS_CONST should
ideally not be called for per partition.
[Approved by mattiasj rb#2830 ]
IN TIME RECOVERY FAILURE ON SLAVES
Problem:
DROP TEMP TABLE IF EXISTS commands can cause point
in time recovery (re-applying binlog) failures.
Analyses:
In RBR, 'DROP TEMPORARY TABLE' commands are
always binlogged by adding 'IF EXISTS' clauses.
Also, the slave SQL thread will not check replicate.* filter
rules for "DROP TEMPORARY TABLE IF EXISTS" queries.
If log-slave-updates is enabled on slave, these queries
will be binlogged in the format of "USE `db`;
DROP TEMPORARY TABLE IF EXISTS `t1`;" irrespective
of filtering rules and irrespective of the `db` existence.
When users try to recover slave from it's own binlog,
use `db` command might fail if `db` is not present on slave.
Fix:
At the time of writing the 'DROP TEMPORARY TABLE
IF EXISTS' query into the binlog, 'use `db`' will not be
present and the table name in the query will be a fully
qualified table name.
Eg:
'USE `db`; DROP TEMPORARY TABLE IF EXISTS `t1`;'
will be logged as
'DROP TEMPORARY TABLE IF EXISTS `db`.`t1`;'.
Problem:
When the user specified foreign key name contains "_ibfk_", InnoDB wrongly
tries to rename it.
Solution:
When a table is renamed, all its associated foreign keys will also be renamed,
only if the foreign key names are automatically generated. If the foreign key
names are given by the user, even if it has _ibfk_ in it, it must not be
renamed.
rb#2935 approved by Jimmy, Krunal and Satya
Inside a loop in a stored procedure, we create a partitioned
table. The CREATE statement is thus treated as a prepared statement:
it is prepared once, and then executed by each iteration. Thus its Lex
is reused many times. This Lex contains a part_info member, which
describes how the partitions should be laid out, including the
partitioning function. Each execution of the CREATE does this, in
open_table_from_share ():
tmp= mysql_unpack_partition(thd, share->partition_info_str,
share->partition_info_str_len,
outparam, is_create_table,
share->default_part_db_type,
&work_part_info_used);
...
tmp= fix_partition_func(thd, outparam, is_create_table);
The first line calls init_lex_with_single_table() which creates
a TABLE_LIST, necessary for the "field fixing" which will be
done by the second line; this is how it is created:
if ((!(table_ident= new Table_ident(thd,
table->s->db,
table->s->table_name, TRUE))) ||
(!(table_list= select_lex->add_table_to_list(thd,
table_ident,
NULL,
0))))
return TRUE;
it is allocated in the execution memory root.
Then the partitioning function ("id", stored in Lex -> part_info)
is fixed, which calls Item_ident:: fix_fields (), which resolves
"id" to the table_list above, and stores in the item's
cached_table a pointer to this table_list.
The table is created, later it is dropped by another statement,
then we execute again the prepared CREATE. This reuses the Lex,
thus also its part_info, thus also the item representing the
partitioning function (part_info is cloned but it's a shallow
cloning); CREATE wants to fix the item again (which is
normal, every execution fixes items again), fix_fields ()
sees that the cached_table pointer is set and picks up the
pointed table_list. But this last object does not exist
anymore (it was allocated in the execution memory root of
the previous execution, so it has been freed), so we access
invalid memory.
The solution: when creating the table_list, mark that it
cannot be cached.
Since log_throttle is not available in 5.5. Logging of
error message for failure of thread to create new connection
in "create_thread_to_handle_connection" is not backported.
Since, function "my_plugin_log_message" is not available in
5.5 version and since there is incompatibility between
sql_print_XXX function compiled with g++ and alog files with
gcc to use sql_print_error, changes related to audit log
plugin is not backported.
Problem:
sys_vars.rpl_init_slave_func test was failing sporadically
on 5.5+.
Fix:
Added assert condition after wait for checks.
Recorded test and enabled it.
BUG#12535301- SYS_VARS.RPL_INIT_SLAVE_FUNC MISMATCHES IN DAILY-5.5
Problem:
sys_vars.rpl_init_slave_func test was not recorded after
the last edit. It was disabled on 5.1 after seeing failures
due to the above reason.
No old failures as this suite never ran with pb2 on 5.1
Fix:
Added assert condition after wait for checks.
Recorded test and enabled it.
TO DUMP DATA FROM MYSQL-5.6
Analysis
--------
Dumping mysql-5.6 data using mysql-5.1/mysql-5.5 'myqldump'
utility fails with a syntax error.
Server system variable 'sql_quote_show_create' which quotes the
identifiers is set in the mysqldump utility. The mysldump utility
of mysql-5.1/mysql-5.5 uses deprecated syntax 'SET OPTION' to set
the 'sql_quote_show_create' option. The support for the syntax is
removed in mysql-5.6. Hence syntax error is reported while taking
the dump.
Fix:
---
Changed the 'mysqldump' code to use the syntax
'SET SQL_QUOTE_SHOW_CREATE' to set the 'sql_quote_show_create'
option. That syntax is supported on mysql-5.1, mysql-5.5 and
mysql-5.6.
NOTE: I have not added an mtr test case since it is difficult
to simulate the condition. Also the syntax may not be further
simplified in the future.
SERIALIZABLE
Problem:
The documentation claims that WITH CONSISTENT SNAPSHOT will work for both
REPEATABLE READ and SERIALIZABLE isolation levels. But it will work only
for REPEATABLE READ isolation level. Also, the clause WITH CONSISTENT
SNAPSHOT is silently ignored when it is not applicable to the given isolation
level.
Solution:
Generate a warning when the clause WITH CONSISTENT SNAPSHOT is ignored.
rb#2797 approved by Kevin.
Note: Support team wanted to push this to 5.5+.