TEMPORARY + HANDLER + LOCK + SP".
Server crashed when one:
1) Opened HANDLER or acquired global read lock
2) Then locked one or several temporary tables with
LOCK TABLES statement (but no base tables).
3) Then issued any statement causing commit (explicit
or implicit).
4) Issued statement which should have closed HANDLER
or released global read lock.
The problem was that when entering LOCK TABLES mode in the
scenario described above we incorrectly set transactional
MDL sentinel to zero. As result during commit all metadata
locks were released (including lock for open HANDLER or
global metadata shared lock). Indeed, attempt to release
metadata lock for the second time which happened during
HANLDER CLOSE or during release of GLR caused crash.
This patch fixes problem by changing MDL_context's
set_trans_sentinel() method to set sentinel to correct
value (it should point to the most recent ticket).
mysql-test/include/handler.inc:
Added test for bug #51136 "Crash in pthread_rwlock_rdlock on
TEMPORARY + HANDLER + LOCK + SP".
mysql-test/r/flush.result:
Updated test results (see flush.test).
mysql-test/r/handler_innodb.result:
Updated test results (see include/handler.inc).
mysql-test/r/handler_myisam.result:
Updated test results (see include/handler.inc).
mysql-test/t/flush.test:
Added additional coverage for bug #51136 "Crash in
pthread_rwlock_rdlock on TEMPORARY + HANDLER + LOCK +
SP".
sql/mdl.h:
When setting new value of transactional sentinel use
pointer to the most recent ticket instead of value
returned by MDL_context::mdl_savepoint().
This allows to handle correctly situation when the new
value of sentinel should be the same as its current value
(MDL_context::mdl_savepoint() returns NULL in this case).
DDL workload".
When a RENAME TABLE or LOCK TABLE ... WRITE statement which
mentioned the same table several times were aborted during
the process of acquring metadata locks (due to deadlock
which was discovered or because of KILL statement) server
might have crashed.
When attempt to acquire all locks requested had failed we
went through the list of requests and released locks which
we have managed to acquire by that moment one by one. Since
in the scenario described above list of requests contained
duplicates this led to releasing the same ticket twice and
a crash as result.
This patch solves the problem by employing different approach
to releasing locks in case of failure to acquire all locks
requested.
Now we take a MDL savepoint before starting acquiring locks
and simply rollback to it if things go bad.
mysql-test/r/lock_multi.result:
Updated test results (see lock_multi.test).
mysql-test/t/lock_multi.test:
Added test case for bug #51134 "Crash in MDL_lock::destroy
on a concurrent DDL workload".
sql/mdl.cc:
MDL_context::acquire_locks():
When attempt to acquire all locks requested has failed do
not go through the list of requests and release locks which
we have managed to acquire one by one.
Since list of requests can contain duplicates such approach
may lead to releasing the same ticket twice and a crash as
result.
Instead use the following approach - take a MDL savepoint
before starting acquiring locks and simply rollback to it
if things go bad.
function with distinct.
Loose index scan is used to find MIN/MAX values using appropriate index and
thus allow to avoid grouping. For each found row it updates non-aggregated
fields with values from row with found MIN/MAX value.
Without loose index scan non-aggregated fields are copied by end_send_group
function. With loose index scan there is no need in end_send_group and
end_send is used instead. Non-aggregated fields still need to be copied and
this was wrongly implemented in QUICK_GROUP_MIN_MAX_SELECT::get_next.
WL#3220 added a case when loose index scan can be used with end_send_group to
optimize calculation of aggregate functions with distinct. In this case
the row found by QUICK_GROUP_MIN_MAX_SELECT::get_next might belong to a next
group and copying it will produce wrong result.
Update of non-aggregated fields is moved to the end_send function from
QUICK_GROUP_MIN_MAX_SELECT::get_next.
mysql-test/r/group_min_max.result:
Added a test case for the bug#50539.
mysql-test/t/group_min_max.test:
Added a test case for the bug#50539.
sql/opt_range.cc:
Bug#50539: Wrong result when loose index scan is used for an aggregate
function with distinct.
Update of non-aggregated fields is moved to the end_send function from
QUICK_GROUP_MIN_MAX_SELECT::get_next.
sql/sql_select.cc:
Bug#50539: Wrong result when loose index scan is used for an aggregate
function with distinct.
Update of non-aggregated fields is moved to the end_send function from
QUICK_GROUP_MIN_MAX_SELECT::get_next.
failed in enter_locked_tables_mode".
Server was aborted due to assertion failure when one tried to
execute statement requiring prelocking (i.e. firing triggers
or using stored functions) while having open HANDLERs.
The problem was that THD::enter_locked_tables_mode() method
which was called at the beginning of execution of prelocked
statement assumed there are no open HANDLERs. It had to do
so because corresponding THD::leave_locked_tables_mode()
method was unable to properly restore MDL sentinel when
leaving LOCK TABLES/prelocked mode in the presence of open
HANDLERs.
This patch solves this problem by changing the latter method
to properly restore MDL sentinel and thus removing need for
this assumption. As a side-effect, it lifts unjustified
limitation by allowing to keep HANDLERs open when entering
LOCK TABLES mode.
mysql-test/include/handler.inc:
Adjusted tests after making LOCK TABLES not to close
open HANDLERs. Added coverage for bug #50908
"Assertion `handler_tables_hash.records == 0' failed
in enter_locked_tables_mode".
mysql-test/r/handler_innodb.result:
Updated test results (see include/handler.inc).
mysql-test/r/handler_myisam.result:
Updated test results (see include/handler.inc).
sql/mysql_priv.h:
Introduced mysql_ha_move_tickets_after_trans_sentinel()
routine which allows to move tickets for metadata locks
corresponding to open HANDLERs after transaction sentinel.
sql/sql_class.cc:
Changed THD::leave_locked_tables_mode() to correctly restore
MDL sentinel value in the presence of open HANDLERs.
sql/sql_class.h:
Removed assert from THD::enter_locked_tables_mode() as we
no longer have to close HANDLERs when entering LOCK TABLES
or prelocked modes. Instead we keep them open and correctly
restore MDL sentinel value after leaving them.
Removal of assert also fixes problem from the bug report.
sql/sql_handler.cc:
Introduced mysql_ha_move_tickets_after_trans_sentinel()
routine which allows to move tickets for metadata locks
corresponding to open HANDLERs after transaction sentinel.
sql/sql_parse.cc:
We no longer have to close HANDLERs when entering LOCK TABLES
mode. Instead we keep them open and simply correctly restore
MDL sentinel value after leaving this mode.
--slave-load-tm
The MDL_SHARED lock was introduced for an object in 5.4, but the 'TABLE_LIST'
object was not initialized with the MDL_SHARED lock when applying event with
LOAD DATA INFILE into table. So the failure is caused when checking the
MDL_SHARED lock for the object.
To fix the problem, the 'TABLE_LIST' object was initialized with the MDL_SHARED
lock when applying event with LOAD DATA INFILE into table.
--slave-load-tm
The MDL_SHARED lock was introduced for an object in 5.4, but the 'TABLE_LIST'
object was not initialized with the MDL_SHARED lock when applying event with
LOAD DATA INFILE into table. So the failure is caused when checking the
MDL_SHARED lock for the object.
To fix the problem, the 'TABLE_LIST' object was initialized with the MDL_SHARED
lock when applying event with LOAD DATA INFILE into table.
mysql-test/suite/rpl/t/disabled.def:
Got rid of the line for enabling 'rpl_cross_version' test.
this includes a major whitespace (formatting) alignment
and sequence changes to better agree with other spec files.
Further changes:
- All features are controlled by "%define" set from call
options or builtin.
- "bundled zlib" is on by default.
- "with libgcc" is controlled by runtime detection of gcc.
- Handling of "CFLAGS" and "CXXFLAGS" is more concentrated.
- Several missing man pages were added.
causing crashes!
Adding a SPATIAL INDEX on a non-geometrical column caused a
segmentation fault when the table was subsequently
inserted into.
A test was added in mysql_prepare_create_table to explicitly
check whether non-geometrical columns are used in a
spatial index, and throw an error if so.
mysql-test/t/gis.test:
Added test cases to verify that only geometrical
columns can get a spatial index.
In addition, verify that only a single geom.
column can participate in a spatial index.
corruption and crash results
An index creation statement where the index key
is larger/wider than the column it references
should throw an error.
A statement like:
CREATE TABLE t1 (a CHAR(1), PRIMARY KEY (A(255)))
did not error, but a segmentation fault followed when
an insertion was attempted on the table
The partial key validiation clause has been
restructured to (hopefully) better document which
uses of partial keys are valid.
The test case for this bug relies on getting a ER_LOCK_WAIT_TIMEOUT
error. However with the introduction of MDL, the test would hang
forever since the metadata locks would not timeout.
MDL timeouts are now introduced in the scope of Bug#45225. This
patch changes the testcase for Bug#34604 to set the new server
variable "lock_wait_timeout" to one second which makes the test
generate the necessary timeout again.
This patch introduces timeouts for metadata locks.
The timeout is specified in seconds using the new dynamic system
variable "lock_wait_timeout" which has both GLOBAL and SESSION
scopes. Allowed values range from 1 to 31536000 seconds (= 1 year).
The default value is 1 year.
The new server parameter "lock-wait-timeout" can be used to set
the default value parameter upon server startup.
"lock_wait_timeout" applies to all statements that use metadata locks.
These include DML and DDL operations on tables, views, stored procedures
and stored functions. They also include LOCK TABLES, FLUSH TABLES WITH
READ LOCK and HANDLER statements.
The patch also changes thr_lock.c code (table data locks used by MyISAM
and other simplistic engines) to use the same system variable.
InnoDB row locks are unaffected.
One exception to the handling of the "lock_wait_timeout" variable
is delayed inserts. All delayed inserts are executed with a timeout
of 1 year regardless of the setting for the global variable. As the
connection issuing the delayed insert gets no notification of
delayed insert timeouts, we want to avoid unnecessary timeouts.
It's important to note that the timeout value is used for each lock
acquired and that one statement can take more than one lock.
A statement can therefore block for longer than the lock_wait_timeout
value before reporting a timeout error. When lock timeout occurs,
ER_LOCK_WAIT_TIMEOUT is reported.
Test case added to lock_multi.test.
include/my_pthread.h:
Added macros for comparing two timespec structs.
include/thr_lock.h:
Introduced timeouts for thr_lock.c locks.
mysql-test/r/mysqld--help-notwin.result:
Updated result file with the new server variable.
mysql-test/r/mysqld--help-win.result:
Updated result file with the new server variable.
mysql-test/suite/sys_vars/r/lock_wait_timeout_basic.result:
Added basic test for the new server variable.
mysql-test/suite/sys_vars/t/lock_wait_timeout_basic.test:
Added basic test for the new server variable.
mysys/thr_lock.c:
Introduced timeouts for thr_lock.c locks.
sql/mdl.cc:
Introduced timeouts for metadata locks.
sql/mdl.h:
Introduced timeouts for metadata locks.
sql/sql_base.cc:
Introduced timeouts in tdc_wait_for_old_versions().
sql/sql_class.h:
Added new server variable lock_wait_timeout.
sql/sys_vars.cc:
Added new server variable lock_wait_timeout.
A closely related problem, hardly worth a new bug report:
Removed a spurious call to:
thd->set_current_stmt_binlog_format_row_if_mixed()
in sql_base.cc:lock_tables().
rqg_mdl_stability".
When start of statement's waiting on a metadata lock
created more than one loop in waiters graph server might
have entered deadlock condition.
The problem was that in the case described above MDL deadlock
detector had to perform several searches for deadlock but
forgot to reset Deadlock_detection_context before performing
new search.
Failure to do so has broken assumption in code resposible for
choosing victim that if Deadlock_detection_context::victim
is set we also have read lock on m_waiting_for_lock for this
context. As result this lock could have been unlocked more
times than it was acquired which corrupted rwlock's state
which led to server deadlock.
This fix ensures that such reset is done before each attempt
to find a deadlock.
mysql-test/r/mdl_sync.result:
Added test for bug #50998 "Deadlock in MDL code during test
rqg_mdl_stability" as well as coverage for the case when
addition of statement waiting for metadata lock adds several
loops in the waiters graph and therefore several searches
for deadlock should be performed by MDL deadlock detector.
mysql-test/t/mdl_sync.test:
Added test for bug #50998 "Deadlock in MDL code during test
rqg_mdl_stability" as well as coverage for the case when
addition of statement waiting for metadata lock adds several
loops in the waiters graph and therefore several searches
for deadlock should be performed by MDL deadlock detector.
sql/mdl.cc:
Ensure that in cases when MDL deadlock detector had to
perform several searches for deadlock because several loops
in waiters graph are possible we reset
Deadlock_detection_context before performing each search.
Failure to do so has broken assumption in code resposible
for choosing victim that if Deadlock_detection_context::victim
is set we also have read lock on m_waiting_for_lock for this
context. As result this lock could have been unlocked more
times than it was acquired which corrupted rwlock's state
(no one was able to acquire write lock on it anymore).
We found that there are some tests that are not cleaning
up properly:
1. rpl_tmp_table_and_DDL
2. rpl_do_grant
3. rpl_sync
For #1 and #2 we found that the slave would not, for some
cases, replicate all the instructions the master processed
in the cleanup section. We fix these by deploying some
synchronization commands in the test cases so that slave
processes all clean up instructions.
As for #3, this is tracked as part of another bug
(BUG@50442).
Problem was that in mysql-trunk the ER() macro is now dependent on current_thd
and the innodb monitor thread has no binding to that thd object. This cause
the crash because of bad derefencing.
Solution was to add a new macro which take the thd as an argument (which the innodb
thread uses for the call).
(Updated according to reviewers comments, i.e. added ER_THD_OR_DEFAULT and
moved test to suite parts.)
mysql-test/suite/parts/r/partition_innodb_status_file.result:
Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table
New test result file
mysql-test/suite/parts/t/partition_innodb_status_file-master.opt:
Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table
New test opt file
mysql-test/suite/parts/t/partition_innodb_status_file.test:
Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table
New test.
Note that the innodb monitor thread only runs every 15 seconds, so this
test will take at least 15 seconds, so I have moved it to the parts suite.
sql/sql_table.cc:
Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table
Using thd safe ER macro.
sql/unireg.h:
Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table
Added ER macros for use with specified thd pointer.
As part of BUG@39934 fix, the public:
- THD::current_stmt_binlog_row_based
variable had been removed and replaced by a private variable:
- THD::current_stmt_binlog_format.
THD was refactored and some modifiers and accessors were
implemented for the new variable.
However, due to a bad merge, the
THD::current_stmt_binlog_row_based variable is back as a public
member of THD. This in itself is already potentially
harmful. What's even worse is that while merging some more
patches and resolving conflicts, the variable started being used
again, which is obviously wrong.
To fix this we:
1. remove the extraneous variable from sql_class.h
2. revert a bad merge for BUG#49132
3. merge BUG#49132 properly again (actually, making use of the
cset used to merge the original patch to mysql-pe).