Analysis:
--------
Certain queries using intrinsic temporary tables may fail due to
name clashes in the file name for the temporary table when the
'temp-pool' enabled.
'temp-pool' tries to reduce the number of different filenames used for
temp tables by allocating them from small pool in order to avoid
problems in the Linux kernel by using a three part filename:
<tmp_file_prefix>_<pid>_<temp_pool_slot_num>.
The bit corresponding to the temp_pool_slot_num is set in the bit
map maintained for the temp-pool when it used for the file name.
It is cleared after the temp table is deleted for re-use.
The 'create_tmp_table()' function call under error condition
tries to clear the same bit twice by calling 'free_tmp_table()'
and 'bitmap_lock_clear_bit()'. 'free_tmp_table()' does a delete
of the table/file and clears the bit by calling the same function
'bitmap_lock_clear_bit()'.
The issue reported can be triggered under the timing window mentioned
below for an error condition while creating the temp table:
a) THD1: Due to an error clears the temp pool slot number used by it
by calling 'free_tmp_table'.
b) THD2: In the process of creating the temp table by using an unused
slot number in the bit map.
c) THD1: Clears the slot number used THD2 by calling
'bitmap_lock_clear_bit()' after completing the call 'free_tmp_table'.
d) THD3: Uses the slot number used the THD2 since it is freed by THD1.
When it tries to create the temp file using that slot number,
an error is reported since it is currently in use by THD2.
[The error: Error 'Can't create/write to file
'/tmp/#sql_277e_0.MYD' (Errcode: 17)']
Another issue which may occur in 5.6 and trunk is that:
When the open temporary table fails after its creation(due to ulimit
or OOM error), the file is not deleted. Thus further attempts to use
the same slot number in the 'temp-pool' results in failure.
Fix:
---
a) Under the error condition calling the 'bitmap_lock_clear_bit()'
function to clear the bit is unnecessary since 'free_tmp_table()'
deletes the table/file and clears the bit. Hence removed the
redundant call 'bitmap_lock_clear_bit()' in 'create_tmp_table()'
This prevents the timing window under which the issue reported
can be seen.
b) If open of the temporary table fails, then the file is deleted
thus allowing the temp-pool slot number to be utilized for the
subsequent temporary table creation.
c) Also if the attempt to create temp table fails since it already
exists, the temp-pool slot for it is marked as used, to avoid
the problem from re-appearing.
Problematic query:
insert ignore into `t1_federated` (`c1`) select `c1` from `t1_local` a
where not exists (select 1 from `t1_federated` b where a.c1 = b.c1);
When this query is killed in another connection it could lead to crash.
The problem is follwing:
An attempt to obtain table statistics for subselect table in killed query
fails with an error. So JOIN::optimize() for subquery is failed but
it does not prevent further subquery evaluation.
At the first subquery execution JOIN::optimize() is called
(see subselect_single_select_engine::exec()) and fails with
an error. 'executed' flag is set to TRUE and it prevents
further subquery evaluation. At the second call
JOIN::optimize() does not happen as 'JOIN::optimized' is TRUE
and in case of uncacheable subquery the 'executed' flag is set
to FALSE before subquery evaluation. So we loose 'optimize stage'
error indication (see subselect_single_select_engine::exec()).
In other words 'executed' flag is used for two purposes, for
error indication at JOIN::optimize() stage and for an
indication of subquery execution. And it seems it's wrong
as the flag could be reset.
mysql-test/r/error_simulation.result:
test case
mysql-test/t/error_simulation.test:
test case
sql/item_subselect.cc:
added new flag subselect_single_select_engine::optimize_error
which is used for error detection which could happen at optimize
stage.
sql/item_subselect.h:
added new flag subselect_single_select_engine::optimize_error
sql/sql_select.cc:
test case
Problematic query:
insert ignore into `t1_federated` (`c1`) select `c1` from `t1_local` a
where not exists (select 1 from `t1_federated` b where a.c1 = b.c1);
When this query is killed in another connection it could lead to crash.
The problem is follwing:
An attempt to obtain table statistics for subselect table in killed query
fails with an error. So JOIN::optimize() for subquery is failed but
it does not prevent further subquery evaluation.
At the first subquery execution JOIN::optimize() is called
(see subselect_single_select_engine::exec()) and fails with
an error. 'executed' flag is set to TRUE and it prevents
further subquery evaluation. At the second call
JOIN::optimize() does not happen as 'JOIN::optimized' is TRUE
and in case of uncacheable subquery the 'executed' flag is set
to FALSE before subquery evaluation. So we loose 'optimize stage'
error indication (see subselect_single_select_engine::exec()).
In other words 'executed' flag is used for two purposes, for
error indication at JOIN::optimize() stage and for an
indication of subquery execution. And it seems it's wrong
as the flag could be reset.
require O(#scans) memory
When an index merge operation was restarted, it would
re-allocate the Unique object controlling the duplicate row
ID elimination. Fixed by making the Unique object a member
of QUICK_INDEX_MERGE_SELECT and thus reusing it throughout
the lifetime of this object.
require O(#scans) memory
When an index merge operation was restarted, it would
re-allocate the Unique object controlling the duplicate row
ID elimination. Fixed by making the Unique object a member
of QUICK_INDEX_MERGE_SELECT and thus reusing it throughout
the lifetime of this object.
Item_hex_string::Item_hex_string
The status of memory allocation in the Lex_input_stream (called
from the Parser_state constructor) was not checked which led to
a parser crash in case of the out-of-memory error.
The solution is to introduce new init() member function in
Parser_state and Lex_input_stream so that status of memory
allocation can be returned to the caller.
mysql-test/r/error_simulation.result:
Added a test case for bug #42064.
mysql-test/t/error_simulation.test:
Added a test case for bug #42064.
mysys/my_alloc.c:
Added error injection code for the regression test.
mysys/my_malloc.c:
Added error injection code for the regression test.
mysys/safemalloc.c:
Added error injection code for the regression test.
sql/event_data_objects.cc:
Use the new init() member function of Parser_state and check
its return value to handle memory allocation failures.
sql/mysqld.cc:
Added error injection code for the regression test.
sql/sp.cc:
Use the new init() member function of Parser_state and check
its return value to handle memory allocation failures.
sql/sql_lex.cc:
Moved memory allocation from constructor to the separate init()
member function.
Added error injection code for the regression test.
sql/sql_lex.h:
Moved memory allocation from constructor to the separate init()
member function.
sql/sql_parse.cc:
Use the new init() member function of Parser_state and check
its return value to handle memory allocation failures.
sql/sql_partition.cc:
Use the new init() member function of Parser_state and check
its return value to handle memory allocation failures.
sql/sql_prepare.cc:
Use the new init() member function of Parser_state and check
its return value to handle memory allocation failures.
sql/sql_trigger.cc:
Use the new init() member function of Parser_state and check
its return value to handle memory allocation failures.
sql/sql_view.cc:
Use the new init() member function of Parser_state and check
its return value to handle memory allocation failures..
sql/thr_malloc.cc:
Added error injection code for the regression test.
Item_hex_string::Item_hex_string
The status of memory allocation in the Lex_input_stream (called
from the Parser_state constructor) was not checked which led to
a parser crash in case of the out-of-memory error.
The solution is to introduce new init() member function in
Parser_state and Lex_input_stream so that status of memory
allocation can be returned to the caller.
Problem: ALTER TABLE ADD INDEX may lead to table copying if there's
numeric field(s) with non-default display width modificator specified.
Fix: compare numeric field's storage lenghts when we decide whether
they can be considered 'equal' for table alteration purposes.
mysql-test/r/error_simulation.result:
Fix for bug#50946: fast index creation still seems to copy the table
- test result.
mysql-test/t/error_simulation.test:
Fix for bug#50946: fast index creation still seems to copy the table
- test case.
sql/field.cc:
Fix for bug#50946: fast index creation still seems to copy the table
- check numeric field's pack lengths instead of it's display lenghts
comparing fields equality for table alteration purposes.
sql/sql_table.cc:
Fix for bug#50946: fast index creation still seems to copy the table
- check compare_tables() result for testing purposes.
Problem: ALTER TABLE ADD INDEX may lead to table copying if there's
numeric field(s) with non-default display width modificator specified.
Fix: compare numeric field's storage lenghts when we decide whether
they can be considered 'equal' for table alteration purposes.
- avoid restart
- keep all in one file
- fix --check-testcase
BitKeeper/deleted/.del-error_simulation-master.opt:
Delete: mysql-test/t/error_simulation-master.opt
mysql-test/r/error_simulation.result:
Update result file
mysql-test/t/error_simulation.test:
Dynamically set debug flag for session only
Post merge fix.
mysql-test/t/error_simulation.test:
Post merge fix.
mysql-test/r/subselect.result:
Post merge fix.
mysql-test/r/error_simulation.result:
Post merge fix.
sql/item.cc:
Post merge fix.
a temporary table has grown out of heap memory reserved for it and
the remaining disk space is not big enough to store the table as
a MyISAM table.
The crash happens because the function create_myisam_from_heap
does not handle safely the mem_root structure associated
with the converted table in the case when an error has occurred.
sql/sql_select.cc:
Fixed bug #28449: a crash may happen at some rare conditions when
a temporary table has grown out of heap memory reserved for it and
the remaining disk space is not big enough to store the table as
a MyISAM table.
The crash happens because the function create_myisam_from_heap
does not handle safely the mem_root structure associated
with the converted table in the case when an error has occurred.
As it's hard to create a sitiation that would throw an error
a special code has been added that raises an error for a newly
created test called error_simulation.
mysql-test/r/error_simulation.result:
New BitKeeper file ``mysql-test/r/error_simulation.result''
Added a test case for bug #28449.
mysql-test/t/error_simulation-master.opt:
New BitKeeper file ``mysql-test/t/error_simulation-master.opt''
mysql-test/t/error_simulation.test:
New BitKeeper file ``mysql-test/t/error_simulation.test''
Added a test case for bug #28449.
a temporary table has grown out of heap memory reserved for it and
the remaining disk space is not big enough to store the table as
a MyISAM table.
The crash happens because the function create_myisam_from_heap
does not handle safely the mem_root structure associated
with the converted table in the case when an error has occurred.