Backport from mysql-5.5 to mysql-5.1
Bug #19612819 : FILESORT: ASSERTION FAILED: POS->FIELD != 0 || POS->ITEM != 0
Problem:
While getting the temp table field for a REF_ITEM
make_sortorder is using the real_item. As a result
server fails later with an assert.
Solution:
Do not use real_item to get the temp table field.
Instead use the REF_ITEM itself as temp table fields
are created for REF_ITEM not the real_item.
JOIN::cur_dups_producing_tables was not maintained correctly in
the cases of greedy optimization (search_depth < n_tables).
Moved it to POSITION structure where it will be maintained automatically.
Removed POSITION::prefix_dups_producing_tables since its value can now
be calculated.
Show total execution time (r_total_time_ms) for various parts of the
query:
1. time spent in SELECTs
2. time spent reading rows from storage engines
#2 currently gets the data from P_S.
This bug manifests due to wrong computation and evaluation of
keyinfo->key_length. The issues were:
* Using table->file->max_key_length() as an absolute value that must not be
reached for a key, while it represents the maximum number of bytes
possible for a table key.
* Incorrectly computing the keyinfo->key_length size during
KEY_PART_INFO creation. The metadata information regarding the key
such the field length (for strings) was added twice.
Temporary table count fix. The number of temporary tables was increased
when the table is not actually created. (when do_not_open was passed
as TRUE to create_tmp_table).
Problem:
find_order_by_list does not update the address of order_item
correctly after resolving.
Solution:
Change the ref_by address for a order_by field if its
SUM_FUNC_ITEM to the address of the field present in
all_fields.
Redefine FT_KEYPART in a way that it does not conflict with Hash Join.
Hash join stores field->field_index in KEYUSE::keypart, so we must
use a value of FT_KEYPART that's greater than MAX_FIELDS.
Problem:
While getting the temp table field for a REF_ITEM
make_sortorder is using the real_item. As a result
server fails later with an assert.
Solution:
Do not use real_item to get the temp table field.
Instead use the REF_ITEM itself as temp table fields
are created for REF_ITEM not the real_item.
- Fixed compiler warnings
- Added include/wait_for_binlog_checkpoint.inc, as suggested by JonasO
- Updated 'build-tags' to work with git (Patch by Serg)
- The code that tested if
WHERE expr=value AND expr=const
can be rewritten to:
WHERE const=value AND expr=const
was incomplete in case of STRING_RESULT.
- Moving the test into a new function, to reduce duplicate code.
generate_derived_keys_for_table() did not work correctly in the case where
- it had a potential index on derived table
- however, TABLE::check_tmp_key() would disallow creation of this index
after looking at its future key parts (because of the key parts exceeding
max. index length)
- the code would leave a KEYUSE structure that refers to a non-existant index.
Depending on further optimizer calculations, this could cause a crash.
- Changed 0x%lx -> %p
array.c:
- Static (preallocated) buffer can now be anywhere
my_sys.h
- Define MY_INIT_BUFFER_USED
sql_delete.cc & sql_lex.cc
- Use memroot when allocating classes (avoids call to current_thd)
sql_explain.h:
- Use preallocated buffers
sql_explain.cc:
- Use preallocated buffers and memroot
sql_select.cc:
- Use multi_alloc_root() instead of many alloc_root()
- Update calls to Explain
The GEOMETRY field metadata is stored in the FRM file.
SRID for a spatial column now can be stored, it was added to the CREATE TABLE syntax,
so the AddGeometryData() stored procedure is now possible. Script adding the required Add/DropGeometryColumn sp-s added.
The problem was a race between the debug code in the server and the SHOW
EXPLAIN FOR in the test case.
The test case would wait for a query to reach the first point of interest
(inside dbug_serve_apcs()), then send it a SHOW EXPLAIN FOR, then wait for the
query to reach the next point of interest. However, the second wait was
insufficient. It was possible for the the second wait to complete immediately,
causing both the first and the second SHOW EXPLAIN FOR to hit the same
invocation of dbug_server_apcs(). Then a later invocation would miss its
intended SHOW EXPLAIN FOR and hang, and the test case would eventually time
out.
Fix is to make sure that the second wait can not trigger during the first
invocation of dbug_server_apcs(). We do this by clearing the thd_proc_info
(that the wait is looking for) before processing the SHOW EXPLAIN FOR; this
way the second wait can not start until the thd_proc_info from the first
invocation has been cleared.
If we didn't do it, SJ-Materialization table would appear to
EXPLAIN JSON code as having different keyparts than it actually
has. This caused unpredictable content in "used_key_parts"
- Switch Explain data structure from "flat" representation of
SJ-Materialization into nested one.
- Update functions that print tabular output to operate on the
nested structure.
- Add function to generate JSON output.
- Basic support for JOIN buffering
- The output is not polished but catches the main point:
tab->select_cond and tab->cache_select->cond are printed separately.
- Hash join support is poor still.
- Also fixed identation in JOIN_TAB::save_explain_data
Analysis:
--------
Certain queries using intrinsic temporary tables may fail due to
name clashes in the file name for the temporary table when the
'temp-pool' enabled.
'temp-pool' tries to reduce the number of different filenames used for
temp tables by allocating them from small pool in order to avoid
problems in the Linux kernel by using a three part filename:
<tmp_file_prefix>_<pid>_<temp_pool_slot_num>.
The bit corresponding to the temp_pool_slot_num is set in the bit
map maintained for the temp-pool when it used for the file name.
It is cleared after the temp table is deleted for re-use.
The 'create_tmp_table()' function call under error condition
tries to clear the same bit twice by calling 'free_tmp_table()'
and 'bitmap_lock_clear_bit()'. 'free_tmp_table()' does a delete
of the table/file and clears the bit by calling the same function
'bitmap_lock_clear_bit()'.
The issue reported can be triggered under the timing window mentioned
below for an error condition while creating the temp table:
a) THD1: Due to an error clears the temp pool slot number used by it
by calling 'free_tmp_table'.
b) THD2: In the process of creating the temp table by using an unused
slot number in the bit map.
c) THD1: Clears the slot number used THD2 by calling
'bitmap_lock_clear_bit()' after completing the call 'free_tmp_table'.
d) THD3: Uses the slot number used the THD2 since it is freed by THD1.
When it tries to create the temp file using that slot number,
an error is reported since it is currently in use by THD2.
[The error: Error 'Can't create/write to file
'/tmp/#sql_277e_0.MYD' (Errcode: 17)']
Another issue which may occur in 5.6 and trunk is that:
When the open temporary table fails after its creation(due to ulimit
or OOM error), the file is not deleted. Thus further attempts to use
the same slot number in the 'temp-pool' results in failure.
Fix:
---
a) Under the error condition calling the 'bitmap_lock_clear_bit()'
function to clear the bit is unnecessary since 'free_tmp_table()'
deletes the table/file and clears the bit. Hence removed the
redundant call 'bitmap_lock_clear_bit()' in 'create_tmp_table()'
This prevents the timing window under which the issue reported
can be seen.
b) If open of the temporary table fails, then the file is deleted
thus allowing the temp-pool slot number to be utilized for the
subsequent temporary table creation.
c) Also if the attempt to create temp table fails since it already
exists, the temp-pool slot for it is marked as used, to avoid
the problem from re-appearing.