Analysis:
For some of the re-executions of the correlated subquery the
where clause is false. In these cases the execution of the
subquery detects that it must generate a NULL row because of
implicit grouping. In this case the subquery execution reaches
the following code in do_select():
while ((table= li++))
mark_as_null_row(table->table);
This code marks all rows in the table as complete NULL rows.
In the example, when evaluating the field t2.f10 for the second
row, all bits of Field::null_ptr[0] are set by the previous call
to mark_as_null_row(). Then the call to Field::is_null()
returns true, resulting in a NULL for the MAX function.
Thus the lines above are not suitable for subquery re-execution
because mark_as_null_row() changes the NULL bits of each table
field, and there is no logic to restore these fields.
Solution:
The call to mark_as_null_row() was added by the fix for bug
lp:613029. Therefore removing the fix for lp:613029 corrects
this wrong result. At the same time the test for lp:613029
behaves correctly because the changes of MWL#89 result in a
different execution path where:
- the constant subquery is evaluated via JOIN::exec_const_cond
- detecting that it has an empty result triggers the branch
if (zero_result_cause)
return_zero_rows()
- return_zero_rows() calls mark_as_null_row().
The bitmap of used tables must be evaluated for the select list of every
materialized derived table / view and saved in a dedicated field.
This is also applied to materialized subqueries.
The value of THD::used tables should be re-evaluated after merges
of views and derived tables into the main query.
Now it's done in the function SELECT_LEX::update_used_tables.
The re-evaluation of the 'used_table' bitmaps for the items
in HAVING, GROUP BY and ORDER BY clauses has been added as well.
The bug was caused by an incorrect code of the function
Item_direct_view_ref::replace_equal_field introduced in the
patch for bugs 717577, 724942. The function erroneously
returned the wrapped field instead of the Item_direct_view_ref
object itself in the cases when no replacement happened.
The bug masked two other minor bugs that could result in not
quite correct output of the EXPLAIN command for some queries.
They were fixed in the patch as well.
The function generate_derived_keys_for_table incorrectly handled
the cases when a materialized view or derived table could be accessed
by different keys on the same fields if these keys depended on the
same tables.
Analysis:
This bug consists of two related problems that are
result of too early evaluation of single-row subqueries
during the optimization phase of the outer query.
Several optimizer code paths try to evaluate single-row
subqueries in order to produce a constant and use that
constant for further optimzation.
When the execution of the subquery peforms destructive
changes to the representation of the subquery, and these
changes are not anticipated by the subsequent optimization
phases of the outer query, we tipically get a crash or
failed assert.
Specifically, in this bug the inner-most suqbuery with
DISTINCT triggers a substitution of the original JOIN
object by a single-table JOIN object with a temp table
needed to perform the DISTINCT operation (created by
JOIN::make_simple_join).
This substitution breaks EXPLAIN because:
a) in the first example JOIN::cleanup no longer can
reach the original table of the innermost subquery, and
close all indexes, and
b) in this second test query, EXPLAIN attempts to print
the name of the internal temp table, and crashes because
the temp table has no name (NULL pointer instead).
Solution:
a) fully disable subquery evaluation during optimization
in all cases - both for constant propagation and range
optimization, and
b) change JOIN::join_free() to perform cleanup irrespective
of EXPLAIN or not.
- JOIN::prepare would have set JOIN::table_count to incorrect value (bad merge of MWL 106)
- optimize_keyuse() would use table-bit as table number
(the change in optimize_keyuse is also the reason for query plan changes. Not
expected to have much effect because only handles cases of no index statistics)
- st_select_lex::register_dependency_item() ignored the fact that some of the
selects on the dependency paths could have been merged to their parents (because they
were mergeable VIEWs)
- Undo the incorrect fix in Item_subselect::recalc_used_tables(): do not call
fix_after_pullout() for Item_subselect::Ref_to_outside members.
If the expression for a derived table contained a clause LIMIT 0
SELECT from such derived table incorrectly returned a non-empty set.
Fixed by ensuring JOIN::do_send_rows to be updated after the call
of st_select_lex_unit::set_limit that sets the value of
JOIN::unit->select_limit_cnt.
Due to this bug in the function generate_derived_keys_for_table some
key definitions to access materialized derived tables or materialized
views were constructed with invalid info for their key parts.
This could make the server crash when it optimized queries using
materialized derived tables or materialized views.
- The crash was because a NOT NULL table column inside the subquery was considered NULLable
because the code thought it was on the inner side of an outer join nest.
- Fixed by making correct distinction between tables inside outer join nests and inside semi-join nests.
Split status variable Rows_read to Rows_read and Rows_tmp_read so that one can see how much real data is read.
Same was done with with Handler_update and Handler_write.
Fixed bug in MEMORY tables where some variables was counted twice.
Added new internal handler call 'ha_close()' to have one place to gather statistics.
Fixed bug where thd->open_options was set to wrong value when doing admin_recreate_table()
mysql-test/r/status.result:
Updated test results and added new tests
mysql-test/r/status_user.result:
Udated test results
mysql-test/t/status.test:
Added new test for temporary table status variables
sql/ha_partition.cc:
Changed to call ha_close() instead of close()
sql/handler.cc:
Added internal_tmp_table variable for easy checking of temporary tables.
Added new internal handler call 'ha_close()' to have one place to gather statistics.
Gather statistics for internal temporary tables.
sql/handler.h:
Added handler variables internal_tmp_table, rows_tmp_read.
Split function update_index_statistics() to two.
Added ha_update_tmp_row() for faster tmp table handling with more statistics.
sql/item_sum.cc:
ha_write_row() -> ha_write_tmp_row()
sql/multi_range_read.cc:
close() -> ha_close()
sql/mysqld.cc:
New status variables: Rows_tmp_read, Handler_tmp_update and Handler_tmp_write
sql/opt_range.cc:
close() -> ha_close()
sql/sql_base.cc:
close() -> ha_close()
sql/sql_class.cc:
Added handling of rows_tmp_read
sql/sql_class.h:
Added new satistics variables.
rows_read++ -> update_rows_read() to be able to correctly count reads to internal temp tables.
Added handler::ha_update_tmp_row()
sql/sql_connect.cc:
Added comment
sql/sql_expression_cache.cc:
ha_write_row() -> ha_write_tmp_row()
sql/sql_select.cc:
close() -> ha_close()
ha_update_row() -> ha_update_tmp_row()
sql/sql_show.cc:
ha_write_row() -> ha_write_tmp_row()
sql/sql_table.cc:
Fixed bug where thd->open_options was set to wrong value when doing admin_recreate_table()
sql/sql_union.cc:
ha_write_row() -> ha_write_tmp_row()
sql/sql_update.cc:
ha_write_row() -> ha_write_tmp_row()
sql/table.cc:
close() -> ha_close()
storage/heap/ha_heap.cc:
Removed double counting of statistic variables.
close -> ha_close() to get tmp table statistics.
storage/maria/ha_maria.cc:
close -> ha_close() to get tmp table statistics.
The following were missing in the patch for mwl106:
- KEY_PART_INFO::fieldnr were not set for generated keys to access
tmp tables storing the rows of materialized derived tables/views
- TABLE_SHARE::column_bitmap_size was not set for tmp tables storing
the rows of materialized derived tables/views.
These could cause crashes or memory overwrite.
If a view/derived table is non-mergeable then the definition of the tmp table
to store the rows for it is created at the prepare stage. In this case if the
view definition uses outer joins and a view column belongs to an inner table
of one of them then the column should be considered as nullable independently
on nullability of the underlying column. If the underlying column happens to be
defined as non-nullable then the function create_tmp_field_from_item rather
than the function create_tmp_field_from_field should be employed to create
the definition of the interesting column in the tmp table.
mysql-test/r/join.result:
Test case for LP:798597
mysql-test/t/join.test:
Test case for LP:798597
sql/sql_select.cc:
In simplify_joins we reset table->maybe_null for outer join tables that can't ever be NULL.
This caused a conflict between the previously calculated items and the group_buffer against the fields
in the temporary table that are created as not null thanks to the optimization.
The fix is to correct the group by items to also be not_null so that they match the used fields and keys.
The function setup_tables should handle table_list elements for
semijoin materialized tables in a special way when executing
a prepared statement for the second time.
The patch for bugs 717577 and 724942 has missed to make adjustments for the
call item_equal->add_const(const_item, orig_field_item) in the function
check_simple_equality that builds multiple equality for a field and a constant.
As a result, when this field happens to be a view field and the corresponding
Item_field object F is wrapped in an Item_direct_view_ref object R the object
F is placed in the multiple equality instead of the object R.
A substitution of an equal item for F potentially can cause very serious
problems and in some cases can lead to crashes of the server.
- Added regression test with queries over the WORLD database.
- Discovered and fixed several bugs in the related cost calculation
functionality both in the semijoin and non-semijon subquery code.
- Added DBUG printing of the cost variables used to decide between
IN-EXISTS and MATERIALIZATION.
- move attempt to evaluate join->exec_const_cond() out of the "Extract constant part of each ON expression" loop
(it got there by mistake when merging).
Fixed reference to not initialized memory detected by valgrind
sql/sql_select.cc:
A bit better fix for tmp-table problem:
Use only dynamic_record format for group by and distinct.
storage/maria/ma_create.c:
DYNAMIC_RECORD format doesn't pack VARCHAR fields.
This change fixes a non-fatal uninitialized memory copy.
The function generate_derived_keys did not take into account the fact
that the last element in the array of keyuses could be just a barrier
element. In some cases it could lead to a crash of the server.
Also fixed a couple of other bugs in generate_derived_keys: the inner
loop in the body of if this function did not change the cycle variables
properly.
The reason for this is that BLOCK_RECORD format is not good when there is a lot of duplicated keys as it first writes the data (to get the row position) and
then writes the key (and thus checks for duplicates).
The code that added semi-join transformations missed checking
the state of the fixed flag for the items built with the
and_items function before calls of the fix_fields method.
This could lead to an abort failure when the first argument
of and_items() happened to be NULL.
'derived_merge':
- if the flag is off then all derived tables are materialized
- if the flag is on then mergeable derived tables are merged
'derived_with_keys':
- if the flag is off then no keys are created for derived tables
- if the flag is on then for any derived table a key to access
the derived table may be created.
Now by default both flags are on.
Later the default values for the flags will be off to comply with
the current behaviour of mysql-5.1.
Uncommented previously commented out test case from parts.partition_repair_myisam
after having added an explicit requirement to materialize the derived
table used in the test case.