greedy_search optimizer_search_depth=0
The algorithm inside restore_prev_nj_state failed to
properly update the counters within the NESTED_JOIN
tree. The counter was decremented each time a table in the
node was removed from the QEP, the correct thing to do being
only to decrement it when the last table in the child node
was removed from the plan. This lead to node counters
getting negative values and the plan thus appeared
impossible. An assertion caught this.
Fixed by not recursing up the tree unless the last table in
the join nest node is removed from the plan
WHERE predicates containing references to empty tables in a
subquery were handled incorrectly by the optimizer when
executing EXPLAIN. As a result, the optimizer could try to
evaluate such predicates rather than just stop with
"Impossible WHERE noticed after reading const tables" as
it would do in a non-subquery case. This led to valgrind
errors and crashes.
Fixed the code checking the above condition so that subqueries
are not excluded and hence are handled in the same way as top
level SELECTs.
The problem was in an incorrect debug assertion. The expression
used in the failing assertion states that when finding
references matching ORDER BY expressions, there can be only one
reference to a single table. But that does not make any sense,
all test cases for this bug are valid examples with multiple
identical WHERE expressions referencing the same table which
are also present in the ORDER BY list.
Fixed by removing the failing assertion. We also have to take
care of the 'found' counter so that we count multiple
references only once. We rely on this fact later in
eq_ref_table().
The problem is that we can not use make_cond_for_table().
This function relies on used_tables() condition
which is not set properly for subqueries.
As result subquery is not filtered out.
The fix is to use remove_eq_conds() function instead
of make_cond_for_table() func. 'remove_eq_conds()'
algorithm relies on const_item() value and it allows
to handle subqueries in right way.
The crash is the result of an attempt made by JOIN::optimize to evaluate
the WHERE condition when no records have been actually read.
The fix is to remove erroneous 'outer_join' variable check.
The crash happens because greedy_serach
can not determine best plan due to
wrong inner table dependences. These
dependences affects join table sorting
which performs before greedy_search starting.
In our case table which has real 'no dependences'
should be put on top of the list but it does not
happen as inner tables have no dependences as well.
The fix is to exclude RAND_TABLE_BIT mask from
condition which checks if table dependences
should be updated.
The problem is that when we make conditon for
grouped result const part of condition is cut off.
It happens because some parts of 'having' condition
which refer to outer join become const after
make_join_statistics. These parts may be lost
during further having condition transformation
in JOIN::exec. The fix is adding 'having'
condition check for const tables after
make_join_statistics is performed.
The crash happens because of discrepancy between values of
conts_tables and join->const_table_map(make_join_statisctics).
Calculation of conts_tables used condition with
HA_STATS_RECORDS_IS_EXACT flag check. Calculation of
join->const_table_map does not use this flag check.
In case of MERGE table without union with index
the table does not become const table and
thus join_read_const_table() is not called
for the table. join->const_table_map supposes
this table is const and later in make_join_select
this table is used for making&calculation const
condition. As table record buffer is not populated
it leads to crash.
The fix is adding a check if an engine supports
HA_STATS_RECORDS_IS_EXACT flag before updating
join->const_table_map.
SunStudio
SunStudio compilers of late warn about methods that might hide
methods in base classes due to the use of overloading combined
with overriding. SunStudio also warns about variables defined
in local socpe or method arguments that have the same name as
a member attribute of the class.
This patch renames methods that might hide base class methods,
to make it easier both for humans and compilers to see what is
actually called. It also renames variables in local scope.
The problem is that not all column names retrieved from a SELECT
statement can be used as view column names due to length and format
restrictions. The server failed to properly check the conformity
of those automatically generated column names before storing the
final view definition on disk.
Since columns retrieved from a SELECT statement can be anything
ranging from functions to constants values of any format and length,
the solution is to rewrite to a pre-defined format any names that
are not acceptable as a view column name.
The name is rewritten to "Name_exp_%u" where %u translates to the
position of the column. To avoid this conversion scheme, define
explict names for the view columns via the column_list clause.
Also, aliases are now only generated for top level statements.
consider clustered primary keys
Choosing a shortest index for the covering index scan,
the optimizer ignored the fact, that the clustered primary
key read involves whole table data.
The find_shortest_key function has been modified to
take into account that fact that a clustered PK has a
longest key of possible covering indices.
The problem is that cond->fix_fields(thd, 0) breaks
condition(cuts off 'having'). The reason of that is
that NULL valued Item pointer is present in the
middle of Item list and it breaks the Item processing
loop.
performance degradation.
Filesort + join cache combination is preferred to full index scan because it
is usually faster. But it's not the case when the index is clustered one.
Now test_if_skip_sort_order function prefers filesort only if index isn't
clustered.
The problem was in an incorrect debug assertion. The expression
used in the failing assertion states that when finding
references matching ORDER BY expressions, there can be only one
reference to a single table. But that does not make any sense,
all test cases for this bug are valid examples with multiple
identical WHERE expressions referencing the same table which
are also present in the ORDER BY list.
Fixed by removing the failing assertion. We also have to take
care of the 'found' counter so that we count multiple
references only once. We rely on this fact later in
eq_ref_table().
The problem is that during temporary table creation uneven bits
are not taken into account for hidden fields. It leads to incorrect
calculation&allocation of null bytes size for table record. And
if grouped value is null we set wrong bit for this value(see end_update()).
Fixed by adding separate calculation of uneven bit for hidden fields.
column is used for ORDER BY
Problem: filesort isn't meant for null length sort data
(e.g. char(0)), that leads to a server crash.
Fix: disregard sort order if sort data record length is 0 (nothing
to sort).
The problem becomes apparent only if HAVE_purify is undefined.
It related to the part of code placed in open_table_from_share() fuction
where we initialize record buffer only if HAVE_purify is enabled.
So in case of HAVE_purify=OFF record buffer is not initialized
on open table stage.
Next we read key, find NULL value and update appropriate null bit
but do not update record buffer. After that the record is stored
in the join cache(store_record_in_cache). For CHAR fields we
strip trailing spaces and in our case this procedure uses
uninitialized record buffer.
The fix is to skip stripping space procedure in case of null values
for CHAR fields(partially based on 6.0 JOIN_CACHE implementation).
Queries optimized with GROUP_MIN_MAX didn't cleanup KEYREAD
optimization properly. As a result subsequent queries may
return incomplete rows (fields are initialized to default
values).
Grouping by a subquery in a query with a distinct aggregate
function lead to a wrong result (wrong and unordered
grouping values).
There are two related problems:
1) The query like this:
SELECT (SELECT t1.a) aa, COUNT(DISTINCT b) c
FROM t1 GROUP BY aa
returned wrong result, because the outer reference "t1.a"
in the subquery was substituted with the Item_ref item.
The Item_ref item obtains data from the result_field object
that refreshes once after the end of each group. This data
is not applicable to filesort since filesort() doesn't care
about groups (and doesn't update result_field objects with
copy_fields() and so on). Also that data is not applicable
to group separation algorithm: end_send_group() checks every
record with test_if_group_changed() that evaluates Item_ref
items, but it refreshes those Item_ref-s only after the end
of group, that is a vicious circle and the grouped column
values in the output are shifted.
Fix: if
a) we grouping by a subquery and
b) that subquery has outer references to FROM list
of the grouping query,
then we substitute these outer references with
Item_direct_ref like references under aggregate
functions: Item_direct_ref obtains data directly
from the current record.
2) The query with a non-trivial grouping expression like:
SELECT (SELECT t1.a) aa, COUNT(DISTINCT b) c
FROM t1 GROUP BY aa+0
also returned wrong result, since JOIN::exec() substitutes
references to top-level aliases in SELECT list with Item_copy
caching items. Item_copy items have same refreshing policy
as Item_ref items, so the whole groping expression with
Item_copy inside returns wrong result in filesort() and
end_send_group().
Fix: include aliased items into GROUP BY item tree instead
of Item_ref references to them.
Fixed 2 problems :
1. test_if_order_by_key() was continuing on the primary key
as if it has a primary key suffix (as the secondary keys do).
This leads to crashes in ORDER BY <pk>,<pk>.
Fixed by not treating the primary key as the secondary one
and not depending on it being clustered with a primary key.
2. The cost calculation was trying to read the records
per key when operating on ORDER BYs that order on all of the
secondary key + some of the primary key.
This leads to crashes because of out-of-bounds array access.
Fixed by assuming we'll find 1 record per key in such cases.
subselect_single_select_engine::exec()
When a subquery doesn't need to be evaluated because
it returns only aggregate functions and these aggregates
can be calculated from the metadata about the table it
was not updating all the relevant members of the JOIN
structure to reflect that this is a constant query.
This caused problems to the enclosing subquery
('<> SOME' in the test case above) trying to read some
data about the tables.
Fixed by setting const_tables to the number of tables
when the SELECT is optimized away.
error in the query.
Fixes a leak after materializing a GROUP BY subquery to a
temp table when the subquery has a blob column in the SELECT
list.
Fixed by correctly destructing temporary buffers after doing
the conversion.
flush_cached_records() was not correctly checking for errors after calling
Item::val_xxx() methods. The expressions may contain subqueries
or stored procedures that cause errors that should stop the statement.
Fixed by correctly checking for errors and propagating them up the call stack.
Need to make sure the tmp join doesn't point to the structure already
freed by the cleanup() for the "base" join, as this can lead to
double free, because sometimes both tmp_join and join point to the
same tmp_table_params.copy_field array.
error in the query.
Fixes a leak after materializing a GROUP BY subquery to a
temp table when the subquery has a blob column in the SELECT
list.
Fixed by correctly destructing temporary buffers for re-usable
queries
fulltext search and row op.
The search for fulltext indexes is searching for some special
predicate layouts. While doing so it's not checking for the number
of columns of the expressions it tries to calculate.
And since row expressions can't return a single scalar value there
was a crash.
Fixed by checking if the expressions are scalar (in addition to
being constant) before calling Item::val_xxx() methods.
Several problems fixed :
1. Non constant expressions in UNION ... ORDER BY were not correctly cleaned up
in st_select_lex_unit::cleanup() causing crashes in EXPLAIN EXTENDED because of
fields quoted by these expressions pointing to the already freed temporary table
used to calculate the UNION.
Fixed by correctly cleaning up expressions of any depth.
2. Subqueries in the order by part of UNION ... ORDER BY ... caused a crash in
EXPLAIN EXTENDED because of a transformation attempt made during EXPLAIN EXTENDED
execution. Fixed by not doing the transformation when in EXPLAIN.
3. Fulltext functions caused crash when in the ORDER BY part of an un-parenthesized
UNION that gets "promoted" to be valid for the whole union, e.g.
SELECT * FROM t1 UNION SELECT * FROM t2 ORDER BY MATCHES (a) AGAINST ('abc' IN BOOLEAN MODE).
This is a case that demonstrates a more general problem of parts of the query being
moved to another level. When doing such transformation late in the optimization run
when most of the flags about the contents of the query are already aggregated it's possible
to "split" the flags so that they correctly reflect the new queries after the transformation.
In specific the ST_SELECT_LEX::ftfunc_list is holding all the free text function for all the
parts of the second SELECT in the UNION and we don't know what part of that is in the ORDER BY
that we're to move to the UNION level and what part is about the other parts of the second SELECT.
Fixed by throwing and error when such statements are about to be processed by adding a check
for the presence of MATCH() inside the ORDER BY clause that's going to get promoted to UNION.
To workaround this new limitation one must parenthesize the UNION SELECTs and provide a real
global ORDER BY for the UNION outside of the parenthesis.
on re-execution of prepared statement
Problem: some (see eq_ref_table()) ORDER BY/GROUP BY optimization
is called before each PS execution. However, we don't properly
initialize its stucture every time before the call.
Fix: properly initialize the sturture used.
returns incorrect results with where
An outer join of a const table (outer) and a normal table
(inner) with GROUP BY on a field from the outer table would
optimize away GROUP BY, and thus trigger the optimization to
do away with a temporary table if grouping was performed on
columns from the const table, hence executing the query with
filesort without temporary table. But this should not be
done if there is a non-indexed access to the inner table,
since filesort does not handle joins. It expects either ref
access, range ditto or table scan. The join condition will
thus not be applied.
Fixed by always forcing execution with temporary table in
the case of ROLLUP with a query involving an outer join. This
is a slightly broader class of queries than need fixing, but
it is hard to ascertain the position of a ROLLUP field wrt
outer join with current query representation.
int join_read_key(JOIN_TAB*)
The eq_ref access method TABLE_REF (accessed through
JOIN_TAB) to save state and to track if this is the
first row it finds or not.
This state was not reset on subquery re-execution
causing an assert.
Fixed by resetting the state before the subquery
re-execution.
int join_read_key(JOIN_TAB*)
The eq_ref access method TABLE_REF (accessed through
JOIN_TAB) to save state and to track if this is the
first row it finds or not.
This state was not reset on subquery re-execution
causing an assert.
Fixed by resetting the state before the subquery
re-execution.
This fix has been proposed by Sergey Petrunya and has been contributed
under SCA by sca@askmonty.org.
The cause for this valgrind error is that in the function
add_cond_and_fix() in sql_select.cc an Item_cond_and object is
created. This is marked as fixed but does not have a correct
table_map() attribute. Later, in make_join_select(), if
engine_condition_pushdown is in use, this table map is used and
results in the valgrind error.
The fix is to add a call to update_used_tables() in add_cond_and_fix()
so that the table map is updated correctly.
This patch is tested by multiple existing tests (e.g. the tests
innodb_mysql, innodb, fulltext, compress all produces this valgrind
warning/error without this fix).
Part 2 :
There was a special optimization on the ref access method for
ORDER BY ... DESC that was set without actually looking on the type of the
selected index for ORDER BY.
Fixed the SELECT ... ORDER BY .. DESC (it uses a different code path compared
to the ASC that has been fixed with the previous fix).
field='const1' AND field='const2' in some cases
Building multiple equality predicates containing
a constant which is compared as a datetime (with a field)
we should take this fact into account and compare the
constant with another possible constatns as datetimes
as well.
E.g. for the
SELECT ... WHERE a='2001-01-01' AND a='2001-01-01 00:00:00'
we should compare '2001-01-01' with '2001-01-01 00:00:00' as
datetimes but not as strings.