The name resolution for correlated subqueries and HAVING clauses
failed to distinguish which of two was being performed when there
was a reference to an outer aliased field.
Fixed by adding the condition that HAVING clause name resulotion
is being performed.
The bug is a regression introduced by the fix for bug30596. The problem
was that in cases when groups in GROUP BY correspond to only one row,
and there is ORDER BY, the GROUP BY was removed and the ORDER BY
rewritten to ORDER BY <group_by_columns> without checking if the
columns in GROUP BY and ORDER BY are compatible. This led to
incorrect ordering of the result set as it was sorted using the
GROUP BY columns. Additionaly, the code discarded ASC/DESC modifiers
from ORDER BY even if its columns were compatible with the GROUP BY
ones.
This patch fixes the regression by checking if ORDER BY columns form a
prefix of the GROUP BY ones, and rewriting ORDER BY only in that case,
preserving the ASC/DESC modifiers. That check is sufficient, since the
GROUP BY columns contain a unique index.
The optimizer takes different execution paths during EXPLAIN than SELECT,
this fix relates only to EXPLAIN, hence no behavior changes.
The test of sort keys for ORDER BY was prohibited from considering keys
that were mentioned in IGNORE KEYS FOR ORDER BY. This led to two
inconsistencies: One was that IGNORE INDEX FOR GROUP BY and
IGNORE INDEX FOR ORDER BY gave apparently different EXPLAINs; the latter
erroneously claimed to do filesort. The second inconsistency
is that the test of sort keys is called twice, finding a sort key the first
time but not the second time, leading to the mentioned filesort.
Fixed by making the test of sort keys consider all enabled
keys on the table. This test rejects keys that are not covering, and for
covering keys the hint should be ignored anyway.
and strategy (explain)
The fix for WL3527 adds tests that test if the index usage hints
combinations don't cause syntax errors.
The EXPLAIN for one of these tests can be affected by the size of the
rowid on the disk (affected by the presence of large file support).
Fixed to avoid the platform dependent test result by removing the
irrelevant columns from the EXPLAIN result.
The optimization that uses a unique index to remove GROUP BY did not
ensure that the index was actually used, thus violating the ORDER BY
that is implied by GROUP BY.
Fixed by replacing GROUP BY with ORDER BY if the GROUP BY clause contains
a unique index over non-nullable field(s). In case GROUP BY ... ORDER BY
null is used, GROUP BY is simply removed.
The optimization that uses a unique index to remove GROUP BY, did not
ensure that the index was actually used, thus violating the ORDER BY
that is impled by GROUP BY.
Fixed by replacing GROUP BY with ORDER BY if the GROUP BY clause contains
a unique index. In case GROUP BY ... ORDER BY null is used, GROUP BY is
simply removed.
This patch adds cost estimation for the queries with ORDER BY / GROUP BY
and LIMIT.
If there was a ref/range access to the table whose rows were required
to be ordered in the result set the optimizer always employed this access
though a scan by a different index that was compatible with the required
order could be cheaper to produce the first L rows of the result set.
Now for such queries the optimizer makes a choice between the cheapest
ref/range accesses not compatible with the given order and index scans
compatible with it.
On many architectures, e.g. 68000, x86, the double registers have higher precision
than the IEEE standard prescribes. When compiled with flags -O and higher, some double's
go into registers and therefore have higher precision. In one test case the cost
information of the best and second-best key were close enough to be influenced by this
effect, causing a failed test in distribution builds.
Fixed by removing some rows from the table in question so that cost information is not
influenced by decimals beyond standard definition of double.
- added join cache indication in EXPLAIN (Extra column).
- prefer filesort over full scan over
index for ORDER BY (because it's faster).
- when switching from REF to RANGE because
RANGE uses longer key turn off sort on
the head table only as the resulting
RANGE access is a candidate for join cache
and we don't want to disable it by sorting
on the first table only.
When fields are inserted instead of * in the select list they were not marked
for check for the ONLY_FULL_GROUP_BY mode.
The Field_iterator_table::create_item() function now marks newly created
items for check when in the ONLY_FULL_GROUP_BY mode.
The setup_wild() and the insert_fields() functions now maintain the
cur_pos_in_select_list counter for the ONLY_FULL_GROUP_BY mode.
can be specified
Currently MySQL allows one to specify what indexes to ignore during
join optimization. The scope of the current USE/FORCE/IGNORE INDEX
statement is only the FROM clause, while all other clauses are not
affected.
However, in certain cases, the optimizer
may incorrectly choose an index for sorting and/or grouping, and
produce an inefficient query plan.
This task provides the means to specify what indexes are
ignored/used for what operation in a more fine-grained manner, thus
making it possible to manually force a better plan. We do this
by extending the current IGNORE/USE/FORCE INDEX syntax to:
IGNORE/USE/FORCE INDEX [FOR {JOIN | ORDER | GROUP BY}]
so that:
- if no FOR is specified, the index hint will apply everywhere.
- if MySQL is started with the compatibility option --old_mode then
an index hint without a FOR clause works as in 5.0 (i.e, the
index will only be ignored for JOINs, but can still be used to
compute ORDER BY).
See the WL#3527 for further details.
Currently in the ONLY_FULL_GROUP_BY mode no hidden fields are allowed in the
select list. To ensure this each expression in the select list is checked
to be a constant, an aggregate function or to occur in the GROUP BY list.
The last two requirements are wrong and doesn't allow valid expressions like
"MAX(b) - MIN(b)" or "a + 1" in a query with grouping by a.
The correct check implemented by the patch will ensure that:
any field reference in the [sub]expressions of the select list
is under an aggregate function or
is mentioned as member of the group list or
is an outer reference or
is part of the select list element that coincide with a grouping element.
The Item_field objects now can contain the position of the select list
expression which they belong to. The position is saved during the
field's Item_field::fix_fields() call.
The non_agg_fields list for non-aggregated fields is added to the SELECT_LEX
class. The SELECT_LEX::cur_pos_in_select_list now contains the position in the
select list of the expression being currently fixed.
When resolving unqualified name references MySQL was not
checking what is the item type for the reference. Thus
e.g a string literal item that has by convention a name
equal to its string value will also work as a reference to
a SELECT list item or a table field.
Fixed by allowing only Item_ref or Item_field to referenced by
(unqualified) name.
Currently SQL_BIG_RESULT is checked only at compile time.
However, additional optimizations may take place after
this check that change the sort method from 'filesort'
to sorting via index. As a result the actual plan
executed is not the one specified by the SQL_BIG_RESULT
hint. Similarly, there is no such test when executing
EXPLAIN, resulting in incorrect output.
The patch corrects the problem by testing for
SQL_BIG_RESULT both during the explain and execution
phases.
Select_type in the EXPLAIN output for the query SELECT * FROM t1 was
'SIMPLE', while for the query SELECT * FROM v1, where the view v1
was defined as SELECT * FROM t1, the EXPLAIN output contained 'PRIMARY'
for the select_type column.
optimizer does not honor IGNORE INDEX
- Allow an index to be used for sorting the table
instead of filesort only if it is not disabled by
IGNORE INDEX.
- Make the range-et-al optimizer produce E(#table records after table
condition is applied),
- Make the join optimizer use this value,
- Add "filtered" column to EXPLAIN EXTENDED to show
fraction of records left after table condition is applied
- Adjust test results, add comments
When making a place to store field values at the start of each group
the real item (not the reference) must be used when deciding which column
to copy.
length.
When temporary field created for DATE(LEFT(column,8)) expression, max_length
value is taken from Item_date_typecast, and it is getting it from underlaid
Item_func_left and it's max_length is 8 in given expression. And all this
results in stripping last 2 digits.
To Item_date_typecast class added its own fix_length_and_dec() function
that sets max_length value to 10, which is proper for DATE field.