Bug#17867117 - ERROR RESULT WHEN "COUNT + DISTINCT + CASE WHEN" NEED MERGE_WALK
Problem:
COUNT DISTINCT gives incorrect result when it uses a Unique
Tree and its last inserted record has null value.
Here is how COUNT DISTINCT is processed, given that this query is not
using loose index scan.
When a row is produced as a result of joining tables (there is only
one table here), we store the SELECTed value in a Unique tree. This
allows elimination of any duplicates, and thus implements DISTINCT.
When we have processed all rows like this, we walk the Unique tree,
counting its elements, in Aggregator_distinct::endup() (tree->walk());
for each element we call Item_sum_count::add(). Such function wants to
ignore any NULL value, for that it checks item_sum -> args[0] ->
null_value. It is a mistake: when walking the Unique tree, the value
to be aggregated is not item_sum ->args[0] but rather table ->
field[0].
Solution:
instead of item_sum -> args[0] -> null_value, use arg_is_null(), which
knows where to look (like in fix for bug 57932).
As a consequence of this solution, we have to make arg_is_null() a
little more general:
1) Because it was so far only used for AVG() (which always has a
single argument), this function was looking at a single argument; now
that it has to work with COUNT(DISTINCT expression1,expression2), it
must look at all arguments.
2) Because we start using arg_is_null () for COUNT(DISTINCT), i.e. in
Item_sum_count::add (), it implies that we are also using it for
COUNT(no DISTINCT) (same add ()). For COUNT(no DISTINCT), the
nullness to check is that of item_sum -> args[0]. But the null_value
of such item is reliable only if val_*() has been called on it. So far
arg_is_null() was always used after a call to arg_val*(), so could
rely on null_value; but for COUNT, there is no call to arg_val*(), so
arg_is_null() has to call is_null() instead.
Testcase for 16539979 by Neeraj. Testcase for 17867117 contributed by
Xiaobin Lin from Taobao.
and innodb
The 5.5 version of the patch.
The server doesn't restrict the data that can be inserted into integer columns
with explicitly specified length that's smaller than what the type can handle,
e.g. 1234 can be inserted into an INT(2) column just fine.
Thus, when calcualting the maximum width of expressions involving such
restricted integer columns we need to use the implicit maximum width of
the field instead of the explicitly speficied one.
Fixed the server to use the implicit maximum in such cases and made sure
the implicit maximum is addjusted the same way as the explicit one wrt
signedness.
Fixed several test case results (ctype_*.result, metadata.result and
type_ranges.result) to reflect the extended column widths.
Added a regression test case in distinct.test.
Note : this is the behavior preserving fix that makes 5.5 behave as 5.1 and
earlier. In the mysql trunk we'll add a insert time check for the explict
maximum size.
The external 'for' loop in remove_dup_with_compare() handled
HA_ERR_RECORD_DELETED by just starting over without advancing
to the next record which caused an infinite loop.
This condition could be triggered on certain data by a SELECT
query containing DISTINCT, GROUP BY and HAVING clauses.
Fixed remove_dup_with_compare() so that we always advance to
the next record when receiving HA_ERR_RECORD_DELETED from
rnd_next().
Problem: storing "SELECT ... INTO @var ..." results in variables we used val_xxx()
methods which returned results of the current row.
So, in some cases (e.g. SELECT DISTINCT, GROUP BY or HAVING) we got data
from the first row of a new group (where we evaluate a clause) instead of
data from the last row of the previous group.
Fix: use val_xxx_result() counterparts to get proper results.
The function test_if_skip_sort_order ignored any covering index used for ref
access of a table in a query with ORDER BY if this index was incompatible
with the ORDER BY list and there was another covering index compatible with
this list.
As a result sub-optimal execution plans were chosen for some queries with
ORDER BY clause.
The bug is a regression introduced in 5.1 by the patch for bug28404.
Under some circumstances test_if_skip_sort_order() could leave some
data structures in an inconsistent state so that some parts of code
could assume the selected execution strategy for GROUP BY/DISTINCT as
a loose index scan (e.g. JOIN_TAB::is_using_loose_index_scan()), while
the actual strategy chosen was an ordered index scan, which led to
wrong data being returned.
Fixed test_if_skip_sort_order() so that when changing the type for a
join table, select->quick is reset not only for EXPLAIN, but for the
actual join execution as well, to not confuse code that depends on its
value to determine the chosen GROUP BY/DISTINCT strategy.
The optimization that uses a unique index to remove GROUP BY did not
ensure that the index was actually used, thus violating the ORDER BY
that is implied by GROUP BY.
Fixed by replacing GROUP BY with ORDER BY if the GROUP BY clause contains
a unique index over non-nullable field(s). In case GROUP BY ... ORDER BY
null is used, GROUP BY is simply removed.
The optimization that uses a unique index to remove GROUP BY, did not
ensure that the index was actually used, thus violating the ORDER BY
that is impled by GROUP BY.
Fixed by replacing GROUP BY with ORDER BY if the GROUP BY clause contains
a unique index. In case GROUP BY ... ORDER BY null is used, GROUP BY is
simply removed.
This patch adds cost estimation for the queries with ORDER BY / GROUP BY
and LIMIT.
If there was a ref/range access to the table whose rows were required
to be ordered in the result set the optimizer always employed this access
though a scan by a different index that was compatible with the required
order could be cheaper to produce the first L rows of the result set.
Now for such queries the optimizer makes a choice between the cheapest
ref/range accesses not compatible with the given order and index scans
compatible with it.
- added join cache indication in EXPLAIN (Extra column).
- prefer filesort over full scan over
index for ORDER BY (because it's faster).
- when switching from REF to RANGE because
RANGE uses longer key turn off sort on
the head table only as the resulting
RANGE access is a candidate for join cache
and we don't want to disable it by sorting
on the first table only.
DATE and DATETIME can be compared either as strings or as int. Both
methods have their disadvantages. Strings can contain valid DATETIME value
but have insignificant zeros omitted thus became non-comparable with
other DATETIME strings. The comparison as int usually will require conversion
from the string representation and the automatic conversion in most cases is
carried out in a wrong way thus producing wrong comparison result. Another
problem occurs when one tries to compare DATE field with a DATETIME constant.
The constant is converted to DATE losing its precision i.e. losing time part.
This fix addresses the problems described above by adding a special
DATE/DATETIME comparator. The comparator correctly converts DATE/DATETIME
string values to int when it's necessary, adds zero time part (00:00:00)
to DATE values to compare them correctly to DATETIME values. Due to correct
conversion malformed DATETIME string values are correctly compared to other
DATE/DATETIME values.
As of this patch a DATE value equals to DATETIME value with zero time part.
For example '2001-01-01' equals to '2001-01-01 00:00:00'.
The compare_datetime() function is added to the Arg_comparator class.
It implements the correct comparator for DATE/DATETIME values.
Two supplementary functions called get_date_from_str() and get_datetime_value()
are added. The first one extracts DATE/DATETIME value from a string and the
second one retrieves the correct DATE/DATETIME value from an item.
The new Arg_comparator::can_compare_as_dates() function is added and used
to check whether two given items can be compared by the compare_datetime()
comparator.
Two caching variables were added to the Arg_comparator class to speedup the
DATE/DATETIME comparison.
One more store() method was added to the Item_cache_int class to cache int
values.
The new is_datetime() function was added to the Item class. It indicates
whether the item returns a DATE/DATETIME value.
The optimizer transforms DISTINCT into a GROUP BY
when possible.
It does that by constructing the same structure
(a list of ORDER instances) the parser makes when
parsing GROUP BY.
While doing that it also eliminates duplicates.
But if a duplicate is found it doesn't advance the
pointer to ref_pointer array, so the next
(and subsequent) ORDER structures point to the wrong
element in the SELECT list.
Fixed by advancing the pointer in ref_pointer_array
even in the case of a duplicate.
The optimizer takes away columns from GROUP BY/DISTINCT if they constitute
all the parts of an unique index.
However if some of the columns can contain NULLs this cannot be done
(because an UNIQUE index can have multiple rows with NULL values).
Fixed by not using UNIQUE indexes with nullable columns to remove
grouping columns from GROUP BY/DISTINCT.
The optimizer removes expressions from GROUP BY/DISTINCT
if they happen to participate in a <expression> = <const>
predicates of the WHERE clause (the idea being that if
it's always equal to a constant it can't have multiple
values).
However for predicates where the expression and the
constant item are of different result type this is not
valid (e.g. a string column compared to 0).
Fixed by additional check of the result types of the
expression and the constant and if they differ the
expression don't get removed from the group by list.
GROUP BY/DISTINCT pruning optimization must be done before ORDER BY
optimization because ORDER BY may be removed when GROUP BY/DISTINCT
sorts as a side effect, e.g. in
SELECT DISTINCT <non-key-col>,<pk> FROM t1
ORDER BY <non-key-col> DISTINCT
must be removed before ORDER BY as if done the other way around
it will remove both.
- Make the range-et-al optimizer produce E(#table records after table
condition is applied),
- Make the join optimizer use this value,
- Add "filtered" column to EXPLAIN EXTENDED to show
fraction of records left after table condition is applied
- Adjust test results, add comments
'SELECT DISTINCT a,b FROM t1' should not use temp table if there is unique
index (or primary key) on a.
There are a number of other similar cases that can be calculated without the
use of a temp table : multi-part unique indexes, primary keys or using GROUP BY
instead of DISTINCT.
When a GROUP BY/DISTINCT clause contains all key parts of a unique
index, then it is guaranteed that the fields of the clause will be
unique, therefore we can optimize away GROUP BY/DISTINCT altogether.
This optimization has two effects:
* there is no need to create a temporary table to compute the
GROUP/DISTINCT operation (or the temporary table will be smaller if only GROUP
is removed and DISTINCT stays or if DISTINCT is removed and GROUP BY stays)
* this causes the statement in effect to become updatable in Connector/Java
because the result set columns will be direct reference to the primary key of
the table (instead to the temporary table that it currently references).
Implemented a check that will optimize away GROUP BY/DISTINCT for queries like
the above.
Currently it will work only for single non-constant table in the FROM clause.
When converting DISTINCT to GROUP BY where the columns are from the covering
index and they are quoted twice in the SELECT list the optimizer is creating
improper processing sequence. This is because of the fact that the columns
of the covering index are not recognized as such and treated as non-index
columns.
Generally speaking duplicate columns can safely be removed from the GROUP
BY/DISTINCT list because this will not add or remove new rows in the
resulting set. Duplicates can be removed even if they are not consecutive
(as is the case for ORDER BY, where the duplicate columns can be removed
only if they are consecutive).
So we can safely transform "SELECT DISTINCT a,a FROM ... ORDER BY a" to
"SELECT a,a FROM ... GROUP BY a ORDER BY a" instead of
"SELECT a,a FROM .. GROUP BY a,a ORDER BY a". We can even transform
"SELECT DISTINCT a,b,a FROM ... ORDER BY a,b" to
"SELECT a,b,a FROM ... GROUP BY a,b ORDER BY a,b".
The fix to this bug consists of checking for duplicate columns in the SELECT
list when constructing the GROUP BY list in transforming DISTINCT to GROUP
BY and skipping the ones that are already in.
Added test cases for bug #12625.
sql_select.cc:
Fixed bug #12625.
Fixed invalid removal of constant items from the DISTINCT
list in the function create_distinct_group.