returns wrong results
Casting AVG() to DECIMAL led to incorrect results when the arguments
had a non-DECIMAL type, because in this case
Item_sum_avg::val_decimal() performed the division by the number of
arguments twice.
Fixed by changing Item_sum_avg::val_decimal() to not rely on
Item_sum_sum::val_decimal(), i.e. calculate sum and divide using
DECIMAL arithmetics for DECIMAL arguments, and utilize val_real() with
subsequent conversion to DECIMAL otherwise.
When resolving references we need to take into consideration
the view "fields" and allow qualified access to them.
Fixed by extending the reference resolution to process view
fields correctly.
The HAVING clause is subject to the same rules as the SELECT list
about using aggregated and non-aggregated columns.
But this was not enforced when processing implicit grouping from
using aggregate functions.
Fixed by performing the same checks for HAVING as for SELECT.
file .\opt_sum.cc, line
The optimizer pre-calculates the MIN/MAX values for queries like
SELECT MIN(kp_k) WHERE kp_1 = const AND ... AND kp_k-1 = const
when there is a key over kp_1...kp_k
In doing so it was not checking correctly nullability and
there was a superfluous assert().
Fixed by making sure that the field can be null before checking and
taking out the wrong assert().
.
Introduced a correct check for nullability
The MIN(field) can return NULL when all the row values in the group
are NULL-able or if there were no rows.
Fixed the assertion to reflect the case when there are no rows.
Item_sum_distinct::setup(THD*): Assertion
There was an assertion to detect a bug in ROLLUP
implementation. However the assertion is not true
when used in a subquery context with non-cacheable
statements.
Fixed by turning the assertion to accepted case
(just like it's done for the other aggregate functions).
to NULL
For queries of the form SELECT MIN(key_part_k) FROM t1
WHERE key_part_1 = const and ... and key_part_k-1 = const,
the opt_sum_query optimization tries to
use an index to substitute MIN/MAX functions with their values according
to the following rules:
1) Insert the minimum non-null values where the WHERE clause still matches, or
3) A row of nulls
However, the correct semantics requires that there is a third case 2)
such that a NULL value is substituted if there are only NULL values for
key_part_k.
The patch modifies opt_sum_query() to handle this missing case.
When only one row was present, the subtraction of nearly the same number
resulted in catastropic cancellation, introducing an error in the
VARIANCE calculation near 1e-15. That was sqrt()ed to get STDDEV, the
error was escallated to near 1e-8.
The simple fix of testing for a row count of 1 and forcing that to yield
0.0 is insufficient, as two rows of the same value should also have a
variance of 0.0, yet the error would be about the same.
So, this patch changes the formula that computes the VARIANCE to be one
that is not subject to catastrophic cancellation.
In addition, it now uses only (faster-than-decimal) floating point numbers
to calculate, and renders that to other types on demand.
We use val_int() calls (followed by null_value check) to determine
nullness in some Item_sum_count' and Item_sum_count_distinct' methods,
as a side effect we get extra warnings raised in the val_int().
Fix: use is_null() instead.
Item::val_xxx() may be called by the server several times at execute time
for a single query. Calls to val_xxx() may be very expensive and sometimes
(count(distinct), sum(distinct), avg(distinct)) not possible.
To avoid that problem the results of calculation for these aggregate
functions are cached so that val_xxx() methods just return the calculated
value for the second and subsequent calls.
wrong results
Mark the containing Item(s) (Item_subselect descendant usually) of
a subselect as containing aggregate functions if it has references
to aggregates functions that are calculated outside its context.
This tels end_send_group() not to make an Item_subselect descendant in
select list a copy and causes the correct value being returned.
Treat queries with no FROM and aggregate functions as normal queries,
so the aggregate function get correctly calculated as if there is 1 row.
This means that they will be considered to have one row, so COUNT(*) will return
1 instead of 0. Other aggregates will behave in compatible manner.
The problem was in that opt_sum_query() replaced MIN/MAX functions
with the corresponding constant found in a key, but due to imprecise
representation of float numbers, when evaluating the where clause,
this comparison failed.
When MIN/MAX optimization detects that all tables can be removed,
also remove all conjuncts in a where clause that refer to these
tables. As a result of this fix, these conditions are not evaluated
twice, and in the case of float number comparisons we do not discard
result rows due to imprecise float representation.
As a side-effect this fix also corrects an unnoticed problem in
bug 12882.
An aggregate function reference was resolved incorrectly and
caused a crash in count_field_types.
Must use real_item() to get to the real Item instance through
the reference
The bug report revealed two problems related to min/max optimization:
1. If the length of a constant key used in a SARGable condition for
for the MIN/MAX fields is greater than the length of the field an
unwanted warning on key truncation is issued;
2. If MIN/MAX optimization is applied to a partial index, like INDEX(b(4))
than can lead to returning a wrong result set.
select result
Item equal objects are employed only at the optimize phase. Usually they are not
supposed to be evaluated. Yet in some cases we call the method val_int() for
them. Here we have to take care of restricting the predicate such an object
represents f1=f2= ...=fn to the projection of known fields fi1=...=fik.
Added a check for field's table being const in Item_equal::val_int().
If the field's table is not const val_int() just skips that field when
evaluating Item_equal.
The problem was in that the MIN/MAX optimization in opt_sum_query was
replacing MIN/MAX functions with their constant argument without
taking into account that a query has no result rows.
Added a test case for bug #9210.
sql_select.cc:
Fixed bug #9210.
The function calc_group_buffer did not cover the case
when the GROUP BY expression was decimal.
Slightly optimized the other code.
Logging to logging@openlogging.org accepted
func_group.result, func_group.test:
Added a test case for bug #8893.
opt_sum.cc:
A misplaced initialization for the returned parameter
prefix_len in the function find_key_for_maxmin caused
usage of a wrong key prefix by the min/max optimization
in cases when the matching index was not the first index
that contained the min/max field.