Commit graph

90 commits

Author SHA1 Message Date
Guilhem Bichot
9418fea133 Bug#16539979 - BASIC SELECT COUNT(DISTINCT ID) IS BROKEN
Bug#17867117 - ERROR RESULT WHEN "COUNT + DISTINCT + CASE WHEN" NEED MERGE_WALK 

Problem:
COUNT DISTINCT gives incorrect result when it uses a Unique
Tree and its last inserted record has null value.

Here is how COUNT DISTINCT is processed, given that this query is not
using loose index scan.

When a row is produced as a result of joining tables (there is only
one table here), we store the SELECTed value in a Unique tree. This
allows elimination of any duplicates, and thus implements DISTINCT.

When we have processed all rows like this, we walk the Unique tree,
counting its elements, in Aggregator_distinct::endup() (tree->walk());
for each element we call Item_sum_count::add(). Such function wants to
ignore any NULL value, for that it checks item_sum -> args[0] ->
null_value. It is a mistake: when walking the Unique tree, the value
to be aggregated is not item_sum ->args[0] but rather table ->
field[0].

Solution:
instead of item_sum -> args[0] -> null_value, use arg_is_null(), which
knows where to look (like in fix for bug 57932).

As a consequence of this solution, we have to make arg_is_null() a
little more general:
1) Because it was so far only used for AVG() (which always has a
single argument), this function was looking at a single argument; now
that it has to work with COUNT(DISTINCT expression1,expression2), it
must look at all arguments.
2) Because we start using arg_is_null () for COUNT(DISTINCT), i.e. in
Item_sum_count::add (), it implies that we are also using it for
COUNT(no DISTINCT) (same add ()). For COUNT(no DISTINCT), the
nullness to check is that of item_sum -> args[0]. But the null_value
of such item is reliable only if val_*() has been called on it. So far
arg_is_null() was always used after a call to arg_val*(), so could
rely on null_value; but for COUNT, there is no call to arg_val*(), so
arg_is_null() has to call is_null() instead.

Testcase for 16539979 by Neeraj. Testcase for 17867117 contributed by
Xiaobin Lin from Taobao.
2013-12-04 12:32:42 +01:00
Georgi Kodinov
a914a32191 Bug #11744875: 4082: integer lengths cause truncation with distinct concat
and innodb

The 5.5 version of the patch.

The server doesn't restrict the data that can be inserted into integer columns 
with explicitly specified length that's smaller than what the type can handle,
e.g. 1234 can be inserted into an INT(2) column just fine.
Thus, when calcualting the maximum width of expressions involving such 
restricted integer columns we need to use the implicit maximum width of 
the field instead of the explicitly speficied one.
Fixed the server to use the implicit maximum in such cases and made sure 
the implicit maximum is addjusted the same way as the explicit one wrt
signedness.

Fixed several test case results (ctype_*.result, metadata.result and 
type_ranges.result) to reflect the extended column widths.

Added a regression test case in distinct.test.

Note : this is the behavior preserving fix that makes 5.5 behave as 5.1 and 
earlier. In the mysql trunk we'll add a insert time check for the explict 
maximum size.
2011-05-11 14:11:57 +03:00
Alexey Kopytov
f161723821 Bug #46159: simple query that never returns
The external 'for' loop in remove_dup_with_compare() handled 
HA_ERR_RECORD_DELETED by just starting over without advancing 
to the next record which caused an infinite loop. 
 
This condition could be triggered on certain data by a SELECT 
query containing DISTINCT, GROUP BY and HAVING clauses. 

Fixed remove_dup_with_compare() so that we always advance to 
the next record when receiving HA_ERR_RECORD_DELETED from 
rnd_next().
2009-09-06 00:42:17 +04:00
Ramil Kalimullin
c754cc84c1 Manual merge. 2009-05-10 21:20:35 +05:00
Ramil Kalimullin
71943e3628 Fix for bug#42009: SELECT into variable gives different results to direct SELECT
Problem: storing "SELECT ... INTO @var ..." results in variables we used val_xxx()
methods which returned results of the current row. 
So, in some cases (e.g. SELECT DISTINCT, GROUP BY or HAVING) we got data
from the first row of a new group (where we evaluate a clause) instead of
data from the last row of the previous group.

Fix: use val_xxx_result() counterparts to get proper results.
2009-02-24 18:47:12 +04:00
Mats Kindahl
450cc26b47 Post-merge fixes to fix test cases. 2008-10-29 18:38:18 +01:00
Patrick Crews
99abc81c5f Merge 5.0 -> 5.1 2008-09-30 15:32:35 -04:00
Patrick Crews
51c40c5bd0 Bug#38311 - Fix of some cruft from remove_files in ndb_autodiscover.test, clean up of distinct.test,
and replacing error numbers with error names.
2008-09-23 05:24:32 -04:00
igor@olga.mysql.com
b210e5df6d Fixed bug#35844.
The function test_if_skip_sort_order ignored any covering index used for ref
access of a table in a query with ORDER BY if this index was incompatible 
with the ORDER BY list and there was another covering index compatible with
this list. 
As a result sub-optimal execution plans were chosen for some queries with
ORDER BY clause.
2008-04-22 21:49:39 -07:00
kaa@kaamos.(none)
b753e4a02e Fix for bug #34928: Confusion by having Primary Key and Index
The bug is a regression introduced in 5.1 by the patch for bug28404.

Under some circumstances test_if_skip_sort_order() could leave some
data structures in an inconsistent state so that some parts of code
could assume the selected execution strategy for GROUP BY/DISTINCT as
a loose index scan (e.g. JOIN_TAB::is_using_loose_index_scan()), while
the actual strategy chosen was an ordered index scan, which led to
wrong data being returned.

Fixed test_if_skip_sort_order() so that when changing the type for a
join table, select->quick is reset not only for EXPLAIN, but for the 
actual join execution as well, to not confuse code that depends on its
value to determine the chosen GROUP BY/DISTINCT strategy.
2008-03-26 19:37:36 +03:00
mhansson/martin@linux-st28.site
98d34d620c Bug #30596 GROUP BY optimization gives wrong result order
The optimization that uses a unique index to remove GROUP BY did not 
ensure that the index was actually used, thus violating the ORDER BY
that is implied by GROUP BY.
Fixed by replacing GROUP BY with ORDER BY if the GROUP BY clause contains
a unique index over non-nullable field(s). In case GROUP BY ... ORDER BY 
null is used, GROUP BY is simply removed.
2007-08-28 18:01:29 +02:00
mhansson/martin@linux-st28.site
a4d5d9204d Bug #30596 GROUP BY optimization gives wrong result order
The optimization that uses a unique index to remove GROUP BY, did not 
ensure that the index was actually used, thus violating the ORDER BY
that is impled by GROUP BY.
Fixed by replacing GROUP BY with ORDER BY if the GROUP BY clause contains
a unique index. In case GROUP BY ... ORDER BY null is used, GROUP BY is
simply removed.
2007-08-27 17:33:41 +02:00
igor@olga.mysql.com
cf39429295 Fixed bug#28404.
This patch adds cost estimation for the queries with ORDER BY / GROUP BY
and LIMIT. 
If there was a ref/range access to the table whose rows were required
to be ordered in the result set the optimizer always employed this access
though a scan by a different index that was compatible with the required 
order could be cheaper to produce the first L rows of the result set.
Now for such queries the optimizer makes a choice between the cheapest
ref/range accesses not compatible with the given order and index scans
compatible with it.
2007-08-02 12:45:56 -07:00
gkodinov/kgeorge@macbook.gmz
fe03f6bbe0 Bug #27531: 5.1 part of the fix
- Renamed "Using join cache" to "Using join buffer".
- "Using join buffer" is now printed on the last
  table that "reads" from the join buffer cache.
2007-05-29 15:58:18 +03:00
gkodinov/kgeorge@magare.gmz
306371a850 bug #27531: 5.1 part of the fix:
- added join cache indication in EXPLAIN (Extra column).
 - prefer filesort over full scan over 
   index for ORDER BY (because it's faster).
 - when switching from REF to RANGE because
   RANGE uses longer key turn off sort on
   the head table only as the resulting 
   RANGE access is a candidate for join cache
   and we don't want to disable it by sorting
   on the first table only.
2007-05-04 18:06:06 +03:00
holyfoot/hf@hfmain.(none)
2fcebef31f Merge mysql.com:/d2/hf/mrg/mysql-5.0-opt
into  mysql.com:/d2/hf/mrg/mysql-5.1-opt
2007-04-29 13:19:32 +05:00
evgen@moonbone.local
4747fa0c03 Bug#27590: Wrong DATE/DATETIME comparison.
DATE and DATETIME can be compared either as strings or as int. Both
methods have their disadvantages. Strings can contain valid DATETIME value
but have insignificant zeros omitted thus became non-comparable with
other DATETIME strings. The comparison as int usually will require conversion
from the string representation and the automatic conversion in most cases is
carried out in a wrong way thus producing wrong comparison result. Another
problem occurs when one tries to compare DATE field with a DATETIME constant.
The constant is converted to DATE losing its precision i.e. losing time part.

This fix addresses the problems described above by adding a special
DATE/DATETIME comparator. The comparator correctly converts DATE/DATETIME
string values to int when it's necessary, adds zero time part (00:00:00)
to DATE values to compare them correctly to DATETIME values. Due to correct
conversion malformed DATETIME string values are correctly compared to other
DATE/DATETIME values.

As of this patch a DATE value equals to DATETIME value with zero time part.
For example '2001-01-01' equals to '2001-01-01 00:00:00'.

The compare_datetime() function is added to the Arg_comparator class.
It implements the correct comparator for DATE/DATETIME values.
Two supplementary functions called get_date_from_str() and get_datetime_value()
are added. The first one extracts DATE/DATETIME value from a string and the
second one retrieves the correct DATE/DATETIME value from an item.
The new Arg_comparator::can_compare_as_dates() function is added and used
to check whether two given items can be compared by the compare_datetime()
comparator.
Two caching variables were added to the Arg_comparator class to speedup the
DATE/DATETIME comparison.
One more store() method was added to the Item_cache_int class to cache int
values.
The new is_datetime() function was added to the Item class. It indicates
whether the item returns a DATE/DATETIME value.
2007-04-27 00:12:09 +04:00
igor@olga.mysql.com
e04289704d Merge olga.mysql.com:/home/igor/mysql-5.0-opt
into  olga.mysql.com:/home/igor/mysql-5.1-opt
2007-04-11 15:12:49 -07:00
gkodinov/kgeorge@magare.gmz
aefc060fe5 Bug #27659:
The optimizer transforms DISTINCT into a GROUP BY
when possible.
It does that by constructing the same structure
(a list of ORDER instances) the parser makes when
parsing GROUP BY.
While doing that it also eliminates duplicates.
But if a duplicate is found it doesn't advance the
pointer to ref_pointer array, so the next 
(and subsequent) ORDER structures point to the wrong
element in the SELECT list.
Fixed by advancing the pointer in ref_pointer_array
even in the case of a duplicate.
2007-04-10 16:55:48 +03:00
gluh@eagle.(none)
7849d31923 Merge mysql.com:/home/gluh/MySQL/Merge/5.0-opt
into  mysql.com:/home/gluh/MySQL/Merge/5.1-opt
2007-02-02 10:25:45 +04:00
gkodinov/kgeorge@macbook.gmz
2ba683f2ee Bug #25551: inconsistent behaviour in grouping NULL, depending on index type
The optimizer takes away columns from GROUP BY/DISTINCT if they constitute
 all the parts of an unique index.
 However if some of the columns can contain NULLs this cannot be done 
(because an UNIQUE index can have multiple rows with NULL values).
 Fixed by not using UNIQUE indexes with nullable columns to remove
 grouping columns from GROUP BY/DISTINCT.
2007-01-31 10:18:26 +02:00
gkodinov/kgeorge@macbook.local
a30c09c33f Merge macbook.local:/Users/kgeorge/mysql/work/mysql-5.0-opt
into  macbook.local:/Users/kgeorge/mysql/work/merge-5.1-opt
2007-01-08 12:32:48 +02:00
gkodinov/kgeorge@macbook.gmz
a63df24a68 Bug #15881: cast problems
The optimizer removes expressions from GROUP BY/DISTINCT
  if they happen to participate in a <expression> = <const>
  predicates of the WHERE clause (the idea being that if
  it's always equal to a constant it can't have multiple 
  values).
  However for predicates where the expression and the 
  constant item are of different result type this is not
  valid (e.g. a string column compared to 0).
  Fixed by additional check of the result types of the 
  expression and the constant and if they differ the 
  expression don't get removed from the group by list.
2007-01-05 14:02:50 +02:00
timour/timka@lamia.home
0368914b9a Merge lamia.home:/home/timka/mysql/src/5.0-bug-21456
into  lamia.home:/home/timka/mysql/src/5.1-bug-21456
2006-08-29 16:39:09 +03:00
timour/timka@lamia.home
70f76e457f Merge lamia.home:/home/timka/mysql/src/4.1-bug-21456
into  lamia.home:/home/timka/mysql/src/5.0-bug-21456
2006-08-23 18:22:53 +03:00
timour/timka@lamia.home
de723f2998 Bug #21456: SELECT DISTINCT(x) produces incorrect results when using order by
GROUP BY/DISTINCT pruning optimization must be done before ORDER BY 
optimization because ORDER BY may be removed when GROUP BY/DISTINCT
sorts as a side effect, e.g. in 
  SELECT DISTINCT <non-key-col>,<pk> FROM t1
  ORDER BY <non-key-col> DISTINCT
must be removed before ORDER BY as if done the other way around
it will remove both.
2006-08-23 16:46:57 +03:00
sergefp@mysql.com
699291a8e6 BUG#14940 "MySQL choose wrong index", v.2
- Make the range-et-al optimizer produce E(#table records after table 
                                           condition is applied),
- Make the join optimizer use this value,
- Add "filtered" column to EXPLAIN EXTENDED to show 
  fraction of records left after table condition is applied
- Adjust test results, add comments
2006-07-28 21:27:01 +04:00
gkodinov@mysql.com
2cda7f5d80 4.1->5.0 merge for bug #16458 2006-06-28 15:53:54 +03:00
gkodinov@mysql.com
7149f48d97 Merge mysql.com:/home/kgeorge/mysql/4.1/B16458
into  mysql.com:/home/kgeorge/mysql/5.0/B16458
2006-06-27 17:59:49 +03:00
gkodinov@mysql.com
9ec681ef35 Bug #16458: Simple SELECT FOR UPDATE causes "Result Set not updatable" error
'SELECT DISTINCT a,b FROM t1' should not use temp table if there is unique 
index (or primary key) on a.
There are a number of other similar cases that can be calculated without the
use of a temp table : multi-part unique indexes, primary keys or using GROUP BY 
instead of DISTINCT.
When a GROUP BY/DISTINCT clause contains all key parts of a unique
index, then it is guaranteed that the fields of the clause will be
unique, therefore we can optimize away GROUP BY/DISTINCT altogether.
This optimization has two effects:
* there is no need to create a temporary table to compute the
   GROUP/DISTINCT operation (or the temporary table will be smaller if only GROUP 
   is removed and DISTINCT stays or if DISTINCT is removed and GROUP BY stays)
* this causes the statement in effect to become updatable in Connector/Java
because the result set columns will be direct reference to the primary key of 
the table (instead to the temporary table that it currently references). 

Implemented a check that will optimize away GROUP BY/DISTINCT for queries like 
the above.
Currently it will work only for single non-constant table in the FROM clause.
2006-06-27 17:40:19 +03:00
gkodinov@mysql.com
7bae0de398 BUG#18068: SELECT DISTINCT (with duplicates and covering index)
When converting DISTINCT to GROUP BY where the columns are from the covering
index and they are quoted twice in the SELECT list the optimizer is creating
improper processing sequence. This is because of the fact that the columns
of the covering index are not recognized as such and treated as non-index
columns.

Generally speaking duplicate columns can safely be removed from the GROUP
BY/DISTINCT list because this will not add or remove new rows in the
resulting set. Duplicates can be removed even if they are not consecutive
(as is the case for ORDER BY, where the duplicate columns can be removed
only if they are consecutive).

So we can safely transform "SELECT DISTINCT a,a FROM ... ORDER BY a" to
"SELECT a,a FROM ... GROUP BY a ORDER BY a" instead of 
"SELECT a,a FROM .. GROUP BY a,a ORDER BY a". We can even transform 
"SELECT DISTINCT a,b,a FROM ... ORDER BY a,b" to
"SELECT a,b,a FROM ... GROUP BY a,b ORDER BY a,b".

The fix to this bug consists of checking for duplicate columns in the SELECT
list when constructing the GROUP BY list in transforming DISTINCT to GROUP
BY and skipping the ones that are already in.
2006-05-09 18:13:01 +03:00
holyfoot@deer.(none)
a920d0df86 bug #15745 (COUNT(DISTINCT CONCAT(x,y)) returns wrong result 2006-03-05 20:48:31 +04:00
serg@serg.mylan
db896a66ca bad merge fixed 2005-09-15 21:05:42 +02:00
igor@rurik.mysql.com
8c34d8e578 Merge ibabaev@bk-internal.mysql.com:/home/bk/mysql-5.0
into rurik.mysql.com:/home/igor/mysql-5.0
2005-08-22 09:40:10 -07:00
pappa@c-4a09e253.1238-1-64736c10.cust.bredbandsbolaget.se
4084c13605 Merge c-4a09e253.1238-1-64736c10.cust.bredbandsbolaget.se:/home/pappa/mysql-4.1
into  c-4a09e253.1238-1-64736c10.cust.bredbandsbolaget.se:/home/pappa/mysql-5.0
2005-08-20 16:38:55 -04:00
igor@rurik.mysql.com
c4f677feb5 Manual merge 2005-08-19 20:21:09 -07:00
igor@rurik.mysql.com
2d32b77693 distinct.test, distinct.result:
Added test cases for bug #12625.
sql_select.cc:
  Fixed bug #12625.
  Fixed invalid removal of constant items from the DISTINCT
  list in the function create_distinct_group.
2005-08-19 01:57:22 -07:00
hf@deer.(none)
6c148a7969 Fix for bug #9764 (DISTINCT IFNULL truncates data) 2005-06-08 20:35:37 +05:00
holyfoot@hf-ibm.(none)
8f3647005c Tests and results fixed with last precision/decimal related modifications 2005-05-06 01:01:39 +05:00
igor@rurik.mysql.com
cdb40830d2 Manual merge 2005-02-11 15:00:29 -08:00
igor@rurik.mysql.com
1bb04b0642 distinct.result:
Adjustment of the result file after the revision of the fix for bug #7520.
2005-02-11 13:44:54 -08:00
hf@deer.(none)
b94a482ee9 Precision Math implementation 2005-02-09 02:50:45 +04:00
sergefp@mysql.com
034c59ed59 Fix test results to account for difference between release BDB, debug BDB and MyISAM. 2004-12-31 14:48:11 +03:00
timour@mysql.com
3bb2c4e325 After merge adjustment of tests due to changes in cost estimates. 2004-10-11 20:10:07 +03:00
timour@mysql.com
e2cd3dd1ce WL#1724 "Min/Max Optimization for Queries with Group By Clause"
- after-review changes
- merged with the source tree from 204-08-27
2004-08-27 16:37:13 +03:00
timour@mysql.com
a840d8daad Complete implementation of WL#1469 "Greedy algorithm to search for an optimal execution plan",
consisting of pos-review fixes and improvements.
2004-05-10 15:48:50 +03:00
pem@mysql.com
975061bb5a Merge 4.1 to 5.0. 2003-12-16 16:12:28 +01:00
antony@ltantony.rdg.cyberkinetica.homeunix.net
fcf96dbb18 WorkLog#1323
Deprecate the use of TYPE=... Preferred syntax is ENGINE=
2003-12-10 04:31:42 +00:00
pem@mysql.com
337238b78a Merging 4.1->5.0 2003-10-22 16:10:22 +02:00
monty@narttu.mysql.fi
d9ff665102 Fixes after merge 2003-10-08 12:01:58 +03:00