We miss some records sometimes using RANGE method if we have
partial key segments.
Example:
Create table t1(a char(2), key(a(1)));
insert into t1 values ('a'), ('xx');
select a from t1 where a > 'x';
We call index_read() passing 'x' key and HA_READ_AFTER_KEY flag
in the handler::read_range_first() wich is wrong because we have
a partial key segment for the field and might miss records like 'xx'.
Fix: don't use open segments in such a case.
When using index for group by and range access the server isolates
a set of ranges based on the conditions over the key parts of the
index used. Then it uses only the ranges over the GROUP BY fields to
jump from one group to another. Since the GROUP BY fields may form a
prefix over the index, we may use only a prefix of the ranges produced
by the range optimizer.
Each range contains a notion on whether it includes its border values.
The problem is that when using a range prefix, the last range is open
because it assumes that there is a range on the next keypart. Thus when
we use a prefix range as it is, it excludes all border values.
The solution is when ignoring the suffix of the range conditions
(to jump over the GROUP BY prefix only) the server must change the
remaining intervals so they always contain their borders, e.g.
if the whole range was :
(1,-inf) <= (<group_by_col>,<min_max_arg_col>) < (1, 3) we must make
(1) <= (<group_by_col>) <= (1) because (a,b) < (c1,c2) means :
a < c1 OR (a = c1 AND b < c2).
result
The IN function aggregates result types of all expressions. It uses that
type in comparison of left expression and expressions in right part.
This approach works in most cases. But let's consider the case when the
right part contains both strings and integers. In that case this approach may
cause wrong results because all strings which do not start with a digit are
evaluated as 0.
CASE uses the same approach when a CASE expression is given thus it's also
affected.
The idea behind this fix is to make IN function to compare expressions with
different result types differently. For example a string in the left
part will be compared as string with strings specified in right part and
will be converted to real for comparison to int or real items in the right
part.
A new function called collect_cmp_types() is added. It collects different
result types for comparison of first item in the provided list with each
other item in the list.
The Item_func_in class now can refer up to 5 cmp_item objects: 1 for each
result type for comparison purposes. cmp_item objects are allocated according
to found result types. The comparison of the left expression with any
right part expression is now based only on result types of these expressions.
The Item_func_case class is modified in the similar way when a CASE
expression is specified. Now it can allocate up to 5 cmp_item objects
to compare CASE expression with WHEN expressions of different types.
The comparison of the CASE expression with any WHEN expression now based only
on result types of these expressions.
this key does not stop" (5.1 version).
UPDATE statement which WHERE clause used key and which invoked trigger
that modified field in this key worked indefinetely.
This problem occured because in cases when UPDATE statement was
executed in update-on-the-fly mode (in which row is updated right
during evaluation of select for WHERE clause) the new version of
the row became visible to select representing WHERE clause and was
updated again and again.
We already solve this problem for UPDATE statements which does not
invoke triggers by detecting the fact that we are going to update
field in key used for scanning and performing update in two steps,
during the first step we gather information about the rows to be
updated and then doing actual updates. We also do this for
MULTI-UPDATE and in its case we even detect situation when such
fields are updated in triggers (actually we simply assume that
we always update fields used in key if we have before update
trigger).
The fix simply extends this check which is done with help of
check_if_key_used()/QUICK_SELECT_I::check_if_keys_used()
routine/method in such way that it also detects cases when
field used in key is updated in trigger. We do this by
changing check_if_key_used() to take field bitmap instead
field list as argument and passing TABLE::write_set
to it (we also have to add info about fields used in
triggers to this bitmap a bit earlier).
As nice side-effect we have more precise and thus more optimal
perfomance-wise check for the MULTI-UPDATE.
Also check_if_key_used() routine and similar method were renamed
to is_key_used()/is_keys_used() in order to better reflect that
it is simple boolean predicate.
Finally, partition_key_modified() routine now also takes field
bitmap instead of field list as argument.
this key does not stop" (version for 5.0 only).
UPDATE statement which WHERE clause used key and which invoked trigger
that modified field in this key worked indefinetely.
This problem occured because in cases when UPDATE statement was
executed in update-on-the-fly mode (in which row is updated right
during evaluation of select for WHERE clause) the new version of
the row became visible to select representing WHERE clause and was
updated again and again.
We already solve this problem for UPDATE statements which does not
invoke triggers by detecting the fact that we are going to update
field in key used for scanning and performing update in two steps,
during the first step we gather information about the rows to be
updated and then doing actual updates. We also do this for
MULTI-UPDATE and in its case we even detect situation when such
fields are updated in triggers (actually we simply assume that
we always update fields used in key if we have before update
trigger).
The fix simply extends this check which is done in check_if_key_used()/
QUICK_SELECT_I::check_if_keys_used() routine/method in such way that
it also detects cases when field used in key is updated in trigger.
As nice side-effect we have more precise and thus more optimal
perfomance-wise check for the MULTI-UPDATE.
Also check_if_key_used()/QUICK_SELECT_I::check_if_keys_used() were
renamed to is_key_used()/QUICK_SELECT_I::is_keys_used() in order to
better reflect that boolean predicate.
Note that this check is implemented in much more elegant way in 5.1
Only MyISAM tables locked with LOCK TABLES ... WRITE were affected.
A query that is optimized with index_merge doesn't reflect rows
inserted within LOCK TABLES.
MyISAM doesn't flush a state within LOCK TABLES. index_merge
optimization creates a copy of the handler, which thus gets
outdated MyISAM state.
New handler->clone() method is introduced to fix this problem.
For non-MyISAM storage engines it allocates a handler and opens
it with ha_open(). For MyISAM it additionally copies MyISAM state
pointer to cloned handler.
when a range condition use an invalid DATETIME constant.
Now we do not use invalid DATETIME constants to form end keys for
range intervals: range analysis just ignores predicates with such
constants.
buffer for a MY_BITMAP temporary buffer allocated on stack in the
function get_best_covering_ror_intersect().
Now the buffer of a proper size is allocated by a request from this
function in mem_root.
We succeeded to demonstrate the bug only on Windows with a very large
database. That's why no test case is provided for in the patch.
Remove the code that cleared "read fields set" for merged scans. That code
was based on assumption that "We're going to just read rowids", while
actually QUICK_RANGE_SELECT code would also need key part values to check
that retrieved record(s) fall within the scanned intervals.
Fix when __attribute__() is stubbed out, add ATTRIBUTE_FORMAT() for specifying
__attribute__((format(...))) safely, make more use of the format attribute,
and fix some of the warnings that this turns up (plus a bonus unrelated one).
In fix for BUG#15872, a condition of type "t.key NOT IN (c1, .... cN)"
where N>1000, was incorrectly converted to
(-inf < X < c_min) OR (c_max < X)
Now this conversion is removed, we dont produce any range lists for such
conditions.
- Make the range-et-al optimizer produce E(#table records after table
condition is applied),
- Make the join optimizer use this value,
- Add "filtered" column to EXPLAIN EXTENDED to show
fraction of records left after table condition is applied
- Adjust test results, add comments