When using index for group by and range access the server isolates
a set of ranges based on the conditions over the key parts of the
index used. Then it uses only the ranges over the GROUP BY fields to
jump from one group to another. Since the GROUP BY fields may form a
prefix over the index, we may use only a prefix of the ranges produced
by the range optimizer.
Each range contains a notion on whether it includes its border values.
The problem is that when using a range prefix, the last range is open
because it assumes that there is a range on the next keypart. Thus when
we use a prefix range as it is, it excludes all border values.
The solution is when ignoring the suffix of the range conditions
(to jump over the GROUP BY prefix only) the server must change the
remaining intervals so they always contain their borders, e.g.
if the whole range was :
(1,-inf) <= (<group_by_col>,<min_max_arg_col>) < (1, 3) we must make
(1) <= (<group_by_col>) <= (1) because (a,b) < (c1,c2) means :
a < c1 OR (a = c1 AND b < c2).
Only MyISAM tables locked with LOCK TABLES ... WRITE were affected.
A query that is optimized with index_merge doesn't reflect rows
inserted within LOCK TABLES.
MyISAM doesn't flush a state within LOCK TABLES. index_merge
optimization creates a copy of the handler, which thus gets
outdated MyISAM state.
New handler->clone() method is introduced to fix this problem.
For non-MyISAM storage engines it allocates a handler and opens
it with ha_open(). For MyISAM it additionally copies MyISAM state
pointer to cloned handler.
when a range condition use an invalid DATETIME constant.
Now we do not use invalid DATETIME constants to form end keys for
range intervals: range analysis just ignores predicates with such
constants.
buffer for a MY_BITMAP temporary buffer allocated on stack in the
function get_best_covering_ror_intersect().
Now the buffer of a proper size is allocated by a request from this
function in mem_root.
We succeeded to demonstrate the bug only on Windows with a very large
database. That's why no test case is provided for in the patch.
In fix for BUG#15872, a condition of type "t.key NOT IN (c1, .... cN)"
where N>1000, was incorrectly converted to
(-inf < X < c_min) OR (c_max < X)
Now this conversion is removed, we dont produce any range lists for such
conditions.
3.23 regression test failure
The member SEL_ARG::min_flag was not initialized,
due to which the condition for no GEOM_FLAG in function
key_or did not choose "Range checked for each record" as
the correct access method.
- When manually constructing a SEL_TREE for "t.key NOT IN(...)", take into account that
get_mm_parts may return a tree with type SEL_TREE::IMPOSSIBLE
- Added missing OOM checks
- Added comments
Re-work best_access_path() and find_best() to reuse E(#rows(range access)) as
E(#rows(ref[_or_null](const) access) only when it is appropriate.
[This is the final cumulative patch]
too much memory. Instead, either create the equvalent SEL_TREE manually, or create only two ranges that
strictly include the area to scan
(Note: just to re-iterate: increasing NOT_IN_IGNORE_THRESHOLD will make optimization run slower for big
IN-lists, but the server will not run out of memory. O(N^2) memory use has been eliminated)
The bug was due to a missed case in the detection of whether an index
can be used for loose scan. More precisely, the range optimizer chose
to use loose index scan for queries for which the condition(s) over
an index key part could not be pushed to the index together with the
loose scan.
As a result, loose index scan was selecting the first row in the
index with a given GROUP BY prefix, and was applying the WHERE
clause after that, while it should have inspected all rows with
the given prefix, and apply the WHERE clause to all of them.
The fix detects and skips such cases.
- Added empty constructors and virtual destructors to many classes and structs
- Removed some usage of the offsetof() macro to instead use C++ class pointers
If check_quick_select returns non-empty range then the function cost_group_min_max
cannot return 0 as an estimate of the number of retrieved records.
Yet the function erroneously returned 0 as the estimate in some situations.
- Fixed tests
- Optimized new code
- Fixed some unlikely core dumps
- Better bug fixes for:
- #14397 - OPTIMIZE TABLE with an open HANDLER causes a crash
- #14850 (ERROR 1062 when a quering a view using a Group By on a column that can be null
Allow for configuration of the maximum number of indexes per table.
Added and used a configure.in macro.
Replaced fixed limits by the configurable limit.
Limited MyISAM indexes to its hard limit.
Fixed a bug in opt_range.cc for many indexes with InnoDB.
Tested for 2, 63, 64, 65, 127, 128, 129, 255, 256, and 257 indexes.
Testing this part of the bugfix requires rebuilding of the server
with different options. This cannot be done with our test suite.
Therefore I added the necessary test files to the bug report.
If you repeat the tests, please note that the ps_* tests fail for
everything but 64 indexes. This is because of differences in the
meta data, namely field lengths for index names etc.
by starting with an empty index set and adding indexes to it until it
becomes covering. If the set becomes covering after adding the first index,
return NULL and don't try constructing ROR-intersection of one index (which
caused a crash)
Loose index scan using only second part of multipart index was choosen, which
results in creating wrong keys and endless loop.
get_best_group_min_max() now allows loose index scan for distinct only if used
keyparts forms a prefix of the index.
Bad examples of usage of a string with its length fixed.
The incorrect length in the trigger file configuration descriptor
fixed (BUG#14090).
A hook for unknown keys added to the parser to support old .TRG files.
large table gives server crash": make sure that when a MyISAM temporary
table is created for a cursor, it's created in its memory root,
not the memory root of the current query.
Invalid date like 2000-02-32 wasn't converted to int, which lead to not
using index and comparison with field as astring, which results in slow
query execution.
convert_constatn_item() and get_mm_leaf() now forces MODE_INVALID_DATES to
allow such conversion.