The problem was in incorrect handling of predicates involving
NULL as a constant value by the range optimizer.
For example, when creating a SEL_ARG node from a condition of
the form "field < const" (which would normally result in the
"NULL < field < const" SEL_ARG), the special case when "const"
is NULL was not taken into account, so "NULL < field < NULL"
was produced for the "field < NULL" condition.
As a result, SEL_ARG structures of this form could not be
further optimized which in turn could lead to incorrectly
constructed SEL_ARG trees. In particular, code assuming SEL_ARG
structures to always form a sequence of ordered disjoint
intervals could enter an infinite loop under some
circumstances.
Fixed by changing get_mm_leaf() so that for any sargable
predicate except "<=>" involving NULL as a constant, "empty"
SEL_ARG is returned, since such a predicate is always false.
Problem was that the partition containing NULL values
was pruned away, since '2001-01-01' < '2001-02-00' but
TO_DAYS('2001-02-00') is NULL.
Added the NULL partition for RANGE/LIST partitioning on TO_DAYS()
function to be scanned too.
Also fixed a bug that added ALLOW_INVALID_DATES to sql_mode
(SELECT * FROM t WHERE date_col < '1999-99-99' on a RANGE/LIST
partitioned table would add it).
There were a problem since pruning uses the field
for comparison (while evaluate_join_record uses longlong),
resulting in pruning failures when comparing DATE to DATETIME.
Fix was to always comparing DATE vs DATETIME as DATETIME,
by adding ' 00:00:00' to the DATE string.
And adding optimization for comparing with 23:59:59, so that
DATETIME_col > '2001-02-03 23:59:59' ->
TO_DAYS(DATETIME_col) > TO_DAYS('2001-02-03 23:59:59') instead
of '>='.
Problem was an errornous date that lead to end partition
was before the start, leading to a crash.
Solution was to check greater or equal instead of only
equal between start and end partition.
NOTE: partitioning pruning handles incorrect dates
differently than index lookup, which can give different
results in a partitioned table versus a non partitioned
table for queries having 'bad' dates in the where clause.
problem was that ha_partition::records was not implemented, thus
using the default handler::records, which is not correct if the engine
does not support HA_STATS_RECORDS_IS_EXACT.
Solution was to implement ha_partition::records as a wrapper around
the underlying partitions records.
The rows column in explain partitions will now include the total
number of records in the partitioned table.
(recommit after removing out-commented code)
- Introduced val_int_endpoint() function which converts between func
argument intervals and func value intervals for monotonic functions.
- Made partition interval analyzer use part_expr->val_int_endpoint()
to check if the edge values should be included.
In the ha_partition::position() we didn't calculate the number
of the partition of the record. We used m_last_part value instead,
relying on that it is set in other place like previous call of a method
like ::write_row(). In replication we don't call any of these befor
position(). Delete_rows_log_event::do_exec_row calls find_and_fetch_row.
In case of InnoDB-based PARTITION table, we have HA_PRIMARY_KEY_REQUIRED_FOR_POSITION
enabled, so use position() / rnd_pos() calls to fetch the record.
Fixed by adding partition_id calculation to the ha_partition::position()
- Make the range-et-al optimizer produce E(#table records after table
condition is applied),
- Make the join optimizer use this value,
- Add "filtered" column to EXPLAIN EXTENDED to show
fraction of records left after table condition is applied
- Adjust test results, add comments
The problem appeared because the same values produced different hash
during INSERT and SELECT for VARCHAR data type.
Fix:
VARCHAR required special treatment to avoid hashing of length bytes
(leftmost one or two bytes) as well as trailing bytes beyond real length,
which could contain garbage. Fix is done by introducing hash() - new method
in the Field class.
In select_describe(), make the String object that holds the value of
"partitions" column to "own" the value buffer, so the buffer isn't
prematurely freed.
[this is the second attempt with review fixes]
(for example, such objects can be returned from get_mm_parts() for "string_field = int_val).
Make find_used_partitions_imerge() to handle such SEL_TREE objects.
* Produce right results for conditions that were transformed to "(partitioning_range) AND
(list_of_subpartitioning_ranges)": make each partition id set iterator auto-reset itself
after it has returned all partition ids in the sequence
* Fix "Range mapping" and "Range mapping" partitioning interval analysis functions to
correctly deal with NULL values.
obtain partition number, call partition_info->get_part_partition_id() when
the table has subpartitions, and get_partition_id() otherwise. (The bug
was that we were always doing the latter)
Final patch
-----------
This WL is about using this bitmap in all parts of the partition handler.
Thus for:
rnd_init/rnd_next
index_init/index_next and all other variants of index scans
read_range_... the various range scans implemented in the partition handler.
Also use those bitmaps in the various other calls that currently loop over all
partitions.
- post-...-post review fixes
- Added "integer range walking" that allows to do partition pruning for "a <=? t.field <=? b"
by finding used partitions for a, a+1, a+2, ..., b-1, b.
- Added more comments.
- Added a RANGE_OPT_PARAM::remove_jump_scans flag that disables construction of index_merge
SEL_TREEs that represent unusable conditions like "key1part1<c1 OR key2part2<c2"
- make prune_partitions() function handle the case where range analysis produces a list of
index_merge trees (it turned out that this is possible, appropriate test case added).
- Other small fixes.