Summary:
This patch enables possible index optimization when
the WHERE clause has an IN condition of the form:
signed_or_unsigned_column IN (signed_or_unsigned_constant,
signed_or_unsigned_constant
[,signed_or_unsigned_constant]*)
when the IN list constants are of different signess, e.g.:
WHERE signed_column IN (signed_constant, unsigned_constant ...)
WHERE unsigned_column IN (signed_constant, unsigned_constant ...)
Details:
In a condition like:
WHERE unsigned_predicant IN (1, LONGLONG_MAX + 1)
comparison handlers for individual (predicant,value) pairs are
calculated as follows:
* unsigned_predicant and 1 produce &type_handler_newdecimal
* unsigned_predicant and (LONGLONG_MAX + 1) produce &type_handler_slonglong
The old code decided that it could not use bisection because
the two pairs had different comparison handlers.
As a result, bisection was not allowed, and, in case of
an indexed integer column predicant the index on the column was not used.
The new code catches special cases like:
signed_predicant IN (signed_constant, unsigned_constant)
unsigned_predicant IN (signed_constant, unsigned_constant)
It enables bisection using in_longlong, which supports a mixture
of predicant and values of different signess.
In case when the predicant is an indexed column this change
automatically enables index range optimization.
Thanks to Vicențiu Ciorbaru for proposing the idea and for preparing MTR tests.
In main.index_merge_myisam we remove the test that was added in
commit a2d24def8c because
it duplicates the test case that was added in
commit 5af12e4635.
Condition can be pushed from the HAVING clause into the WHERE clause
if it depends only on the fields that are used in the GROUP BY list
or depends on the fields that are equal to grouping fields.
Aggregate functions can't be pushed down.
How the pushdown is performed on the example:
SELECT t1.a,MAX(t1.b)
FROM t1
GROUP BY t1.a
HAVING (t1.a>2) AND (MAX(c)>12);
=>
SELECT t1.a,MAX(t1.b)
FROM t1
WHERE (t1.a>2)
GROUP BY t1.a
HAVING (MAX(c)>12);
The implementation scheme:
1. Extract the most restrictive condition cond from the HAVING clause of
the select that depends only on the fields that are used in the GROUP BY
list of the select (directly or indirectly through equalities)
2. Save cond as a condition that can be pushed into the WHERE clause
of the select
3. Remove cond from the HAVING clause if it is possible
The optimization is implemented in the function
st_select_lex::pushdown_from_having_into_where().
New test file having_cond_pushdown.test is created.
Consider an IN predicate with ROW-type arguments:
predicant IN (value1, ..., valueM)
where predicant and all values consist of N elements.
When performing IN for these arguments, at every position i (1..N)
only data type of i-th element of predicant was taken into account,
while data types on i-th elements of value1..valueM were not taken.
These led to bad comparison data type detection, e.g. when
mixing unsigned and signed integer values.
After this change all element data types are taken into account.
So, for example, a mixture of unsigned and signed values is
now calculated using decimal and does not overflow any more.
Detailed changes:
1. All comparators for ROW elements are now created recursively
at fix_fields() time, inside cmp_item_row::prepare_comparators().
Previously prepare_comparators() installed comparators only
for temporal data types, while comparators for other types were
installed at execution time, in cmp_item_row::store_value().
2. Removing comparator creating code from cmp_item_row::store_value().
It was responsible for non-temporal data types.
3. Removing find_date_time_item(). It's not needed any more.
All ROW-element data types are now covered by
cmp_item_row::prepare_comparators().
4. Adding a helper method Item_args::alloc_and_extract_row_elements()
to extract elements from an array of ROW-type Items, from the given
position. Using this method to collect elements from the i-th
position and further pass them to
Type_handler_hybrid_field_type::aggregate_for_comparison().
5. Moving the call for alloc_comparators() inside
cmp_item_row::prepare_comparators(). This helps
to call prepare_comparators() for ROW elements recursively
(if elements appear to be ROWs again).
Moving alloc_comparators() from "public" to "private".
MDEV-16426 Optimizer erroneously treats equal constants of different formats as same
A cleanup for MDEV-14630: fixing a crash in Item_decimal::eq().
Problems:
- old implementations of Item_decimal::eq() and
Item_temporal_literal::eq() were not symmetric
with Item_param::eq(), this caused MDEV-11361.
- old implementations for DECIMAL and temporal data types
did not take into account that in case when eq() is called
with binary_cmp==true, {{eq()}} should check not only equality
of the two values, but also equality if their decimal precision.
This cuases MDEV-16426.
- Item_decimal::eq() crashes with "item" pointing
to a non-DECIMAL value. Before MDEV-14630
non-DECIMAL values were filtered out by the test:
type() == item->type()
as literals of different types had different type().
After MDEV-14630 type() for literals of all data types return CONST_ITEM.
This caused failures in tests:
./mtr engines/iuds.insert_number
./mtr --ps --embedded main.explain_slowquerylog
(revealed by buildbot)
The essence of the fix:
Making literals and Item_param reuse the same code to avoid
asymmetries between Item_param::eq(Item_literal) and
Item_literal::eq(Item_param), now and in the future, and to
avoid code duplication between Item_literal and Item_param.
Adding tests for "decimals" for DECIMAL and temporal data types,
to treat constants of different scale as not equal when "binary_cmp"
is "true".
Details:
1. Adding a helper class Item_const to extract constant values from Items easier
2. Deriving Item_basic_value from Item_const
3. Joining Type_handler::Item_basic_value_eq() and Item_basic_value_bin_eq()
into a single method with an extra "binary_cmp" argument
(it looks simple this way) and renaming the new method to Item_const_eq().
Modifying its implementations to operate with
Item_const instead of Item_basic_value.
4. Adding a new class Type_handler_hex_hybrid,
to handle hex constants like 0x616263.
5. Removing Item::VARBIN_ITEM and fixing Item_hex_constant to
use type_handler_hex_hybrid instead of type_handler_varchar.
Item_hex_hybrid::type() now returns CONST_ITEM, like all
other literals do.
6. Move virtual methods Item::type_handler_for_system_time() and
Item::cast_to_int_type_handler() from Item to Type_handler.
7. Removing Item_decimal::eq() and Item_temporal_literal::eq().
These classes are now handled by the generic Item_basic_value::eq().
8. Implementing Type_handler_temporal_result::Item_const_eq()
and Type_handler_decimal_result::Item_const_eq(),
this fixes MDEV-11361.
9. Adding tests for "decimals" into
Type_handler_decimal_result::Item_const_eq() and
Type_handler_temporal_result::Item_const_eq()
in case if "binary_cmp" is true.
This fixes MDEV-16426.
10. Moving Item_cache out of Item_basic_value.
They share nothing. It simplifies implementation
of Item_basic_value::eq(). Deriving Item_cache
directly from Item.
11. Adding class DbugStringItemTypeValue, which
used Item::print() internally, and using
in instead of the old debug printing code.
This gives nicer output in func_debug.result.
Changes N5 and N6 do not directly relate to the bugs fixed,
but make the code fully symmetric across all literal types.
Without a new handler Type_handler_hex_hybrid we'd have
to keep two code branches (for regular literals and for
hex hybrid literals).
reorder items in args[] array. Instead of
when1,then1,when2,then2,...[,case][,else]
sort them as
[case,]when1,when2,...,then1,then2,...[,else]
in this case all items used for comparison take a continuous part
of the array and can be aggregated directly. and all items that
can be returned take a continuous part of the array and can be
aggregated directly. Old code had to copy them to a temporary
array before aggreation, and then copy back (thd->change_item_tree)
everything that was changed.
this is a 10.3 version of bf1ca14ff3