We hit this assert during the create of a temporary table field
because the current code does not handle the case when the value
of the NAME_CONST function is NULL.
Fixed this by allowing creation of temporary table fields even
for the case when NAME_CONST returns NULL value.
Introduced tmp_table_field_from_field_type_maybe_null() function
in Item class so both Item_basic_value and Item_name_const can use it.
Introduced a virtual method get_func_item() in the Item class.
The problem happened in the derived condition pushdown code:
- When Item_func_regex::build_clone() was called, it created a copy of
the original Item_func_regex, and this copy got registered in free_list.
Class specific additional dynamic members (such as "re") made
a shallow copy, rather than a deep copy, in the cloned Item_func_regex.
As a result, the Regexp_processor_pcre::m_pcre of the cloned Item_func_regex
and of the original Item_func_regex pointed to the same compiled regular
expression.
- On cleanup_items(), both original and cloned copies of Item_func_regex
called re.cleanup(), which called pcre_free(m_pcre). So the same compiled
regular expression was freed two times, which was noticed by ASAN.
The same problem was repeatable for Item_func_regexp_instr.
A similar problem happened for Item_func_sp, for the sp_result_field member.
Both original and cloned copies of Item_func_sp pointed the same Field instance
and both deleted it on cleanup().
A possible solution would be to fix build_clone() to create deep
(instead of shallow) copies for the dynamic members of the affected classes
(Item_func_regex, Item_func_regexp_instr, Item_func sp).
However, this would be too complex.
As agreed with Galina and Igor, this patch disallows using using these
affected classes in derived condition pushdown by overriding get_clone()
to return NULL.
Consider an IN predicate with ROW-type arguments:
predicant IN (value1, ..., valueM)
where predicant and all values consist of N elements.
When performing IN for these arguments, at every position i (1..N)
only data type of i-th element of predicant was taken into account,
while data types on i-th elements of value1..valueM were not taken.
These led to bad comparison data type detection, e.g. when
mixing unsigned and signed integer values.
After this change all element data types are taken into account.
So, for example, a mixture of unsigned and signed values is
now calculated using decimal and does not overflow any more.
Detailed changes:
1. All comparators for ROW elements are now created recursively
at fix_fields() time, inside cmp_item_row::prepare_comparators().
Previously prepare_comparators() installed comparators only
for temporal data types, while comparators for other types were
installed at execution time, in cmp_item_row::store_value().
2. Removing comparator creating code from cmp_item_row::store_value().
It was responsible for non-temporal data types.
3. Removing find_date_time_item(). It's not needed any more.
All ROW-element data types are now covered by
cmp_item_row::prepare_comparators().
4. Adding a helper method Item_args::alloc_and_extract_row_elements()
to extract elements from an array of ROW-type Items, from the given
position. Using this method to collect elements from the i-th
position and further pass them to
Type_handler_hybrid_field_type::aggregate_for_comparison().
5. Moving the call for alloc_comparators() inside
cmp_item_row::prepare_comparators(). This helps
to call prepare_comparators() for ROW elements recursively
(if elements appear to be ROWs again).
Moving alloc_comparators() from "public" to "private".
Now the boolean data type is preserved in hybrid functions and MIN/MAX,
so COALESCE(bool_expr,bool_expr) and MAX(bool_expr) are correctly
detected by JSON_OBJECT() as being boolean rather than numeric expressions.
The logic and the implementation scheme are similar with the
MDEV-9197 Pushdown conditions into non-mergeable views/derived tables
How the push down is made on the example:
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y) from t2 group by x);
-->
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y)
from t2
where x>3
group by x
having max(y)>10);
The implementation scheme:
1. Search for the condition cond that depends only on the fields
from the left part of the IN subquery (left_part)
2. Find fields F_group in the select of the right part of the
IN subquery (right_part) that are used in the GROUP BY
3. Extract from the cond condition cond_where that depends only on the
fields from the left_part that stay at the same places in the left_part
(have the same indexes) as the F_group fields in the projection of the
right_part
4. Transform cond_where so it can be pushed into the WHERE clause of the
right_part and delete cond_where from the cond
5. Transform cond so it can be pushed into the HAVING clause of the right_part
The optimization is made in the
Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the
variable condition_pushdown_for_subquery.
New test file in_subq_cond_pushdown.test is created.
There are also some changes made for setup_jtbm_semi_joins().
Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins()
that is called before optimize_cond() for cond and setup_jtbm_semi_joins()
that is called after optimize_cond().
New setup_jtbm_semi_joins() is made in the way so that the result of its work is
the same as if it was called before optimize_cond().
The code that is common for pushdown into materialized derived and into materialized
IN subqueries is factored out into pushdown_cond_for_derived(),
Item_in_subselect::pushdown_cond_for_in_subquery() and
st_select_lex::pushdown_cond_into_where_clause().
failure upon SELECT with impossible condition
The problem appears because of a wrong implementation of the
Item_func_in::build_clone() method. It didn't clone 'array' and 'cmp_fields'
fields for the cloned IN predicate and this could cause crashes.
The Item_func_in::fix_length_and_dec() method was refactored and a new method
named Item_func_in::create_array() was created. It allowed to create 'array'
for cloned IN predicates in a proper way.
this is a 10.3 version of 27d94b7e03
It disables caching of the first argument of IN,
if it's of a temporal type. Because other types are not
cached in this context.
reorder items in args[] array. Instead of
when1,then1,when2,then2,...[,case][,else]
sort them as
[case,]when1,when2,...,then1,then2,...[,else]
in this case all items used for comparison take a continuous part
of the array and can be aggregated directly. and all items that
can be returned take a continuous part of the array and can be
aggregated directly. Old code had to copy them to a temporary
array before aggreation, and then copy back (thd->change_item_tree)
everything that was changed.
this is a 10.3 version of bf1ca14ff3
Conversion of a subquery to a semi-join is blocked when we have an
IN subquery predicate in the on_expr of an outer join. Currently this
scenario is handled but the cases when an IN subquery predicate is wrapped
inside a Item_in_optimizer item then this blocking is not done.
cmp_item_sort_string::store_value() did not cache the string returned
from item->val_str(), whose result can point to various private members
such as Item_char_typecast::tmp_value.
- cmp_item_sort_string::store_value() remembered the pointer returned
from item->val_str() poiting to tmp_value into cmp_item_string::value_res.
- Later, cmp_item_real::store_value() was called, which called
Item_str_func::val_real(), which called Item_char_typecast::val_str(&tmp)
using a local stack variable "String tmp". Item_char_typecast::tmp_value
was overwritten and become a link to "tmp":
tmp_value.Ptr freed its own buffer and set to point to the buffer
owned by "tmp".
- On return from Item_str_func::val_real(), "String tmp" was destructed,
but "tmp_value" still pointed to the buffer owned by "tmp",
So tmp_value.Ptr became invalid.
- Then cmp_item_sort_string() passed cmp_item_string::value_res to sortcmp().
At this point, value_res still pointed to an invalid value of
Item_char_typecast::tmp_value.
Fix:
changing cmp_item_sort_string::store_value() to force copying
to cmp_item_string::value if item->val_str(&value) returned
a different pointer (instead of &value).
Refactor get_datetime_value() not to create Item_cache_temporal(),
but do it always in ::fix_fields() or ::fix_length_and_dec().
Creating items at the execution time doesn't work very well with
virtual columns and check constraints that are fixed and executed
in different THDs.
It's a generic function, not using anything from Arg_comparator.
Make it a static function, not a class method, to be able to use
it later without Arg_comparator
reorder items in args[] array. Instead of
when1,then1,when2,then2,...[,case][,else]
sort them as
[case,]when1,when2,...,then1,then2,...[,else]
in this case all items used for comparison take a continuous part
of the array and can be aggregated directly. and all items that
can be returned take a continuous part of the array and can be
aggregated directly. Old code had to copy them to a temporary
array before aggreation, and then copy back (thd->change_item_tree)
everything that was changed.
The problem was that Item_func_hybrid_field_type::get_date() did not
convert the result to the correct data type, so MYSQL_TIME::time_type
of the get_date() result could be not in sync with field_type().
Changes:
1. Adding two new classes Datetime and Date to store MYSQL_TIMESTAMP_DATETIME
and MYSQL_TIMESTAMP_DATE values respectively
(in addition to earlier added class Time, for MYSQL_TIMESTAMP_TIME values).
2. Adding Item_func_hybrid_field_type::time_op().
It performs the operation using TIME representation,
and always returns a MYSQL_TIME value with time_type=MYSQL_TIMESTAMP_TIME.
Implementing time_op() for all affected children classes.
3. Fixing all implementations of date_op() to perform the operation
using strictly DATETIME representation. Now they always return a MYSQL_TIME
value with time_type=MYSQL_TIMESTAMP_{DATE|DATETIME},
according to the result data type.
4. Removing assignment of ltime.time_type to mysql_timestamp_type()
from all val_xxx_from_date_op(), because now date_op() makes sure
to return a proper MYSQL_TIME value with a good time_type (and other member)
5. Adding Item_func_hybrid_field_type::val_xxx_from_time_op().
6. Overriding Type_handler_time_common::Item_func_hybrid_field_type_val_xxx()
to call val_xxx_from_time_op() instead of val_xxx_from_date_op().
7. Modified Item_func::get_arg0_date() to return strictly a TIME value
if TIME_TIME_ONLY is passed, or return strictly a DATETIME value otherwise.
If args[0] returned a value of a different temporal type,
(for example a TIME value when TIME_TIME_ONLY was not passed,
or a DATETIME value when TIME_TIME_ONLY was passed), the conversion
is automatically applied.
Earlier, get_arg0_date() did not guarantee a result in
accordance to TIME_TIME_ONLY flag.
There were two problems related to the bug report:
1. Item_datetime::get_date() was not implemented.
So execution went through val_int() followed
by int-to-datetime or int-to-time conversion.
This was the reason why the optimizer did not
work well on data with fractional seconds.
2. Item_datetime::set() did not have a TIME specific code
to mix months and days to hours after unpack_time().
This is why the optimizer did not work well with negative
TIME values, as well as huge time values.
Changes:
1. Overriding Item_datetime::get_date(), to return ltime.
This fixes the problem N1.
2. Cleanup: Moving pack_time() and unpack_time() from
sql-common/my_time.c and include/my_time.h to
sql/sql_time.cc and sql/sql_time.h, as they are not needed
on the client side.
3. Adding a new "enum_mysql_timestamp_type ts_type" parameter
to unpack_time() and moving the TIME specific code to mix
months and days with hours inside unpack_time().
Adding a new "ts_type" parameter to Item_datetime::set(),
to pass it from the caller down to unpack_time().
So now the TIME specific code is automatically called
from Item_datetime::set(). This fixes the problem N2.
This change also helped to get rid of duplicate TIME specific code
from other three places, where mixing month/days to hours
was done immediately after unpack_time().
Moving the DATE specific code to zero hhmmssff
from Item_func_min_max::get_date_native to inside unpack_time(),
for symmetry.
4. Removing the virtual method in_vector::result_type(),
adding in_vector::type_handler() instead.
This helps to get result_type(), field_type(),
mysql_timestamp_type() of an in_vector easier.
Passing type_handler()->mysql_timestamp_type() as
a new parameter to Item_datetime::set() inside
in_temporal::value_to_item().
5. Cleaup: Removing separate implementations of in_datetime::get_value()
and in_time::get_value(). Adding a single implementation
in_temporal::get_value() instead.
Passing type_handler()->field_type() to get_value_internal().
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
MDEV-14957: JOIN::prepare gets unusable "conds" as argument
Do not touch merged derived (it is irreversible)
Fix first argument of in_optimizer for calls possible before fix_fields()