MDEV-17625 Different warnings when comparing a garbage to DATETIME vs TIME
- Splitting processes of data type conversion (to TIME/DATE,DATETIME)
and warning generation.
Warning are now only get collected during conversion (in an "int" variable),
and are pushed in the very end of conversion (not in parallel).
Warnings generated by the low level routines str_to_xxx() and number_to_xxx()
can now be changed at the end, when TIME_FUZZY_DATES is applied,
from "Invalid value" to "Truncated invalid value".
Now "Illegal value" is issued only when the low level routine returned
an error and TIME_FUZZY_DATES was not set. Otherwise, if the low level
routine returned "false" (success), or if NULL was converted to a zero
datetime by TIME_FUZZY_DATES, then "Truncated illegal value"
is issued. This gives better warnings.
- Methods Type_handler::Item_get_date() and
Type_handler::Item_func_hybrid_field_type_get_date() now only
convert and collect warning information, but do not push warnings.
- Changing the return data type for Type_handler::Item_get_date()
and Type_handler::Item_func_hybrid_field_type_get_date() from
"bool" to "void". The conversion result (success vs error) can be
checked by testing ltime->time_type. MYSQL_TIME_{NONE|ERROR}
mean mean error, other values mean success.
- Adding new wrapper methods Type_handler::Item_get_date_with_warn() and
Type_handler::Item_func_hybrid_field_type_get_date_with_warn()
to do conversion followed by raising warnings, and changing
the code to call new Type_handler::***_with_warn() methods.
- Adding a helper class Temporal::Status, a wrapper
for MYSQL_TIME_STATUS with automatic initialization.
- Adding a helper class Temporal::Warn, to collect warnings
but without actually raising them. Moving a part of ErrConv
into a separate class ErrBuff, and deriving both Temporal::Warn
and ErrConv from ErrBuff. The ErrBuff part of Temporal::Warn
is used to collect textual representation of the input data.
- Adding a helper class Temporal::Warn_push. It's used
to collect warning information during conversion, and
automatically pushes warnings to the diagnostics area
on its destructor time (in case of non-zero warning).
- Moving more code from various functions inside class Temporal.
- Adding more Temporal_hybrid constructors and
protected Temporal methods make_from_xxx(),
which convert and only collect warning information, but do not
actually raise warnings.
- Now the low level functions str_to_datetime() and str_to_time()
always set status->warning if the return value is "true" (error).
- Now the low level functions number_to_time() and number_to_datetime()
set the "*was_cut" argument if the return value is "true" (error).
- Adding a few DBUG_ASSERTs to make sure that str_to_xxx() and
number_to_xxx() always set warnings on error.
- Adding new warning flags MYSQL_TIME_WARN_EDOM and MYSQL_TIME_WARN_ZERO_DATE
for the code symmetry. Before this change there was a special
code path for (rc==true && was_cut==0) which was treated by
Field_temporal::store_invalid_with_warning as "zero date violation".
Now was_cut==0 always means that there are no any error/warnings/notes
to be raised, not matter what rc is.
- Using new Temporal_hybrid constructors in combination with
Temporal::Warn_push inside str_to_datetime_with_warn(),
double_to_datetime_with_warn(), int_to_datetime_with_warn(),
Field::get_date(), Item::get_date_from_string(), and a few other places.
- Removing methods Dec_ptr::to_datetime_with_warn(),
Year::to_time_with_warn(), my_decimal::to_datetime_with_warn(),
Dec_ptr::to_datetime_with_warn().
Fixing Sec6::to_time() and Sec6::to_datetime() to
convert and only collect warnings, without raising warnings.
Now warning raising functionality resides in Temporal::Warn_push.
- Adding classes Longlong_hybrid_null and Double_null, to
return both value and the "IS NULL" flag. Adding methods
Item::to_double_null(), to_longlong_hybrid_null(),
Item_func_hybrid_field_type::to_longlong_hybrid_null_op(),
Item_func_hybrid_field_type::to_double_null_op().
Removing separate classes VInt and VInt_op, as they
have been replaced by a single class Longlong_hybrid_null.
- Adding a helper method Temporal::type_name_by_timestamp_type(),
moving a part of make_truncated_value_warning() into it,
and reusing in Temporal::Warn::push_conversion_warnings().
- Removing Item::make_zero_date() and
Item_func_hybrid_field_type::make_zero_mysql_time().
They provided duplicate functionality.
Now this code resides in Temporal::make_fuzzy_date().
The latter is now called for all Item types when data type
conversion (to DATE/TIME/DATETIME) is involved, including
Item_field and Item_direct_view_ref.
This fixes MDEV-17563: Item_direct_view_ref now correctly converts
NULL to a zero date when TIME_FUZZY_DATES says so.
C++ does not guarantee the order of parameter evaluation.
It was wrong to pass item->val_int() and item->null_value
at the same time to any function or constructor.
Adding a new helper class Longlong_null, and new methods
Item::to_longlong_null() and Item_func_hybrid_field_type::to_longlong_null_op(),
which make sure to properly call val_int()/int_op() and test null_value.
Reorganizing the rest of the code accordingly.
We hit this assert during the create of a temporary table field
because the current code does not handle the case when the value
of the NAME_CONST function is NULL.
Fixed this by allowing creation of temporary table fields even
for the case when NAME_CONST returns NULL value.
Introduced tmp_table_field_from_field_type_maybe_null() function
in Item class so both Item_basic_value and Item_name_const can use it.
Introduced a virtual method get_func_item() in the Item class.
The problem happened in the derived condition pushdown code:
- When Item_func_regex::build_clone() was called, it created a copy of
the original Item_func_regex, and this copy got registered in free_list.
Class specific additional dynamic members (such as "re") made
a shallow copy, rather than a deep copy, in the cloned Item_func_regex.
As a result, the Regexp_processor_pcre::m_pcre of the cloned Item_func_regex
and of the original Item_func_regex pointed to the same compiled regular
expression.
- On cleanup_items(), both original and cloned copies of Item_func_regex
called re.cleanup(), which called pcre_free(m_pcre). So the same compiled
regular expression was freed two times, which was noticed by ASAN.
The same problem was repeatable for Item_func_regexp_instr.
A similar problem happened for Item_func_sp, for the sp_result_field member.
Both original and cloned copies of Item_func_sp pointed the same Field instance
and both deleted it on cleanup().
A possible solution would be to fix build_clone() to create deep
(instead of shallow) copies for the dynamic members of the affected classes
(Item_func_regex, Item_func_regexp_instr, Item_func sp).
However, this would be too complex.
As agreed with Galina and Igor, this patch disallows using using these
affected classes in derived condition pushdown by overriding get_clone()
to return NULL.
Detailed: changes:
1. Moving Field specific code into new methods on Field:
- Field *Field::create_tmp_field(...)
- virtual void init_for_tmp_table(...)
2. Removing virtual Item::create_tmp_field().
Adding instead a new virtual method Item::create_tmp_field_ex().
Note, a virtual create_tmp_field() still exists, but only for Item_sum.
This resembles 10.0 code structure. Perhaps create_tmp_field() should
be removed from Item_sum and Item_sum descendants should override
create_tmp_field_ex() directly. This can be done in a separate commit.
3. Adding helper classes Tmp_field_src and Tmp_field_param,
to make the API for Item::create_tmp_field_ex() smaller
and easier to extend in the future.
4. Decomposing the public function create_tmp_field() into
virtual implementations for Item and a number of its descendants:
- Item_basic_value
- Item_sp_variable
- Item_name_const
- Item_result_field
- Item_field
- Item_ref
- Item_type_holder
- Item_row
- Item_func_sp
- Item_func_user_var
- Item_sum
- Item_sum_field
- Item_proc
5. Adding DBUG_ASSERT-only virtual implementations for
Item types that should not appear in create_tmp_table_ex(),
for easier debugging:
- Item_nodeset_func
- Item_nodeset_to_const_comparator
- Item_null_result
- Item_copy
- Item_ident_for_show
- Item_user_var_as_out_param
6. Moving public function create_tmp_field_from_field()
as a method to Item_field.
7. Removing Item::set_result_field(). It's not needed any more.
8. Cleanup: Removing the enum value "EXPR_CACHE_ITEM",
as it's not used for a very long time.
preserve positions if the multi-update join is using tmp table:
* store positions in the tmp table if needed
JOIN::add_fields_for_current_rowid()
* take positions from the tmp table, not from file->position():
multi_update::prepare2()
The logic and the implementation scheme are similar with the
MDEV-9197 Pushdown conditions into non-mergeable views/derived tables
How the push down is made on the example:
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y) from t2 group by x);
-->
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y)
from t2
where x>3
group by x
having max(y)>10);
The implementation scheme:
1. Search for the condition cond that depends only on the fields
from the left part of the IN subquery (left_part)
2. Find fields F_group in the select of the right part of the
IN subquery (right_part) that are used in the GROUP BY
3. Extract from the cond condition cond_where that depends only on the
fields from the left_part that stay at the same places in the left_part
(have the same indexes) as the F_group fields in the projection of the
right_part
4. Transform cond_where so it can be pushed into the WHERE clause of the
right_part and delete cond_where from the cond
5. Transform cond so it can be pushed into the HAVING clause of the right_part
The optimization is made in the
Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the
variable condition_pushdown_for_subquery.
New test file in_subq_cond_pushdown.test is created.
There are also some changes made for setup_jtbm_semi_joins().
Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins()
that is called before optimize_cond() for cond and setup_jtbm_semi_joins()
that is called after optimize_cond().
New setup_jtbm_semi_joins() is made in the way so that the result of its work is
the same as if it was called before optimize_cond().
The code that is common for pushdown into materialized derived and into materialized
IN subqueries is factored out into pushdown_cond_for_derived(),
Item_in_subselect::pushdown_cond_for_in_subquery() and
st_select_lex::pushdown_cond_into_where_clause().
multiple times with different arguments.
If the ON expression of an outer join is an OR formula with one
of the disjunct being a constant formula then the expression
cannot be null-rejected if the constant formula is true. Otherwise
it can be null-rejected and if so the outer join can be converted
into inner join. This optimization was added in the patch for
mdev-4817. Yet the code had a defect: if the query was used in
a stored procedure with parameters and the constant item contained
some of them then the value of this constant item depended on the
values of the parameters. With some parameters it may be true,
for others not. The validity of conversion to inner join is checked
only once and it happens only for the first call of procedure.
So if the parameters in the first call allowed the conversion it
was done and next calls used the transformed query though there
could be calls whose parameters made the conversion invalid.
Fixed by cheking whether the constant disjunct in the ON expression
originally contained an SP parameter. If so the expression is not
considered as null-rejected. For this check a new item's attribute
was intruduced: Item::with_param. It is calculated for each item
by fix fields() functions.
Also moved the call of optimize_constant_subqueries() in
JOIN::optimize after the call of simplify_joins(). The reason
for this is that after the optimization introduced by the patch
for mdev-4817 simplify_joins() can use the results of execution
of non-expensive constant subqueries and this is not valid.
Problems:
1. Unlike Item_field::fix_fields(),
Item_sum_sp::fix_length_and_dec() and Item_func_sp::fix_length_and_dec()
did not run the code which resided in adjust_max_effective_column_length(),
therefore they did not extend max_length for the integer return data types
from the user-specified length to the maximum length according to
the data type capacity.
2. The code in adjust_max_effective_column_length() was not correct
for TEXT data, because Field_blob::max_display_length()
multiplies to mbmaxlen. So TEXT variants were unintentionally
promoted to the next longer data type for multi-byte character
sets: TINYTEXT->TEXT, TEXT->MEDIUMTEXT, MEDIUMTEXT->LONGTEXT.
3. Item_sum_sp::create_table_field_from_handler()
Item_func_sp::create_table_field_from_handler()
erroneously called tmp_table_field_from_field_type(),
which converted VARCHAR(>512) to TEXT variants.
So "CREATE..SELECT spfunc()" erroneously converted
VARCHAR to TEXT. This was wrong, because stored
functions have explicitly declared data types,
which should be preserved.
Solution:
- Removing Type_std_attributes(const Field *)
and using instead Type_std_attributes::set() in combination
with field->type_str_attributes() all around the code, e.g.:
Type_std_attributes::set(field->type_std_attributes())
These two ways of copying attributes from a Field
to an Item duplicated each other, and were slightly
different in how to mix max_length and mbmaxlen.
- Removing adjust_max_effective_column_length() and
fixing Field::type_std_attributes() to do all necessary
type-specific calculations , so no further adjustments
is needed.
Field::type_std_attributes() is now called from all affected methods:
Item_field::fix_fields()
Item_sum_sp::fix_length_and_dec()
Item_func_sp::fix_length_and_dec()
This fixes the problem N1.
- Making Field::type_std_attributes() virtual, to make
sure that type-specific adjustments a properly done
by individual Field_xxx classes. Implementing
Field_blob::type_std_attributes() in the way that
no TEXT promotion is done.
This fixes the problem N2.
- Fixing Item_sum_sp::create_table_field_from_handler()
Item_func_sp::create_table_field_from_handler() to
call create_table_field_from_handler() instead of
tmp_table_field_from_field_type() to avoid
VARCHAR->TEXT conversion on "CREATE..SELECT spfunc()".
- Recording mysql-test/suite/compat/oracle/r/sp-param.result
as "CREATE..SELECT spfunc()" now correctly
preserve the data type as specified in the RETURNS clause.
- Adding new tests
Renaming methods:
- Field::make_field(Send_field*) to make_send_field(..)
- Item::make_field(THD *,Send_field *) to make_send_field(..)
- Item::init_make_field(Send_field *, enum_field_type) to init_make_send_field(..)
These names looked similar to other functions that are used
for a very different purpose (creating Field instances):
- Public function "Field * make_field(..)"
- Method "Field *Column_defitinion::make_field(..)"
The rename makes it's easier to search the code using "grep".