1. Implementing the task according to the description:
a. Adding Type_handler::type_handler_for_tmp_table().
b. Adding Type_handler::type_handler_for_union_table.
c. Adding helper methods Type_handler::varstring_type_handler(const Item*),
Type_handler::blob_type_handler(const Item*)
d. Removing Item::make_string_field() and
Item_func_group_concat::make_string_field().
They are not needed any more.
e. Simplifying Item::tmp_table_field_from_field_type() to just two lines.
f. Renaming Item_type_holder::make_field_by_type() and implementing
virtual Item_type_holder::create_tmp_field() instead.
The new implementation is also as simple as two lines.
g. Adding a new virtual method Type_all_attributes::get_typelib(),
to access to TYPELIB definitions for ENUM and SET columns.
h. Simplifying the code branch for TIME_RESULT, DECIMAL_RESULT, STRING_RESULT
in Item::create_tmp_field(). It's now just one line.
i. Implementing Type_handler_enum::make_table_field() and
Type_handler_set::make_table_field().
2. Code simplification in Field_str constructor calls.
a. Changing the "CHARSET_INFO *cs" argument in constuctors for Field_str
and its descendants to "const DTCollation &collation". This is to
avoid two step initialization:
- setting Field_str::derivation and Field_str::repertoire to the
default values first
- then resetting them using:
set_derivation(item->derivation, item->repertoire).
b. Removing Field::set_derivation()
c. Adding a new constructor DTCollation(CHARSET_INFO *cs),
for the old code compatibility.
3. Changes in test results
As a side effect some test results have changed, because
in the old version Item::make_string_field() converted
TINYBLOB to VARCHAR(255). Now TINYBLOB is preserved.
a. sp-row.result
This query:
CREATE TABLE t1 AS SELECT tinyblob_sp_variable;
Now preserves TINYBLOB as the data type.
Before the patch a VARCHAR(255) was created.
b. gis-debug.result
This is a debug test, to make sure that + and - operators
are commutative and non-commutative correspondingly.
The exact data type is not really important.
(But anyway, it now chooses a better data type that fits the result)
This is a joint patch for:
- MDEV-12426 Add Field::type_handler()
- MDEV-12432 Range optimizer for ENUM and SET does not return "Impossible WHERE" in some case
With the new type handler approach being added to Field, it was easier to fix
MDEV-12432 rather than to reproduce the old ENUM/SET behavior.
The patch does the following:
1. Adds Field::type_handler(), according to the task description.
2. Fixes the asymmetry between Fields and Items of ENUM and SET field types.
Field_enum::cmp_type() returned INT_RESULT
Item*::cmp_type() returned STRING_RESULT for ENUM and SET expressions
This asymmetry was originally done for easier coding in the optimizer sources.
However, in 10.1 we moved a lot of code to methods of the class Field:
- test_if_equality_guarantees_uniqueness()
- can_be_substituted_to_equal_item()
- get_equal_const_item()
- can_optimize_keypart_ref()
- can_optimize_hash_join()
- can_optimize_group_min_max()
- can_optimize_range()
- can_optimize_outer_join_table_elimination()
As of 10.2 only a few lines of the code in opt_range.cc, field.cc and field.h
still relayed on the fact that Field_enum::cmp_type() returns INT_RESULT:
- Some asserts in field.cc
- Field_year::get_copy_func()
- Item_func_like::get_mm_leaf()
- Item_bool_func::get_mm_leaf()
These lines have been fixed.
3. Item_bool_func::get_mm_leaf() did not work well for ENUM/SET,
see MDEV-12432. So the ENUM/SET code was rewritten, and the relevant
code in Field_enum::store() and Field_set::store() was fixed to
properly return errors to the caller.
4. The result of Field_decimal::result_type() was changed from REAL_RESULT
to DECIMAL_RESULT. Data type aggregation (e.g. in COALESCE()) is now more
precise for old DECIMAL, because Item::decimal_precision() now goes through
the DECIMAL_RESULT branch. Earlier it went through the REAL_RESULT branch.
- Adding Type_handler::make_table_field() and moving pieces of the code
from Item::tmp_table_field_from_field_type() to virtual implementations
for various type handlers.
- Adding a new Type_all_attributes, to access to Item's extended
attributes, such as decimal_precision() and geometry_type().
- Adding a new class Record_addr, to pass record related information
to Type_handler methods (ptr, null_ptr and null_bit) as a single structure.
Note, later it will possibly be extended for BIT-alike field purposes,
by adding new members (bit_ptr_arg, bit_ofs_arg).
- Moving the code from Field_new_decimal::create_from_item()
to Type_handler_newdecimal::make_table_field().
- Removing Field_new_decimal() and Field_geom() helper constructor
variants that were used for temporary field creation.
- Adding Item_field::type_handler(), Field::type_handler() and
Field_blob::type_handler() to return correct type handlers for
blob variants, according to Field_blob::packlength.
- Adding Type_handler_blob_common, as a common parent for
Type_handler_tiny_blob, Type_handler_blob, Type_handler_medium_blob
and Type_handler_long_blob.
- Implementing Type_handler_blob_common::Item_hybrid_func_fix_attributes().
It's needed for cases when TEXT variants of different character sets are mixed
in LEAST, GREATEST, CASE and its abreviations (IF, IFNULL, COALESCE), e.g.:
CREATE TABLE t1 (
a TINYTEXT CHARACTER SET latin1,
b TINYTEXT CHARACTER SET utf8
);
CREATE TABLE t2 AS SELECT COALESCE(a,b) FROM t1;
Type handler aggregation returns TINYTEXT as a common data type
for the two columns. But as conversion from latin1 to utf8
happens for "a", the maximum possible length of "a" grows from 255 to 255*3.
Type_handler_blob_common::Item_hybrid_func_fix_attributes() makes sure
to update the blob type handler according to max_length.
- Adding Type_handler::blob_type_handler(uint max_octet_length).
- Adding a few m_type_aggregator_for_result.add() pairs, because
now Item_xxx::type_handler() can return pointers to type_handler_tiny_blob,
type_handler_blob, type_handler_medium_blob, type_handler_long_blob.
Before the patch only type_handler_blob was possible result of type_handler().
- Making type_handler_tiny_blob, type_handler_blob, type_handler_medium_blob,
type_handler_long_blob public.
- Removing the condition in Item_sum_avg::create_tmp_field()
checking Item_sum_avg::result_type() against DECIMAL_RESULT.
Now both REAL_RESULT and DECIMAL_RESULT are symmetrically handled
by tmp_table_field_from_field_type().
- Removing Item_geometry_func::create_field_for_create_select(),
as the inherited version perfectly works.
- Fixing Item_func_as_wkb::field_type() to return MYSQL_TYPE_LONG_BLOB
rather than MYSQL_TYPE_BLOB. It's needed to make sure that
tmp_table_field_from_field_type() creates a LONGBLOB field for AsWKB().
- Fixing Item_func_as_wkt::fix_length_and_dec() to set max_length to
UINT32_MAX rather than MAX_BLOB_WIDTH, to make sure that
tmp_table_field_from_field_type() creates a LONGTEXT field for AsWKT().
- Removing Item_func_set_user_var::create_field_for_create_select(),
as the inherited version works fine.
- Adding Item_func_get_user_var::create_field_for_create_select() to
make sure that "CREATE TABLE t1 AS SELECT @string_user variable"
always creates a field of LONGTEXT/LONGBLOB type.
- Item_func_ifnull::create_field_for_create_select()
behavior has changed. Before the patch it passed set_blob_packflag=false,
which meant to create LONGBLOB for all blob variants.
Now it takes into account max_length, which gives better column
data types for:
CREATE TABLE t2 AS SELECT IFNULL(blob_column1, blob_column2) FROM t1;
- Fixing Item_func_nullif::fix_length_and_dec() to use
set_handler(args[2]->type_handler()) instead of
set_handler_by_field_type(args[2]->field_type()).
This is needed to distinguish between BLOB variants.
- Implementing Item_blob::type_handler(), to make sure to create
proper BLOB field variant, according to max_length, for queries like:
CREATE TABLE t1 AS
SELECT some_blob_field FROM INFORMATION_SCHEMA.SOME_TABLE;
- Fixing Item_field::real_type_handler() to make sure that
the code aggregating fields for UNION gets a proper BLOB
variant type handler from fields.
- Adding a special code into Item_type_holder::make_field_by_type(),
to make sure that after aggregating field types it also properly
takes into account max_length when mixing TEXT variants of different
character sets and chooses a proper TEXT variant:
CREATE TABLE t1 (
a TINYTEXT CHARACTER SET latin1,
b TINYTEXT CHARACTER SET utf8
);
CREATE TABLE t2 AS SELECT a FROM t1 UNION SELECT b FROM t1;
- Adding tests, for better coverage of IFNULL, NULLIF, UNION.
- The fact that tmp_table_field_from_field_type() now takes
into account BLOB variants (instead of always creating LONGBLOB),
tests results for WEIGHT_STRING() and NULLIF() and UNION
have become more precise.
Benefits of this patch:
- Removed a lot of calls to strlen(), especially for field_string
- Strings generated by parser are now const strings, less chance of
accidently changing a string
- Removed a lot of calls with LEX_STRING as parameter (changed to pointer)
- More uniform code
- Item::name_length was not kept up to date. Now fixed
- Several bugs found and fixed (Access to null pointers,
access of freed memory, wrong arguments to printf like functions)
- Removed a lot of casts from (const char*) to (char*)
Changes:
- This caused some ABI changes
- lex_string_set now uses LEX_CSTRING
- Some fucntions are now taking const char* instead of char*
- Create_field::change and after changed to LEX_CSTRING
- handler::connect_string, comment and engine_name() changed to LEX_CSTRING
- Checked printf() related calls to find bugs. Found and fixed several
errors in old code.
- A lot of changes from LEX_STRING to LEX_CSTRING, especially related to
parsing and events.
- Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING*
- Some changes for char* to const char*
- Added printf argument checking for my_snprintf()
- Introduced null_clex_str, star_clex_string, temp_lex_str to simplify
code
- Added item_empty_name and item_used_name to be able to distingush between
items that was given an empty name and items that was not given a name
This is used in sql_yacc.yy to know when to give an item a name.
- select table_name."*' is not anymore same as table_name.*
- removed not used function Item::rename()
- Added comparision of item->name_length before some calls to
my_strcasecmp() to speed up comparison
- Moved Item_sp_variable::make_field() from item.h to item.cc
- Some minimal code changes to avoid copying to const char *
- Fixed wrong error message in wsrep_mysql_parse()
- Fixed wrong code in find_field_in_natural_join() where real_item() was
set when it shouldn't
- ER_ERROR_ON_RENAME was used with extra arguments.
- Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already
give the error.
TODO:
- Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c
- Change code to not modify LEX_CSTRING for database name
(as part of lower_case_table_names)
- Adding a new virtual method Type_handler::Item_time_precision()
- Adding a new virtual method Type_handler::Item_datetime_precision()
- Removing Item::temporal_precision() and adding Item::time_precision()
and Item::datetime_precision() instead.
- Moving Item_func_convert_tz::fix_length_and_dec() from item_timefunc.cc
to item_timefunc.h. It's only two lines, and we're changing it anyway.
- Removing Item_temporal_typecast::fix_length_and_dec_generic(),
moving this code to
Type_handler::Item_{date|time|datetime}_typecast_fix_length_and_dec().
This allows to get rid of one more field_type() call.
Also, in the old reduction, Item_date_typecast::fix_length_and_dec()
unnecessarily called args[0]->temporal_precision(). The new reduction
does not call args[0]->datetime_precision(), as DATE does not
have fractional digits.
This was wrong because:
- There was no reason to rollback name for item that will be deleted
after query.
- name_length was not rolled back
- Changing real_item() doesn't work as it may be used many times in the
same query
After removing all the old code and extending the test case, all the
related test cases passes.
Sanja and I concluded that the old code isn't needed anymore. If it
still needed for some scenario not covered by our test system, it needs
to be coded in some other way, so better to remove the wrong code.
failed with SELECT SQ, TEXT field
The functon find_all_keys does call Item_subselect::walk, which calls walk() for the subquery
The issue is that when a field is represented by Item_outer_ref(Item_direct_ref(Item_copy_string( ...))).
Item_copy_string does have a pointer to an Item_field in Item_copy::item but does not implement Item::walk method, so we are not
able to set the bitmap for that field. This is the reason why the assert fails.
Fixed by adding the walk method to Item_copy class.
Define my_thread_id as an unsigned type, to avoid mismatch with
ulonglong. Change some parameters to this type.
Use size_t in a few more places.
Declare many flag constants as unsigned to avoid sign mismatch
when shifting bits or applying the unary ~ operator.
When applying the unary ~ operator to enum constants, explictly
cast the result to an unsigned type, because enum constants can
be treated as signed.
In InnoDB, change the source code line number parameters from
ulint to unsigned type. Also, make some InnoDB functions return
a narrower type (unsigned or uint32_t instead of ulint;
bool instead of ibool).
Optionally do table->update_default_fields() even for INSERT
that supposedly provides values for all column. Because these
"values" might be DEFAULT, which would need table->update_default_fields()
at the end.
Also set Item_default_value::used_tables() from the default expression.
Non-zero used_field() means that mysql_insert() will initialize all
fields to their default values (with restore_record()) even if
all columns are later provided with values. Because default expressions
may refer to other columns and they must be initialized.
in Item_partition_func_safe_string(THD *thd, const char *name_arg,
uint length, CHARSET_INFO *cs= NULL), the 'name_arg' is the value
of the string constant and 'length' is the length of this constant,
so length == strlen(name_arg).
This patch makes the following changes (according to the task description):
- Adds Type_handler::Item_func_round_fix_length_and_dec().
- Splits the code from Item_func_round::fix_length_and_dec() into new
Item_func_round methods fix_arg_int(), fix_arg_decimal(), fix_arg_double().
- Calls the new Item_func_round methods from the relevant implementations of
Type_handler_xxx::Item_func_round_fix_length_and_dec().
- Adds a new error message ER_ILLEGAL_PARAMETER_DATA_TYPE_FOR_OPERATION
- Makes ROUND() return the new error for GEOMETRY
Additionally:
- Inherits Item_func_round directly from Item_func_numhybrid as it
uses nothing from Item_func_num1.
- Fixes "MDEV-12000 ROUND(expr,const_expr_returning_NULL) creates DOUBLE(0,0)".
Now if args[1] returns NULL, the data type is set to DOUBLE with
NOT_FIXED_DEC decimals instead of 0 decimals.
Before "MDEV-10709 Expressions as parameters to Dynamic SQL" only
user variables were syntactically allowed as EXECUTE parameters.
User variables were OK as both IN and OUT parameters.
When Item_param was bound to an actual parameter (a user variable),
it automatically meant that the bound Item was settable.
The DBUG_ASSERT() in Protocol_text::send_out_parameters() guarded that
the actual parameter is really settable.
After MDEV-10709, any kind of expressions are allowed as EXECUTE IN parameters.
But the patch for MDEV-10709 forgot to check that only descendants of
Settable_routine_parameter should be allowed as OUT parameters.
So an attempt to pass a non-settable parameter as an OUT parameter
made server crash on the above mentioned DBUG_ASSERT.
This patch changes Item_param::get_settable_routine_parameter(),
which previously always returned "this". Now, when Item_param is bound
to some Item, it caches if the bound Item is settable.
Item_param::get_settable_routine_parameter() now returns "this" only
if the bound actual parameter is settable, and returns NULL otherwise.
Problem: Item_param::basic_const_item() returned true when fixed==false.
This unexpected combination made Item::const_charset_converter() crash
on asserts.
Fix:
- Changing all Item_param::set_xxx() to set "fixed" to true.
This fixes the problem.
- Additionally, changing all Item_param::set_xxx() to set
Item_param::item_type, to avoid duplicate code, and for consistency,
to make the code symmetric between different constant types.
Before this patch only set_null() set item_type.
- Moving Item_param::state and Item_param::item_type from public to private,
to make sure easier that these members are in sync with "fixed" and to
each other.
- Adding a new argument "unsigned_arg" to Item::set_decimal(),
and reusing it in two places instead of duplicate code.
- Adding a new method Item_param::fix_temporal() and reusing it in two places.
- Adding methods has_no_value(), has_long_data_value(), has_int_value(),
instead of direct access to Item_param::state.
in Item_partition_func_safe_string(THD *thd, const char *name_arg,
uint length, CHARSET_INFO *cs= NULL), the 'name_arg' is the value
of the string constant and 'length' is the length of this constant,
so length == strlen(name_arg).
The problem happened because Item_ident_for_show::field_type() always
returned MYSQL_TYPE_DOUBLE and ignored the actual data type of the
referenced Field. As a result, the execution always used
Item_ident_for_show::val_real() to send the default value of the field,
so most default values for non-numeric types were displayed as '0'.
This patch:
1. Cleanup:
a. Removes Send_field::charsetnr, as it's been unused since
introduction of Item::charset_for_protocol() in MySQL-5.5.
b. Adds the "const" qualifier to Field::char_length().
This is needed for (5.a), see below.
2. Introduces a new virtual method Type_handler::charset_for_protocol(),
returning item->collation.collation for string data types, or
&my_charset_bin for non-string data types.
3. Changes Item::charset_for_protocol() from virtual to non-virtual.
It now calls type_handler()->charset_for_protocol().
As a good side effect, duplicate code in Item::charset_for_protocol() and
Item_temporal_hybrid_func::charset_for_protocol() is now gone.
4. Fixes Item_ident_for_show::field_type() to correctly return
its data type according to the data type of the referenced field.
This actually fixes the problem reported in MDEV-11672.
Now the default value is sent using a correct method, e.g.
val_str() for VARCHAR/TEXT, or val_int() for INT/BIGINT.
This required additional changes:
a. in DBUG_ASSERT in Protocol::store(const char *,size_t,CHARSET_INFO),
This method is now used by mysqld_list_fields(), which
(unlike normal SELECT queries) does not set
field_types/field_pos/field_count.
b. Item_ident_for_show::Item_ident_for_show() now set standard attributes
(collation, decimals, max_length, unsigned_flag) according to the
referenced field, to make charset_for_protocol() return the correct
value and to make mysqld_list_fields() correctly send default
values.
5. In order to share the code between Item_field::set_field() and
Item_ident_for_show::Item_ident_for_show():
a. Introduces a new method Type_std_attributes::set(const Field*)
b. To make (a) possible, moves Item::fix_char_length() from Item
to Type_std_attributes, also moves char_to_byte_length_safe()
from item.h to sql_type.h
c. Additionally, moves Item::fix_length_and_charset() and
Item::max_char_length() from Item to Type_std_attributes.
This is not directly needed for the fix and is done just for symmetry
with fix_char_length(), as these three methods are directly related
to each other.
This patch fixes a number of data type aggregation problems in IN and CASE:
- MDEV-11497 Wrong result for (int_expr IN (mixture of signed and unsigned expressions))
- MDEV-11514 IN with a mixture of TIME and DATETIME returns a wrong result
- MDEV-11554 Wrong result for CASE on a mixture of signed and unsigned expressions
- MDEV-11555 CASE with a mixture of TIME and DATETIME returns a wrong result
1. The problem reported in MDEV-11514 and MDEV-11555 was in the wrong assumption
that items having the same cmp_type() can reuse the same cmp_item instance.
So Item_func_case and Item_func_in used a static array of cmp_item*,
one element per one XXX_RESULT.
TIME and DATETIME cannot reuse the same cmp_item, because arguments of
these types are compared very differently. TIME and DATETIME must have
different instances in the cmp_item array. Reusing the same cmp_item
for TIME and DATETIME leads to unexpected result and unexpected warnings.
Note, after adding more data types soon (e.g. INET6), the problem would
become more serious, as INET6 will most likely have STRING_RESULT, but
it won't be able to reuse the same cmp_item with VARCHAR/TEXT.
This patch introduces a new class Predicant_to_list_comparator,
which maintains an array of cmp_items, one element per distinct
Type_handler rather than one element per XXX_RESULT.
2. The problem reported in MDEV-11497 and MDEV-11554 happened because
Item_func_in and Item_func_case did not take into account the fact
that UNSIGNED and SIGNED values must be compared as DECIMAL rather than INT,
because they used item_cmp_type() to aggregate the arguments.
The relevant code now resides in Predicant_to_list_comparator::add_value()
and uses Type_handler_hybrid_field_type::aggregate_for_comparison(),
like Item_func_between does.
This patch:
- Adds a new virtual method Type_handler::Item_get_cache
- Splits moves Item_cache::get_cache() into the new method, every
"case XXX_RESULT" to the corresponding Type_handler_xxx::Item_get_cache.
- Adds Item::get_cache as a convenience wrapper, to make the caller code
shorter.
- Changes the last argument of Arg_comparator::cache_converted_constant()
from Item_result to "const Type_handler *".
- Removes subselect_engine::cmp_type, subselect_engine::res_type,
subselect_engine::res_field_type and derives subselect_engine
from Type_handler_hybrid_field_type instead.
- Makes Type_handler_varchar public, as it's now needed as the
default data type handler for subselect_engine.
Also fixes:
MDEV-11331 Wrong result for INSERT INTO t1 (datetime_field) VALUES (hybrid_function_of_TIME_data_type)
MDEV-11333 Expect "Impossible where condition" for WHERE timestamp_field>=DATE_ADD(TIMESTAMP'9999-01-01 00:00:00',INTERVAL 1000 YEAR)
This patch does the following:
1. Splits the function Item::save_in_field() into pieces:
- Item::save_str_in_field()
- Item::save_real_in_field()
- Item::save_decimal_in_field()
- Item::save_int_in_field()
2. Adds the missing "no_conversion" parameters to
Item::save_time_in_field() and Item::save_date_in_field(),
so this parameter is now correctly passed to
set_field_to_null_with_conversions().
This fixes the problem reported in 11333.
3. Introduces a new virtual method Type_handler::Item_save_in_field()
and uses the methods Item::save_xxx_in_field() from the implementations
of Type_handler_xxx::Item_save_in_field().
These changes additionally fix the problem reported in MDEV-11331,
as the old code erroneously handled expressions like
COALESE(datetime-expression) through the STRING_RESULT branch of
Item::save_in_field() and therefore they looked like string type expressions
for the target fields. Now such expressions are correctly handled by
Item::save_date_in_field().
The patch for bug mdev-10882 tried to fix it by providing an
implementation of the virtual method build_clone for the class
Item_cache. It's turned out that it is not easy provide a valid
implementation for Item_cache::build_clone(). At the same time
if the condition that can be pushed into a materialized view
contains a cached item this item can be substituted for a basic
constant of the same value. In such a way we can avoid building
proper clones for Item_cache objects when constructing pushdown
conditions.
otherwise we'd need to store sql_mode *per vcol*
(consider CREATE INDEX...) and how SHOW CREATE TABLE would
support that?
Additionally, get rid of vcol::expr_str, just to make sure
the string is always generated and never leaked in the
original form.