1. Implementing the task according to the description:
a. Adding Type_handler::type_handler_for_tmp_table().
b. Adding Type_handler::type_handler_for_union_table.
c. Adding helper methods Type_handler::varstring_type_handler(const Item*),
Type_handler::blob_type_handler(const Item*)
d. Removing Item::make_string_field() and
Item_func_group_concat::make_string_field().
They are not needed any more.
e. Simplifying Item::tmp_table_field_from_field_type() to just two lines.
f. Renaming Item_type_holder::make_field_by_type() and implementing
virtual Item_type_holder::create_tmp_field() instead.
The new implementation is also as simple as two lines.
g. Adding a new virtual method Type_all_attributes::get_typelib(),
to access to TYPELIB definitions for ENUM and SET columns.
h. Simplifying the code branch for TIME_RESULT, DECIMAL_RESULT, STRING_RESULT
in Item::create_tmp_field(). It's now just one line.
i. Implementing Type_handler_enum::make_table_field() and
Type_handler_set::make_table_field().
2. Code simplification in Field_str constructor calls.
a. Changing the "CHARSET_INFO *cs" argument in constuctors for Field_str
and its descendants to "const DTCollation &collation". This is to
avoid two step initialization:
- setting Field_str::derivation and Field_str::repertoire to the
default values first
- then resetting them using:
set_derivation(item->derivation, item->repertoire).
b. Removing Field::set_derivation()
c. Adding a new constructor DTCollation(CHARSET_INFO *cs),
for the old code compatibility.
3. Changes in test results
As a side effect some test results have changed, because
in the old version Item::make_string_field() converted
TINYBLOB to VARCHAR(255). Now TINYBLOB is preserved.
a. sp-row.result
This query:
CREATE TABLE t1 AS SELECT tinyblob_sp_variable;
Now preserves TINYBLOB as the data type.
Before the patch a VARCHAR(255) was created.
b. gis-debug.result
This is a debug test, to make sure that + and - operators
are commutative and non-commutative correspondingly.
The exact data type is not really important.
(But anyway, it now chooses a better data type that fits the result)
- Adding Type_handler::make_table_field() and moving pieces of the code
from Item::tmp_table_field_from_field_type() to virtual implementations
for various type handlers.
- Adding a new Type_all_attributes, to access to Item's extended
attributes, such as decimal_precision() and geometry_type().
- Adding a new class Record_addr, to pass record related information
to Type_handler methods (ptr, null_ptr and null_bit) as a single structure.
Note, later it will possibly be extended for BIT-alike field purposes,
by adding new members (bit_ptr_arg, bit_ofs_arg).
- Moving the code from Field_new_decimal::create_from_item()
to Type_handler_newdecimal::make_table_field().
- Removing Field_new_decimal() and Field_geom() helper constructor
variants that were used for temporary field creation.
- Adding Item_field::type_handler(), Field::type_handler() and
Field_blob::type_handler() to return correct type handlers for
blob variants, according to Field_blob::packlength.
- Adding Type_handler_blob_common, as a common parent for
Type_handler_tiny_blob, Type_handler_blob, Type_handler_medium_blob
and Type_handler_long_blob.
- Implementing Type_handler_blob_common::Item_hybrid_func_fix_attributes().
It's needed for cases when TEXT variants of different character sets are mixed
in LEAST, GREATEST, CASE and its abreviations (IF, IFNULL, COALESCE), e.g.:
CREATE TABLE t1 (
a TINYTEXT CHARACTER SET latin1,
b TINYTEXT CHARACTER SET utf8
);
CREATE TABLE t2 AS SELECT COALESCE(a,b) FROM t1;
Type handler aggregation returns TINYTEXT as a common data type
for the two columns. But as conversion from latin1 to utf8
happens for "a", the maximum possible length of "a" grows from 255 to 255*3.
Type_handler_blob_common::Item_hybrid_func_fix_attributes() makes sure
to update the blob type handler according to max_length.
- Adding Type_handler::blob_type_handler(uint max_octet_length).
- Adding a few m_type_aggregator_for_result.add() pairs, because
now Item_xxx::type_handler() can return pointers to type_handler_tiny_blob,
type_handler_blob, type_handler_medium_blob, type_handler_long_blob.
Before the patch only type_handler_blob was possible result of type_handler().
- Making type_handler_tiny_blob, type_handler_blob, type_handler_medium_blob,
type_handler_long_blob public.
- Removing the condition in Item_sum_avg::create_tmp_field()
checking Item_sum_avg::result_type() against DECIMAL_RESULT.
Now both REAL_RESULT and DECIMAL_RESULT are symmetrically handled
by tmp_table_field_from_field_type().
- Removing Item_geometry_func::create_field_for_create_select(),
as the inherited version perfectly works.
- Fixing Item_func_as_wkb::field_type() to return MYSQL_TYPE_LONG_BLOB
rather than MYSQL_TYPE_BLOB. It's needed to make sure that
tmp_table_field_from_field_type() creates a LONGBLOB field for AsWKB().
- Fixing Item_func_as_wkt::fix_length_and_dec() to set max_length to
UINT32_MAX rather than MAX_BLOB_WIDTH, to make sure that
tmp_table_field_from_field_type() creates a LONGTEXT field for AsWKT().
- Removing Item_func_set_user_var::create_field_for_create_select(),
as the inherited version works fine.
- Adding Item_func_get_user_var::create_field_for_create_select() to
make sure that "CREATE TABLE t1 AS SELECT @string_user variable"
always creates a field of LONGTEXT/LONGBLOB type.
- Item_func_ifnull::create_field_for_create_select()
behavior has changed. Before the patch it passed set_blob_packflag=false,
which meant to create LONGBLOB for all blob variants.
Now it takes into account max_length, which gives better column
data types for:
CREATE TABLE t2 AS SELECT IFNULL(blob_column1, blob_column2) FROM t1;
- Fixing Item_func_nullif::fix_length_and_dec() to use
set_handler(args[2]->type_handler()) instead of
set_handler_by_field_type(args[2]->field_type()).
This is needed to distinguish between BLOB variants.
- Implementing Item_blob::type_handler(), to make sure to create
proper BLOB field variant, according to max_length, for queries like:
CREATE TABLE t1 AS
SELECT some_blob_field FROM INFORMATION_SCHEMA.SOME_TABLE;
- Fixing Item_field::real_type_handler() to make sure that
the code aggregating fields for UNION gets a proper BLOB
variant type handler from fields.
- Adding a special code into Item_type_holder::make_field_by_type(),
to make sure that after aggregating field types it also properly
takes into account max_length when mixing TEXT variants of different
character sets and chooses a proper TEXT variant:
CREATE TABLE t1 (
a TINYTEXT CHARACTER SET latin1,
b TINYTEXT CHARACTER SET utf8
);
CREATE TABLE t2 AS SELECT a FROM t1 UNION SELECT b FROM t1;
- Adding tests, for better coverage of IFNULL, NULLIF, UNION.
- The fact that tmp_table_field_from_field_type() now takes
into account BLOB variants (instead of always creating LONGBLOB),
tests results for WEIGHT_STRING() and NULLIF() and UNION
have become more precise.
Benefits of this patch:
- Removed a lot of calls to strlen(), especially for field_string
- Strings generated by parser are now const strings, less chance of
accidently changing a string
- Removed a lot of calls with LEX_STRING as parameter (changed to pointer)
- More uniform code
- Item::name_length was not kept up to date. Now fixed
- Several bugs found and fixed (Access to null pointers,
access of freed memory, wrong arguments to printf like functions)
- Removed a lot of casts from (const char*) to (char*)
Changes:
- This caused some ABI changes
- lex_string_set now uses LEX_CSTRING
- Some fucntions are now taking const char* instead of char*
- Create_field::change and after changed to LEX_CSTRING
- handler::connect_string, comment and engine_name() changed to LEX_CSTRING
- Checked printf() related calls to find bugs. Found and fixed several
errors in old code.
- A lot of changes from LEX_STRING to LEX_CSTRING, especially related to
parsing and events.
- Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING*
- Some changes for char* to const char*
- Added printf argument checking for my_snprintf()
- Introduced null_clex_str, star_clex_string, temp_lex_str to simplify
code
- Added item_empty_name and item_used_name to be able to distingush between
items that was given an empty name and items that was not given a name
This is used in sql_yacc.yy to know when to give an item a name.
- select table_name."*' is not anymore same as table_name.*
- removed not used function Item::rename()
- Added comparision of item->name_length before some calls to
my_strcasecmp() to speed up comparison
- Moved Item_sp_variable::make_field() from item.h to item.cc
- Some minimal code changes to avoid copying to const char *
- Fixed wrong error message in wsrep_mysql_parse()
- Fixed wrong code in find_field_in_natural_join() where real_item() was
set when it shouldn't
- ER_ERROR_ON_RENAME was used with extra arguments.
- Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already
give the error.
TODO:
- Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c
- Change code to not modify LEX_CSTRING for database name
(as part of lower_case_table_names)
Introduced 2f63e2e2a
Compiler error was:
sql/log_event.cc: In function ‘size_t log_event_print_value(IO_CACHE*, const uchar*, uint, uint, char*, size_t)’:
sql/log_event.cc:2897:7: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
if (!ptr)
^~
sql/log_event.cc:2900:9: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the ‘if’
int32 i32= uint2korr(ptr);
^~~~~
Added in 950abd526
Error was:
sql/item_sum.cc:3478:5: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
if ((!args[i]->fixed &&
^~
/sql/item_sum.cc:3482:7: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the ‘if’
with_subselect|= args[i]->with_subselect;
^~~~~~~~~~~~~~
Signed-off-by: Daniel Black <daniel.black@au.ibm.com>
Due to this bug many queries that contained a window function
with MIN/MAX aggregation returned wrong results.
Calculation of a MIN/MAX aggregate function uses cache objects
and a comparator object that are created and set up in
Item_sum_hybrid::fix_fields () by a call of Item_sum_hybrid::setup_hybrid().
The latter binds the objects to the first argument of the
MIN/MAX function. Meanwhile window function perform aggregation
over fields of a temporary table. So binding must be done rather to
these fields. The earliest moment when setup the objects used in
MIN/max functions can be done is after all calls of the method
split_sum_func().
This patch introduces this late setup, but only for aggregate
functions used in window functions.
Probably it makes sense to use this late setup for all MIN/MAX
objects.
The method Item_sum::print did not print opening '(' after the name
of simple window functions (like rank, dense_rank etc).
As a result the view definitions with such window functions
were formed invalid in .frm files.
If a window function with aggregation is over the result
set of a grouping query then the argument of the aggregate
function from the window function is allowed to be an
aggregate function itself.
This patch:
- Implements the task according to the description
- Adds a new class Type_handler_numeric as a common parent
for Type_handler_real_result, Type_handler_int_result,
Type_handler_decimal_result, to share the common code
between numeric data type handlers.
- Removes the dedundant call for collation.set(item->collation) in
Item_sum_hybrid::setup_hybrid(), because setup_hybrid() is called
either after fix_length_and_dec() or afte ther constructor
Item_sum_hybrid(THD *thd, Item_sum_hybrid *item),
so the collation is already properly set in all cases.
This patch:
- Adds a new virtual method Type_handler::Item_get_cache
- Splits moves Item_cache::get_cache() into the new method, every
"case XXX_RESULT" to the corresponding Type_handler_xxx::Item_get_cache.
- Adds Item::get_cache as a convenience wrapper, to make the caller code
shorter.
- Changes the last argument of Arg_comparator::cache_converted_constant()
from Item_result to "const Type_handler *".
- Removes subselect_engine::cmp_type, subselect_engine::res_type,
subselect_engine::res_field_type and derives subselect_engine
from Type_handler_hybrid_field_type instead.
- Makes Type_handler_varchar public, as it's now needed as the
default data type handler for subselect_engine.
We assume all around the code that null_value==true is in sync
with NULL value returned by val_str()/val_decimal().
Item_sum_sum::val_decimal() erroneously returned a non-NULL value together
with null_value set to true. Fixing to return NULL instead.
Decimals with float, double and decimal now works the following way:
- DECIMAL_NOT_SPECIFIED is used when declaring DECIMALS without a firm number
of decimals. It's only used in asserts and my_decimal_int_part.
- FLOATING_POINT_DECIMALS (31) is used to mark that a FLOAT or DOUBLE
was defined without decimals. This is regarded as a floating point value.
- Max decimals allowed for FLOAT and DOUBLE is FLOATING_POINT_DECIMALS-1
- Clients assumes that float and double with decimals >= NOT_FIXED_DEC are
floating point values (no decimals)
- In the .frm decimals=FLOATING_POINT_DECIMALS are used to define
floating point for float and double (31, like before)
To ensure compatibility with old clients we do:
- When storing float and double, we change NOT_FIXED_DEC to
FLOATING_POINT_DECIMALS.
- When creating fields from .frm we change for float and double
FLOATING_POINT_DEC to NOT_FIXED_DEC
- When sending definition for a float/decimal field without decimals
to the client as part of a result set we convert NOT_FIXED_DEC to
FLOATING_POINT_DECIMALS.
- variance() and std() has changed to limit the decimals to
FLOATING_POINT_DECIMALS -1 to not get the double converted floating point.
(This was to preserve compatiblity)
- FLOAT and DOUBLE still have 30 as max number of decimals.
Bugs fixed:
variance() printed more decimals than we support for double values.
New behaviour:
- Strings now have 38 decimals instead of 30 when converted to decimal
- CREATE ... SELECT with a decimal with > 30 decimals will create a column
with a smaller range than before as we are trying to preserve the number of
decimals.
Other changes
- We are now using the obsolete bit FIELDFLAG_LEFT_FULLSCREEN to specify
decimals > 31
- NOT_FIXED_DEC is now declared in one place
- For clients, NOT_FIXED_DEC is always 31 (to ensure compatibility).
On the server NOT_FIXED_DEC is DECIMAL_NOT_SPECIFIED (39)
- AUTO_SEC_PART_DIGITS is taken from DECIMAL_NOT_SPECIFIED
- DOUBLE conversion functions are now using DECIMAL_NOT_SPECIFIED instead of
NOT_FIXED_DEC
filesort and init_read_record() for the same table.
This will simplify code for WINDOW FUNCTIONS (MDEV-6115)
- Filesort_info renamed to SORT_INFO and moved to filesort.h
- filesort now returns SORT_INFO
- init_read_record() now takes a SORT_INFO parameter.
- unique declaration is moved to uniques.h
- subselect caching of buffers is now more explicit than before
- filesort_buffer is now reusable even if rec_length has changed.
- filsort_free_buffers() and free_io_cache() calls are removed
- Remove one malloc() when using get_addon_fields()
Other things:
- Added --debug-assert-on-not-freed-memory option to make it easier to
debug some not-freed-memory issues.
It is based on the sum() function, thus much of the logic is shared.
Considering that there are 2 counters stored within the function, one
that handles the null value, while the other that handles the divisor
for the avg computation, it is possible to remove the counter from the
Item_sum_avg. I have not removed it in this patch as we may choose to
refactor the whole code into a separate class.
This remains to be dicussed.
The counter keeps track of the number of items added to the sum
function. It is increased when we add a value to the sum function and
decreased when it is removed.
- Item_sum_count::remove() should check if the argument's value is NULL.
- Window Function item must have its Item_window_func::split_sum_func
called,
- and it must call split_sum_func for aggregate's arguments (see the
comment near Item_window_func::split_sum_func for explanation why)
- Add temporary code: clone_read_record() clones READ_RECORD structure,
as long as it is used for reading filesort() result that fits into
memory.
- Add frame bounds for ROWS-type frames. ROWS n PRECEDING|FOLLOWING,
ROWS UNBOUNDED PRECEDING|FOLLOWING, CURRENT ROW are supported.
- Add Item_sum_count::remove() which allows "streaming" computation
of COUNT() over a moving frame.
"Re-factor the code for post-join operations".
The patch mainly contains the code ported from mysql-5.6 and
created for two essential architectural changes:
1. WL#5558: Resolve ORDER BY execution method at the optimization stage
2. WL#6071: Inline tmp tables into the nested loops algorithm
The first task was implemented for mysql-5.6 by Ole John Aske.
It allows to make all decisions on ORDER BY operation at the optimization
stage.
The second task implemented for mysql-5.6 by Evgeny Potemkin adds JOIN_TAB
nodes for post-join operations that require temporary tables. It allows
to execute these operations within the nested loops algorithm that used to
be used before this task only for join queries. Besides these task moves
all planning on the execution of these operations from the execution phase
to the optimization phase.
Some other re-factoring changes of mysql-5.6 were pulled in, mainly because
it was easier to pull them in than roll them back. In particular all
changes concerning Ref_ptr_array were incorporated.
The port required some changes in the MariaDB code that concerned the
functionality of EXPLAIN and ANALYZE. This was done mainly by Sergey
Petrunia.
MDEV-9408 CREATE TABLE SELECT MAX(int_column) creates different columns for table vs view
There were three almost identical pieces of the code:
- Field *Item_func::tmp_table_field();
- Field *Item_sum::create_tmp_field();
- Field *create_tmp_field_from_item();
with a difference in very small details (hence the bugs):
Only Item_func::tmp_table_field() was correct, the other two were not.
Removing the two incorrect pieces of the redundant code.
Joining these three functions/methods into a single virtual method
Item::create_tmp_field().
Additionally, moving Item::make_string_field() and
Item::tmp_table_field_from_field_type() from the public into the
protected section of the class declaration, as they are now not
needed outside of Item.
- Moving Item_xxx_field declarations after Item_sum_xxx declarations,
so Item_xxx_field constructors can be defined directly in item_sum.h
rather than item_sum.cc. This removes some duplicate code, e.g.
initialization of the following members at constructor time:
name, decimals, max_length, unsigned_flag, field, maybe_null.
- Adding Item_sum_field as a common parent for Item_avg_field and
Item_variance_field
- Deriving Item_sum_field directly from Item rather that Item_result_field,
as Item_sum_field descendants do not need anything from Item_result_field.
- Removing hybrid infrastructure from Item_avg_field,
adding Item_avg_field_decimal and Item_avg_field_double instead,
as desired result type is already known at constructor time
(not only at fix_fields time). This simplifies the code.
- Changing Item_avg_field_decimal::val_int() to call val_int_from_decimal()
instead of doing { return (longlong) rint(val_real()); }
This is the fix itself.