"Re-factor the code for post-join operations".
The patch mainly contains the code ported from mysql-5.6 and
created for two essential architectural changes:
1. WL#5558: Resolve ORDER BY execution method at the optimization stage
2. WL#6071: Inline tmp tables into the nested loops algorithm
The first task was implemented for mysql-5.6 by Ole John Aske.
It allows to make all decisions on ORDER BY operation at the optimization
stage.
The second task implemented for mysql-5.6 by Evgeny Potemkin adds JOIN_TAB
nodes for post-join operations that require temporary tables. It allows
to execute these operations within the nested loops algorithm that used to
be used before this task only for join queries. Besides these task moves
all planning on the execution of these operations from the execution phase
to the optimization phase.
Some other re-factoring changes of mysql-5.6 were pulled in, mainly because
it was easier to pull them in than roll them back. In particular all
changes concerning Ref_ptr_array were incorporated.
The port required some changes in the MariaDB code that concerned the
functionality of EXPLAIN and ANALYZE. This was done mainly by Sergey
Petrunia.
MDEV-9408 CREATE TABLE SELECT MAX(int_column) creates different columns for table vs view
There were three almost identical pieces of the code:
- Field *Item_func::tmp_table_field();
- Field *Item_sum::create_tmp_field();
- Field *create_tmp_field_from_item();
with a difference in very small details (hence the bugs):
Only Item_func::tmp_table_field() was correct, the other two were not.
Removing the two incorrect pieces of the redundant code.
Joining these three functions/methods into a single virtual method
Item::create_tmp_field().
Additionally, moving Item::make_string_field() and
Item::tmp_table_field_from_field_type() from the public into the
protected section of the class declaration, as they are now not
needed outside of Item.
It was used only temporary, during udf_handler::fix_fields() time,
and then copied to the owner Item_func_or_sum object.
Changing the code to use the Used_tables_and_const_cache part
of the owner Item_sum_or_func object directly.
- MDEV-8875 Wrong metadata for MAX(CAST(time_column AS DATETIME))
- MDEV-8873 Wrong field type or metadata for LEAST(int_column,string_column)
- MDEV-8912 Wrong metadata or type for @c:=string_or_blob_field
Adding Item_hybrid_func as a common parent for Item_func_hybrid_field_type,
Item_func_min_max, Item_func_user_var. This removes some duplicate code.
- Turning get_mm_tree_for_const() from a static function into
a protected method in Item.
- Adding a new class Item_bool_func2_with_rev, for the functions and operators
that have a reverse function and can use the range optimizer for
to optimize "value OP field" as "field REV_OP value". Deriving
Item_bool_rowready_func2 and Item_funt_spatial_rel from the new class.
- Removing Item_bool_func2::have_rev_func().
Item_func_hybrid_field_type did not return correct field_type(), cmp_type()
and result_type() in some cases, because cached_result_type and
cached_field_type were set in independent pieces of the code and
did not properly match to each other.
Fix:
- Removing Item_func_hybrid_result_type
- Deriving Item_func_hybrid_field_type directly from Item_func
- Introducing a new class Type_handler which guarantees that
field_type(), cmp_type() and result_type() are always properly synchronized
and using the new class in Item_func_hybrid_field_type.
- Part 4: Removing calls to sql_alloc() and sql_calloc()
Other things:
- Added current_thd in some places to make it clear that it's called (easier to remove later)
- Move memory allocation from Item_func_case::fix_length_and_dec() to Item_func_case::fix_fields()
- Added mem_root to some new calls
- Fixed some wrong UNINIT_VAR() calls
- Fixed a bug in generate_partition_syntax() in case of errors
- Added mem_root to argument to new thread_info
- Simplified my_parse_error() call in sql_yacc.yy
Added mandatory thd parameter to Item (and all derivative classes) constructor.
Added thd parameter to all routines that may create items.
Also removed "current_thd" from Item::Item. This reduced number of
pthread_getspecific() calls from 290 to 177 per OLTP RO transaction.
Fixed several optimizer issues relatied to GROUP BY:
a) Refering to a SELECT column in HAVING sometimes calculated it twice, which caused problems with non determinstic functions
b) Removing duplicate fields and constants from GROUP BY was done too late for "using index for group by" optimization to work
c) EXPLAIN SELECT ... GROUP BY did wrongly show 'Using filesort' in some cases involving "Using index for group-by"
a) was fixed by:
- Changed last argument to Item::split_sum_func2() from bool to int to allow more flags
- Added flag argument to Item::split_sum_func() to allow on to specify if the item was in the SELECT part
- Mark all split_sum_func() calls from SELECT with SPLIT_SUM_SELECT
- Changed split_sum_func2() to do nothing if called with an argument that is not a sum function and doesn't include sum functions, if we are not an argument to SELECT.
This ensures that in a case like
select a*sum(b) as f1 from t1 where a=1 group by c having f1 <= 10;
That 'a' in the SELECT part is stored as a reference in the temporary table togeher with sum(b) while the 'a' in having isn't (not needed as 'a' is already a reference to a column in the result)
b) was fixed by:
- Added an extra remove_const() pass for GROUP BY arguments before make_join_statistics() in case of one table SELECT.
This allowes get_best_group_min_max() to optimize things better.
c) was fixed by:
- Added test for group by optimization in JOIN::exec_inner for
select->quick->get_type() == QUICK_SELECT_I::QS_TYPE_GROUP_MIN_MAX
item.cc:
- Simplifed Item::split_sum_func2()
- Split test to make them faster and easier to read
- Changed last argument to Item::split_sum_func2() from bool to int to allow more flags
- Added flag argument to Item::split_sum_func() to allow on to specify if the item was in the SELECT part
- Changed split_sum_func2() to do nothing if called with an argument that is not a sum function and doesn't include sum functions, if we are not an argument to SELECT.
opt_range.cc:
- Simplified get_best_group_min_max() by calcuating first how many group_by elements.
- Use join->group instead of join->group_list to test if group by, as join->group_list may be NULL if everything was optimized away.
sql_select.cc:
- Added an extra remove_const() pass for GROUP BY arguments before make_join_statistics() in case of one table SELECT.
- Use group instead of group_list to test if group by, as group_list may be NULL if everything was optimized away.
- Moved printing of "Error in remove_const" to remove_const() instead of having it in caller.
- Simplified some if tests by re-ordering code.
- update_depend_map_for_order() and remove_const() fixed to handle the case where make_join_statistics() has not yet been called (join->join_tab is 0 in this case)
count_sargable_conds() instead for Item_func_in, Item_func_null_predicate,
Item_bool_func2. There other Item_int_func descendants that used to set
"sargable" to true (Item_func_between, Item_equal) already have their
own implementation of count_sargable_conds(). There is no sense to
have two parallel coding models for the same thing.
Step #8: Adding get_mm_tree() in Item_func, Item_func_between,
Item_func_in, Item_equal. This removes one virtual call item->type()
in queries like:
SELECT * FROM t1 WHERE c BETWEEN const1 AND const2;
SELECT * FROM t1 WHERE c>const;
SELECT * FROM t1 WHERE c IN (const_list);
Step 2c:
After discussion with Igor, it appeared that Item_field and Item_ref
could not appear in this context in the old function build_equal_item_for_cond:
else if (cond->type() == Item::FUNC_ITEM ||
cond->real_item()->type() == Item::FIELD_ITEM)
The part of the condition checking for Item_field::FIELD_ITEM was a dead code.
- Moving implementation of Item_ident_or_func_or_sum::build_equal_items()
to Item_func::build_equal_items()
- Restoring deriving of Item_ident and Item_sum_or_func from Item_result_field.
Removing Item_ident_or_func_or_sum.
- Adding a new class Item_args, represending regular function or
aggregate function arguments array.
- Adding a new class Item_func_or_sum,
a parent class for Item_func and Item_sum
- Moving Item_result_field::name() to Item_func_or_sum(),
as name() is not needed on Item_result_field level.
MDEV-6789 segfault in Item_func_from_unixtime::get_date on updating table with virtual columns
* prohibit VALUES in partitioning expression
* prohibit user and system variables in virtual column expressions
* fix Item_func_date_format to cache locale (for %M/%W to return the same as MONTHNAME/DAYNAME)
* fix Item_func_from_unixtime to cache time_zone directly, not THD (and not to crash)
* added tests for other incorrectly allowed (in vcols) functions to see that they don't crash
Issue :
-------
This seems for some platform -(LONGLONG_MIN) is
not flagged as out of range.
Fix:
----
Fix is backported from mysql-5.6 bug 14314156.
Fixed by adding an explicit test for this value in
Item_func_neg::int_op().
sql/item_func.cc:
For some platforms we need special handling of
LONGLONG_MIN to guarantee overflow.