1. Removing the legacy code that disabled equal field propagation in cases
when comparison is done as VARBINARY. This is now correctly handled by
the new propagation code in Item_xxx::propagate_equal_fields() and
Field_str::can_be_substituted_to_equal_item (the bug fix).
2. Also, removing legacy (pre-MySQL-4.1) Arg_comparator methods
compare_binary_string() and compare_e_binary_string(), as VARBINARY
comparison is correcty handled in compare_string() and compare_e_string() by
the corresponding VARBINARY collation handler implemented in my_charset_bin.
(not really a part of the bug fix)
WHERE COALESCE(time_column)=TIME('00:00:00')
AND COALESCE(time_column)=DATE('2015-09-11')
MDEV-8814 Wrong result for WHERE datetime_column > TIME('00:00:00')
MDEV-8754 Wrong result for SELECT..WHERE year_field=2020 AND NULLIF(year_field,2010)='2020'
Problems:
1. Item_func_nullif stored a copy of args[0] in a private member m_args0_copy,
which was invisible for the inherited Item_func menthods, like
update_used_tables(). As a result, after equal field propagation
things like Item_func_nullif::const_item() could return wrong result
and a non-constant NULLIF() was erroneously treated as a constant
at optimize_cond() time.
Solution: removing m_args0_copy and storing the return value item
in args[2] instead.
2. Equal field propagation did not work well for Item_fun_nullif.
Solution: using ANY_SUBST for args[0] and args[1], as they are in
comparison, and IDENTITY_SUBST for args[2], as it's not in comparison.
removing IMPOSSIBLE_RESULT from Item_result, as it's not
needed any more. The fact that an Item is not in a comparison
context is now always designated by IDENTITY_SUBST in Subst_constraint.
Previously IMPOSSIBLE_RESULT and IDENTITY_SUBST co-existed but
actually meant the same thing.
Item::cmp_context was inconsistently used in combination with cmp_type()
and result_type() in different places of the code. Fixed to use cmp_type()
in all places where cmp_context is involved, to avoid unexpected results
for temporal data types (which have result_type()==STRING_RESULT and
cmp_type==TIME_RESULT).
- Part 4: Removing calls to sql_alloc() and sql_calloc()
Other things:
- Added current_thd in some places to make it clear that it's called (easier to remove later)
- Move memory allocation from Item_func_case::fix_length_and_dec() to Item_func_case::fix_fields()
- Added mem_root to some new calls
- Fixed some wrong UNINIT_VAR() calls
- Fixed a bug in generate_partition_syntax() in case of errors
- Added mem_root to argument to new thread_info
- Simplified my_parse_error() call in sql_yacc.yy
- Part 3: Adding mem_root to push_back() and push_front()
Other things:
- Added THD as an argument to some partition functions.
- Added memory overflow checking for XML tag's in read_xml()
- Added mem_root to all calls to new Item
- Added private method operator new(size_t size) to Item to ensure that
we always use a mem_root when creating an item.
This saves use once call to current_thd per Item creation
Added mandatory thd parameter to Item (and all derivative classes) constructor.
Added thd parameter to all routines that may create items.
Also removed "current_thd" from Item::Item. This reduced number of
pthread_getspecific() calls from 290 to 177 per OLTP RO transaction.
- Changed ER(ER_...) to ER_THD(thd, ER_...) when thd was known or if there was many calls to current_thd in the same function.
- Changed ER(ER_..) to ER_THD_OR_DEFAULT(current_thd, ER...) in some places where current_thd is not necessary defined.
- Removing calls to current_thd when we have access to thd
Part of this is optimization (not calling current_thd when not needed),
but part is bug fixing for error condition when current_thd is not defined
(For example on startup and end of mysqld)
Notable renames done as otherwise a lot of functions would have to be changed:
- In JOIN structure renamed:
examined_rows -> join_examined_rows
record_count -> join_record_count
- In Field, renamed new_field() to make_new_field()
Other things:
- Added DBUG_ASSERT(thd == tmp_thd) in Item_singlerow_subselect() just to be safe.
- Removed old 'tab' prefix in JOIN_TAB::save_explain_data() and use members directly
- Added 'thd' as argument to a few functions to avoid calling current_thd.
Fixed several optimizer issues relatied to GROUP BY:
a) Refering to a SELECT column in HAVING sometimes calculated it twice, which caused problems with non determinstic functions
b) Removing duplicate fields and constants from GROUP BY was done too late for "using index for group by" optimization to work
c) EXPLAIN SELECT ... GROUP BY did wrongly show 'Using filesort' in some cases involving "Using index for group-by"
a) was fixed by:
- Changed last argument to Item::split_sum_func2() from bool to int to allow more flags
- Added flag argument to Item::split_sum_func() to allow on to specify if the item was in the SELECT part
- Mark all split_sum_func() calls from SELECT with SPLIT_SUM_SELECT
- Changed split_sum_func2() to do nothing if called with an argument that is not a sum function and doesn't include sum functions, if we are not an argument to SELECT.
This ensures that in a case like
select a*sum(b) as f1 from t1 where a=1 group by c having f1 <= 10;
That 'a' in the SELECT part is stored as a reference in the temporary table togeher with sum(b) while the 'a' in having isn't (not needed as 'a' is already a reference to a column in the result)
b) was fixed by:
- Added an extra remove_const() pass for GROUP BY arguments before make_join_statistics() in case of one table SELECT.
This allowes get_best_group_min_max() to optimize things better.
c) was fixed by:
- Added test for group by optimization in JOIN::exec_inner for
select->quick->get_type() == QUICK_SELECT_I::QS_TYPE_GROUP_MIN_MAX
item.cc:
- Simplifed Item::split_sum_func2()
- Split test to make them faster and easier to read
- Changed last argument to Item::split_sum_func2() from bool to int to allow more flags
- Added flag argument to Item::split_sum_func() to allow on to specify if the item was in the SELECT part
- Changed split_sum_func2() to do nothing if called with an argument that is not a sum function and doesn't include sum functions, if we are not an argument to SELECT.
opt_range.cc:
- Simplified get_best_group_min_max() by calcuating first how many group_by elements.
- Use join->group instead of join->group_list to test if group by, as join->group_list may be NULL if everything was optimized away.
sql_select.cc:
- Added an extra remove_const() pass for GROUP BY arguments before make_join_statistics() in case of one table SELECT.
- Use group instead of group_list to test if group by, as group_list may be NULL if everything was optimized away.
- Moved printing of "Error in remove_const" to remove_const() instead of having it in caller.
- Simplified some if tests by re-ordering code.
- update_depend_map_for_order() and remove_const() fixed to handle the case where make_join_statistics() has not yet been called (join->join_tab is 0 in this case)
count_sargable_conds() instead for Item_func_in, Item_func_null_predicate,
Item_bool_func2. There other Item_int_func descendants that used to set
"sargable" to true (Item_func_between, Item_equal) already have their
own implementation of count_sargable_conds(). There is no sense to
have two parallel coding models for the same thing.
Pass THD to find_all_keys() and Item_equal::Item_equal().
In MRR use table->in_use instead of current_thd.
This reduces number of pthread_getspecific() calls from 354 to 320.
- Changing Comp_creator::create() and create_swap() to return
Item_bool_rowready_func2 instead of Item_bool_func2, as they
can never return neither Item_func_like nor Item_func_xor
- Changing the first argument of Comp_create::create() and create_swap()
from THD to MEM_ROOT, so the method implementations can now reside in
item_cmpfunc.h instead of item_cmpfunc.cc and thus make the code slightly
easier to read.
sql_alloc() has additional costs compared to direct mem_root allocation:
- function call: it is defined in a separate translation unit and can't be
inlined
- it needs to call pthread_getspecific() to get THD::mem_root
It is called dozens of times implicitely at least by:
- List<>::push_back()
- List<>::push_front()
- new (for Sql_alloc derived classes)
- sql_memdup()
Replaced lots of implicit sql_alloc() calls with direct mem_root allocation,
passing through THD pointer whenever it is needed.
Number of sql_alloc() calls reduced 345 -> 41 per OLTP RO transaction.
pthread_getspecific() overhead dropped 0.76 -> 0.59
sql_alloc() overhed dropped 0.25 -> 0.06
- Adding a new class Item_args, represending regular function or
aggregate function arguments array.
- Adding a new class Item_func_or_sum,
a parent class for Item_func and Item_sum
- Moving Item_result_field::name() to Item_func_or_sum(),
as name() is not needed on Item_result_field level.
Fixing a wrong assymetric code in Arg_comparator::set_cmp_func().
It existed for a long time, but showed up in 10.0.14 after the fix
for "MDEV-6666 Malformed result for CONCAT(utf8_column, binary_string)".
The bug is not very important per se, but it was helpful to move
Item_func_strcmp out of Item_bool_func2 (to Item_int_func),
for the purposes of "MDEV-4912 Add a plugin to field types (column types)".