test case:
t1 is a spider engine table;
CREATE TABLE `t1` (
`id` int(11) NOT NULL DEFAULT '0',
`name` char(64) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=SPIDER
query: "select * from t1 where name not like 'x%' " would dispatch "select xxx name name like 'x%' " to remote mysqld, is wrong
This bug is caused by pushdown from HAVING into WHERE.
It appears because condition that is pushed wasn't fixed.
It is also discovered that condition pushdown from HAVING into
WHERE is done wrong. There is no need to build clones for some
conditions that can be pushed. They can be simply moved from HAVING
into WHERE without cloning.
build_pushable_cond_for_having_pushdown(),
remove_pushed_top_conjuncts_for_having() methods are changed.
It is found that there is no transformation made for fields of
pushed condition.
field_transformer_for_having_pushdown transformer is added.
New tests are added. Some comments are changed.
using Item_cond
This bug is similar to the bug MDEV-16765.
It appears because of the wrong pushdown into HAVING clause while this
pushdown shouldn't be made at all.
This happens because function that checks if Item_cond can be pushed
always returns that it can be pushed.
To fix it new method Item_cond::excl_dep_on_table() was added.
Optimized the code that removed multiple equalities pushed from HAVING
into WHERE. Now this removal is postponed until all multiple equalities
are eliminated in substitute_for_best_equal_field().
in the tree bb-10.4-mdev7486
The crash was caused because of the similar problem as in mdev-16765:
Item_cond::excl_dep_on_group_fields_for_having_pushdown() was missing.
Condition can be pushed from the HAVING clause into the WHERE clause
if it depends only on the fields that are used in the GROUP BY list
or depends on the fields that are equal to grouping fields.
Aggregate functions can't be pushed down.
How the pushdown is performed on the example:
SELECT t1.a,MAX(t1.b)
FROM t1
GROUP BY t1.a
HAVING (t1.a>2) AND (MAX(c)>12);
=>
SELECT t1.a,MAX(t1.b)
FROM t1
WHERE (t1.a>2)
GROUP BY t1.a
HAVING (MAX(c)>12);
The implementation scheme:
1. Extract the most restrictive condition cond from the HAVING clause of
the select that depends only on the fields that are used in the GROUP BY
list of the select (directly or indirectly through equalities)
2. Save cond as a condition that can be pushed into the WHERE clause
of the select
3. Remove cond from the HAVING clause if it is possible
The optimization is implemented in the function
st_select_lex::pushdown_from_having_into_where().
New test file having_cond_pushdown.test is created.
The bug appears because of the wrong pushdown into the WHERE clause of the
materialized derived table/view work. For the excl_dep_on_grouping_fields()
method that checks if the condition can be pushed into the WHERE clause
the case when Item_cond is used is missing. For Item_cond elements this
method always returns positive result (that condition can be pushed).
So this condition is pushed even if is shouldn't be pushed.
To fix it new Item_cond::excl_dep_on_grouping_fields() method is added.
We hit this assert during the create of a temporary table field
because the current code does not handle the case when the value
of the NAME_CONST function is NULL.
Fixed this by allowing creation of temporary table fields even
for the case when NAME_CONST returns NULL value.
Introduced tmp_table_field_from_field_type_maybe_null() function
in Item class so both Item_basic_value and Item_name_const can use it.
Introduced a virtual method get_func_item() in the Item class.
The problem happened in the derived condition pushdown code:
- When Item_func_regex::build_clone() was called, it created a copy of
the original Item_func_regex, and this copy got registered in free_list.
Class specific additional dynamic members (such as "re") made
a shallow copy, rather than a deep copy, in the cloned Item_func_regex.
As a result, the Regexp_processor_pcre::m_pcre of the cloned Item_func_regex
and of the original Item_func_regex pointed to the same compiled regular
expression.
- On cleanup_items(), both original and cloned copies of Item_func_regex
called re.cleanup(), which called pcre_free(m_pcre). So the same compiled
regular expression was freed two times, which was noticed by ASAN.
The same problem was repeatable for Item_func_regexp_instr.
A similar problem happened for Item_func_sp, for the sp_result_field member.
Both original and cloned copies of Item_func_sp pointed the same Field instance
and both deleted it on cleanup().
A possible solution would be to fix build_clone() to create deep
(instead of shallow) copies for the dynamic members of the affected classes
(Item_func_regex, Item_func_regexp_instr, Item_func sp).
However, this would be too complex.
As agreed with Galina and Igor, this patch disallows using using these
affected classes in derived condition pushdown by overriding get_clone()
to return NULL.
Consider an IN predicate with ROW-type arguments:
predicant IN (value1, ..., valueM)
where predicant and all values consist of N elements.
When performing IN for these arguments, at every position i (1..N)
only data type of i-th element of predicant was taken into account,
while data types on i-th elements of value1..valueM were not taken.
These led to bad comparison data type detection, e.g. when
mixing unsigned and signed integer values.
After this change all element data types are taken into account.
So, for example, a mixture of unsigned and signed values is
now calculated using decimal and does not overflow any more.
Detailed changes:
1. All comparators for ROW elements are now created recursively
at fix_fields() time, inside cmp_item_row::prepare_comparators().
Previously prepare_comparators() installed comparators only
for temporal data types, while comparators for other types were
installed at execution time, in cmp_item_row::store_value().
2. Removing comparator creating code from cmp_item_row::store_value().
It was responsible for non-temporal data types.
3. Removing find_date_time_item(). It's not needed any more.
All ROW-element data types are now covered by
cmp_item_row::prepare_comparators().
4. Adding a helper method Item_args::alloc_and_extract_row_elements()
to extract elements from an array of ROW-type Items, from the given
position. Using this method to collect elements from the i-th
position and further pass them to
Type_handler_hybrid_field_type::aggregate_for_comparison().
5. Moving the call for alloc_comparators() inside
cmp_item_row::prepare_comparators(). This helps
to call prepare_comparators() for ROW elements recursively
(if elements appear to be ROWs again).
Moving alloc_comparators() from "public" to "private".
Now the boolean data type is preserved in hybrid functions and MIN/MAX,
so COALESCE(bool_expr,bool_expr) and MAX(bool_expr) are correctly
detected by JSON_OBJECT() as being boolean rather than numeric expressions.
The logic and the implementation scheme are similar with the
MDEV-9197 Pushdown conditions into non-mergeable views/derived tables
How the push down is made on the example:
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y) from t2 group by x);
-->
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y)
from t2
where x>3
group by x
having max(y)>10);
The implementation scheme:
1. Search for the condition cond that depends only on the fields
from the left part of the IN subquery (left_part)
2. Find fields F_group in the select of the right part of the
IN subquery (right_part) that are used in the GROUP BY
3. Extract from the cond condition cond_where that depends only on the
fields from the left_part that stay at the same places in the left_part
(have the same indexes) as the F_group fields in the projection of the
right_part
4. Transform cond_where so it can be pushed into the WHERE clause of the
right_part and delete cond_where from the cond
5. Transform cond so it can be pushed into the HAVING clause of the right_part
The optimization is made in the
Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the
variable condition_pushdown_for_subquery.
New test file in_subq_cond_pushdown.test is created.
There are also some changes made for setup_jtbm_semi_joins().
Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins()
that is called before optimize_cond() for cond and setup_jtbm_semi_joins()
that is called after optimize_cond().
New setup_jtbm_semi_joins() is made in the way so that the result of its work is
the same as if it was called before optimize_cond().
The code that is common for pushdown into materialized derived and into materialized
IN subqueries is factored out into pushdown_cond_for_derived(),
Item_in_subselect::pushdown_cond_for_in_subquery() and
st_select_lex::pushdown_cond_into_where_clause().
failure upon SELECT with impossible condition
The problem appears because of a wrong implementation of the
Item_func_in::build_clone() method. It didn't clone 'array' and 'cmp_fields'
fields for the cloned IN predicate and this could cause crashes.
The Item_func_in::fix_length_and_dec() method was refactored and a new method
named Item_func_in::create_array() was created. It allowed to create 'array'
for cloned IN predicates in a proper way.
this is a 10.3 version of 27d94b7e03
It disables caching of the first argument of IN,
if it's of a temporal type. Because other types are not
cached in this context.
reorder items in args[] array. Instead of
when1,then1,when2,then2,...[,case][,else]
sort them as
[case,]when1,when2,...,then1,then2,...[,else]
in this case all items used for comparison take a continuous part
of the array and can be aggregated directly. and all items that
can be returned take a continuous part of the array and can be
aggregated directly. Old code had to copy them to a temporary
array before aggreation, and then copy back (thd->change_item_tree)
everything that was changed.
this is a 10.3 version of bf1ca14ff3
Conversion of a subquery to a semi-join is blocked when we have an
IN subquery predicate in the on_expr of an outer join. Currently this
scenario is handled but the cases when an IN subquery predicate is wrapped
inside a Item_in_optimizer item then this blocking is not done.
cmp_item_sort_string::store_value() did not cache the string returned
from item->val_str(), whose result can point to various private members
such as Item_char_typecast::tmp_value.
- cmp_item_sort_string::store_value() remembered the pointer returned
from item->val_str() poiting to tmp_value into cmp_item_string::value_res.
- Later, cmp_item_real::store_value() was called, which called
Item_str_func::val_real(), which called Item_char_typecast::val_str(&tmp)
using a local stack variable "String tmp". Item_char_typecast::tmp_value
was overwritten and become a link to "tmp":
tmp_value.Ptr freed its own buffer and set to point to the buffer
owned by "tmp".
- On return from Item_str_func::val_real(), "String tmp" was destructed,
but "tmp_value" still pointed to the buffer owned by "tmp",
So tmp_value.Ptr became invalid.
- Then cmp_item_sort_string() passed cmp_item_string::value_res to sortcmp().
At this point, value_res still pointed to an invalid value of
Item_char_typecast::tmp_value.
Fix:
changing cmp_item_sort_string::store_value() to force copying
to cmp_item_string::value if item->val_str(&value) returned
a different pointer (instead of &value).
Refactor get_datetime_value() not to create Item_cache_temporal(),
but do it always in ::fix_fields() or ::fix_length_and_dec().
Creating items at the execution time doesn't work very well with
virtual columns and check constraints that are fixed and executed
in different THDs.
It's a generic function, not using anything from Arg_comparator.
Make it a static function, not a class method, to be able to use
it later without Arg_comparator
reorder items in args[] array. Instead of
when1,then1,when2,then2,...[,case][,else]
sort them as
[case,]when1,when2,...,then1,then2,...[,else]
in this case all items used for comparison take a continuous part
of the array and can be aggregated directly. and all items that
can be returned take a continuous part of the array and can be
aggregated directly. Old code had to copy them to a temporary
array before aggreation, and then copy back (thd->change_item_tree)
everything that was changed.
The problem was that Item_func_hybrid_field_type::get_date() did not
convert the result to the correct data type, so MYSQL_TIME::time_type
of the get_date() result could be not in sync with field_type().
Changes:
1. Adding two new classes Datetime and Date to store MYSQL_TIMESTAMP_DATETIME
and MYSQL_TIMESTAMP_DATE values respectively
(in addition to earlier added class Time, for MYSQL_TIMESTAMP_TIME values).
2. Adding Item_func_hybrid_field_type::time_op().
It performs the operation using TIME representation,
and always returns a MYSQL_TIME value with time_type=MYSQL_TIMESTAMP_TIME.
Implementing time_op() for all affected children classes.
3. Fixing all implementations of date_op() to perform the operation
using strictly DATETIME representation. Now they always return a MYSQL_TIME
value with time_type=MYSQL_TIMESTAMP_{DATE|DATETIME},
according to the result data type.
4. Removing assignment of ltime.time_type to mysql_timestamp_type()
from all val_xxx_from_date_op(), because now date_op() makes sure
to return a proper MYSQL_TIME value with a good time_type (and other member)
5. Adding Item_func_hybrid_field_type::val_xxx_from_time_op().
6. Overriding Type_handler_time_common::Item_func_hybrid_field_type_val_xxx()
to call val_xxx_from_time_op() instead of val_xxx_from_date_op().
7. Modified Item_func::get_arg0_date() to return strictly a TIME value
if TIME_TIME_ONLY is passed, or return strictly a DATETIME value otherwise.
If args[0] returned a value of a different temporal type,
(for example a TIME value when TIME_TIME_ONLY was not passed,
or a DATETIME value when TIME_TIME_ONLY was passed), the conversion
is automatically applied.
Earlier, get_arg0_date() did not guarantee a result in
accordance to TIME_TIME_ONLY flag.
There were two problems related to the bug report:
1. Item_datetime::get_date() was not implemented.
So execution went through val_int() followed
by int-to-datetime or int-to-time conversion.
This was the reason why the optimizer did not
work well on data with fractional seconds.
2. Item_datetime::set() did not have a TIME specific code
to mix months and days to hours after unpack_time().
This is why the optimizer did not work well with negative
TIME values, as well as huge time values.
Changes:
1. Overriding Item_datetime::get_date(), to return ltime.
This fixes the problem N1.
2. Cleanup: Moving pack_time() and unpack_time() from
sql-common/my_time.c and include/my_time.h to
sql/sql_time.cc and sql/sql_time.h, as they are not needed
on the client side.
3. Adding a new "enum_mysql_timestamp_type ts_type" parameter
to unpack_time() and moving the TIME specific code to mix
months and days with hours inside unpack_time().
Adding a new "ts_type" parameter to Item_datetime::set(),
to pass it from the caller down to unpack_time().
So now the TIME specific code is automatically called
from Item_datetime::set(). This fixes the problem N2.
This change also helped to get rid of duplicate TIME specific code
from other three places, where mixing month/days to hours
was done immediately after unpack_time().
Moving the DATE specific code to zero hhmmssff
from Item_func_min_max::get_date_native to inside unpack_time(),
for symmetry.
4. Removing the virtual method in_vector::result_type(),
adding in_vector::type_handler() instead.
This helps to get result_type(), field_type(),
mysql_timestamp_type() of an in_vector easier.
Passing type_handler()->mysql_timestamp_type() as
a new parameter to Item_datetime::set() inside
in_temporal::value_to_item().
5. Cleaup: Removing separate implementations of in_datetime::get_value()
and in_time::get_value(). Adding a single implementation
in_temporal::get_value() instead.
Passing type_handler()->field_type() to get_value_internal().
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
MDEV-14957: JOIN::prepare gets unusable "conds" as argument
Do not touch merged derived (it is irreversible)
Fix first argument of in_optimizer for calls possible before fix_fields()
TODO:
- Make get_thd_memroot() inline
- To do this, we need to reduce dependence of include files, especially
so that sql_class.h is not depending in item.h
Make differentiation between pullout for merge and pulout of outer field during exists2in transformation.
In last case the field was outer and so we can safely start from name resolution context of the SELECT where it was pulled.
Old behavior lead to inconsistence between list of tables and outer name resolution context (which skips one SELECT for merge purposes) which creates problem vor name resolution.
As a result of this merge the code for the following tasks appears in 10.3:
- MDEV-12172 Implement tables specified by table value constructors
- MDEV-12176 Transform [NOT] IN predicate with long list of values INTO
[NOT] IN subquery.
Side effect: the second debug Note in cache_temporal_4265.result disappeared.
Before this change:
- During JOIN::cache_const_exprs(),
Item::get_cache() for Item_date_add_interval() was called.
The data type for date_add('2001-01-01',interval 5 day) is VARCHAR,
because the first argument is VARCHAR (not temporal).
Item_get_cache() created Item_cache_str('2001-01-06').
- During evaluate_join_record(), get_datetime_value() was called,
which called Item::get_date() for Item_cache_str('2001-01-06').
This gave the second Note. Then, get_datetime_value() created
a new cache, now Item_cache_temporal for '2001-01-06', so not
further str_to_datetime() happened.
After this change:
- During tem_bool_rowready_func2::fix_length_and_dec(),
Arg_comparator::set_cmp_func_datetime() is called,
which immediately creates an instance of Item_cache_date for
the result of date_add('2001-01-01',interval 5 day).
So later no str_to_datetime happens any more,
neither during JOIN::cache_const_exprs(),
nor during evaluate_join_record().
Fixing the data type for the "fuzzydate" parameter to
Item_func_hybrid_field_type::date_op() from uint to ulonglong,
for consistency with Item::get_date().
- Implementing stricter data type control for Item_long_func descendants
- Cleanup: renaming Type_handler::can_return_str_ascii() to can_return_text()
(a better name).
This is a preparatory step for MDEV-13864.
It does not change behavior in any ways. It simply splits methods into smaller peaces.
The intent of this separate patch is to make more readable the main patch for
MDEV-13864 (which will actually move the predicant to args[0]).
1. Splitting fix_length_and_dec() into smaller pieces, adding:
- bool aggregate_then_and_else_arguments(THD *thd);
- bool aggregate_switch_and_when_arguments(THD *thd);
2. Splitting find_item() into smaller pieces, adding:
- Item *find_item_searched();
- Item *find_item_simple();
3. Splitting print() into smaller pieces, adding:
- void print_when_then_arguments(String *str, enum_query_type query_type,
Item **items, uint count);
- void print_else_argument(String *str, enum_query_type query_type, Item *item)
4. Moving the maybe_null handling part related to ELSE from fix_length_and_dec()
to fix_fields(), as in all other Item_func's.
5. Removing the unused String* argument from find_item().
6. Moving find_item() from public to private, as it's not needed outside.
JSON_EXTRACT behaves specifically in the comparison,
so we have to implement specific method for that in
Arg_comparator.
Conflicts:
sql/item_cmpfunc.cc
1. use Regexp_processor_pcre::set_recursion_limit() to set the
recursion limit depending on the current available stack size
2. make pcre stack frame to be estimated no less than 500 bytes.
sometimes pcre estimates it too low, even though the manual
says 500+16 bytes (it was estimated only 188 for me, actual
frame size was 512).
3. do it for embedded too
The "is null" function performs one operation which no other Item_func
does, which is to update used tables during fix_length_and_dec().
This however can not be performed before window functions have had a
chance to resolve their order by and partition by definitions, which
happens after the initial setup_fields call. Consequently, do not call
Item_func_isnull update_used_tables during fix_length_and_dec().
There was another issue detected once the crash was resolved.
Because window functions did not implement is_null() method, we would
end up returning bad results for "is null" and "is not null" functions.
Implemented is_null() method for Item_windowfunc.
This patch fills in a serious flaw in the
code that supports condition pushdown into
materialized views / derived tables.
If a predicate happened to contain a reference
to a mergeable view / derived table and it does
not depended directly on the target materialized
view / derived table then the predicate was not
considered as a subject to pusdown to this view
/ derived table.
It was possible to construct a PCRE expression that exceeded the stack.
resulting in a crash:
With fix:
MariaDB [(none)]> SELECT 1
-> FROM dual
-> WHERE ('Alpha,Bravo,Charlie,Delta,Echo,Foxtrot,StrataCentral,Golf,Hotel,India,Juliet,Kilo,Lima,Mike,StrataL3,November,Oscar,StrataL2,Sand,P3,P4SwitchTest,Arsys,Poppa,ExtensionMgr,Arp,Quebec,Romeo,StrataApiV2,PtReyes,Sierra,SandAcl,Arrow,Artools,BridgeTest,Tango,SandT,PAlaska,Namespace,Agent,Qos,PatchPanel,ProjectReport,Ark,Gimp,Agent,SliceAgent,Arnet,Bgp,Ale,Tommy,Central,AsicPktTestLib,Hsc,SandL3,Abuild,Pca9555,Standby,ControllerDut,CalSys,SandLib,Sb820,PointV2,BfnLib,Evpn,BfnSdk,Sflow,ManagementActive,AutoTest,GatedTest,Bgp,Sand,xinetd,BfnAgentLib,bf-utils,Hello,BfnState,Eos,Artest,Qos,Scd,ThermoMgr,Uniform,EosUtils,Eb,FanController,Central,BfnL3,BfnL2,tcp_wrappers,Victor,Environment,Route,Failover,Whiskey,Xray,Gimp,BfnFixed,Strata,SoCal,XApi,Msrp,XpProfile,tcpdump,PatchPanel,ArosTest,FhTest,Arbus,XpAcl,MacConc,XpApi,telnet,QosTest,Alpha2,BfnVlan,Stp,VxlanControllerTest,MplsAgent,Bravo2,Lanz,BfnMbb,Intf,XCtrl,Unicast,SandTunnel,L3Unicast,Ipsec,MplsTest,Rsvp,EthIntf,StageMgr,Sol,MplsUtils,Nat,Ira,P4NamespaceDut,Counters,Charlie2,Aqlc,Mlag,Power,OpenFlow,Lag,RestApi,BfdTest,strongs,Sfa,CEosUtils,Adt746,MaintenanceMode,MlagDut,EosImage,IpEth,MultiProtocol,Launcher,Max3179,Snmp,Acl,IpEthTest,PhyEee,bf-syslibs,tacc,XpL2,p4-ar-switch,p4-bf-switch,LdpTest,BfnPhy,Mirroring,Phy6,Ptp'
->
-> REGEXP '^((?!\b(Strata|StrataApi|StrataApiV2)\b).)*$');
Empty set, 1 warning (0.00 sec)
MariaDB [(none)]> show warnings;
+---------+------+---------------------------------------------------------+
| Level | Code | Message |
+---------+------+---------------------------------------------------------+
| Warning | 1139 | Got error 'pcre_exec: Internal error (-21)' from regexp |
+---------+------+---------------------------------------------------------+
- Adding Type_handler::make_table_field() and moving pieces of the code
from Item::tmp_table_field_from_field_type() to virtual implementations
for various type handlers.
- Adding a new Type_all_attributes, to access to Item's extended
attributes, such as decimal_precision() and geometry_type().
- Adding a new class Record_addr, to pass record related information
to Type_handler methods (ptr, null_ptr and null_bit) as a single structure.
Note, later it will possibly be extended for BIT-alike field purposes,
by adding new members (bit_ptr_arg, bit_ofs_arg).
- Moving the code from Field_new_decimal::create_from_item()
to Type_handler_newdecimal::make_table_field().
- Removing Field_new_decimal() and Field_geom() helper constructor
variants that were used for temporary field creation.
- Adding Item_field::type_handler(), Field::type_handler() and
Field_blob::type_handler() to return correct type handlers for
blob variants, according to Field_blob::packlength.
- Adding Type_handler_blob_common, as a common parent for
Type_handler_tiny_blob, Type_handler_blob, Type_handler_medium_blob
and Type_handler_long_blob.
- Implementing Type_handler_blob_common::Item_hybrid_func_fix_attributes().
It's needed for cases when TEXT variants of different character sets are mixed
in LEAST, GREATEST, CASE and its abreviations (IF, IFNULL, COALESCE), e.g.:
CREATE TABLE t1 (
a TINYTEXT CHARACTER SET latin1,
b TINYTEXT CHARACTER SET utf8
);
CREATE TABLE t2 AS SELECT COALESCE(a,b) FROM t1;
Type handler aggregation returns TINYTEXT as a common data type
for the two columns. But as conversion from latin1 to utf8
happens for "a", the maximum possible length of "a" grows from 255 to 255*3.
Type_handler_blob_common::Item_hybrid_func_fix_attributes() makes sure
to update the blob type handler according to max_length.
- Adding Type_handler::blob_type_handler(uint max_octet_length).
- Adding a few m_type_aggregator_for_result.add() pairs, because
now Item_xxx::type_handler() can return pointers to type_handler_tiny_blob,
type_handler_blob, type_handler_medium_blob, type_handler_long_blob.
Before the patch only type_handler_blob was possible result of type_handler().
- Making type_handler_tiny_blob, type_handler_blob, type_handler_medium_blob,
type_handler_long_blob public.
- Removing the condition in Item_sum_avg::create_tmp_field()
checking Item_sum_avg::result_type() against DECIMAL_RESULT.
Now both REAL_RESULT and DECIMAL_RESULT are symmetrically handled
by tmp_table_field_from_field_type().
- Removing Item_geometry_func::create_field_for_create_select(),
as the inherited version perfectly works.
- Fixing Item_func_as_wkb::field_type() to return MYSQL_TYPE_LONG_BLOB
rather than MYSQL_TYPE_BLOB. It's needed to make sure that
tmp_table_field_from_field_type() creates a LONGBLOB field for AsWKB().
- Fixing Item_func_as_wkt::fix_length_and_dec() to set max_length to
UINT32_MAX rather than MAX_BLOB_WIDTH, to make sure that
tmp_table_field_from_field_type() creates a LONGTEXT field for AsWKT().
- Removing Item_func_set_user_var::create_field_for_create_select(),
as the inherited version works fine.
- Adding Item_func_get_user_var::create_field_for_create_select() to
make sure that "CREATE TABLE t1 AS SELECT @string_user variable"
always creates a field of LONGTEXT/LONGBLOB type.
- Item_func_ifnull::create_field_for_create_select()
behavior has changed. Before the patch it passed set_blob_packflag=false,
which meant to create LONGBLOB for all blob variants.
Now it takes into account max_length, which gives better column
data types for:
CREATE TABLE t2 AS SELECT IFNULL(blob_column1, blob_column2) FROM t1;
- Fixing Item_func_nullif::fix_length_and_dec() to use
set_handler(args[2]->type_handler()) instead of
set_handler_by_field_type(args[2]->field_type()).
This is needed to distinguish between BLOB variants.
- Implementing Item_blob::type_handler(), to make sure to create
proper BLOB field variant, according to max_length, for queries like:
CREATE TABLE t1 AS
SELECT some_blob_field FROM INFORMATION_SCHEMA.SOME_TABLE;
- Fixing Item_field::real_type_handler() to make sure that
the code aggregating fields for UNION gets a proper BLOB
variant type handler from fields.
- Adding a special code into Item_type_holder::make_field_by_type(),
to make sure that after aggregating field types it also properly
takes into account max_length when mixing TEXT variants of different
character sets and chooses a proper TEXT variant:
CREATE TABLE t1 (
a TINYTEXT CHARACTER SET latin1,
b TINYTEXT CHARACTER SET utf8
);
CREATE TABLE t2 AS SELECT a FROM t1 UNION SELECT b FROM t1;
- Adding tests, for better coverage of IFNULL, NULLIF, UNION.
- The fact that tmp_table_field_from_field_type() now takes
into account BLOB variants (instead of always creating LONGBLOB),
tests results for WEIGHT_STRING() and NULLIF() and UNION
have become more precise.
The patch actually fixes the old defect of the optimizer that
could not extract keys for range access from IN predicates
with row arguments.
This problem was resolved in the mysql-5.7 code. The patch
supersedes what was done there:
- it can build range access when not all components of
the first row argument are refer to the columns of the table
for which the range access is constructed.
- it can use equality predicates to build range access
to the table that is not referred to in this argument.
Also, implement MDEV-11027 a little differently from 5.5 and 10.0:
recv_apply_hashed_log_recs(): Change the return type back to void
(DB_SUCCESS was always returned).
Report progress also via systemd using sd_notifyf().
Item_func_le included Arg_comparator. Arg_comparator remembered
the current_thd during fix_fields and used that value during
execution to allocate Item_cache in get_datetime_value().
But for vcols fix_fields and val_int can happen in different threads.
Same bug for Item_func_in using in_datetime or cmp_item_datetime,
both also remembered current_thd at fix_fields() to use it later
for get_datetime_value().
As a fix, these objects no longer remember the current_thd,
and get_datetime_value() uses current_thd at run time. This
should not increase the number of current_thd calls much, as
Item_cache is created only once anyway.
Now after the patch for MDEV-11478 added a new method Type_handler::name(),
using the new method in debug output to make test results for MDEV-11514
more readable (func_debug.result).
When a condition containing NULLIF is pushed into a materialized
view/derived table the clone of the Item_func_nullif item must
be processed in a special way to guarantee that the first argument
points to the same item as the third argument.
This patch fixes a number of data type aggregation problems in IN and CASE:
- MDEV-11497 Wrong result for (int_expr IN (mixture of signed and unsigned expressions))
- MDEV-11514 IN with a mixture of TIME and DATETIME returns a wrong result
- MDEV-11554 Wrong result for CASE on a mixture of signed and unsigned expressions
- MDEV-11555 CASE with a mixture of TIME and DATETIME returns a wrong result
1. The problem reported in MDEV-11514 and MDEV-11555 was in the wrong assumption
that items having the same cmp_type() can reuse the same cmp_item instance.
So Item_func_case and Item_func_in used a static array of cmp_item*,
one element per one XXX_RESULT.
TIME and DATETIME cannot reuse the same cmp_item, because arguments of
these types are compared very differently. TIME and DATETIME must have
different instances in the cmp_item array. Reusing the same cmp_item
for TIME and DATETIME leads to unexpected result and unexpected warnings.
Note, after adding more data types soon (e.g. INET6), the problem would
become more serious, as INET6 will most likely have STRING_RESULT, but
it won't be able to reuse the same cmp_item with VARCHAR/TEXT.
This patch introduces a new class Predicant_to_list_comparator,
which maintains an array of cmp_items, one element per distinct
Type_handler rather than one element per XXX_RESULT.
2. The problem reported in MDEV-11497 and MDEV-11554 happened because
Item_func_in and Item_func_case did not take into account the fact
that UNSIGNED and SIGNED values must be compared as DECIMAL rather than INT,
because they used item_cmp_type() to aggregate the arguments.
The relevant code now resides in Predicant_to_list_comparator::add_value()
and uses Type_handler_hybrid_field_type::aggregate_for_comparison(),
like Item_func_between does.
This patch implements the task according to the description:
1. The old code from Item_func_in::fix_length_and_dec() was decomposed
into smaller methods:
- all_items_are_consts()
- compatible_types_scalar_bisection_possible()
- compatible_types_row_bisection_possible()
- fix_in_vector()
- fix_for_scalar_comparison_using_bisection()
- fix_for_scalar_comparison_using_cmp_items()
- fix_for_row_comparison_using_bisection()
- fix_for_row_comparison_using_cmp_items()
The data type dependend pieces where moved as methods to Type_handler.
2. Splits in_datetime into separate:
- in_datetime, for DATETIME and DATE,
- in_time, for TIME
to make the code more symmetric across data types.
Additionally:
- Adds a test func_debug.test to see which calculation strategy
(bisect or no bisect) is chosen to handle IN with various arguments.
- Adds a new helper method (to avoid duplicate code):
cmp_item_rows::prepare_comparators()
- Changes the propotype for cmp_item_row::alloc_comparators(),
to avoid duplicate code, and to use less current_thd.
- Changes "friend" sections in cmp_item_row and in_row from
an exact Item_func_in method to the entire class Item_func_in,
as their internals are now needed in multiple Item_func_in methods.
- Added more comments (e.g. on bisection, on the problem reported in MDEV-11511)
- Removes "Item_result Item_func_opt_neg::m_compare_type" and introduces
"Type_handler_hybrid_field_type Item_func_opt_neg::m_comparator" instead.
- Removes Item_func_between::compare_as_dates, because
the new member m_comparator now contains the precise information
about the data type that is used for comparison, which is important
for TIME vs DATETIME.
- Adds a new method:
Type_handler_hybrid_field_type::aggregate_for_comparison(const Type_handler*),
as a better replacement for item_cmp_type(), which additionally can handle
TIME vs DATE/DATETIME/TIMESTAMP correctly. Additionally, it correctly
handles TIMESTAMP which fixes the problem reported in MDEV-11482.
The old compare_as_dates/find_date_time_item() based code didn't handle
comparison between TIME and TIMESTAMP correctly and erroneously used TIME
comparison instead of DATETIME comparison.
- Adds a new method:
Type_handler_hybrid_field_type::aggregate_for_comparison(Item **, uint nitems),
as a better replacement for agg_cmp_type(), which can handle TIME.
- Splits Item_func_between::val_int() into pieces val_int_cmp_xxx(),
one new method per XXX_RESULT.
- Adds a new virtual method Type_handler::Item_func_between_val_int()
whose implementations use Item_func_between::val_int_cmp_xxx().
- Makes type_handler_longlong and type_handler_newdecimal public,
as they are now needed in item_cmpfunc.cc.
Note:
This patch does not change Item_func_in to use the new aggregation methods,
so it still uses collect_cmp_type()/item_cmp_type() based aggregation.
Item_func_in will be changed in a separate patch and item_cmp_type() will be
removed.
This patch:
- Adds a new virtual method Type_handler::Item_get_cache
- Splits moves Item_cache::get_cache() into the new method, every
"case XXX_RESULT" to the corresponding Type_handler_xxx::Item_get_cache.
- Adds Item::get_cache as a convenience wrapper, to make the caller code
shorter.
- Changes the last argument of Arg_comparator::cache_converted_constant()
from Item_result to "const Type_handler *".
- Removes subselect_engine::cmp_type, subselect_engine::res_type,
subselect_engine::res_field_type and derives subselect_engine
from Type_handler_hybrid_field_type instead.
- Makes Type_handler_varchar public, as it's now needed as the
default data type handler for subselect_engine.
This patch:
- Introduces a new virtuial method Type_handler::set_comparator_func
and moves pieces of the code from the switch in
Arg_comparator::set_compare_func into the corresponding
Type_handler_xxx::set_comparator_func.
- Adds Type_handler::get_handler_by_cmp_type()
- Moves Type_handler_hybrid_field_type::get_handler_by_result_type() to
a static method Type_handler::get_handler_by_result_type(),
for symmetry with similar methods:
* Type_handler::get_handler_by_field_type()
* Type_handler::get_handler_by_real_type()
* Type_handler::get_handler_by_cmp_type()
- Introduces Type_handler_row, to unify the code for the scalar
data types and the ROW data type (currently for comparison purposes only).
- Adds public type_handler_row, as it's now needed in item_row.h
- Makes type_handler_null public, as it's now needed in item_cmpfunc.h
Note, other type_handler_xxx will become public as well later.
- Removes the global variable Arg_comparator::comparator_matrix,
as it's not needed any more.
The function Item_func_isnull::update_used_tables() must
handle the case when the predicate is over not nullable
column in a special way.
This is actually a bug of MariaDB 5.3/5.5, but it's probably
hard to demonstrate that it can cause problems there.
The class Item_func_nop_all missed an implementation
of the virtual method get_copy.
As a result if the condition that can be pushed into
into a materialized view / derived table contained
an ANY subselect then the pushdown condition was built
incorrectly.
- Adding SHOW CREATE TABLE into all DEFAULT tests,
to cover need_parentheses_in_default() for all items
- Fixing a few items not to print parentheses in DEFAULT:
spatial function-alike predicates, IS_IPV4 and IS_IPV6 functions,
COLUMN_CHECK() and COLUMN_EXISTS().
- Force usage of () around complex DEFAULT expressions
- Give error if DEFAULT expression contains invalid characters
- Don't use const_charset_conversion for stored Item_func_sysconf expressions
as the result is not constaint over different executions
- Fixed Item_func_user() to not store calculated value in str_value
10.1 introduced a problem:
Execution time for various recursive stages
(walk, update_used_table, and propagate_equal_fields)
in NULLIF is O(recursion_level^2), because complexity is
doubled on every recursion level when we copy args[0] to args[2].
This change fixes to avoid unnecessary recursion in:
- Item_func_nullif::walk
- Item_func_nullif::update_used_tables
- Item_func_nullif::propagate_equal_fields
when possible.
This is a backport of the patch for MDEV-9653 (fixed earlier in 10.1.13).
The code in Item_func_case::fix_length_and_dec() did not
calculate max_length and decimals properly.
In case of any numeric result (DECIMAL, REAL, INT) a generic method
Item_func_case::agg_num_lengths() was called, which could erroneously result
into a DECIMAL item with max_length==0 and decimals==0, so the constructor of
Field_new_decimals tried to create a field of DECIMAL(0,0) type,
which caused a crash.
Unlike Item_func_case, the code responsible for merging attributes in
Item_func_coalesce::fix_length_and_dec() works fine: it has specific execution
branches for all distinct numeric types and correctly creates a DECIMAL(1,0)
column instead of DECIMAL(0,0) for the same set of arguments.
The fix does the following:
- Moves the attribute merging code from Item_func_coalesce::fix_length_and_dec()
to a new method Item_func_hybrid_result_type::fix_attributes()
- Removes the wrong code from Item_func_case::fix_length_and_dec()
and reuses fix_attributes() in both Item_func_coalesce::fix_length_and_dec()
and Item_func_case::fix_length_and_dec()
- Fixes count_real_length() and count_decimal_length() to get an array
of Items as an argument, instead of using Item::args directly.
This is needed for Item_func_case::fix_length_and_dec().
- Moves methods Item_func::count_xxx_length() from "public" to "protected".
- Removes Item_func_case::agg_num_length(), as it's not used any more.
- Additionally removes Item_func_case::agg_str_length(),
as it also was not used (dead code).
* only copy args[0] to args[2] after fix_fields (when all item
substitutions have already happened)
* change QT_ITEM_FUNC_NULLIF_TO_CASE (that allows to print NULLIF
as CASE) to QT_ITEM_ORIGINAL_FUNC_NULLIF (that prohibits it).
So that NULLIF-to-CASE is allowed by default and only disabled
explicitly for SHOW VIEW|FUNCTION|PROCEDURE and mysql_make_view.
By default it is allowed (in particular in error messages and
debug output, that can happen anytime before or after optimizer).
"Re-factor the code for post-join operations".
The patch mainly contains the code ported from mysql-5.6 and
created for two essential architectural changes:
1. WL#5558: Resolve ORDER BY execution method at the optimization stage
2. WL#6071: Inline tmp tables into the nested loops algorithm
The first task was implemented for mysql-5.6 by Ole John Aske.
It allows to make all decisions on ORDER BY operation at the optimization
stage.
The second task implemented for mysql-5.6 by Evgeny Potemkin adds JOIN_TAB
nodes for post-join operations that require temporary tables. It allows
to execute these operations within the nested loops algorithm that used to
be used before this task only for join queries. Besides these task moves
all planning on the execution of these operations from the execution phase
to the optimization phase.
Some other re-factoring changes of mysql-5.6 were pulled in, mainly because
it was easier to pull them in than roll them back. In particular all
changes concerning Ref_ptr_array were incorporated.
The port required some changes in the MariaDB code that concerned the
functionality of EXPLAIN and ANALYZE. This was done mainly by Sergey
Petrunia.
MDEV-9408 CREATE TABLE SELECT MAX(int_column) creates different columns for table vs view
There were three almost identical pieces of the code:
- Field *Item_func::tmp_table_field();
- Field *Item_sum::create_tmp_field();
- Field *create_tmp_field_from_item();
with a difference in very small details (hence the bugs):
Only Item_func::tmp_table_field() was correct, the other two were not.
Removing the two incorrect pieces of the redundant code.
Joining these three functions/methods into a single virtual method
Item::create_tmp_field().
Additionally, moving Item::make_string_field() and
Item::tmp_table_field_from_field_type() from the public into the
protected section of the class declaration, as they are now not
needed outside of Item.
Problem:
At the end of first execution select_lex->prep_where is pointing to
a runtime created object (temporary table field). As a result
server exits trying to access a invalid pointer during second
execution.
Analysis:
While optimizing the join conditions for the query, after the
permanent transformation, optimizer makes a copy of the new
where conditions in select_lex->prep_where. "prep_where" is what
is used as the "where condition" for the query at the start of execution.
W.r.t the query in question, "where" condition is actually pointing
to a field in the temporary table. As a result, for the second
execution the pointer is no more valid resulting in server exit.
Fix:
At the end of the first execution, select_lex->where will have the
original item of the where condition.
Make prep_where the new place where the original item of select->where
has to be rolled back.
Fixed in 5.7 with the wl#7082 - Move permanent transformations from
JOIN::optimize to JOIN::prepare
Patch for 5.5 includes the following backports from 5.6:
Bugfix for Bug12603141 - This makes the first execute statement in the testcase
pass in 5.5
However it was noted later in in Bug16163596 that the above bugfix needed to
be modified. Although Bug16163596 is reproducible only with changes done for
Bug12582849, we have decided include the fix.
Considering that Bug12582849 is related to Bug12603141, the fix is
also included here. However this results in Bug16317817, Bug16317685,
Bug16739050. So fix for the above three bugs is also part of this patch.
- Turning get_mm_tree_for_const() from a static function into
a protected method in Item.
- Adding a new class Item_bool_func2_with_rev, for the functions and operators
that have a reverse function and can use the range optimizer for
to optimize "value OP field" as "field REV_OP value". Deriving
Item_bool_rowready_func2 and Item_funt_spatial_rel from the new class.
- Removing Item_bool_func2::have_rev_func().
1. Removing the legacy code that disabled equal field propagation in cases
when comparison is done as VARBINARY. This is now correctly handled by
the new propagation code in Item_xxx::propagate_equal_fields() and
Field_str::can_be_substituted_to_equal_item (the bug fix).
2. Also, removing legacy (pre-MySQL-4.1) Arg_comparator methods
compare_binary_string() and compare_e_binary_string(), as VARBINARY
comparison is correcty handled in compare_string() and compare_e_string() by
the corresponding VARBINARY collation handler implemented in my_charset_bin.
(not really a part of the bug fix)
WHERE COALESCE(time_column)=TIME('00:00:00')
AND COALESCE(time_column)=DATE('2015-09-11')
MDEV-8814 Wrong result for WHERE datetime_column > TIME('00:00:00')
__MEMMOVE_SSSE3_BACK FROM STRING::COPY
Issue:
-----
While using row comparators, the store_value functions call
val_xxx functions in the prepare phase. This can cause
valgrind issues.
SOLUTION:
---------
Setting up of the comparators should be done by
alloc_comparators in the prepare phase. Also, make sure
store_value will be called only during execute phase.
This is a backport of the fix for Bug#17755540.
MDEV-8754 Wrong result for SELECT..WHERE year_field=2020 AND NULLIF(year_field,2010)='2020'
Problems:
1. Item_func_nullif stored a copy of args[0] in a private member m_args0_copy,
which was invisible for the inherited Item_func menthods, like
update_used_tables(). As a result, after equal field propagation
things like Item_func_nullif::const_item() could return wrong result
and a non-constant NULLIF() was erroneously treated as a constant
at optimize_cond() time.
Solution: removing m_args0_copy and storing the return value item
in args[2] instead.
2. Equal field propagation did not work well for Item_fun_nullif.
Solution: using ANY_SUBST for args[0] and args[1], as they are in
comparison, and IDENTITY_SUBST for args[2], as it's not in comparison.