result
Row subqueries producing no rows were not handled as UNKNOWN
values in row comparison expressions.
That was a result of the following two problems:
1. Item_singlerow_subselect did not mark the resulting row
value as NULL/UNKNOWN when no rows were produced.
2. Arg_comparator::compare_row() did not take into account that
a whole argument may be NULL rather than just individual scalar
values.
Before bug#34384 was fixed, the above problems were hidden
because an uninitialized (i.e. without any stored value) cached
object would appear as NULL for scalar values in a row subquery
returning an empty result. After the fix
Arg_comparator::compare_row() would try to evaluate
uninitialized cached objects.
Fixed by removing the aforementioned problems.
The EXISTS transformation has additional switches to catch the known corner
cases that appear when transforming an IN predicate into EXISTS. Guarded
conditions are used which are deactivated when a NULL value is seen in the
outer expression's row. When the inner query block supplies NULL values,
however, they are filtered out because no distinction is made between the
guarded conditions; guarded NOT x IS NULL conditions in the HAVING clause that
filter out NULL values cannot be de-activated in isolation from those that
match values or from the outer expression or NULL's.
The above problem is handled by making the guarded conditions remember whether
they have rejected a NULL value or not, and index access methods are taking
this into account as well.
The bug consisted of
1) Not resetting the property for every nested loop iteration on the inner
query's result.
2) Not propagating the NULL result properly from inner query to IN optimizer.
3) A hack that may or may not have been needed at some point. According to a
comment it was aimed to fix#2 by returning NULL when FALSE was actually
the result. This caused failures when #2 was properly fixed. The hack is
now removed.
The fix resolves all three points.
file .\item_subselect.cc, line 836
IN quantified predicates are never executed directly. They are rather wrapped
inside nodes called IN Optimizers (Item_in_optimizer) which take care of the
execution. However, this is not done during query preparation. Unfortunately
the LIKE predicate pre-evaluates constant right-hand side arguments even
during name resolution. Likely this is meant as an optimization.
Fixed by not pre-evaluating LIKE arguments in view prepare mode.
Incorrect handling of NULL arguments could lead to a crash on
the IN or CASE operations when either NULL arguments were
passed explicitly as arguments (IN) or implicitly generated by
the WITH ROLLUP modifier (both IN and CASE).
Item_func_case::find_item() assumed all necessary comparators
to be instantiated in fix_length_and_dec(). However, in the
presence of WITH ROLLUP modifier, arguments could be
substituted with an Item_null leading to an "unexpected"
STRING_RESULT comparator being invoked.
In addition to the problem identical to the above,
Item_func_in::val_int() could crash even with explicitly passed
NULL arguments due to an optimization in fix_length_and_dec()
leading to NULL arguments being ignored during comparators
creation.
Problem: a flaw (derefencing a NULL pointer) in the LIKE optimization
code may lead to a server crash in some rare cases.
Fix: check the pointer before its dereferencing.
(Original patch by Sinisa Milivojevic)
The YEAR(4) value of 2000 was equal to the "bad" YEAR(4) value of 0000.
The get_year_value() function has been modified to not adjust bad
YEAR(4) value to 2000.
SunStudio
SunStudio compilers of late warn about methods that might hide
methods in base classes due to the use of overloading combined
with overriding. SunStudio also warns about variables defined
in local socpe or method arguments that have the same name as
a member attribute of the class.
This patch renames methods that might hide base class methods,
to make it easier both for humans and compilers to see what is
actually called. It also renames variables in local scope.
NULLable BIGINT and INT columns in comparison
Problem: a consequence of the fix for 43668.
Some Arg_comparator inner initialization missed,
that may lead to unpredictable (wrong) comparison
results.
Fix: always properly initialize Arg_comparator
before its usage.
A few problems were found in the fix for bug 43668:
1) Comparison of the YEAR column with NULL always returned TRUE;
2) Comparison of the YEAR column with constants always returned
unpredictable result;
3) Unnecessary conversion warnings when comparing a non-integer
constant with a NULL value in the YEAR column;
The problems described above have been resolved with an
exception: zero (i.e. invalid) YEAR column value comparison
with 00 or 2000 still fail (it is not a regression and it was
not a regression), so MIN/MAX on YEAR column containing zero
value still fail.
Arg_comparator uses Item_cache objects to store constants being compared when
they're need a type conversion. Because this cache wasn't initialized properly
Arg_comparator might produce wrong comparison result.
The Arg_comparator::cache_converted_constant function now initializes cache
prior to usage.
field='const1' AND field='const2' in some cases
Building multiple equality predicates containing
a constant which is compared as a datetime (with a field)
we should take this fact into account and compare the
constant with another possible constatns as datetimes
as well.
E.g. for the
SELECT ... WHERE a='2001-01-01' AND a='2001-01-01 00:00:00'
we should compare '2001-01-01' with '2001-01-01 00:00:00' as
datetimes but not as strings.
Actually there is two different bugs.
The first one caused crash on queries with WHERE condition over views
containing WHERE condition. A wrong check for prepared statement phase led
to items for view fields being allocated in the execution memory and freed
at the end of execution. Thus the optimized WHERE condition refers to
unallocated memory on the second execution and server crashed.
The second one caused by the Item_cond::compile function not saving changes
it made to the item tree. Thus on the next execution changes weren't
reverted and server crashed on dereferencing of unallocated space.
The new helper function called is_stmt_prepare_or_first_stmt_execute
is added to the Query_arena class.
The find_field_in_view function now uses
is_stmt_prepare_or_first_stmt_execute() to check whether
newly created view items should be freed at the end of the query execution.
The Item_cond::compile function now saves changes it makes to item tree.
MySQL manual describes values of the YEAR(2) field type as follows:
values 00 - 69 mean 2000 - 2069 years and values 70 - 99 mean 1970 - 1999
years. MIN/MAX and comparison functions was comparing them as int values
thus producing wrong result.
Now the Arg_comparator class is extended with compare_year function which
performs correct comparison of the YEAR type.
The Item_sum_hybrid class now uses Item_cache and Arg_comparator objects to
correctly calculate its value.
To allow Arg_comparator to use func_name() function for Item_func and Item_sum
objects the func_name declaration is moved to the Item_result_field class.
A helper function is_owner_equal_func is added to the Arg_comparator class.
It checks whether the Arg_comparator object owner is the <=> function or not.
A helper function setup is added to the Item_sum_hybrid class. It sets up
cache item and comparator.
When values of different types are compared they're converted to a type that
allows correct comparison. This conversion is done for each comparison and
takes some time. When a constant is being compared it's possible to cache the
value after conversion to speedup comparison. In some cases (large dataset,
complex WHERE condition with many type conversions) query might be executed
7% faster.
A test case isn't provided because all changes are internal and isn't visible
outside.
The behavior of the Item_cache is changed to cache values on the first request
of cached value rather than at the moment of storing item to be cached.
A flag named value_cached is added to the Item_cache class. It's set to TRUE
when cache holds the value of the last stored item.
Function named cache_value() is added to the Item_cache class and derived classes.
This function actually caches the value of the saved item.
Item_cache_xxx::store functions now only store item to be cached and set
value_cached flag to FALSE.
Item_cache_xxx::val_xxx functions are changed to call cache_value function
prior to returning cached value if value_cached is FALSE.
The Arg_comparator::set_cmp_func function now calls cache_converted_constant
to cache constants if they need a type conversion.
The Item_cache::get_cache function is overloaded to allow setting of the
cache type.
The cache_converted_constant function is added to the Arg_comparator class.
It checks whether a value can and should be cached and if so caches it.
The SE API requires mysql to notify the storage engine that
it's going to read certain tables at the beginning of the
statement (by calling start_stmt(), store_lock() or
external_lock()).
These are typically called by the lock_tables().
However SHOW CREATE TABLE is not pre-locking the tables
because it's not expected to access the data at all.
But for some view definitions (that include comparing a
date/datetime/timestamp column to a string returning
scalar subquery) the JOIN::prepare may still access data
when materializing the scalar non-correlated subquery
in Arg_comparator::can_compare_as_dates().
Fixed by not materializing the subquery when the function
is called in a SHOW/EXPLAIN/CREATE VIEW
values return too many records
WHERE clauses with "outer_value_list NOT IN subselect" were
handled incorrectly if the outer value list contained multiple
items where at least one of these could be NULL. The first
outer record with NULL value was handled correctly, but if a
second record with NULL value existed, the optimizer would
choose to reuse the result it got on the last execution of the
subselect. This is incorrect if the outer value list has
multiple items.
The fix is to make Item_in_optimizer::val_int (in
item_cmpfunc.cc) reuse the result of the latest execution
for NULL values only if all values in the outer_value_list
are NULL.
When a query was using a DATE or DATETIME value formatted
using any other separator characters beside hyphen '-', a
query with a greater-or-equal '>=' condition matching only
the greatest value in an indexed column, the result was
empty if index range scan was employed.
The range optimizer got a new feature between 5.1.38 and
5.1.39 that changes a greater-or-equal condition to a
greater-than if the value matching that in the query was not
present in the table. But the value comparison function
compared the dates as strings instead of dates.
The bug was fixed by splitting the function
get_date_from_str in two: One part that parses and does
error checking. This function is now visible outside the
module. The old get_date_from_str now calls the new
function.
SELECT ... WHERE ... IN (NULL, ...) does full table scan,
even if the same query without the NULL uses efficient range scan.
The bugfix for the bug 18360 introduced an optimization:
if
1) all right-hand arguments of the IN function are constants
2) result types of all right argument items are compatible
enough to use the same single comparison function to
compare all of them to the left argument,
then
we can convert the right-hand list of constant items to an array
of equally-typed constant values for the further
QUICK index access etc. (see Item_func_in::fix_length_and_dec()).
The Item_null constant item objects have STRING_RESULT
result types, so, as far as Item_func_in::fix_length_and_dec()
is aware of NULLs in the right list, this improvement efficiently
optimizes IN function calls with a mixed right list of NULLs and
string constants. However, the optimization doesn't affect mixed
lists of NULLs and integers, floats etc., because there is no
unique common comparator.
New optimization has been added to ignore the result type
of NULL constants in the static analysis of mixed right-hand lists.
This is safe, because at the execution phase we care about
presence of NULLs anyway.
1. The collect_cmp_types() function has been modified to optionally
ignore NULL constants in the item list.
2. NULL-skipping code of the Item_func_in::fix_length_and_dec()
function has been modified to work not only with in_string
vectors but with in_vectors of other types.
with gcc 4.3.2
This patch fixes a number of GCC warnings about variables used
before initialized. A new macro UNINIT_VAR() is introduced for
use in the variable declaration, and LINT_INIT() usage will be
gradually deprecated. (A workaround is used for g++, pending a
patch for a g++ bug.)
GCC warnings for unused results (attribute warn_unused_result)
for a number of system calls (present at least in later
Ubuntus, where the usual void cast trick doesn't work) are
also fixed.
The problem was that creating a DECIMAL column from a decimal
value could lead to a failed assertion as decimal values can
have a higher precision than those attached to a table. The
assert could be triggered by creating a table from a decimal
with a large (> 30) scale. Also, there was a problem in
calculating the number of digits in the integral and fractional
parts if both exceeded the maximum number of digits permitted
by the new decimal type.
The solution is to ensure that truncation procedure is executed
when deducing a DECIMAL column from a decimal value of higher
precision. If the integer part is equal to or bigger than the
maximum precision for the DECIMAL type (65), the integer part
is truncated to fit and the fractional becomes zero. Otherwise,
the fractional part is truncated to fit into the space left
after the integer part is copied.
This patch borrows code and ideas from Martin Hansson's patch.
Using DECIMAL constants with more than 65 digits in CREATE
TABLE ... SELECT led to bogus errors in release builds or
assertion failures in debug builds.
The problem was in inconsistency in how DECIMAL constants and
fields are handled internally. We allow arbitrarily long
DECIMAL constants, whereas DECIMAL(M,D) columns are limited to
M<=65 and D<=30. my_decimal_precision_to_length() was used in
both Item and Field code and truncated precision to
DECIMAL_MAX_PRECISION when calculating value length without
adjusting precision and decimals. As a result, a DECIMAL
constant with more than 65 digits ended up having length less
than precision or decimals which led to assertion failures.
Fixed by modifying my_decimal_precision_to_length() so that
precision is truncated to DECIMAL_MAX_PRECISION only for Field
object which is indicated by the new 'truncate' parameter.
Another inconsistency fixed by this patch is how DECIMAL
constants and expressions are handled for CREATE ... SELECT.
create_tmp_field_from_item() (which is used for constants) was
changed as a part of the bugfix for bug #24907 to handle long
DECIMAL constants gracefully. Item_func::tmp_table_field()
(which is used for expressions) on the other hand was still
using a simplistic approach when creating a Field_new_decimal
from a DECIMAL expression.
with gcc 4.3.2
Compiling MySQL with gcc 4.3.2 and later produces a number of
warnings, many of which are new with the recent compiler
versions.
This bug will be resolved in more than one patch to limit the
size of changesets. This is the first patch, fixing a number
of the warnings, predominantly "suggest using parentheses
around && in ||", and empty for and while bodies.
with gcc 4.3.2
Compiling MySQL with gcc 4.3.2 and later produces a number of
warnings, many of which are new with the recent compiler
versions.
This bug will be resolved in more than one patch to limit the
size of changesets. This is the first patch, fixing a number
of the warnings, predominantly "suggest using parentheses
around && in ||", and empty for and while bodies.
HAVING
When calculating GROUP BY the server caches some expressions. It does
that by allocating a string slot (Item_copy_string) and assigning the
value of the expression to it. This effectively means that the result
type of the expression can be changed from whatever it was to a string.
As this substitution takes place after the compile-time result type
calculation for IN but before the run-time type calculations,
it causes the type calculations in the IN function done at run time
to get unexpected results different from what was prepared at compile time.
In the CASE ... WHEN ... THEN ... statement there was a similar problem
and it was solved by artificially adding a STRING argument to the set of
types of the IN/CASE arguments at compile time, so if any of the
arguments of the CASE function changes its type to a string it will
still be covered by the information prepared at compile time.
and HAVING
When calculating GROUP BY the server caches some expressions. It does
that by allocating a string slot (Item_copy_string) and assigning the
value of the expression to it. This effectively means that the result
type of the expression can be changed from whatever it was to a string.
As this substitution takes place after the compile-time result type
calculation for IN but before the run-time type calculations,
it causes the type calculations in the IN function done at run time
to get unexpected results different from what was prepared at compile time.
In the CASE ... WHEN ... THEN ... statement there was a similar problem
and it was solved by artificially adding a STRING argument to the matrix
at compile time, so if any of the arguments of the CASE function changes
its type to a string it will still be covered by the information prepared
at compile time.
Extended the CASE fix for cover the IN case.
An alternative way of fixing this problem is by caching the result type of
the arguments at compile time and using the cached information at run time
instead of re-calculating the result types.
Preferred the CASE approach for uniformity and fix localization.
In case of ROW item each compared pair does not
check if argumet collations can be aggregated and
thus appropiriate item conversion does not happen.
The fix is to add the check and convertion for ROW
pairs.
Item_in_optimizer::is_null() evaluated "NULL IN (SELECT ...)" to NULL regardless of
whether subquery produced any records, this was a documented limitation.
The limitation has been removed (see bugs 8804, 24085, 24127) now
Item_in_optimizer::val_int() correctly handles all cases with NULLs. Make
Item_in_optimizer::is_null() invoke val_int() to return correct values for
"NULL IN (SELECT ...)".
Execution of queries containing the CASE function of
aggregate function like in "SELECT ... CASE ARGV(...) WHEN ..."
crashed the server.
The CASE function caches pointers to concrete comparison
functions for an each pair of types of CASE-WHERE clause
parameters, i.e. for the "CASE INT_RESULT WHERE REAL_RESULT
THEN ... WHERE DECIMAL_RESULT ... END" function call it
caches comparisons for INT_RESULT with REAL_RESULT and
for INT_RESULT with DECIMAL_RESULT. Usually a result
type is known after a call to the fix_fields function,
however, the setup_copy_fields function call may
wrap aggregate items with Item_copy_string that has
STRING_RESULT result type, so setup_copy_fields may
change argument result types of the CASE function after
call to Item_func_case::fix_fields/fix_length_and_dec.
Then the Item_func_case::find_item function tries to
use comparison function for unexpected pair of the
STRING_RESULT and some other type - that caused
an assertion failure of server crash.
The Item_func_case::fix_length_and_dec function has
been modified to take into account possible STRING_RESULT
result type in the presence of aggregate arguments of
the CASE function.
IF(..., CAST(longtext AS UNSIGNED), signed_val)
(was: LEFT JOIN on inline view crashes server)
Select from a LONGTEXT column wrapped with an expression
like "IF(..., CAST(longtext_column AS UNSIGNED), smth_signed)"
failed an assertion or crashed the server. IFNULL function was
affected too.
LONGTEXT column item has a maximum length of 32^2-1 bytes,
at the same time this is a maximum possible length of any
MySQL item. CAST(longtext_column AS UNSIGNED) returns some
unsigned numeric result of length 32^2-1, so the result of
IF/IFNULL function of this number and some other signed number
will have text length of (32^2-1)+1=32^2 (one byte for the
minus sign) - there is integer overflow, and the length is
equal to zero. That caused assert/crash.
The bug has been fixed by the same solution as in the CASE
function implementation.