The problem occurred when one had a subquery that had an equality X=Y where
Y referred to a named select list expression from the parent select. MySQL
crashed when trying to use the X=Y equality for ref-based access.
Fixed by allowing non-Item_field items in the described case.
server crash.
The filesort implementation has an optimization for subquery execution which
consists of reusing previously allocated buffers. In particular the call to
the read_buffpek_from_file function might be skipped when a big enough buffer
for buffer descriptors (buffpeks) is already allocated. Beside allocating
memory for buffpeks this function fills allocated buffer with data read from
disk. Skipping it might led to using an arbitrary memory as fields' data and
finally to a crash.
Now the read_buffpek_from_file function is always called. It allocates
new buffer only when necessary, but always fill it with correct data.
Default values of variables were not subject to upper/lower bounds
and step, while setting variables was. Bounds and step are also
applied to defaults now; defaults are corrected quietly, values
given by the user are corrected, and a correction-warning is thrown
as needed. Lastly, very large values could wrap around, starting
from 0 again. They are bounded at the maximum value for the
respective data-type now if no lower maximum is specified in the
variable's definition.
crashes MySQL 5.122
There was a difference in how UNIONs are handled
on top level and when in sub-query.
Because the rules for sub-queries were syntactically
allowing cases that are not currently supported by
the server we had crashes (this bug) or wrong results
(bug 32051).
Fixed by making the syntax rules for UNIONs match the
ones at top level.
These rules however do not support nesting UNIONs, e.g.
(SELECT a FROM t1 UNION ALL SELECT b FROM t2)
UNION
(SELECT c FROM t3 UNION ALL SELECT d FROM t4)
Supports for statements with nested UNIONs will be
added in a future version.
Index lookup does not always guarantee that we can
simply remove the relevant conditions from the WHERE
clause. Reasons can be e.g. conversion errors,
partial indexes etc.
The optimizer was removing these parts of the WHERE
condition without any further checking.
This leads to "false positives" when using indexes.
Fixed by checking the index reference conditions
(using WHERE) when using indexes with sub-queries.
only on some occasions
Referencing an element from the SELECT list in a WHERE
clause is not permitted. The namespace of the WHERE
clause is the table columns only. This was not enforced
correctly when resolving outer references in sub-queries.
Fixed by not allowing references to aliases in a
sub-query in WHERE.
This bug is actually two. The first one manifests itself on an EXPLAIN
SELECT query with nested subqueries that employs the filesort algorithm.
The whole SELECT under explain is marked as UNCACHEABLE_EXPLAIN to preserve
some temporary structures for explain. As a side-effect of this values of
nested subqueries weren't cached and subqueries were re-evaluated many
times. Each time buffer for filesort was allocated but wasn't freed because
freeing occurs at the end of topmost SELECT. Thus all available memory was
eaten up step by step and OOM event occur.
The second bug manifests itself on SELECT queries with conditions where
a subquery result is compared with a key field and the subquery itself also
has such condition. When a long chain of such nested subqueries is present
the stack overrun occur. This happens because at some point the range optimizer
temporary puts the PARAM structure on the stack. Its size if about 8K and
the stack is exhausted very fast.
Now the subselect_single_select_engine::exec function allows subquery result
caching when the UNCACHEABLE_EXPLAIN flag is set.
Now the SQL_SELECT::test_quick_select function calls the check_stack_overrun
function for stack checking purposes to prevent server crash.
After adding an index the <VARBINARY> IN (SELECT <BINARY> ...)
clause returned a wrong result: the VARBINARY value was illegally padded
with zero bytes to the length of the BINARY column for the index search.
(<VARBINARY>, ...) IN (SELECT <BINARY>, ... ) clauses are affected too.
Item_in_subselect's only externally callable method is val_bool().
However the nullability in the wrapper class (Item_in_optimizer) is
established by calling the "forbidden" method val_int().
Fixed to use the correct method (val_bool() ) to establish nullability
of Item_in_subselect in Item_in_optimizer.
query / no aggregate of subquery
The optimizer counts the aggregate functions that
appear as top level expressions (in all_fields) in
the current subquery. Later it makes a list of these
that it uses to actually execute the aggregates in
end_send_group().
That count is used in several places as a flag whether
there are aggregates functions.
While collecting the above info it must not consider
aggregates that are not aggregated in the current
context. It must treat them as normal expressions
instead. Not doing that leads to incorrect data about
the query, e.g. running a query that actually has no
aggregate functions as if it has some (and hence is
expected to return only one row).
Fixed by ignoring the aggregates that are not aggregated
in the current context.
One other smaller omission discovered and fixed in the
process : the place of aggregation was not calculated for
user defined functions. Fixed by calling
Item_sum::init_sum_func_check() and
Item_sum::check_sum_func() as it's done for the rest of
the aggregate functions.
ORDER BY and LIMIT 1.
The bug was introduced by the patch for bug 21727. The patch
erroneously skipped initialization of the array of headers
for sorted records for non-first evaluations of the subquery.
To fix the problem a new parameter has been added to the
function make_char_array that performs the initialization.
Now this function is called for any invocation of the
filesort procedure. Yet it allocates the buffer for sorted
records only if this parameter is NULL.
using a derived table over a grouping subselect.
This crash happens only when materialization of the derived tables
requires creation of auxiliary temporary table, for example when
a grouping operation is carried out with usage of a temporary table.
The crash happened because EXPLAIN EXTENDED when printing the query
expression made an attempt to use the objects created in the mem_root
of the temporary table which has been already freed by the moment
when printing is called.
This bug appeared after the method Item_field::print() had been
introduced.
Non-correlated scalar subqueries may get executed
in EXPLAIN at the optimization phase if they are
part of a right hand sargable expression.
If the scalar subquery uses a temp table to
materialize its results it will replace the
subquery structure from the parser with a simple
select from the materialization table.
As a result the EXPLAIN will crash as the
temporary materialization table is not to be shown
in EXPLAIN at all.
Fixed by preserving the original query structure
right after calling optimize() for scalar subqueries
with temp tables executed during EXPLAIN.
DATE and DATETIME can be compared either as strings or as int. Both
methods have their disadvantages. Strings can contain valid DATETIME value
but have insignificant zeros omitted thus became non-comparable with
other DATETIME strings. The comparison as int usually will require conversion
from the string representation and the automatic conversion in most cases is
carried out in a wrong way thus producing wrong comparison result. Another
problem occurs when one tries to compare DATE field with a DATETIME constant.
The constant is converted to DATE losing its precision i.e. losing time part.
This fix addresses the problems described above by adding a special
DATE/DATETIME comparator. The comparator correctly converts DATE/DATETIME
string values to int when it's necessary, adds zero time part (00:00:00)
to DATE values to compare them correctly to DATETIME values. Due to correct
conversion malformed DATETIME string values are correctly compared to other
DATE/DATETIME values.
As of this patch a DATE value equals to DATETIME value with zero time part.
For example '2001-01-01' equals to '2001-01-01 00:00:00'.
The compare_datetime() function is added to the Arg_comparator class.
It implements the correct comparator for DATE/DATETIME values.
Two supplementary functions called get_date_from_str() and get_datetime_value()
are added. The first one extracts DATE/DATETIME value from a string and the
second one retrieves the correct DATE/DATETIME value from an item.
The new Arg_comparator::can_compare_as_dates() function is added and used
to check whether two given items can be compared by the compare_datetime()
comparator.
Two caching variables were added to the Arg_comparator class to speedup the
DATE/DATETIME comparison.
One more store() method was added to the Item_cache_int class to cache int
values.
The new is_datetime() function was added to the Item class. It indicates
whether the item returns a DATE/DATETIME value.
Validity checks for nested set functions
were not taking into account that the enclosed
set function may be on a nest level that is
lower than the nest level of the enclosing set
function.
Fixed by :
- propagating max_sum_func_level
up the enclosing set functions chain.
- updating the max_sum_func_level of the
enclosing set function when the enclosed set
function is aggregated above or on the same
nest level of as the level of the enclosing
set function.
- updating the max_arg_level of the enclosing
set function on a reference that refers to
an item above or on the same nest level
as the level of the enclosing set function.
- Treating both Item_field and Item_ref as possibly
referencing items from outer nest levels.
The Item_outer_ref class based on the Item_direct_ref class was always used
to represent an outer field. But if the outer select is a grouping one and the
outer field isn't under an aggregate function which is aggregated in that
outer select an Item_ref object should be used to represent such a field.
If the outer select in which the outer field is resolved isn't grouping then
the Item_field class should be used to represent such a field.
This logic also should be used for an outer field resolved through its alias
name.
Now the Item_field::fix_outer_field() uses Item_outer_field objects to
represent aliased and non-aliased outer fields for grouping outer selects
only.
Now the fix_inner_refs() function chooses which class to use to access outer
field - the Item_ref or the Item_direct_ref. An object of the chosen class
substitutes the original field in the Item_outer_ref object.
The direct_ref and the found_in_select_list fields were added to the
Item_outer_ref class.
If a set function with a outer reference s(outer_ref) cannot be aggregated
the outer query against which the reference has been resolved then MySQL
interpretes s(outer_ref) in the same way as it would interpret s(const).
Hovever the standard requires throwing an error in this situation.
Added some code to support this requirement in ansi mode.
Corrected another minor bug in Item_sum::check_sum_func.
context was used as an argument of GROUP_CONCAT.
Ensured correct setting of the depended_from field in references
generated for set functions aggregated in outer selects.
A wrong value of this field resulted in wrong maps returned by
used_tables() for these references.
Made sure that a temporary table field is added for any set function
aggregated in outer context when creation of a temporary table is
needed to execute the inner subquery.
aggregated in outer context returned wrong results.
This happened only if the subquery did not contain any references
to outer fields.
As there were no references to outer fields the subquery erroneously
was taken for non-correlated one.
Now any set function aggregated in outer context makes the subquery
correlated.
when the column is to be read from a derived table column which
was specified as a concatenation of string literals.
The bug happened because the Item_string::append did not adjust the
value of Item_string::max_length. As a result of it the temporary
table column defined to store the concatenation of literals was
not wide enough to hold the whole value.
away.
During optimization stage the WHERE conditions can be changed or even
be removed at all if they know for sure to be true of false. Thus they aren't
showed in the EXPLAIN EXTENDED which prints conditions after optimization.
Now if all elements of an Item_cond were removed this Item_cond is substituted
for an Item_int with the int value of the Item_cond.
If there were conditions that were totally optimized away then values of the
saved cond_value and having_value will be printed instead.
created for sorting.
Any outer reference in a subquery was represented by an Item_field object.
If the outer select employs a temporary table all such fields should be
replaced with fields from that temporary table in order to point to the
actual data. This replacement wasn't done and that resulted in a wrong
subquery evaluation and a wrong result of the whole query.
Now any outer field is represented by two objects - Item_field placed in the
outer select and Item_outer_ref in the subquery. Item_field object is
processed as a normal field and the reference to it is saved in the
ref_pointer_array. Thus the Item_outer_ref is always references the correct
field. The original field is substituted for a reference in the
Item_field::fix_outer_field() function.
New function called fix_inner_refs() is added to fix fields referenced from
inner selects and to fix references (Item_ref objects) to these fields.
The new Item_outer_ref class is a descendant of the Item_direct_ref class.
It additionally stores a reference to the original field and designed to
behave more like a field.
Before this fix, a IN predicate of the form: "IN (( subselect ))", with two
parenthesis, would be evaluated as a single row subselect: if the subselect
returns more that 1 row, the statement would fail.
The SQL:2003 standard defines a special exception in the specification,
and mandates that this particular form of IN predicate shall be equivalent
to "IN ( subselect )", which involves a table subquery and works with more
than 1 row.
This fix implements "IN (( subselect ))", "IN ((( subselect )))" etc
as per the SQL:2003 requirement.
All the details related to the implementation of this change have been
commented in the code, and the relevant sections of the SQL:2003 spec
are given for reference, so they are not repeated here.
Having access to the spec is a requirement to review in depth this patch.
The bug report has demonstrated the following two problems.
1. If an ORDER/GROUP BY list includes a constant expression being
optimized away and, at the same time, containing single-row
subselects that return more that one row, no error is reported.
Strictly speaking the standard allows to ignore error in this case.
Yet, now a corresponding fatal error is reported in this case.
2. If a query requires sorting by expressions containing single-row
subselects that, however, return more than one row, then the execution
of the query may cause a server crash.
To fix this some code has been added that blocks execution of a subselect
item in case of a fatal error in the method Item_subselect::exec.