created for sorting.
Any outer reference in a subquery was represented by an Item_field object.
If the outer select employs a temporary table all such fields should be
replaced with fields from that temporary table in order to point to the
actual data. This replacement wasn't done and that resulted in a wrong
subquery evaluation and a wrong result of the whole query.
Now any outer field is represented by two objects - Item_field placed in the
outer select and Item_outer_ref in the subquery. Item_field object is
processed as a normal field and the reference to it is saved in the
ref_pointer_array. Thus the Item_outer_ref is always references the correct
field. The original field is substituted for a reference in the
Item_field::fix_outer_field() function.
New function called fix_inner_refs() is added to fix fields referenced from
inner selects and to fix references (Item_ref objects) to these fields.
The new Item_outer_ref class is a descendant of the Item_direct_ref class.
It additionally stores a reference to the original field and designed to
behave more like a field.
Before this fix, a IN predicate of the form: "IN (( subselect ))", with two
parenthesis, would be evaluated as a single row subselect: if the subselect
returns more that 1 row, the statement would fail.
The SQL:2003 standard defines a special exception in the specification,
and mandates that this particular form of IN predicate shall be equivalent
to "IN ( subselect )", which involves a table subquery and works with more
than 1 row.
This fix implements "IN (( subselect ))", "IN ((( subselect )))" etc
as per the SQL:2003 requirement.
All the details related to the implementation of this change have been
commented in the code, and the relevant sections of the SQL:2003 spec
are given for reference, so they are not repeated here.
Having access to the spec is a requirement to review in depth this patch.
The bug report has demonstrated the following two problems.
1. If an ORDER/GROUP BY list includes a constant expression being
optimized away and, at the same time, containing single-row
subselects that return more that one row, no error is reported.
Strictly speaking the standard allows to ignore error in this case.
Yet, now a corresponding fatal error is reported in this case.
2. If a query requires sorting by expressions containing single-row
subselects that, however, return more than one row, then the execution
of the query may cause a server crash.
To fix this some code has been added that blocks execution of a subselect
item in case of a fatal error in the method Item_subselect::exec.
UNION over correlated and uncorrelated SELECTS.
In such subqueries each uncorrelated SELECT should be considered as
uncacheable. Otherwise join_free is called for it and in many cases
it causes some problems.
when they contain the '!' operator.
Added an implementation for the method Item_func_not::print.
The method encloses any NOT expression into extra parentheses to avoid
incorrect stored representations of views that use the '!' operators.
Without this change when a view was created that contained
the expression !0*5 its stored representation contained not this
expression but rather the expression not(0)*5 .
The operator '!' is of a higher precedence than '*', while NOT is
of a lower precedence than '*'. That's why the expression !0*5
is interpreted as not(0)*5, while the expression not(0)*5 is interpreted
as not((0)*5) unless sql_mode is set to HIGH_NOT_PRECEDENCE.
Now we translate !0*5 into (not(0))*5.
- Make the code produce correct result: use an array of triggers to turn on/off equalities for each
compared column. Also turn on/off optimizations based on those equalities.
- Make EXPLAIN output show "Full scan on NULL key" for tables for which we switch between
ref/unique_subquery/index_subquery and ALL access.
- index_subquery engine now has HAVING clause when it is needed, and it is
displayed in EXPLAIN EXTENDED
- Fix incorrect presense of "Using index" for index/unique-based subqueries (BUG#22930)
// bk trigger note: this commit refers to BUG#24127
When transforming "oe IN (SELECT ie ...)" wrap the pushed-down predicates
iff "oe can be null", not "ie can be null".
The fix doesn't cover row-based subqueries, those will be fixed in #24127.
We create Item_cache_* object for each operand for each left operand of
a subquery predicate. We also create Item_func_conv_charset for each string
constant that needs charset conversion. So here we have Item_cache wrapped
into Item_func_conv_charset.
When Item_func_conv_charset wraps an constant Item it gets it's value
in constructor. The problem is that Item_cache is ready to be used only
at execution time, which is too late.
The fix makes Item_cache wrapping constant to get ready at fix_fields() time.
- When returning metadata for scalar subqueries the actual type of the
column was calculated based on the value type, which limits the actual
type of a scalar subselect to the set of (currently) 3 basic types :
integer, double precision or string. This is the reason that columns
of types other then the basic ones (e.g. date/time) are reported as
being of the corresponding basic type.
Fixed by storing/returning information for the column type in addition
to the result type.
This is a performance issue for queries with subqueries evaluation
of which requires filesort.
Allocation of memory for the sort buffer at each evaluation of a
subquery may take a significant amount of time if the buffer is rather big.
With the fix we allocate the buffer at the first evaluation of the
subquery and reuse it at each subsequent evaluation.
Evaluate "NULL IN (SELECT ...)" in a special way: Disable pushed-down
conditions and their "consequences":
= Do full table scans instead of unique_[index_subquery] lookups.
= Change appropriate "ref_or_null" accesses to full table scans in
subquery's joins.
Also cache value of NULL IN (SELECT ...) if the SELECT is not correlated
wrt any upper select.
If elements a not top-level IN subquery were accessed by an index and
the subquery result set included a NULL value then the quantified
predicate that contained the subquery was evaluated to NULL when
it should return a non-null value.
list using a function
When executing dependent subqueries they are re-inited and re-exec() for
each row of the outer context.
The cause for the bug is that during subquery reinitialization/re-execution,
the optimizer reallocates JOIN::join_tab, JOIN::table in make_simple_join()
and the local variable in 'sortorder' in create_sort_index(), which is
allocated by make_unireg_sortorder().
Care must be taken not to allocate anything into the thread's memory pool
while re-initializing query plan structures between subquery re-executions.
All such items mush be cached and reused because the thread's memory pool
is freed at the end of the whole query.
Note that they must be cached and reused even for queries that are not
otherwise cacheable because otherwise it will grow the thread's memory
pool every time a cacheable query is re-executed.
We provide additional members to the JOIN structure to store references
to the items that need to be cached.
strings
MySQL is setting the flag HA_END_SPACE_KEYS for all the keys that reference
text or varchar columns with collation different than binary.
This was done to handle correctly the situation where a lookup on such a key
may return more than 1 row because of the presence of many rows that differ
only by the amount of trailing space in the table's string column.
Inserting such values however appears to violate the unique checks on
INSERT/UPDATE. Thus that flag must not be set as it will prevent the optimizer
from choosing a faster access method.
This fix removes the setting of the HA_END_SPACE_KEYS flag.