Made the parser to support parenthesis around UNION branches.
This is done by amending the rules of the parser so it generates the correct
structure.
Currently it supports arbitrary subquery/join/parenthesis operations in the
EXISTS clause.
In the IN/scalar subquery case it will allow adding nested parenthesis only
if there is an UNION clause after the parenthesis. Otherwise it will just
treat the multiple nested parenthesis as a scalar expression.
It adds extra lex level for ((SELECT ...) UNION ...) to accommodate for the
UNION clause.
Treat queries with no FROM and aggregate functions as normal queries,
so the aggregate function get correctly calculated as if there is 1 row.
This means that they will be considered to have one row, so COUNT(*) will return
1 instead of 0. Other aggregates will behave in compatible manner.
- Make the range-et-al optimizer produce E(#table records after table
condition is applied),
- Make the join optimizer use this value,
- Add "filtered" column to EXPLAIN EXTENDED to show
fraction of records left after table condition is applied
- Adjust test results, add comments
The problem was in that opt_sum_query() replaced MIN/MAX functions
with the corresponding constant found in a key, but due to imprecise
representation of float numbers, when evaluating the where clause,
this comparison failed.
When MIN/MAX optimization detects that all tables can be removed,
also remove all conjuncts in a where clause that refer to these
tables. As a result of this fix, these conditions are not evaluated
twice, and in the case of float number comparisons we do not discard
result rows due to imprecise float representation.
As a side-effect this fix also corrects an unnoticed problem in
bug 12882.
The bug caused a crash of the server if a subquery with
ORDER BY DESC used the range access method.
The bug happened because the method QUICK_SELECT_DESC::reset
was not reworked after MRR interface had been introduced.
The bug was due to a loss happened during a refactoring made
on May 30 2005 that modified the function JOIN::reinit.
As a result of it for any subquery the value of offset_limit_cnt
was not restored for the following executions. Yet the first
execution of the subquery made it equal to 0.
The fix restores this value in the function JOIN::reinit.
DESCRIBE returned the type BIGINT for a column of a view if the column
was specified by an expression over values of the type INT.
E.g. for the view defined as follows:
CREATE VIEW v1 SELECT COALESCE(f1,f2) FROM t1
DESCRIBE returned type BIGINT for the only column of the view if f1,f2 are
columns of the INT type.
At the same time DESCRIBE returned type INT for the only column of the table
defined by the statement:
CREATE TABLE t2 SELECT COALESCE(f1,f2) FROM t1.
This inconsistency was removed by the patch.
Now the code chooses between INT/BIGINT depending on the
precision of the aggregated column type.
Thus both DESCRIBE commands above returns type INT for v1 and t2.
may return a wrong result.
An Item_sum_hybrid object has the was_values flag which indicates whether any
values were added to the sum function. By default it is set to true and reset
to false on any no_rows_in_result() call. This method is called only in
return_zero_rows() function. An ALL/ANY subquery can be optimized by MIN/MAX
optimization. The was_values flag is used to indicate whether the subquery
has returned at least one row. This bug occurs because return_zero_rows() is
called only when we know that the select will return zero rows before
starting any scans but often such information is not known.
In the reported case the return_zero_rows() function is not called and
the was_values flag is not reset to false and yet the subquery return no rows
Item_func_not_all and Item_func_nop_all functions return a wrong
comparison result.
The end_send_group() function now calls no_rows_in_result() for each item
in the fields_list if there is no rows were found for the (sub)query.
The ALL/ANY subqueries are the subject of MIN/MAX optimization. The matter
of this optimization is to embed MIN() or MAX() function into the subquery
in order to get only one row by which we can tell whether the expression
with ALL/ANY subquery is true or false.
But when it is applied to a subquery like 'select a_constant' the reported bug
occurs. As no tables are specified in the subquery the do_select() function
isn't called for the optimized subquery and thus no values have been added
to a MIN()/MAX() function and it returns NULL instead of a_constant.
This leads to a wrong query result.
For the subquery like 'select a_constant' there is no reason to apply
MIN/MAX optimization because the subquery anyway will return at most one row.
Thus the Item_maxmin_subselect class is more appropriate for handling such
subqueries.
The Item_in_subselect::single_value_transformer() function now checks
whether tables are specified for the subquery. If no then this subselect is
handled like a UNION using an Item_maxmin_subselect object.
The problem was that we restored SQL_CACHE, SQL_NO_CACHE flags in SELECT
statement from internal structures based on value set later at runtime, not
the original value set by the user.
The solution is to remember that original value.
The convert_constant_item() function converts constant items to ints on
prepare phase to optimize execution speed. In this case it tries to evaluate
subselect which contains a derived table and is contained in a derived table.
All derived tables are filled only after all derived tables are prepared.
So evaluation of subselect with derived table at the prepare phase will
return a wrong result.
A new flag with_subselect is added to the Item class. It indicates that
expression which this item represents is a subselect or contains a subselect.
It is set to 0 by default. It is set to 1 in the Item_subselect constructor
for subselects.
For Item_func and Item_cond derived classes it is set after fixing any argument
in Item_func::fix_fields() and Item_cond::fix_fields accordingly.
The convert_constant_item() function now doesn't convert a constant item
if the with_subselect flag set in it.
When a view statement is compiled on CREATE VIEW time, most of the
optimizations should not be done. Finding the right optimization
for a subquery is one of them.
Unfortunately the optimizer is resolving the column references of
the left expression of IN subqueries in the process of deciding
witch optimization to use (if needed). So there should be a
special case in Item_in_subselect::fix_fields() : check the
validity of the left expression of IN subqueries in CREATE VIEW
mode and then proceed as normal.
Re-work best_access_path() and find_best() to reuse E(#rows(range access)) as
E(#rows(ref[_or_null](const) access) only when it is appropriate.
[This is the final cumulative patch]
This performance degradation was due to the fact that some
cost evaluation code added into 4.1 in the function find_best was
not merged into the code of the function best_access_path added
together with other code for greedy optimizer.
Added a parameter to the function print_plan. The parameter contains
accumulated cost for a given partial join.
The patch does not include a special test case since this performance
degradation is hard to reproduse with a simple example.
TODO: make the function find_best use the function best_access_path
in order to remove duplication of code which might result in incomplete
merges in the future.