Before this patch the crash occured when a single row dataset is used and
Item::remove_eq_conds() is called for HAVING. This function is not supposed
to be called after the elimination of multiple equalities.
To fix this problem instead of Item::remove_eq_conds() Item::val_int() is
used. In this case the optimizer tries to evaluate the condition for the
single row dataset and discovers impossible HAVING immediately. So, the
execution phase is skipped.
Approved by Igor Babaev <igor@maridb.com>
When resolving a column from the HAVING clause, a new Item_field
object may be created inside Item_ref::fix_fields().
But the object is created with an empty name resolution context,
which then leads to debug assertion failure during
Item_field::fix_fields().
The solution is to pass the correct name resolution context
when creating the Item_field object.
Reviewer: Oleksandr Byelkin (sanja@mariadb.com)
The cause of regression was handling for ROWNUM() function.
For queries like
SELECT ROWNUM() FROM ... ORDER BY ...
ROWNUM() should be computed before the ORDER BY.
The computation was moved to be before the ORDER BY for any entries in
the select list that had RAND_TABLE_BIT set.
This had a negative impact on queries in form:
SELECT sp_func() FROM t1 ORDER BY ... LIMIT n
where sp_func() is NOT declared as DETERMINISTIC (and so has
RAND_TABLE_BIT set).
The fix is to require evaluation for sorting only for the ROWNUM()
function. Functions that just have RAND_TABLE_BIT() can be computed
after ORDER BY ... LIMIT is applied.
(think about a possible index that satisfies the ORDER BY clause. In
that case, the the rows would be read in the needed order and we would
stop after reading LIMIT rows, achieving the same effect).
The ROWNUM() function is for SELECT mapped to JOIN->accepted_rows, which is
incremented for each accepted rows.
For Filesort, update, insert, delete and load data, we map ROWNUM() to
internal variables incremented when the table is changed.
The connection between the row counter and Item_func_rownum is done
in sql_select.cc::fix_items_after_optimize() and
sql_insert.cc::fix_rownum_pointers()
When ROWNUM() is used anywhere in query, the optimization to ignore ORDER
BY in sub queries are disabled. This was done to get the following common
Oracle query to work:
select * from (select * from t1 order by a desc) as t where rownum() <= 2;
MDEV-3926 "Wrong result with GROUP BY ... WITH ROLLUP" contains a discussion
about this topic.
LIMIT optimization is enabled when in a top level WHERE clause comparing
ROWNUM() with a numerical constant using any of the following expressions:
- ROWNUM() < #
- ROWNUM() <= #
- ROWNUM() = 1
ROWNUM() can be also be the right argument to the comparison function.
LIMIT optimization is done in two cases:
- For the current sub query when the ROWNUM comparison is done on the top
level:
SELECT * from t1 WHERE rownum() <= 2 AND t1.a > 0
- For an inner sub query, when the upper level has only a ROWNUM comparison
in the WHERE clause:
SELECT * from (select * from t1) as t WHERE rownum() <= 2
In Oracle mode, one can also use ROWNUM without parentheses.
Other things:
- Fixed bug where the optimizer tries to optimize away sub queries
with RAND_TABLE_BIT set (non-deterministic queries). Now these
sub queries will not be converted to joins. This bug fix was also
needed to get rownum() working inside subqueries.
- In remove_const() remove setting simple_order to FALSE if ROLLUP is
USED. This code was disable a long time ago because of wrong assignment
in the following code. Instead we set simple_order to false if
RAND_TABLE_BIT was used in the SELECT list. This ensures that
we don't delete ORDER BY if the result set is not deterministic, like
in 'SELECT RAND() AS 'r' FROM t1 ORDER BY r';
- Updated parameters for Sort_param::init_for_filesort() to be able
to provide filesort with information where the number of accepted
rows should be stored
- Reordered fields in class Filesort to optimize storage layout
- Added new error messsage to tell that a function can't be used in HAVING
- Added field 'with_rownum' to THD to mark that ROWNUM() is used in the
query.
Co-author: Oleksandr Byelkin <sanja@mariadb.com>
LIMIT optimization for sub query
Fixes also:
MDEV-24942 Server crashes in _ma_rec_pack... with DEFAULT() on BLOB
This was caused by two different bugs, both related to that the default
value for the blob was not calculated before it was used:
- There where now Item_default_value::..result() wrappers, which is
needed as item in HAVING uses these. This causes crashes when
using a reference to a DEFAULT(blob_field) in HAVING. It also
caused wrong results when used with other fields with default value
expressions that are not constants.
- create_tmp_field() did not take into account that blob fields with
default expressions are not yet initialized. Fixed by treating
Item_default_value(blob) like a normal item expression.
This bug is caused by pushdown from HAVING into WHERE.
It appears because condition that is pushed wasn't fixed.
It is also discovered that condition pushdown from HAVING into
WHERE is done wrong. There is no need to build clones for some
conditions that can be pushed. They can be simply moved from HAVING
into WHERE without cloning.
build_pushable_cond_for_having_pushdown(),
remove_pushed_top_conjuncts_for_having() methods are changed.
It is found that there is no transformation made for fields of
pushed condition.
field_transformer_for_having_pushdown transformer is added.
New tests are added. Some comments are changed.
Optimized the code that removed multiple equalities pushed from HAVING
into WHERE. Now this removal is postponed until all multiple equalities
are eliminated in substitute_for_best_equal_field().
Condition can be pushed from the HAVING clause into the WHERE clause
if it depends only on the fields that are used in the GROUP BY list
or depends on the fields that are equal to grouping fields.
Aggregate functions can't be pushed down.
How the pushdown is performed on the example:
SELECT t1.a,MAX(t1.b)
FROM t1
GROUP BY t1.a
HAVING (t1.a>2) AND (MAX(c)>12);
=>
SELECT t1.a,MAX(t1.b)
FROM t1
WHERE (t1.a>2)
GROUP BY t1.a
HAVING (MAX(c)>12);
The implementation scheme:
1. Extract the most restrictive condition cond from the HAVING clause of
the select that depends only on the fields that are used in the GROUP BY
list of the select (directly or indirectly through equalities)
2. Save cond as a condition that can be pushed into the WHERE clause
of the select
3. Remove cond from the HAVING clause if it is possible
The optimization is implemented in the function
st_select_lex::pushdown_from_having_into_where().
New test file having_cond_pushdown.test is created.