Window is a reserved keyword according to SQL Standard 2016. However, we
can make the grammar slightly flexible by allowing WINDOW keyword everywhere
except table aliases. Change yacc grammar to separate between all keywords
and table_alias keywords.
When creating an Item_direct_view_ref we were setting the
Name_resolution_context based on TABLE_LIST::view member variable.
However, for derived tables this member is NULL. To not set a wrong
context, if TABLE_LIST::view is empty, use THD::lex instead.
Item_cond did not correctly propagate with_window_func flag in the Item
tree. Without it, we would not call Item::split_sum_func correctly and the
window function's result_field would remain NULL.
This is backport of 25ad623d64 for 10.2.
The issue is similar to the one from MDEV-13240. Item::save_in_field()
returns an error during tmp table population in a create table from select query
as we try to save an empty string as a date value when
force_return_blank is set to true for window functions.
MDEV-13240 Wrong warning with MAX(datetime_field) OVER (...)
The problem resided in Item_window_func implementation,
and it was revealed by bb-10.2-ext specific changes:
Item_window_func::save_in_field() works differently in bb-10.2-ext vs 10.2:
- 10.2 goes through val_str()
- bb-10.2-ext goes through get_date(), due to Type_handler related changes.
get_date() tries to convert empty string to DATETIME, hence the warning.
During a discussion with Vicentiu, it was decided to fix
Item_window_func::val_xxx() to return NULL
(instead of an "empty" value, such as 0 for numbers and '' for strings)
when force_return_blank is set.
Make st_select_lex::set_explain_type() take into account that JOIN_TABs
it is traversing may be also post-join aggregation JOIN_TABs (which
have pos_in_table_list=NULL, etc).
Do not run the window function computation step when the select
produces no rows (zero_result_cause!=NULL).
This may cause reads from uninitialized memory.
We still need to run the window function computation step when
the output includes just one row (for example
SELECT MAX(col), RANK() OVER (...) FROM t1 WHERE 1=0).
This fix also resolves an issue with queries with window functions
producing an output row where should be none, like in
SELECT ROW_NUMBER() FROM t1 WHERE 1=0.
Updated a few test results in the existing tests to reflect this.
Interval function makes use of Item_row. Item_row did not correctly mark
with_window_func flag according to its arguments. Fix it by making
Item_row aware of this flag.
The problem resided in Item_window_func implementation,
and it was revealed by bb-10.2-ext specific changes:
Item_window_func::save_in_field() works differently in bb-10.2-ext vs 10.2:
- 10.2 goes through val_str()
- bb-10.2-ext goes through get_date(), due to Type_handler related changes.
get_date() tries to convert empty string to DATETIME, hence the warning.
During a discussion with Vicentiu, it was decided to fix
Item_window_func::val_xxx() to return NULL
(instead of an "empty" value, such as 0 for numbers and '' for strings)
when force_return_blank is set.
The "is null" function performs one operation which no other Item_func
does, which is to update used tables during fix_length_and_dec().
This however can not be performed before window functions have had a
chance to resolve their order by and partition by definitions, which
happens after the initial setup_fields call. Consequently, do not call
Item_func_isnull update_used_tables during fix_length_and_dec().
There was another issue detected once the crash was resolved.
Because window functions did not implement is_null() method, we would
end up returning bad results for "is null" and "is not null" functions.
Implemented is_null() method for Item_windowfunc.
During statement preparation st_order::item gets set to a value in
ref_ptr_array. During statement execution we were overriding that value,
causing subsequent checks for window functions to return true.
Whenever we do any setting from ref_ptr_array, make sure to always
store the value in all_fields as well.
For function items containing window functions, as MDEV-12336 has
discovered, we don't need to create a separate Item_direct_ref or
Item_aggregate_ref as they will be computed directly from the top-level
item once the window function argument columns are computed.
The usage of windows functions when all tables were optimized away
by min/max optimization were not supported. As result a result,
the queries that used window functions with min/max aggregation
over the whole table returned wrong result sets.
The patch fixed this problem.
The function Item::split_sum_func2() incorrectly processed the function
items with window functions that were not window functions themselfes
and were used as arguments of other functions.
The bug was not visible in current HEAD. Introduced test case to catch
regressions. Also improve error messages regarding distinct usage in
window functions.
Window functions need to be computed after applying the HAVING clause.
An optimization that we have for regular, non-window function, cases is
to apply having only during sending of the rows to the client. This
allows rows that should be filtered from the temporary table used to
store aggregation results to be stored there.
This behaviour is undesireable for window functions, as we have to
compute window functions on the result-set after HAVING is applied.
Storing extra rows in the table leads to wrong values as the frame
bounds might capture those -to be filtered afterwards- rows.
These are different bugs, but the fixing code is the same:
if window functions are used over implicit grouping then
now the execution should follow the general path calling
the function set in JOIN::first_select.
Due to this bug many queries that contained a window function
with MIN/MAX aggregation returned wrong results.
Calculation of a MIN/MAX aggregate function uses cache objects
and a comparator object that are created and set up in
Item_sum_hybrid::fix_fields () by a call of Item_sum_hybrid::setup_hybrid().
The latter binds the objects to the first argument of the
MIN/MAX function. Meanwhile window function perform aggregation
over fields of a temporary table. So binding must be done rather to
these fields. The earliest moment when setup the objects used in
MIN/max functions can be done is after all calls of the method
split_sum_func().
This patch introduces this late setup, but only for aggregate
functions used in window functions.
Probably it makes sense to use this late setup for all MIN/MAX
objects.
This patch complements the patch for bug 11138.
Without this patch some table-less queries with window functions
could cause crashes due to a memory overwrite.
The method Item_sum::print did not print opening '(' after the name
of simple window functions (like rank, dense_rank etc).
As a result the view definitions with such window functions
were formed invalid in .frm files.
Using window functions over results of implicit groupings
required special handling in JOIN::make_aggr_tables_info.
The patch made sure that the result of implicit grouping
was written into a temporary table properly.
If a window function with aggregation is over the result
set of a grouping query then the argument of the aggregate
function from the window function is allowed to be an
aggregate function itself.
This bug happens due to a conflict in the construct window_spec.
(win_ref conflicts with the non-reserved key word ROWS).
The standard SQL-2003 says that ROWS is a reserved key word.
Made this key word reserved in our grammar and removed
the conflict.
There was no implementation of the virtual method print()
for the Item_window_func class. As a result for a view
containing window function an invalid view definition could
be written in the frm file. When a query that refers to
this view was executed a syntax error was reported.
Fix window function expressions such as win_func() <operator> expr.
The problem was found in 2 places.
First, when we have complex expressions containing window functions, we
can only compute their final value _after_ we have computed the window
function's values. These values must be stored within the temporary
table that we are using, before sending them off.
This is done by performing an extra copy_funcs call before the final
end_send() call.
Second, such expressions need to have their inner arguments,
changed such that the references within those arguments point to fields within
the temporary table.
Ex: sum(t.a) over (order by t.b) + sum(t.a) over (order by t.b)
Before this fix, t.a pointed to the original table's a field. In order
to compute the sum function's value correctly, it needs to point to the
copy of this field inside the temp table.
This is done by calling split_sum_func for each argument in the
expression in turn.
The win.test results have also been updated as they contained wrong
values for such a use case.
When specifying a RANGE type frame that exceeds the partition size, both
for the top and bottom cursors we end up removing more rows than added
to the aggregate function. This happens because our TOP range cursor,
which removes values from the aggregate function, would be allowed to breach
partition boundaries, while the BOTTOM range cursor would not.
To prevent this from happening, force the TOP range cursor to only move
within the current partition, as does the BOTTOM range cursor.