The problematic query outlined a bug in window functions sorting
optimization. When multiple window functions are present in a query,
we sort the sorting key (as defined by PARTITION BY and ORDER BY) from
generic to specific.
SELECT RANK() OVER (ORDER BY const_col) as r1,
RANK() OVER (ORDER BY const_col, a) as r2,
RANK() OVER (PARTITION BY c) as r3,
RANK() OVER (PARTITION BY c ORDER BY b) as r4
FROM table;
For these functions, the sorting we need to do for window function
computations are: [(const_col), (const_col, a)] and [(c), (c, b)].
Instead of doing 4 different sort order, the sorts grouped within [] are
compatible and we can use the most *specific* sort to cover both window
functions.
The bug was caused by an incorrect flagging of which sort is most
specific for a compatible group of functions. In our specific test case,
instead of picking (const_col, a) as the most specific sort, it would
only sort by (const_col), which lead to wrong results for rank function.
By ensuring that we pick the last sort key before an "incompatible sort"
flag is met in our "ordered array of sorting specifications", we
guarantee correct results.
If procedure is changed in one connection, and other procedure has
already called the initial version of the procedure, the query to
INFORMATION_SCHEMA.PARAMETERS would use obsolete information from sp
cache for that connection. That happens because cache invalidating
method only increments cache version, and does not flush (all) the
cache(s), and changing of a procedure only invalidates cache, and
removes the procedure's cache entry from local thread cache only.
The fix adds the check if sp info obtained from the cache for forming of
results for the query to I_S, is not obsoleted, and does not use it, if
it is.
The test has been added to main.information_schema. It changes the SP in
one connection, and ensures, that the change is seen in the query to the
I_S.PARAMETERS in other connection, that already has called the
procedure before the change.
"mtr --view-protocol func_math" failed because of a too long
column names imlicitly generated for the underlying expressions.
With --view-protocol they were replaced to "Name_exp_1".
Adding column aliases for these expressions.
'long long int'; cast to an unsigned type to negate this value ..
to itself in Item_func_mul::int_op and Item_func_round::int_op
Problems:
The code in multiple places in the following methods:
- Item_func_mul::int_op()
- longlong Item_func_int_div::val_int()
- Item_func_mod::int_op()
- Item_func_round::int_op()
did not properly check for corner values LONGLONG_MIN
and (LONGLONG_MAX+1) before doing negation.
This cuased UBSAN to complain about undefined behaviour.
Fix summary:
- Adding helper classes ULonglong, ULonglong_null, ULonglong_hybrid
(in addition to their signed couterparts in sql/sql_type_int.h).
- Moving the code performing multiplication of ulonglong numbers
from Item_func_mul::int_op() to ULonglong_hybrid::ullmul().
- Moving the code responsible for extracting absolute values
from negative numbers to Longlong::abs().
It makes sure to perform negation without undefinite behavior:
LONGLONG_MIN is handled in a special way.
- Moving negation related code to ULonglong::operator-().
It makes sure to perform negation without undefinite behavior:
(LONGLONG_MAX + 1) is handled in a special way.
- Moving signed<=>unsigned conversion code to
Longlong_hybrid::val_int() and ULonglong_hybrid::val_int().
- Reusing old and new sql_type_int.h classes in multiple
places in Item_func_xxx::int_op().
Fix details (explain how sql_type_int.h classes are reused):
- Instead of straight negation of negative "longlong" arguments
*before* performing unsigned multiplication,
Item_func_mul::int_op() now calls ULonglong_null::ullmul()
using Longlong_hybrid_null::abs() to pass arguments.
This fixes undefined behavior N1.
- Instead of straight negation of "ulonglong" result
*after* performing unsigned multiplication,
Item_func_mul::int_op() now calls ULonglong_hybrid::val_int(),
which recursively calls ULonglong::operator-().
This fixes undefined behavior N2.
- Removing duplicate negating code from Item_func_mod::int_op().
Using ULonglong_hybrid::val_int() instead.
This fixes undefinite behavior N3.
- Removing literal "longlong" negation from Item_func_round::int_op().
Using Longlong::abs() instead, which correctly handler LONGLONG_MIN.
This fixes undefinite behavior N4.
- Removing the duplicate (negation related) code from
Item_func_int_div::val_int(). Reusing class ULonglong_hybrid.
There were no undefinite behavior in here.
However, this change allowed to reveal a bug in
"-9223372036854775808 DIV 1".
The removed negation code appeared to be incorrect when
negating +9223372036854775808. It returned the "out of range" error.
ULonglong_hybrid::operator-() now handles all values correctly
and returns +9223372036854775808 as a negation for -9223372036854775808.
Re-recording wrong results for
SELECT -9223372036854775808 DIV 1;
Now instead of "out of range", it returns -9223372036854775808,
which is the smallest possible value for the expression data type
(signed) BIGINT.
- Removing "no UBSAN" branch from Item_func_splus::int_opt()
and Item_func_minus::int_opt(), as it made UBSAN happy but
in RelWithDebInfo some MTR tests started to fail.
The code in choose_best_splitting() assumed that the join prefix is
in join->positions[].
This is not necessarily the case. This function might be called when
the join prefix is in join->best_positions[], too.
Follow the approach from best_access_path(), which calls this function:
pass the current join prefix as an argument,
"const POSITION *join_positions" and use that.
This bug could affect queries containing a subquery over splittable derived
tables and having an outer references in its WHERE clause. If such subquery
contained an equality condition whose left part was a reference to a column
of the derived table and the right part referred only to outer columns
then the server crashed in the function st_join_table::choose_best_splitting()
The crashing code was added in the commit ce7ffe61d8
that made the code of the function sensitive to presence of the flag
OUTER_REF_TABLE_BIT in the KEYUSE_EXT::needed_in_prefix fields.
The field needed_in_prefix of the KEYUSE_EXT structure should not contain
table maps with OUTER_REF_TABLE_BIT or RAND_TABLE_BIT.
Note that this fix is quite conservative: for affected queries it just
returns the query plans that were used before the above mentioned commit.
In fact the equalities causing crashes should be pushed into derived tables
without any usage of split optimization.
Approved by Sergei Petrunia <sergey@mariadb.com>
EXPLAIN EXTENDED should always print the field item used in the left part
of an equality expression from the SET clause of an update statement as a
reference to table column.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
The problem was that JOIN_CACHE::alloc_buffer() did not check if the
given join_buffer_value is less than the query require.
Added a check for this and disabled join cache if it cannot be used.
The problem was trying to access JOIN_TAB::select which is set to NULL
when using the filesort. The correct way is accessing either
JOIN_TAB::select or JOIN_TAB::filesort->select depending on whether
the filesort is used.
This commit introduces member function JOIN_TAB::get_sql_select()
encapsulating that check so the code duplication is eliminated.
The new condition (s->table->quick_keys.is_set(best_key->key))
was added to best_access_path() to eliminate a Valgrind error.
The cause of that error was using TRASH_ALLOC(quick_key_parts)
instead of bzero(quick_key_parts); hence, accessing
s->table->quick_key_parts[best_key->key]) without prior checking
for quick_keys.is_set() might have caused reading "dirty" memory
collations
Fix by Alexey Botchkov
The 'value_len' is calculated wrong for the multibyte charsets. In the
read_strn() function we get the length of the string with the final ' " '
character. So have to subtract it's length from the value_len. And the
length of '1' isn't correct for the ucs2 charset (must be 2).
ROW variables did not get assigned from subselects in these contexts:
BEGIN
DECLARE r ROW TYPE OF t1;
SET r=(SELECT * FROM t1 WHERE a=1);
END;
BEGIN
DECLARE r ROW TYPE OF t1 DEFAULT (SELECT * FROM t1 WHERE a=1);
END;
All fields of the ROW variable remained NULL.
This bug could affect queries containing a subquery over splittable derived
tables and having an outer references in its WHERE clause. If such subquery
contained an equality condition whose left part was a reference to a column
of the derived table and the right part referred only to outer columns
then the server crashed in the function st_join_table::choose_best_splitting()
The crashing code was added in the commit ce7ffe61d8
that made the code of the function sensitive to presence of the flag
OUTER_REF_TABLE_BIT in the KEYUSE_EXT::needed_in_prefix fields.
The field needed_in_prefix of the KEYUSE_EXT structure should not contain
table maps with OUTER_REF_TABLE_BIT or RAND_TABLE_BIT.
Note that this fix is quite conservative: for affected queries it just
returns the query plans that were used before the above mentioned commit.
In fact the equalities causing crashes should be pushed into derived tables
without any usage of split optimization.
Approved by Sergei Petrunia <sergey@mariadb.com>
lower_case_table_names=2 means "table names and database names are
stored as declared, but they are compared in lowercase".
But names of objects in grants are stored in lowercase for any value
of lower_case_table_names. This caused an error when checking grants
for objects containing uppercase letters since table_hash_search()
didn't take into account lower_case_table_names value
EXPLAIN EXTENDED should always print the field item used in the left part
of an equality expression from the SET clause of an update statement as a
reference to table column.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
This bug affected EXPLAIN EXTENDED command for single-table DELETE that
used an IN subquery in its WHERE clause. A crash happened if the optimizer
chose to employ index_subquery or unique_subquery access when processing
such command.
The crash happened when the command tried to print the transformed query.
In the current code of 10.4 for single-table DELETE statements the output
of any explain command is produced after the join structures of all used
subqueries have been destroyed. JOIN::destroy() sets the field tab of the
JOIN_TAB structures created for subquery tables to NULL. As a result
subselect_indexsubquery_engine::print(), subselect_indexsubquery_engine()
cannot use this field to get the alias name of the joined table.
This patch suggests to use the field TABLE_LIST::TAB that can be accessed
from JOIN_TAB::tab_list to get the alias name of the joined table.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
- main.selectivity failed because one test produced different result with
embedded (missing feature). Fixed by moving the failing part to
selectivity_notembedded.
- Disabled maria.encrypt-no-key for embedded as embedded does not support
encryption
- Moved test from join_cache to join_cache_notasan that tried to alloc()
a buffer bigger than available memory.
The old code did set max_records to either number_of_rows
(partial_join_cardinality) or memory size (join_buffer_space_limit)
which did not make sense.
Fixed by setting max_records to number of rows that fits into
join_buffer_size.
Other things:
- Initialize buffer cache values in JOIN_CACHE constructors (safety)
Reviewer: Sergei Petrunia <sergey@mariadb.com>
The problem, introduced in patch for MDEV-26301:
When check_join_cache_usage() decides not to use join buffer, it must
adjust the access method accordingly. For BNL-H joins this means switching
from pseudo-"ref access"(with index=MAX_KEY) to some other access method.
Failing to do this will cause assertions down the line when code that is
not aware of BNL-H will try to initialize index use for ref access with
index=MAX_KEY.
The fix is to follow the regular code path to disable the join buffer for
the join_tab ("goto no_join_cache") instead of just returning from
check_join_cache_usage().
The problem was that join_buffer_size conflicted with
join_buffer_space_limit, which caused the query to be run without join
buffer. However this caused wrong results as the optimizer assumed
that hash+join buffer would ensure that the equi-join condition
would be satisfied, and didn't check it itself.
Fixed by not using join_buffer_space_limit when
optimize_join_buffer_size=off. This matches the documentation at
https://mariadb.com/kb/en/block-based-join-algorithms
Other things:
- Removed not used variable JOIN_TAB::join_buffer_size_limit
- Give an error if we cannot allocate a join buffer. This can
only happen if the join_buffer variables are wrongly configured or
we are running out of memory.
In the future, instead of returning an error, we could properly
convert the query plan that uses BNL-H join into one that doesn't
use join buffering:
make sure the equi-join condition is checked where appropriate.
Reviewer: Sergei Petrunia <sergey@mariadb.com>
select_insert::store_values() must reset
has_value_set bitmap before every row, just like mysql_insert() does.
because ON DUPLICATE KEY UPDATE and triggers modify it
This patch optimizes the number of refills for the lateral derived table
to which a materialized derived table subject to split optimization is
is converted. This optimized number of refills is now considered as the
expected number of refills of the materialized derived table when searching
for the best possible splitting of the table.
When a query does implicit grouping and join operation produces an empty
result set, a NULL-complemented row combination is generated.
However, constant table fields still show non-NULL values.
What happens in the is that end_send_group() is called with a
const row but without any rows matching the WHERE clause.
This last part is shown by 'join->first_record' not being set.
This causes item->no_rows_in_result() to be called for all items to reset
all sum functions to their initial state. However fields are not set
to NULL.
The used fix is to produce NULL-complemented records for constant tables
as well. Also, reset the constant table's records back in case we're
in a subquery which may get re-executed.
An alternative fix would have item->no_rows_in_result() also work
with Item_field objects.
There is some other issues with the code:
- join->no_rows_in_result_called is used but never set.
- Tables that are used with group functions are not properly marked as
maybe_null, which is required if the table rows should be regarded as
null-complemented (not existing).
- The code that tries to detect if mixed_implicit_grouping should be set
didn't take into account all usage of fields and sum functions.
- Item_func::restore_to_before_no_rows_in_result() called the wrong
function.
- join->clear() does not use a table_map argument to clear_tables(),
which caused it to ignore constant tables.
- unclear_tables() does not correctly restore status to what is
was before clear_tables().
Main bug fix was to always use a table_map argument to clear_tables() and
always use join->clear() and clear_tables() together with unclear_tables().
Other fixes:
- Fixed Item_func::restore_to_before_no_rows_in_result()
- Set 'join->no_rows_in_result_called' when no_rows_in_result_set()
is called.
- Removed not used argument from setup_end_select_func().
- More code comments
- Ensure that end_send_group() modifies the same fields as are in the
result set.
- Changed return_zero_rows() to use pointers instead of references,
similar to the rest of the code.
Variant #2.
When Histogram::point_selectivity() sees that the point value of interest
falls into one bucket, it tries to guess whether the bucket has many
different (unpopular) values or a few popular values. (The number of
rows is fixed, as it's a Height-balanced histogram).
The basis for this guess is the "width" of the value range the bucket
covers. Buckets covering wider value ranges are assumed to contain
values with proportionally lower frequencies.
This is just a [brave] guesswork. For a very narrow bucket, it may
produce an estimate that's larger than total #rows in the bucket
or even in the whole table.
Remove the guesswork and replace it with basic logic: return
either the per-table average selectivity of col=const, or selectivity
of one bucket, whichever is lower.
Fix-up for commit 476b24d084
Author: Monty
Date: Thu Feb 16 14:19:33 2023 +0200
MDEV-20057 Distinct SUM on CROSS JOIN and grouped returns wrong result
which misses initializing of sorder->suffix_length.
In this commit the initialization is implemented by passing
MY_ZEROFILL flag to the allocation of SORT_FIELD elements
This bug could manifest itself at the first execution of prepared statement
created for queries using a materialized view defined as union. A crash
could happen for sure if the query contained a condition pushable into
the view and this condition was over the column defined via a complex string
expression requiring implicit conversion from one charset to another for
some of its sub-expressions. The bug could cause crashes when executing
PS for some other queries whose optimization needed building clones for
such expressions.
This bug was introduced in the patch for MDEV-29988 where the class
Item_direct_ref_to_item was added. The implementations of the virtual
methods get_copy() and build_clone() were invalid for the class and this
could cause crashes after the method build_clone() was called for
expressions containing objects of the Item_direct_ref_to_item type.
Approved by Sergei Golubchik <serg@mariadb.com>
This bug caused server crash when processing a multi-update statement that
used views if optimizer tracing was enabled.
The bug was introduced in the patch for MDEV-30539 that could incorrectly
detect the most top level selects of queries if views were used in them.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
This bug could affect multi-update statements as well as single-table
update statements processed as multi-updates when the where condition
contained a range condition over a non-indexed varchar column. The
optimizer calculates selectivity of such range conditions using histograms.
For each range the buckets containing endpoints of the the range are
determined with a procedure that stores the values of the endpoints in the
space of the record buffer where values of the columns are usually stored.
For a range over a varchar column the value of a endpoint may exceed the
size of the buffer and in such case the value is stored with truncation.
This truncations cannot affect the result of the calculation of the range
selectivity as the calculation employes only the beginning of the value
string. However it can trigger generation of an unexpected error on this
truncation if an update statement is processed.
This patch prohibits truncation messages when selectivity of a range
condition is calculated for a non-indexed column.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
- Adding a new argument "flag" to MY_COLLATION_HANDLER::strnncollsp_nchars()
and a flag MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES.
The flag defines if strnncollsp_nchars() should emulate trailing spaces
which were possibly trimmed earlier (e.g. in InnoDB CHAR compression).
This is important for NOPAD collations.
For example, with this input:
- str1= 'a ' (Latin letter a followed by one space)
- str2= 'a ' (Latin letter a followed by two spaces)
- nchars= 3
if the flag is given, strnncollsp_nchars() will virtually restore
one trailing space to str1 up to nchars (3) characters and compare two
strings as equal:
- str1= 'a ' (one extra trailing space emulated)
- str2= 'a ' (as is)
If the flag is not given, strnncollsp_nchars() does not add trailing
virtual spaces, so in case of a NOPAD collation, str1 will be compared
as less than str2 because it is shorter.
- Field_string::cmp_prefix() now passes the new flag.
Field_varstring::cmp_prefix() and Field_blob::cmp_prefix() do
not pass the new flag.
- The branch in cmp_whole_field() in storage/innobase/rem/rem0cmp.cc
(which handles the CHAR data type) now also passed the new flag.
- Fixing UCA collations to respect the new flag.
Other collations are possibly also affected, however
I had no success in making an SQL script demonstrating the problem.
Other collations will be extended to respect this flags in a separate
patch later.
- Changing the meaning of the last parameter of Field::cmp_prefix()
from "number of bytes" (internal length)
to "number of characters" (user visible length).
The code calling cmp_prefix() from handler.cc was wrong.
After this change, the call in handler.cc became correct.
The code calling cmp_prefix() from key_rec_cmp() in key.cc
was adjusted according to this change.
- Old strnncollsp_nchar() related tests in unittest/strings/strings-t.c
now pass the new flag.
A few new tests also were added, without the flag.