After MDEV-21580 the truncation of SORT_FIELD::length
set_if_smaller(sortorder->length, thd->variables.max_sort_length)
became conditional:
if (is_variable_sized())
set_if_smaller(length, thd->variables.max_sort_length)
To provide correct functioning of is_variable_sized() SORT_FIELD::type
must be set properly. This commit adds the necessary initialization
of SORT_FIELD::type to JOIN_TAB::remove_duplicates() as it is done
in filesort's sortlength() function.
DBUG_ASSERT is added to sortlength() just in case to prevent
a possible uint32 overflow
SELECT DISTINCT did not work with expressions with sum functions.
Distinct was only done on the values stored in the intermediate temporary
tables, which only stored the value of each sum function.
In other words:
SELECT DISTINCT sum(a),sum(b),avg(c) ... worked.
SELECT DISTINCT sum(a),sum(b) > 2,sum(c)+sum(d) would not work.
The later query would do ONLY apply distinct on the sum(a) part.
Reviewer: Sergei Petrunia <sergey@mariadb.com>
This was fixed by extending remove_dup_with_hash_index() and
remove_dup_with_compare() to take into account the columns in the result
list that where not stored in the temporary table.
Note that in many cases the above dup removal functions are not used as
the optimizer may be able to either remove duplicates early or it will
discover that duplicate remove is not needed. The later happens for
example if the group by fields is part of the result.
Other things:
- Backported from 11.0 the change of Sort_param.tmp_buffer from char* to
String.
- Changed Type_handler::make_sort_key() to take String as a parameter
instead of Sort_param. This was done to allow make_sort_key() functions
to be reused by distinct elimination functions.
This makes Type_handler_string_result::make_sort_key() similar to code
in 11.0
- Simplied error handling in remove_dup_with_compare() to remove code
duplication.
In case when filesort does not use addon field packing (because of
too small potential savings) and uses fixed width addon fields instead,
the field->pack() call can store less bytes when the field maximum
possible field length, e.g. in case of VARCHAR().
The memory between the packed length and addonf->length (the maximum length)
stayed uninitialized, which was reported by Valgrind/MSAN.
The problem was introduced by f52bf92014 in 10.5,
which removed the tail initialization (probably unintentionally).
Restoring the bzero() in the fixed length branch,
so in case when pack() stores less bytes than addonf->length says,
the trailing bytes gets initialized.
Note, before f52bf92014, the bzero()
was under HAVE_valgrind conditional compilation. Now it's being added
unconditionally:
- MSAN also reported the problem, so it's not only Valgrind specific.
- As Serg proposed, conditional initialization is bad - it can have
potentional security problems as the non-initialized memory fragments
can store various pieces of essential information, e.g. passwords.
ha_partition stores records in array of m_ordered_rec_buffer and uses
it for prio queue in ordered index scan. When the records are restored
from the array the blob buffers may be already freed or rewritten.
The solution is to take temporary ownership of cached blob buffers via
String::swap(). When the record is restored from m_ordered_rec_buffer
the ownership is returned to table fields.
Cleanups:
init_record_priority_queue(): removed needless !m_ordered_rec_buffer
check as there is same assertion few lines before.
dbug_print_row() for arbitrary row pointer
The issue here was the system variable max_sort_length was being applied
to decimals and it was truncating the value for decimals to the number
of bytes set by max_sort_length.
This was leading to a buffer overflow as the values were written
to the buffer without truncation and then we moved the offset to
the number of bytes(set by max_sort_length), that are needed for comparison.
The fix is to not apply max_sort_length for fixed size types like INT,
DECIMALS and only apply max_sort_length for CHAR, VARCHARS, TEXT and
BLOBS.
For a correlated subquery filesort is executed multiple times.
During each execution, sortlength() computed total sort key length in
Sort_keys::sort_length, without resetting it first.
Eventually Sort_keys::sort_length got larger than @@sort_buffer_size, which
caused filesort() to be aborted with error.
Fixed by making sortlength() to compute lengths only during the first
invocation. Subsequent invocations return pre-computed values.
and
MDEV-23414 Assertion `res->charset() == item->collation.collation' failed in Type_handler_string_result::make_packed_sort_key_part
pack_sort_string() *must* take a collation from the Item, not from the
String value. Because when casting a string to _binary the original
String is not copied for performance reasons, it's reused but its
collation does not match Item's collation anymore.
Note, that String's collation cannot be simply changed to _binary,
because for an Item_string literal the original String must stay
unchanged for the duration of the query.
this partially reverts 61c15ebe32
An overflow was happening with LONGTEXT columns, when the length was converted to the length
in the strxfrm form (mem-comparable keys).
Introduced a function to truncate the length to the max_sort_length before calculating
the length of the strxfrm form.
- Better to use 'String *' directly.
- Added String::get_value(LEX_STRING*) for the few cases where we want to
convert a String to LEX_CSTRING.
Other things:
- Use StringBuffer for some functions to avoid mallocs
Make sure that the sort_buffer that is allocated has atleast space for MERGEBUFF2 keys.
The issue here was that the record length is quite high and sort buffer size is very small,
due to which we end up with zero number of keys in the sort buffer. The Sort_param::max_keys_per_buffer
was zero in such a case, due to which we were flushing empty sort_buffer to the disk.
The issue here is for a DEPENDENT subquery that has an aggregate function in the ORDER BY clause,
is wrapped inside an Item_aggregate_ref. For computation of ORDER BY we need to refer to the
temp table field corresponding to this item. But in the function make_sortorder, we were
explicitly casting Item_aggrgate_ref to Item_sum, which leads to us not getting the temp
table field corresponding to the item.
In the merge_buffers phase for sorting, the sort buffer size is divided between the number of chunks.
The chunks have a start and end position (m_buffer_start and m_buffer_end).
Then we read the as many records that fit in this buffer for a chunk of the file.
The issue here was we were resetting the end of buffer(m_buffer_end) to the number of bytes that was
read, this was causing a problem because with dynamic size of sort keys it is possible that later
we would not be able to accommodate even one key inside a chunk of file.
So the fix was to not reset the end of buffer for a chunk of file.
For character sets and collation where character to weight mapping > 1,
there we need to make sure while creating a sort key,
a temporary buffer is created to store the value of the item by val_str function
and then copy that value back to the sort buffer.
In this case when using a priority queue Sort_param::tmp_buffer was not allocated.
Minor refactoring:
Changed Sort_param::tmp_buffer from char* to String
This task deals with packing the sort key inside the sort buffer, which would
lead to efficient usage of the memory allocated for the sort buffer.
The changes brought by this feature are
1) Sort buffers would have sort keys of variable length
2) The format for sort keys inside the sort buffer would look like
|<sort_length><null_byte><key_part1><null_byte><key_part2>.......|
sort_length is the extra bytes that are required to store the variable
length of a sort key.
3) When packing of sort key is done we store the ORIGINAL VALUES inside
the sort buffer and not the STRXFRM form (mem-comparable sort keys).
4) Special comparison function packed_keys_comparison() is introduced
to compare 2 sort keys.
This patch also contains contributions from Sergei Petrunia.
This task deals with packing the non-sorted fields (or addon fields).
This would lead to efficient usage of the memory allocated for the sort buffer.
The changes brought by this feature are
1) Sort buffers would have records of variable length
2) Each record in the sort buffer would be stored like
<sort_key1><sort_key2>....<addon_length><null_bytes><field1><field2>....
addon_length is the extra bytes that are required to store the variable
length of addon field across different records.
3) Changes in rr_unpack_from_buffer and rr_from_tempfile to take into account
the variable length of records.
Ported WL#1509 Pack values of non-sorted fields in the sort buffer from
MySQL by Tor Didriksen