Final added to:
- All reasonable classes inhereted from Field
- All classes inhereted from Protocol
- Almost all Handler classes
- Some important Item classes
The stripped size of mariadbd is just 4K smaller, but several object files
showed notable improvements in common execution paths.
- Checked field.o and item_sum.o
Other things:
- Added 'override' to a few class functions touched by this patch.
- Removed 'virtual' from a new class functions that had/got 'override'
- Changed Protocol_discard to inherit from Protocol instad of Protocol_text
The problem here is similar to the case with DISTINCT, the tree used for ORDER BY
needs to also hold the null bytes of the record. This was not done for GROUP_CONCAT
as NULLS are rejected by GROUP_CONCAT.
Also introduced a comparator function for the order by tree to handle null
values with JSON_ARRAYAGG.
For DISTINCT to be handled with JSON_ARRAYAGG, we need to make sure
that the Unique tree also holds the NULL bytes of a table record
inside the node of the tree. This behaviour for JSON_ARRAYAGG is
different from GROUP_CONCAT because in GROUP_CONCAT we just reject
NULL values for columns.
Also introduced a comparator function for the unique tree to handle null
values for distinct inside JSON_ARRAYAGG.
Backported from MYSQL
Bug #25331425: DISTINCT CLAUSE DOES NOT WORK IN GROUP_CONCAT
Issue:
------
The problem occurs when:
1) GROUP_CONCAT (DISTINCT ....) is used in the query.
2) Data size greater than value of system variable:
tmp_table_size.
The result would contain values that are non-unique.
Root cause:
-----------
An in-memory structure is used to filter out non-unique
values. When the data size exceeds tmp_table_size, the
overflow is written to disk as a separate file. The
expectation here is that when all such files are merged,
the full set of unique values can be obtained.
But the Item_func_group_concat::add function is in a bit of
hurry. Even as it is adding values to the tree, it wants to
decide if a value is unique and write it to the result
buffer. This works fine if the configured maximum size is
greater than the size of the data. But since tmp_table_size
is set to a low value, the size of the tree is smaller and
hence requires the creation of multiple copies on disk.
Item_func_group_concat currently has no mechanism to merge
all the copies on disk and then generate the result. This
results in duplicate values.
Solution:
---------
In case of the DISTINCT clause, don't write to the result
buffer immediately. Do the merge and only then put the
unique values in the result buffer. This has be done in
Item_func_group_concat::val_str.
Note regarding result file changes:
-----------------------------------
Earlier when a unique value was seen in
Item_func_group_concat::add, it was dumped to the output.
So result is in the order stored in SE. But with this fix,
we wait until all the data is read and the final set of
unique values are written to output buffer. So the data
appears in the sorted order.
This only fixes the cases when we have DISTINCT without ORDER BY clause
in GROUP_CONCAT.
Backported from MYSQL
Bug #25331425: DISTINCT CLAUSE DOES NOT WORK IN GROUP_CONCAT
Issue:
------
The problem occurs when:
1) GROUP_CONCAT (DISTINCT ....) is used in the query.
2) Data size greater than value of system variable:
tmp_table_size.
The result would contain values that are non-unique.
Root cause:
-----------
An in-memory structure is used to filter out non-unique
values. When the data size exceeds tmp_table_size, the
overflow is written to disk as a separate file. The
expectation here is that when all such files are merged,
the full set of unique values can be obtained.
But the Item_func_group_concat::add function is in a bit of
hurry. Even as it is adding values to the tree, it wants to
decide if a value is unique and write it to the result
buffer. This works fine if the configured maximum size is
greater than the size of the data. But since tmp_table_size
is set to a low value, the size of the tree is smaller and
hence requires the creation of multiple copies on disk.
Item_func_group_concat currently has no mechanism to merge
all the copies on disk and then generate the result. This
results in duplicate values.
Solution:
---------
In case of the DISTINCT clause, don't write to the result
buffer immediately. Do the merge and only then put the
unique values in the result buffer. This has be done in
Item_func_group_concat::val_str.
Note regarding result file changes:
-----------------------------------
Earlier when a unique value was seen in
Item_func_group_concat::add, it was dumped to the output.
So result is in the order stored in SE. But with this fix,
we wait until all the data is read and the final set of
unique values are written to output buffer. So the data
appears in the sorted order.
This only fixes the cases when we have DISTINCT without ORDER BY clause
in GROUP_CONCAT.
We have to include NULL in the result which the GOUP_CONCAT doesn't
always do. Also converting should be done into another String instance
as these can be same.
Item_sum_sp did not override val_native(). So the reported script
crashed in the default implementation in Item::val_native() on DBUG_ASSERT().
Implementing a correct Item_sum_sp::val_native().
e.g.
- dont -> don't
- occurence -> occurrence
- succesfully -> successfully
- easyly -> easily
Also remove trailing space in selected files.
These changes span:
- server core
- Connect and Innobase storage engine code
- OQgraph, Sphinx and TokuDB storage engines
Related to MDEV-21769.
The JSON_ARRAYAGG function extends the GROUP_CONCAT function and provides
a method of aggregating JSON results. The current implementation supports
DISTINCT and LIMIT but not ORDER BY (Oracle supports GROUP BY).
Adding GROUP BY support is possible but it requires some extra work as the
grouping appears to be done inside a temporary table that complicates
matters.
Added test cases that covert aggregation of all JSON types and JSON
validation for the generated results.
group concat tree is allocated in a memroot, so the only way to free
memory is to copy a part of the tree into a new memroot.
track the accumilated length of the result, and when it crosses
the threshold - copy the result into a new tree, free the old one.
Added support for usual agreggate UDF (UDAF)
Added remove() call support for more efficient window function processing
Added example of aggregate UDF with efficient windows function support
MDEV-17625 Different warnings when comparing a garbage to DATETIME vs TIME
- Splitting processes of data type conversion (to TIME/DATE,DATETIME)
and warning generation.
Warning are now only get collected during conversion (in an "int" variable),
and are pushed in the very end of conversion (not in parallel).
Warnings generated by the low level routines str_to_xxx() and number_to_xxx()
can now be changed at the end, when TIME_FUZZY_DATES is applied,
from "Invalid value" to "Truncated invalid value".
Now "Illegal value" is issued only when the low level routine returned
an error and TIME_FUZZY_DATES was not set. Otherwise, if the low level
routine returned "false" (success), or if NULL was converted to a zero
datetime by TIME_FUZZY_DATES, then "Truncated illegal value"
is issued. This gives better warnings.
- Methods Type_handler::Item_get_date() and
Type_handler::Item_func_hybrid_field_type_get_date() now only
convert and collect warning information, but do not push warnings.
- Changing the return data type for Type_handler::Item_get_date()
and Type_handler::Item_func_hybrid_field_type_get_date() from
"bool" to "void". The conversion result (success vs error) can be
checked by testing ltime->time_type. MYSQL_TIME_{NONE|ERROR}
mean mean error, other values mean success.
- Adding new wrapper methods Type_handler::Item_get_date_with_warn() and
Type_handler::Item_func_hybrid_field_type_get_date_with_warn()
to do conversion followed by raising warnings, and changing
the code to call new Type_handler::***_with_warn() methods.
- Adding a helper class Temporal::Status, a wrapper
for MYSQL_TIME_STATUS with automatic initialization.
- Adding a helper class Temporal::Warn, to collect warnings
but without actually raising them. Moving a part of ErrConv
into a separate class ErrBuff, and deriving both Temporal::Warn
and ErrConv from ErrBuff. The ErrBuff part of Temporal::Warn
is used to collect textual representation of the input data.
- Adding a helper class Temporal::Warn_push. It's used
to collect warning information during conversion, and
automatically pushes warnings to the diagnostics area
on its destructor time (in case of non-zero warning).
- Moving more code from various functions inside class Temporal.
- Adding more Temporal_hybrid constructors and
protected Temporal methods make_from_xxx(),
which convert and only collect warning information, but do not
actually raise warnings.
- Now the low level functions str_to_datetime() and str_to_time()
always set status->warning if the return value is "true" (error).
- Now the low level functions number_to_time() and number_to_datetime()
set the "*was_cut" argument if the return value is "true" (error).
- Adding a few DBUG_ASSERTs to make sure that str_to_xxx() and
number_to_xxx() always set warnings on error.
- Adding new warning flags MYSQL_TIME_WARN_EDOM and MYSQL_TIME_WARN_ZERO_DATE
for the code symmetry. Before this change there was a special
code path for (rc==true && was_cut==0) which was treated by
Field_temporal::store_invalid_with_warning as "zero date violation".
Now was_cut==0 always means that there are no any error/warnings/notes
to be raised, not matter what rc is.
- Using new Temporal_hybrid constructors in combination with
Temporal::Warn_push inside str_to_datetime_with_warn(),
double_to_datetime_with_warn(), int_to_datetime_with_warn(),
Field::get_date(), Item::get_date_from_string(), and a few other places.
- Removing methods Dec_ptr::to_datetime_with_warn(),
Year::to_time_with_warn(), my_decimal::to_datetime_with_warn(),
Dec_ptr::to_datetime_with_warn().
Fixing Sec6::to_time() and Sec6::to_datetime() to
convert and only collect warnings, without raising warnings.
Now warning raising functionality resides in Temporal::Warn_push.
- Adding classes Longlong_hybrid_null and Double_null, to
return both value and the "IS NULL" flag. Adding methods
Item::to_double_null(), to_longlong_hybrid_null(),
Item_func_hybrid_field_type::to_longlong_hybrid_null_op(),
Item_func_hybrid_field_type::to_double_null_op().
Removing separate classes VInt and VInt_op, as they
have been replaced by a single class Longlong_hybrid_null.
- Adding a helper method Temporal::type_name_by_timestamp_type(),
moving a part of make_truncated_value_warning() into it,
and reusing in Temporal::Warn::push_conversion_warnings().
- Removing Item::make_zero_date() and
Item_func_hybrid_field_type::make_zero_mysql_time().
They provided duplicate functionality.
Now this code resides in Temporal::make_fuzzy_date().
The latter is now called for all Item types when data type
conversion (to DATE/TIME/DATETIME) is involved, including
Item_field and Item_direct_view_ref.
This fixes MDEV-17563: Item_direct_view_ref now correctly converts
NULL to a zero date when TIME_FUZZY_DATES says so.