This patch is the result of running
run-clang-tidy -fix -header-filter=.* -checks='-*,modernize-use-equals-default' .
Code style changes have been done on top. The result of this change
leads to the following improvements:
1. Binary size reduction.
* For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by
~400kb.
* A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb.
2. Compiler can better understand the intent of the code, thus it leads
to more optimization possibilities. Additionally it enabled detecting
unused variables that had an empty default constructor but not marked
so explicitly.
Particular change required following this patch in sql/opt_range.cc
result_keys, an unused template class Bitmap now correctly issues
unused variable warnings.
Setting Bitmap template class constructor to default allows the compiler
to identify that there are no side-effects when instantiating the class.
Previously the compiler could not issue the warning as it assumed Bitmap
class (being a template) would not be performing a NO-OP for its default
constructor. This prevented the "unused variable warning".
rename to stress that is a specific hack for Item_func_nextval
and should not be used for other items.
If a vcol uses Item_func_nextval, a corresponding table for the sequence
should be added to the prelocking list (in that sense NEXTVAL is not
simply a function, but more like a subquery), see add_internal_tables()
in DML_prelocking_strategy::handle_table(). At the moment it is only
implemented for DEFAULT, not for GENERATED ALWAYS AS, thus the
VCOL_NEXTVAL hack.
it's not "non deterministic", it's completely defined
by @@rand_seed1 and @@rand_seed2. And as a session func it needs
to be re-fixed at the beginning of every statement.
Problem:
When calculatung MIN() and MAX() in a query with GROUP BY, like this:
SELECT MIN(time_expr), MAX(time_expr) FROM t1 GROUP BY i;
the code in Item_sum_min_max::update_field() erroneosly used
string format comparison, therefore '100:20:30' was considered as
smaller than '10:20:30'.
Fix:
1. Implementing low level "native" related methods in class Time:
Time::Time(const Native &native) - convert native to Time
Time::to_native(Native *to, uint decimals) - convert Time to native
The "native" binary representation for TIME is equal to
the binary data format of Field_timef, which is used to
store TIME when mysql56_temporal_format is ON (default).
2. Implementing Type_handler_time_common "native" related methods:
Type_handler_time_common::cmp_native()
Type_handler_time_common::Item_val_native_with_conversion()
Type_handler_time_common::Item_val_native_with_conversion_result()
Type_handler_time_common::Item_param_val_native()
3. Implementing missing "native representation" related methods
in Field_time and Field_timef:
Field_time::store_native()
Field_time::val_native()
Field_timef::store_native()
Field_timef::val_native()
4. Implementing missing "native" related methods in all Items
that can have the TIME data type:
Item_timefunc::val_native()
Item_name_const::val_native()
Item_time_literal::val_native()
Item_cache_time::val_native()
Item_handled_func::val_native()
5. Marking Type_handler_time_common as "native ready".
So now Item_sum_min_max::update_field() calculates
values using min_max_update_native_field(),
which uses native binary representation rather than string representation.
Before this change, only the TIMESTAMP data type used native
representation to calculate MIN() and MAX().
Benchmarks (see more details in MDEV):
This change not only fixes the wrong result, but also
makes a "SELECT .. MAX.. GROUP BY .." query faster:
# TIME(0)
CREATE TABLE t1 (id INT, time_col TIME) ENGINE=HEAP;
INSERT INTO t1 VALUES (1,'10:10:10'); -- repeat this 1m times
SELECT id, MAX(time_col) FROM t1 GROUP BY id;
MySQL80: 0.159 sec
10.3: 0.108 sec
10.4: 0.094 sec (fixed)
# TIME(6):
CREATE TABLE t1 (id INT, time_col TIME(6)) ENGINE=HEAP;
INSERT INTO t1 VALUES (1,'10:10:10.999999'); -- repeat this 1m times
SELECT id, MAX(time_col) FROM t1 GROUP BY id;
My80: 0.154
10.3: 0.135
10.4: 0.093 (fixed)
The code in Item_func_int_val::fix_length_and_dec_int_or_decimal()
calculated badly the result data type for FLOOR()/CEIL(), so for example
the decimal(38,10) input created a decimal(28,0) result.
That was not correct, because one extra integer digit is needed.
floor(-9.9) -> -10
ceil(9.9) -> 10
Rewritting the code in a more straightforward way.
Additional changes:
- FLOOR() now takes into account the presence of the UNSIGNED
flag of the argument: FLOOR(unsigned decimal) does not need an extra digits.
- FLOOR()/CEILING() now preserve the unsigned flag in the result
data type is decimal.
These changes give nicer data types.
Changing that in case of *INT and hex hybrid input:
- ROUND(x,NULL) creates a column with the same type as x.
The old code created a DOUBLE column, which was not relevant at all.
This change simplifies the code a lot.
- ROUND(x,non_constant) creates a column of the INT, BIGINT or DECIMAL
data type (depending on the exact type of x).
The old code created a column of the DOUBLE data type,
which lead to precision loss. Hence MDEV-23366.
- ROUND(bigint_30,negative_constant) creates a column of the DECIMAL(30,0)
data type. The old code created DECIMAL(29,0), which looked strange:
the data type promoted to a higher one, but max length reduced.
Now the length attribute is preserved.
Item_func_round::fix_arg_int() did not take into account cases
when the result of ROUND(bigint_subject,negative_precision)
could go outside of the BIGINT range. The old code only incremented
max_length, but did not extend change the data type.
Fixing to extend the data type (together with max_length increment).
Fixing ROUND(date,0), TRUNCATE(date,x), FLOOR(date), CEILING(date)
to return the `int(8) unsigned` data type.
Details:
1. Cleanup: moving virtual implementations
- Type_handler_temporal_result::Item_func_int_val_fix_length_and_dec()
- Type_handler_temporal_result::Item_func_round_fix_length_and_dec()
to Type_handler_date_common. Other temporal data type handlers
override these methods anyway. So they were only DATE specific.
This change makes the code clearer.
2. Backporting DTCollation_numeric from 10.5, to reuse the code easier.
3. Adding the `preferred_attrs` argument to Item_func_round::fix_arg_int(). Now
Type_handler_xxx::Item_func_round_val_fix_length_and_dec() work as follows:
- The INT-alike and YEAR handlers copy preferred_attrs from args[0].
- The DATE handler passes explicit attributes, to get `int(8) unsigned`.
- The hex hybrid handler passes NULL, so fix_arg_int() calculates attributes.
4. Type_handler_date_common::Item_func_int_val_fix_length_and_dec()
now sets the type handler and attributes to get `int(8) unsigned`.
1. Fixing ROUND(x) and TRUNCATE(x,0) with TINYINT, SMALLINT, MEDIUMINT, BIGINT
input to preserve the exact data type of the argument when it's possible.
2. Fixing FLOOR(x) and CEILING(x) with TINYINT, SMALLINT, MEDIUMINT, BIGINT
to preserve the exact data type of the argument.
3. Adding dedicated Type_handler_year::Item_func_round_fix_length_and_dec()
to easier handle ROUND(x) and TRUNCATE(x,y) for the YEAR(2) and YEAR(4)
input. They still return INT(2) UNSIGNED and INT(4) UNSIGNED correspondingly,
as before.
Implementing dedicated fixing methods:
- Type_handler_bit::Item_func_round_fix_length_and_dec()
- Type_handler_bit::Item_func_int_val_fix_length_and_dec()
- Type_handler_typelib::Item_func_round_fix_length_and_dec()
because the inherited methods did not work well.
Fixing:
- Type_handler_typelib::Item_func_int_val_fix_length_and_dec
It did not work well, because it used args[0]->max_length to
calculate the result data type. In case of ENUM and SET it was
not correct, because in FLOOR() and CEILING() context
ENUM and SET return not more than 5 digits (65535 is the biggest
possible value).
Misc:
- Changing the API of
Type_handler_bit::Bit_decimal_notation_int_digits(const Item *item)
to a more generic form:
Type_handler_bit::Bit_decimal_notation_int_digits_by_nbits(uint nbits)
- Fixing Type_handler_bit::Bit_decimal_notation_int_digits_by_nbits() to
return the exact number of decimal digits for all nbits 1..64.
The old implementation was approximate.
This change gives better (more precise) data types.
Item_func_div::fix_length_and_dec_temporal() set the return data type to
integer in case of @div_precision_increment==0 for temporal input with FSP=0.
This caused Item_func_div to call int_op(), which is not implemented,
so a crash on DBUG_ASSERT(0) happened.
Fixing fix_length_and_dec_temporal() to set the result type to DECIMAL.
Bit operators (~ ^ | & << >>) and the function BIT_COUNT()
always called val_int() for their arguments.
It worked correctly only for INT type arguments.
In case of DECIMAL and DOUBLE arguments it did not work well:
the argument values were truncated to the maximum SIGNED BIGINT value
of 9223372036854775807.
Fixing the code as follows:
- If the argument if of an integer data type,
it works using val_int() as before.
- If the argument if of some other data type, it gets the argument value
using val_decimal(), to avoid truncation, and then converts the result
to ulonglong.
Using Item_handled_func to switch between the two approaches easier.
As an additional advantage, with Item_handled_func it will be easier
to implement overloading in the future, so data type plugings will be able
to define their own behavioir of bit operators and BIT_COUNT().
Moving the code from the former val_int() implementations
as methods to Longlong_null, to avoid code duplication in the
INT and DECIMAL branches.
The patch for `MDEV-20795 CAST(inet6 AS BINARY) returns wrong result`
unintentionally changed what Item_char_typecast::type_handler()
returns. This broke UNIONs with the BINARY() function, as the Aria
engine started to get columns of unexpected data types.
Restoring previous behaviour, to return
Type_handler::string_type_handler(max_length).
The prototype for Item_handed_func::return_type_handler() has changed
from:
const Type_handler *return_type_handler() const
to:
const Type_handler *return_type_handler(const Item_handled_func *) const
These two methods:
- Item_result_field::create_tmp_field_ex()
- Item_func_user_var::create_tmp_field_ex()
had duplicate code, except that they used a different type handler.
Adding a protected method Item_result_field::create_tmp_field_ex_from_handler()
with a "const Type_handler*" parameter, and reusing it from the
two mentioned methods.
This change takes into account a column's GENERATED ALWAYS AS
expression dependcy on sql_mode's PAD_CHAR_TO_FULL_LENGTH and
NO_UNSIGNED_SUBTRACTION flags.
Indexed virtual columns as well as persistent generated columns are
now not allowed to have such dependencies to avoid inconsistent data
or index files on sql_mode changes.
So an error is now returned in cases like this:
CREATE OR REPLACE TABLE t1
(
a CHAR(5),
v VARCHAR(5) AS (a) PERSISTENT -- CHAR->VARCHAR or CHAR->TEXT = ERROR
);
Functions RPAD() and RTRIM() can now remove dependency on
PAD_CHAR_TO_FULL_LENGTH. So this can be used instead:
CREATE OR REPLACE TABLE t1
(
a CHAR(5),
v VARCHAR(5) AS (RTRIM(a)) PERSISTENT
);
Note, unlike CHAR->VARCHAR and CHAR->TEXT this still works,
not RPAD(a) is needed:
CREATE OR REPLACE TABLE t1
(
a CHAR(5),
v CHAR(5) AS (a) PERSISTENT -- CHAR->CHAR is OK
);
More sql_mode flags may affect values of generated columns.
They will be addressed separately.
See comments in sql_mode.h for implementation details.
This patch introduces the optimization that allows range optimizer to
consider index range scans that are built employing NOT NULL predicates
inferred from WHERE conditions and ON expressions.
The patch adds a new optimizer switch not_null_range_scan.