This patch is the result of running
run-clang-tidy -fix -header-filter=.* -checks='-*,modernize-use-equals-default' .
Code style changes have been done on top. The result of this change
leads to the following improvements:
1. Binary size reduction.
* For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by
~400kb.
* A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb.
2. Compiler can better understand the intent of the code, thus it leads
to more optimization possibilities. Additionally it enabled detecting
unused variables that had an empty default constructor but not marked
so explicitly.
Particular change required following this patch in sql/opt_range.cc
result_keys, an unused template class Bitmap now correctly issues
unused variable warnings.
Setting Bitmap template class constructor to default allows the compiler
to identify that there are no side-effects when instantiating the class.
Previously the compiler could not issue the warning as it assumed Bitmap
class (being a template) would not be performing a NO-OP for its default
constructor. This prevented the "unused variable warning".
The problem was that federated engine does not support comparable rowids
which was not taken into account by semijoin code.
Fixed by checking that we don't use semijoin with tables that does not
support comparable rowids.
Other things:
- Fixed some typos in the code comments
The cause of regression was handling for ROWNUM() function.
For queries like
SELECT ROWNUM() FROM ... ORDER BY ...
ROWNUM() should be computed before the ORDER BY.
The computation was moved to be before the ORDER BY for any entries in
the select list that had RAND_TABLE_BIT set.
This had a negative impact on queries in form:
SELECT sp_func() FROM t1 ORDER BY ... LIMIT n
where sp_func() is NOT declared as DETERMINISTIC (and so has
RAND_TABLE_BIT set).
The fix is to require evaluation for sorting only for the ROWNUM()
function. Functions that just have RAND_TABLE_BIT() can be computed
after ORDER BY ... LIMIT is applied.
(think about a possible index that satisfies the ORDER BY clause. In
that case, the the rows would be read in the needed order and we would
stop after reading LIMIT rows, achieving the same effect).
Deallocation of TABLE_LIST::dt_handler and TABLE_LIST::pushdown_derived
was performed in multiple places if code. This not only made the code
more difficult to maintain but also led to memory leaks and
ASAN heap-use-after-free errors.
This commit puts deallocation of TABLE_LIST::dt_handler and
TABLE_LIST::pushdown_derived to the single point - JOIN::cleanup()
If all elements in the list of 'IN' or 'NOT IN' clause are equal
and there are no NULLs then clause
- "a IN (e1,..,en)" can be converted to "a = e1"
- "a NOT IN (e1,..,en)" can be converted to "a <> e1".
This means an object of Item_func_in can be replaced with an object
of Item_func_eq for IN (e1,..,en) clause and Item_func_ne for
NOT IN (e1,...,en). Such a replacement allows the optimizer to choose
a better execution plan
The problem was caused by use of COLLATION(AVG('x')). This is an
item whose value is a constant.
Name Resolution code called convert_const_to_int() which removed AVG('x').
However, the item representing COLLATION(...) still had with_sum_func=1.
This inconsistent state confused the code that handles grouping and
DISTINCT: JOIN::get_best_combination() decided to use one temporary
table and allocated one JOIN_TAB for it, but then
JOIN::make_aggr_tables_info() attempted to use two and made writes
beyond the end of the JOIN::join_tab array.
The fix:
- Do not replace constant expressions which contain aggregate functions.
- Add JOIN::dbug_join_tab_array_size to catch attempts to use more
JOIN_TAB objects than we've allocated.
1. For INSERT..SELECT statements: don't include table/view the data
is inserted into in the list of leaf tables
2. Remove duplicated and dead code related to table_count
Part of:
MDEV-28073 Slow query performance in MariaDB when using many tables
s->key_dependent has a list of tables that are compared with key fields
in the current table. However it does not take into account if a key
field could be resolved by another table.
This is because MariaDB expands 'join_tab->keyuse' to include all generated
comparisons.
For example:
SELECT * from t1,t2,t3 where t1.key=t2.key and t2.key=t3.key
In this case keyuse for t1 includes t2.key and t3.key and key_dependent
contains 't2.map | t3.map'
If we in best_extension_by_limited_search() consider t2,t1 then t1's
key is fully defined, but we cannot do any prune of plans as
s->key_dependent indicates that t3 is still needed.
Fixed by calculating in best_access_patch the current key_dependent map
of tables that is needed to satisfy all keys. This allows us to prune
more bad plans earlier as soon as all keys can be used.
We also set key_dependent to 0 if we found an EQ_REF key, as this an
optimal key for the table and there is no reason to check more keys.
(Try 2)
The code that updates semi-join optimization state for a join order prefix
had several bugs. The visible effect was bad optimization for FirstMatch or
LooseScan strategies: they either weren't considered when they should have
been, or considered when they shouldn't have been.
In order to hit the bug, the optimizer needs to consider several different
join prefixes in a certain order. Queries with "obvious" query plans which
prune all join orders except one are not affected.
Internally, the bugs in updates of semi-join state were:
1. restore_prev_sj_state() assumed that
"we assume remaining_tables doesnt contain @tab"
which wasn't true.
2. Another bug in this function: it did remove bits from
join->cur_sj_inner_tables but never added them.
3. greedy_search() adds tables into the join prefix but neglects to update
the semi-join optimization state. (It does update nested outer join
state, see this call:
check_interleaving_with_nj(best_table)
but there's no matching call to update the semi-join state.
(This wasn't visible because most of the state is in the POSITION
structure which is updated. But there is also state in JOIN, too)
The patch:
- Fixes all of the above
- Adds JOIN::dbug_verify_sj_inner_tables() which is used to verify the
state is correct at every step.
- Renames advance_sj_state() to optimize_semi_joins().
= Introduces update_sj_state() which ideally should have been called
"advance_sj_state" but I didn't reuse the name to not create confusion.
(Try 2) (Cherry-pick back into 10.3)
The code that updates semi-join optimization state for a join order prefix
had several bugs. The visible effect was bad optimization for FirstMatch or
LooseScan strategies: they either weren't considered when they should have
been, or considered when they shouldn't have been.
In order to hit the bug, the optimizer needs to consider several different
join prefixes in a certain order. Queries with "obvious" query plans which
prune all join orders except one are not affected.
Internally, the bugs in updates of semi-join state were:
1. restore_prev_sj_state() assumed that
"we assume remaining_tables doesnt contain @tab"
which wasn't true.
2. Another bug in this function: it did remove bits from
join->cur_sj_inner_tables but never added them.
3. greedy_search() adds tables into the join prefix but neglects to update
the semi-join optimization state. (It does update nested outer join
state, see this call:
check_interleaving_with_nj(best_table)
but there's no matching call to update the semi-join state.
(This wasn't visible because most of the state is in the POSITION
structure which is updated. But there is also state in JOIN, too)
The patch:
- Fixes all of the above
- Adds JOIN::dbug_verify_sj_inner_tables() which is used to verify the
state is correct at every step.
- Renames advance_sj_state() to optimize_semi_joins().
= Introduces update_sj_state() which ideally should have been called
"advance_sj_state" but I didn't reuse the name to not create confusion.
The issue was that best_extension_by_limited_search() had to go through
too many plans with the same cost as there where many EQ_REF tables.
Fixed by shortcutting EQ_REF (AND REF) when the result only contains one
row. This got the optimization time down from hours to sub seconds.
The only known downside with this patch is that in some cases a table
with ref and 1 record may be used before on EQ_REF table. The faster
optimzation phase should compensate for this.