This bug affected queries with nested left joins having the same last inner
table such that not_exists optimization could be applied to the most inner
outer join when optimizer chose to use join buffers. The bug could lead to
producing wrong a result set.
If the WHERE condition a query contains a conjunctive IS NULL predicate
over a non-nullable column of an inner table of a not nested outer join
then not_exists optimization can be applied to tho the outer join. With
this optimization when looking for matches for a certain record from the
outer table of the join the records of the inner table can be ignored
right after the first match satisfying the ON condition is found.
In the case of nested outer joins having the same last inner table this
optimization still can be applied but only if all ON conditions of the
embedding outer joins are satisfied. Such check was missing in the code
that tried to apply not_exists optimization when join buffers were used
for outer join operations.
This problem has been already fixed in the patch for bug MDEV-7992. Yet
there it was resolved only for the cases when join buffers were not used
for outer joins.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
When a range rowid filter was used with an index ref access the cost of
accessing the index entries for the records rejected by the filter was not
taken into account. For a ref access by an index with big average number
of records per key this led to poor execution plans if selectivity of the
used filter was high.
The patch resolves this problem. It also introduces a minor optimization
that skips look-ups into a filter that turns out to be empty.
With this patch the output of ANALYZE stmt reports the number of look-ups
into used rowid filters.
The patch also back-ports from 10.5 the code that properly sets the field
TABLE::file::table for opened temporary tables.
The test cases that were supposed to use rowid filters have been adjusted
in order to use similar execution plans after this fix.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Part of:
MDEV-28073 Slow query performance in MariaDB when using many tables
s->key_dependent has a list of tables that are compared with key fields
in the current table. However it does not take into account if a key
field could be resolved by another table.
This is because MariaDB expands 'join_tab->keyuse' to include all generated
comparisons.
For example:
SELECT * from t1,t2,t3 where t1.key=t2.key and t2.key=t3.key
In this case keyuse for t1 includes t2.key and t3.key and key_dependent
contains 't2.map | t3.map'
If we in best_extension_by_limited_search() consider t2,t1 then t1's
key is fully defined, but we cannot do any prune of plans as
s->key_dependent indicates that t3 is still needed.
Fixed by calculating in best_access_patch the current key_dependent map
of tables that is needed to satisfy all keys. This allows us to prune
more bad plans earlier as soon as all keys can be used.
We also set key_dependent to 0 if we found an EQ_REF key, as this an
optimal key for the table and there is no reason to check more keys.
best_extension_by_limited_search() assumes that tables should be sorted
according to size to be able to quickly disregard bad plans. However the
current usage of swap_variables() will change the table order to a not
sorted one for the next recursive call. This breaks the assumtion and
causes performance issues when using many tables (we have to examine
many more plans).
This patch fixes this by ensuring that the original table order is kept
for the not yet used tables when best_extension_by_limited_search() is
called.
This was done by always calling swap_variables() for each table and
restoring the original table order at exit.
Some test changed:
- In a majority of the test the change was that two "identical tables"
where swapped and the optimzer is now using the first/smaller table
- In few test the table order was changed. The new plan looks identical
or slighly better than the original.
- multi_range_read_info_const now uses the new records_in_range interface
- Added handler::avg_io_cost()
- Don't calculate avg_io_cost() in get_sweep_read_cost if avg_io_cost is
not 1.0. In this case we trust the avg_io_cost() from the handler.
- Changed test_quick_select to use TIME_FOR_COMPARE instead of
TIME_FOR_COMPARE_IDX to align this with the rest of the code.
- Fixed bug when using test_if_cheaper_ordering where we didn't use
keyread if index was changed
- Fixed a bug where we didn't use index only read when using order-by-index
- Added keyread_time() to HEAP.
The default keyread_time() was optimized for blocks and not suitable for
HEAP. The effect was the HEAP prefered table scans over ranges for btree
indexes.
- Fixed get_sweep_read_cost() for HEAP tables
- Ensure that range and ref have same cost for simple ranges
Added a small cost (MULTI_RANGE_READ_SETUP_COST) to ranges to ensure
we favior ref for range for simple queries.
- Fixed that matching_candidates_in_table() uses same number of records
as the rest of the optimizer
- Added avg_io_cost() to JT_EQ_REF cost. This helps calculate the cost for
HEAP and temporary tables better. A few tests changed because of this.
- heap::read_time() and heap::keyread_time() adjusted to not add +1.
This was to ensure that handler::keyread_time() doesn't give
higher cost for heap tables than for normal tables. One effect of
this is that heap and derived tables stored in heap will prefer
key access as this is now regarded as cheap.
- Changed cost for index read in sql_select.cc to match
multi_range_read_info_const(). All index cost calculation is now
done trough one function.
- 'ref' will now use quick_cost for keys if it exists. This is done
so that for '=' ranges, 'ref' is prefered over 'range'.
- scan_time() now takes avg_io_costs() into account
- get_delayed_table_estimates() uses block_size and avg_io_cost()
- Removed default argument to test_if_order_by_key(); simplifies code
This patch contains a full implementation of the optimization
that allows to use in-memory rowid / primary filters built for range
conditions over indexes. In many cases usage of such filters reduce
the number of disk seeks spent for fetching table rows.
In this implementation the choice of what possible filter to be applied
(if any) is made purely on cost-based considerations.
This implementation re-achitectured the partial implementation of
the feature pushed by Galina Shalygina in the commit
8d5a11122c.
Besides this patch contains a better implementation of the generic
handler function handler::multi_range_read_info_const() that
takes into account gaps between ranges when calculating the cost of
range index scans. It also contains some corrections of the
implementation of the handler function records_in_range() for MyISAM.
This patch supports the feature for InnoDB and MyISAM.