Commit graph

31 commits

Author SHA1 Message Date
Sergei Golubchik
7b53672c63 Merge branch '10.5' into 10.6 2024-05-08 20:06:00 +02:00
Galina Shalygina
4bc1860eb4 MDEV-23878 Wrong result with semi-join and splittable derived table
Due to this bug a wrong result might be expected from queries with
an IN subquery predicate in the WHERE clause and a derived table in the
FROM clause to which split optimization could be applied.

The function JOIN::fix_all_splittings_in_plan() used the value of the
bitmap JOIN::sjm_lookup_tables() such as it had been left after the
search for the best plan for the select containing the splittable
derived table. That value could not be guaranteed to be correct. So the
recalculation of this bitmap is needed to exclude the plans with key
accesses from SJM lookup tables.

Approved by Igor Babaev <igor@maridb.com>
2024-05-07 12:21:35 +02:00
Marko Mäkelä
2b01e5103d Merge 10.5 into 10.6 2023-12-19 18:41:42 +02:00
Marko Mäkelä
4ae105a37d Merge 10.4 into 10.5 2023-12-18 08:59:07 +02:00
Rex
b4712242dd MDEV-31279 Crash when lateral derived is guaranteed to return no rows
Consider this query
SELECT t1.* FROM t1, (SELECT t2.b FROM t2 WHERE NOT EXISTS
(SELECT 1 FROM t3) GROUP BY b) sq where sq.b = t1.a;

If SELECT 1 FROM t3 is expensive, for example t3 has >
thd->variables.expensive_subquery_limit, first evaluation is deferred to
mysql_derived_fill().  There it is noted that, in the above case
 NOT EXISTS (SELECT 1 FROM t3) is constant and false.

This causes the join variable zero_result_cause to be set to
"Impossible WHERE noticed after reading const tables" and the handler
for this join is never "opened" via handler::ha_open.

When mysql_derived_fill() is called for the next group of results, this
unopened handler is not taken into account.

reviewed by Igor Babaev (igor@mariadb.com)
2023-12-14 06:14:24 +12:00
Sergei Golubchik
a42a6fa99b Merge branch 'bb-10.5-release' into bb-10.6-release 2023-06-05 18:53:02 +02:00
Sergei Golubchik
bed70468ea Merge branch 'bb-10.4-release' into bb-10.5-release 2023-06-05 17:50:51 +02:00
Sergei Petrunia
928012a27a MDEV-31403: Server crashes in st_join_table::choose_best_splitting
The code in choose_best_splitting() assumed that the join prefix is
in join->positions[].

This is not necessarily the case. This function might be called when
the join prefix is in join->best_positions[], too.
Follow the approach from best_access_path(), which calls this function:
pass the current join prefix as an argument,
"const POSITION *join_positions" and use that.
2023-06-05 18:24:39 +03:00
Oleksandr Byelkin
1c39479598 Merge branch '10.5' into 10.6 2023-05-05 11:09:46 +02:00
Oleksandr Byelkin
b735ca4773 Merge branch '10.4' into 10.5 2023-05-05 10:50:02 +02:00
Sergei Petrunia
2594da7a33 MDEV-31194: Server crash or assertion failure with join_cache_level=4
The problem, introduced in patch for MDEV-26301:

When check_join_cache_usage() decides not to use join buffer, it must
adjust the access method accordingly. For BNL-H joins this means switching
from pseudo-"ref access"(with index=MAX_KEY) to some other access method.

Failing to do this will cause assertions down the line when code that is
not aware of BNL-H will try to initialize index use for ref access with
index=MAX_KEY.

The fix is to follow the regular code path to disable the join buffer for
the join_tab ("goto no_join_cache") instead of just returning from
check_join_cache_usage().
2023-05-05 11:16:23 +03:00
Oleksandr Byelkin
652d54bf00 Merge branch '10.5' into 10.6 2023-05-04 07:36:37 +02:00
Oleksandr Byelkin
e87440b79e Merge branch '10.4' into 10.5 2023-05-03 15:53:14 +02:00
Igor Babaev
ce7ffe61d8 MDEV-26301 Split optimization refills temporary table too many times
This patch optimizes the number of refills for the lateral derived table
to which a materialized derived table subject to split optimization is
is converted. This optimized number of refills is now considered as the
expected number of refills of the materialized derived table when searching
for the best possible splitting of the table.
2023-05-03 14:11:11 +02:00
Oleksandr Byelkin
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Oleksandr Byelkin
a576a1cea5 Merge branch '10.3' into 10.4 2022-01-30 09:46:52 +01:00
Sergei Petrunia
c04adce8ac MDEV-26337: subquery with groupby and ROLLUP returns incorrect results on LEFT JOIN on INDEXED values
Disable LATERAL DERIVED optimization for subqueries that have WITH ROLLUP.

This bug could affect queries with grouping derived tables / views / CTEs
with ROLLUP. The bug could manifest itself if the corresponding
materialized derived tables are subject to split optimization.

The current implementation of the split optimization produces rows
from the derived table in an arbitrary order. So these rows must be
accumulated in another temporary table and sorted according to the
used GROUP BY clause in order to be able to generate the additional
ROLLUP rows.

This patch prohibits to use split optimization for grouping derived
tables / views / CTEs with ROLLUP.
2022-01-13 16:49:45 +03:00
Marko Mäkelä
9608773f75 MDEV-4750 follow-up: Reduce disabling innodb_stats_persistent
This essentially reverts commit 4e89ec6692
and only disables InnoDB persistent statistics for tests where it is
desirable. By design, InnoDB persistent statistics will not be updated
except by ANALYZE TABLE or by STATS_AUTO_RECALC.

The internal transactions that update persistent InnoDB statistics
in background tasks (with innodb_stats_auto_recalc=ON) may cause
nondeterministic query plans or interfere with some tests that deal
with other InnoDB internals, such as the purge of transaction history.
2021-08-31 13:55:02 +03:00
Marko Mäkelä
3c97097f11 Merge 10.4 into 10.5 2021-06-04 10:07:29 +03:00
Igor Babaev
385f4316c3 Corrected the test case of MDEV-25714 in order to have the same EXPLAIN output as in 10.3 2021-06-03 20:43:04 -07:00
Igor Babaev
0b797130c6 MDEV-25714 Join using derived with aggregation returns incorrect results
If a join query uses a derived table (view / CTE) with GROUP BY clause then
the execution plan for such join may employ split optimization. When this
optimization is employed the derived table is not materialized. Rather only
some partitions of the derived table are subject to grouping. Split
optimization can be applied only if:
- there are some indexes over the tables used in the join specifying the
  derived table whose prefixes partially cover the field items used in the
  GROUP BY list (such indexes are called splitting indexes)
- the WHERE condition of the join query contains conjunctive equalities
  between columns of the derived table that comprise major parts of
  splitting indexes and columns of the other join tables.

When the optimizer evaluates extending of a partial join by the rows of the
derived table it always considers a possibility of using split optimization.
Different splitting indexes can be used depending on the extended partial
join. At some rare conditions, for example, when there is a non-splitting
covering index for a table joined in the join specifying the derived table
usage of a splitting index to produce rows needed for grouping may be still
less beneficial than usage of such covering index without any splitting
technique. The function JOIN_TAB::choose_best_splitting() must take this
into account.

Approved by Oleksandr Byelkin <sanja@mariadb.com>
2021-06-03 14:44:03 -07:00
Igor Babaev
663bc849b5 MDEV-25714 Join using derived with aggregation returns incorrect results
If a join query uses a derived table (view / CTE) with GROUP BY clause then
the execution plan for such join may employ split optimization. When this
optimization is employed the derived table is not materialized. Rather only
some partitions of the derived table are subject to grouping. Split
optimization can be applied only if:
- there are some indexes over the tables used in the join specifying the
  derived table whose prefixes partially cover the field items used in the
  GROUP BY list (such indexes are called splitting indexes)
- the WHERE condition of the join query contains conjunctive equalities
  between columns of the derived table that comprise major parts of
  splitting indexes and columns of the other join tables.

When the optimizer evaluates extending of a partial join by the rows of the
derived table it always considers a possibility of using split optimization.
Different splitting indexes can be used depending on the extended partial
join. At some rare conditions, for example, when there is a non-splitting
covering index for a table joined in the join specifying the derived table
usage of a splitting index to produce rows needed for grouping may be still
less beneficial than usage of such covering index without any splitting
technique. The function JOIN_TAB::choose_best_splitting() must take this
into account.

Approved by Oleksandr Byelkin <sanja@mariadb.com>
2021-06-03 12:16:59 -07:00
Nikita Malyavin
3f55c56951 Merge branch bb-10.4-release into bb-10.5-release 2021-05-05 23:57:11 +03:00
Sergei Petrunia
2820f30dde MDEV-23723: Crash when test_if_skip_sort_order() is checked for derived ...
The problem was caused by the following scenario:

Subquery's table has two indexes, KEY a(a), KEY a_b(a,b)

- LATERAL DERIVED optimization decides to use index a.
  = The subquery uses ref access over key a.
- test_if_skip_sort_order() sees that KEY a_b satisfies the
  subquery's GROUP BY clause, and attempts to switch to it.
  = It fails to do so, because KEYUSE objects for index a_b
    are switched off.

Fixed by disallowing to change the ref access key if it uses KEYUSE
objects injected by LATERAL DERIVED optimization.
2021-04-30 21:42:14 +03:00
Monty
eb483c5181 Updated optimizer costs in multi_range_read_info_const() and sql_select.cc
- multi_range_read_info_const now uses the new records_in_range interface
- Added handler::avg_io_cost()
- Don't calculate avg_io_cost() in get_sweep_read_cost if avg_io_cost is
  not 1.0.  In this case we trust the avg_io_cost() from the handler.
- Changed test_quick_select to use TIME_FOR_COMPARE instead of
  TIME_FOR_COMPARE_IDX to align this with the rest of the code.
- Fixed bug when using test_if_cheaper_ordering where we didn't use
  keyread if index was changed
- Fixed a bug where we didn't use index only read when using order-by-index
- Added keyread_time() to HEAP.
  The default keyread_time() was optimized for blocks and not suitable for
  HEAP. The effect was the HEAP prefered table scans over ranges for btree
  indexes.
- Fixed get_sweep_read_cost() for HEAP tables
- Ensure that range and ref have same cost for simple ranges
  Added a small cost (MULTI_RANGE_READ_SETUP_COST) to ranges to ensure
  we favior ref for range for simple queries.
- Fixed that matching_candidates_in_table() uses same number of records
  as the rest of the optimizer
- Added avg_io_cost() to JT_EQ_REF cost. This helps calculate the cost for
  HEAP and temporary tables better. A few tests changed because of this.
- heap::read_time() and heap::keyread_time() adjusted to not add +1.
  This was to ensure that handler::keyread_time() doesn't give
  higher cost for heap tables than for normal tables. One effect of
  this is that heap and derived tables stored in heap will prefer
  key access as this is now regarded as cheap.
- Changed cost for index read in sql_select.cc to match
  multi_range_read_info_const(). All index cost calculation is now
  done trough one function.
- 'ref' will now use quick_cost for keys if it exists. This is done
  so that for '=' ranges, 'ref' is prefered over 'range'.
- scan_time() now takes avg_io_costs() into account
- get_delayed_table_estimates() uses block_size and avg_io_cost()
- Removed default argument to test_if_order_by_key(); simplifies code
2020-03-27 03:58:32 +02:00
Monty
a071e0e029 Merge branch '10.2' into 10.3 2019-09-03 13:17:32 +03:00
Igor Babaev
8f4de38f65 MDEV-18467 Server crashes in fix_semijoin_strategies_for_picked_join_order
If a splittable materialized derived table / view T is used in a inner nest
of an outer join with impossible ON condition then T is marked as a
constant table. Yet the execution plan to build T is still searched for
in spite of the fact that is not needed. So it should be set.
2019-03-04 23:11:18 -08:00
Igor Babaev
8595361768 MDEV-17381 Wrong query result with LATERAL DERIVED optimization
and join_cache_level=6

This bug was fixed by the patch for mdev-17382 applied to 5.5.
2018-10-08 06:55:48 -07:00
Igor Babaev
5ec144cfab MDEV-17211 Server crash on query
The function JOIN_TAB::choose_best_splitting() did not take into account
that for some tables whose fields were used in the GROUP BY list of
the specification of a splittable materialized derived there might exist
no elements in the array ext_keyuses_for_splitting.
2018-09-17 18:50:21 -07:00
Igor Babaev
c5a9a63293 MDEV-16917 Index affects query results
The optimizer erroneously allowed to use join cache when joining a
splittable materialized table together with splitting optimization.
As a consequence in some rare cases the server returned wrong result
sets for queries with materialized derived.

This patch allows to use either join cache without usage of splitting
technique for materialization of a splittable derived table or splitting
without usage of join cache when joining such table. The costs the these
alternatives are compared and the best variant is chosen.
2018-09-15 14:28:33 -07:00