Commit graph

47 commits

Author SHA1 Message Date
Oleksandr Byelkin
f52954ef42 Merge commit '10.4' into 10.5 2023-07-20 11:54:52 +02:00
Oleksandr Byelkin
78b1831c9f Merge branch '10.4' into 10.4.30 2023-06-07 15:08:29 +02:00
Sergei Golubchik
bed70468ea Merge branch 'bb-10.4-release' into bb-10.5-release 2023-06-05 17:50:51 +02:00
Sergei Petrunia
928012a27a MDEV-31403: Server crashes in st_join_table::choose_best_splitting
The code in choose_best_splitting() assumed that the join prefix is
in join->positions[].

This is not necessarily the case. This function might be called when
the join prefix is in join->best_positions[], too.
Follow the approach from best_access_path(), which calls this function:
pass the current join prefix as an argument,
"const POSITION *join_positions" and use that.
2023-06-05 18:24:39 +03:00
Igor Babaev
8f3bf593d2 MDEV-31240 Crash with condition pushable into derived and containing outer reference
This bug could affect queries containing a subquery over splittable derived
tables and having an outer references in its WHERE clause. If such subquery
contained an equality condition whose left part was a reference to a column
of the derived table and the right part referred only to outer columns
then the server crashed in the function st_join_table::choose_best_splitting()
The crashing code was added in the commit ce7ffe61d8
that made the code of the function sensitive to presence of the flag
OUTER_REF_TABLE_BIT in the KEYUSE_EXT::needed_in_prefix fields.

The field needed_in_prefix of the KEYUSE_EXT structure should not contain
table maps with OUTER_REF_TABLE_BIT or RAND_TABLE_BIT.

Note that this fix is quite conservative: for affected queries it just
returns the query plans that were used before the above mentioned commit.
In fact the equalities causing crashes should be pushed into derived tables
without any usage of split optimization.

Approved by Sergei Petrunia <sergey@mariadb.com>
2023-06-03 10:39:34 +02:00
Daniel Black
2771890bab MDEV-31301 sql/opt_split.cc:1043:5: warning: ‘best_param_tables’ may be used uninitialized
The warning is true, it needs to be initialized.

Caused by: MDEV-26301 / ce7ffe61d8
2023-06-01 08:37:04 +01:00
Igor Babaev
0474466bc2 MDEV-31240 Crash with condition pushable into derived and containing outer reference
This bug could affect queries containing a subquery over splittable derived
tables and having an outer references in its WHERE clause. If such subquery
contained an equality condition whose left part was a reference to a column
of the derived table and the right part referred only to outer columns
then the server crashed in the function st_join_table::choose_best_splitting()
The crashing code was added in the commit ce7ffe61d8
that made the code of the function sensitive to presence of the flag
OUTER_REF_TABLE_BIT in the KEYUSE_EXT::needed_in_prefix fields.

The field needed_in_prefix of the KEYUSE_EXT structure should not contain
table maps with OUTER_REF_TABLE_BIT or RAND_TABLE_BIT.

Note that this fix is quite conservative: for affected queries it just
returns the query plans that were used before the above mentioned commit.
In fact the equalities causing crashes should be pushed into derived tables
without any usage of split optimization.

Approved by Sergei Petrunia <sergey@mariadb.com>
2023-05-12 10:34:22 +03:00
Oleksandr Byelkin
e87440b79e Merge branch '10.4' into 10.5 2023-05-03 15:53:14 +02:00
Sergei Petrunia
ed3e6f66a2 MDEV-26301: Split optimization refills: Optimizer Trace coverage
Add Optimizer Trace printouts.
2023-05-03 14:11:25 +02:00
Igor Babaev
ce7ffe61d8 MDEV-26301 Split optimization refills temporary table too many times
This patch optimizes the number of refills for the lateral derived table
to which a materialized derived table subject to split optimization is
is converted. This optimized number of refills is now considered as the
expected number of refills of the materialized derived table when searching
for the best possible splitting of the table.
2023-05-03 14:11:11 +02:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Oleksandr Byelkin
a576a1cea5 Merge branch '10.3' into 10.4 2022-01-30 09:46:52 +01:00
Igor Babaev
0041265671 MDEV-27510 Query returns wrong result when using split optimization
This bug may affect the queries that uses a grouping derived table with
grouping list containing references to columns from different tables if
the optimizer decides to employ the split optimization for the derived
table. In some very specific cases it may affect queries with a grouping
derived table that refers only one base table.
This bug was caused by an improper fix for the bug MDEV-25128. The fix
tried to get rid of the equality conditions pushed into the where clause
of the grouping derived table T to which the split optimization had been
applied. The fix erroneously assumed that only those pushed equalities
that were used for ref access of the tables referenced by T were needed.
In fact the function remove_const() that figures out what columns from the
group list can be removed if the split optimization is applied can uses
other pushed equalities as well.
This patch actually provides a proper fix for MDEV-25128. Rather than
trying to remove invalid pushed equalities referencing the fields of SJM
tables with a look-up access the patch attempts not to push such equalities.

Approved by Oleksandr Byelkin <sanja@mariadb.com>
2022-01-25 17:12:37 -08:00
Igor Babaev
97425f740f MDEV-27132 Wrong result from query when using split optimization
This bug could affect queries with a grouping derived table containing
equalities in the where clause of its specification if the optimizer
chose to apply split optimization to access the derived table. In such
cases wrong results could be returned from the queries.
When the optimizer considers a possibility of using split optimization
to a derived table it injects equalities joining the derived table with
other tables into the where condition of the derived table. After the join
order for the execution using split optimization has been chosen as the
cheapest the injected equalities that are not used to access the derived
table are removed from the where condition of the derived table.
For this removal the optimizer looks through the conjuncts of the where
condition of the derived table, fetches the equalities and checks whether
they belong to the list of injected equalities.
As the injection of the list was performed just by the insertion of it
into the list of top level AND condition of the where condition some extra
conjuncts from the where condition could be automatically attached to the
end of the list of injected equalities. If such attached conjunct happened
to be an equality predicate it was removed from the where condition of the
derived table and thus lost for checking at the execution phase.
The bug has been fixed by injecting of a shallow copy of the list of the
pushed equalities rather than the list itself leaving the latter intact.

Approved by Oleksandr Byelkin <sanja@mariadb.com>
2022-01-17 23:04:39 -08:00
Sergei Petrunia
c04adce8ac MDEV-26337: subquery with groupby and ROLLUP returns incorrect results on LEFT JOIN on INDEXED values
Disable LATERAL DERIVED optimization for subqueries that have WITH ROLLUP.

This bug could affect queries with grouping derived tables / views / CTEs
with ROLLUP. The bug could manifest itself if the corresponding
materialized derived tables are subject to split optimization.

The current implementation of the split optimization produces rows
from the derived table in an arbitrary order. So these rows must be
accumulated in another temporary table and sorted according to the
used GROUP BY clause in order to be able to generate the additional
ROLLUP rows.

This patch prohibits to use split optimization for grouping derived
tables / views / CTEs with ROLLUP.
2022-01-13 16:49:45 +03:00
Marko Mäkelä
3c97097f11 Merge 10.4 into 10.5 2021-06-04 10:07:29 +03:00
Igor Babaev
0b797130c6 MDEV-25714 Join using derived with aggregation returns incorrect results
If a join query uses a derived table (view / CTE) with GROUP BY clause then
the execution plan for such join may employ split optimization. When this
optimization is employed the derived table is not materialized. Rather only
some partitions of the derived table are subject to grouping. Split
optimization can be applied only if:
- there are some indexes over the tables used in the join specifying the
  derived table whose prefixes partially cover the field items used in the
  GROUP BY list (such indexes are called splitting indexes)
- the WHERE condition of the join query contains conjunctive equalities
  between columns of the derived table that comprise major parts of
  splitting indexes and columns of the other join tables.

When the optimizer evaluates extending of a partial join by the rows of the
derived table it always considers a possibility of using split optimization.
Different splitting indexes can be used depending on the extended partial
join. At some rare conditions, for example, when there is a non-splitting
covering index for a table joined in the join specifying the derived table
usage of a splitting index to produce rows needed for grouping may be still
less beneficial than usage of such covering index without any splitting
technique. The function JOIN_TAB::choose_best_splitting() must take this
into account.

Approved by Oleksandr Byelkin <sanja@mariadb.com>
2021-06-03 14:44:03 -07:00
Igor Babaev
663bc849b5 MDEV-25714 Join using derived with aggregation returns incorrect results
If a join query uses a derived table (view / CTE) with GROUP BY clause then
the execution plan for such join may employ split optimization. When this
optimization is employed the derived table is not materialized. Rather only
some partitions of the derived table are subject to grouping. Split
optimization can be applied only if:
- there are some indexes over the tables used in the join specifying the
  derived table whose prefixes partially cover the field items used in the
  GROUP BY list (such indexes are called splitting indexes)
- the WHERE condition of the join query contains conjunctive equalities
  between columns of the derived table that comprise major parts of
  splitting indexes and columns of the other join tables.

When the optimizer evaluates extending of a partial join by the rows of the
derived table it always considers a possibility of using split optimization.
Different splitting indexes can be used depending on the extended partial
join. At some rare conditions, for example, when there is a non-splitting
covering index for a table joined in the join specifying the derived table
usage of a splitting index to produce rows needed for grouping may be still
less beneficial than usage of such covering index without any splitting
technique. The function JOIN_TAB::choose_best_splitting() must take this
into account.

Approved by Oleksandr Byelkin <sanja@mariadb.com>
2021-06-03 12:16:59 -07:00
Marko Mäkelä
80ed136e6d Merge 10.4 into 10.5 2021-04-21 09:01:01 +03:00
Monty
031f11717d Fix all warnings given by UBSAN
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.

The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
  complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
  memory access of integers.  Fixed by using byte_order_generic.h when
  compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
  disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
  suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
  safe to have overflows (two cases, in item_func.cc).

Things fixed:
- Don't left shift signed values
  (byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
  constructors.  This was needed as UBSAN checks that these types has
  correct values when one copies an object.
  (gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
  deleted objects.
  (events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
  on Query_arena object.
- Fixed several cast of objects to an incompatible class!
  (Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
   sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
  This includes also ++ and -- of integers.
  (Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
  value_type is initialized to this instead of to -1, which is not a valid
  enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
  instead of a null string (safer as it ensures we do not do arithmetic
  on null strings).

Other things:

- Changed struct st_position to an OBJECT and added an initialization
  function to it to ensure that we do not copy or use uninitialized
  members. The change to a class was also motived that we used "struct
  st_position" and POSITION randomly trough the code which was
  confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
  the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
  avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr.  (This variable was before
  only in 10.5 and up).  It can now have one of two values:
  ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
  it virtual. This was an effort to get UBSAN to work with loaded storage
  engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
  in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
  server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
  to integer arithmetic.

Changes that should not be needed but had to be done to suppress warnings
from UBSAN:

- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
  compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
  some compile time warnings.

Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
2021-04-20 12:30:09 +03:00
Sergei Petrunia
5b678d9ea4 MDEV-25251: main.derived_split_innodb fails on ICC release binary
The code compares two query plans with identical costs, the plan
with lateral is the same as one without. Introduce a small difference
to cost numbers to prefer non-lateral plan in this case.
2021-03-31 09:52:28 +03:00
Marko Mäkelä
80459bcbd4 Merge 10.4 into 10.5 2021-03-27 17:37:42 +02:00
Igor Babaev
480a06718d MDEV-25128 Wrong result from join with materialized semi-join and
splittable derived

If one of joined tables of the processed query is a materialized derived
table (or view or CTE) with GROUP BY clause then under some conditions it
can be subject to split optimization. With this optimization new equalities
are injected into the WHERE condition of the SELECT that specifies this
derived table. The injected equalities are generated for all join orders
with which the split optimization can employed. After the best join order
has been chosen only certain of this equalities are really needed. The
others can be safely removed. If it's not done and some of injected
equalities involve expressions over semi-joins with look-up access then
the query may return a wrong result set.
This patch effectively removes equalities injected for split optimization
that are needed only at the optimization stage and not needed for execution.

Approved by serg@mariadb.com
2021-03-23 20:54:54 -07:00
Oleksandr Byelkin
02e7bff882 Merge commit '10.4' into 10.5 2021-01-06 10:53:00 +01:00
Varun Gupta
e5d88e03be MDEV-22740: UBSAN: sql/opt_split.cc:1150:28: runtime error: shift exponent 61 is too large for 32-bit type 'int' (on optimized builds)
Use a ulonglong instead of uint when left shifting to calculate the table map for all the tables in a query
2020-12-14 16:51:30 +05:30
Marko Mäkelä
133b4b46fe Merge 10.4 into 10.5 2020-11-03 16:24:47 +02:00
Marko Mäkelä
c7f322c91f Merge 10.2 into 10.3 2020-11-02 15:48:47 +02:00
Marko Mäkelä
37c14690fc Merge 10.4 into 10.5 2020-03-30 19:07:25 +03:00
Igor Babaev
caf110fa52 MDEV-21883 Server crashes when joining a subselect with 32 tables and GROUP BY
This bug could cause a crash for any query that used a derived table/view/CTE
whose specification was a SELECT with a GROUP BY clause and a FROM list
containing 32 or more table references.
The problem appeared only in the cases when the splitting optimization
could be applied to such derived table/view/CTE.
2020-03-23 19:21:57 -07:00
Sergei Golubchik
c1c5222cae cleanup: PSI key is *always* the first argument 2020-03-10 19:24:23 +01:00
Sergei Golubchik
7c58e97bf6 perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
Igor Babaev
8d7462ec49 MDEV-21614 Wrong query results with optimizer_switch="split_materialized=on"
Do not materialize a semi-join nest if it contains a materialized derived
table /view that potentially can be subject to the split optimization.
Splitting of materialization of such nest would help, but currently there
is no code to support this technique.
2020-02-07 19:48:35 -08:00
Igor Babaev
8f4de38f65 MDEV-18467 Server crashes in fix_semijoin_strategies_for_picked_join_order
If a splittable materialized derived table / view T is used in a inner nest
of an outer join with impossible ON condition then T is marked as a
constant table. Yet the execution plan to build T is still searched for
in spite of the fact that is not needed. So it should be set.
2019-03-04 23:11:18 -08:00
Igor Babaev
5ec144cfab MDEV-17211 Server crash on query
The function JOIN_TAB::choose_best_splitting() did not take into account
that for some tables whose fields were used in the GROUP BY list of
the specification of a splittable materialized derived there might exist
no elements in the array ext_keyuses_for_splitting.
2018-09-17 18:50:21 -07:00
Igor Babaev
c5a9a63293 MDEV-16917 Index affects query results
The optimizer erroneously allowed to use join cache when joining a
splittable materialized table together with splitting optimization.
As a consequence in some rare cases the server returned wrong result
sets for queries with materialized derived.

This patch allows to use either join cache without usage of splitting
technique for materialization of a splittable derived table or splitting
without usage of join cache when joining such table. The costs the these
alternatives are compared and the best variant is chosen.
2018-09-15 14:28:33 -07:00
Igor Babaev
d453374fc4 MDEV-16801 seg_fault on a query
The bug was in the in the code of JOIN::check_for_splittable_materialized()
where the structures describing the fields of a materialized derived
table that potentially could be used in split optimization were build.
As a result of this bug some fields that were not usable for splitting
were detected as usable. This could trigger crashes further in
st_join_table::choose_best_splitting().
2018-08-03 15:09:49 -07:00
Varun Gupta
724ab9a1cb MDEV-16057: Using optimization Splitting with Group BY we see an unnecessary attached condition
t1.pk IS NOT NULL where pk is a PRIMARY KEY

For equalites in the WHERE clause we create a keyuse array that contains the set of all equalities.
For each KEYUSE inside the keyuse array we have a field "null_rejecting"
which tells that the equality will not hold if either the left or right
hand side of the equality is NULL.
If the equality is NULL rejecting then we accordingly add a NOT NULL condition for the field present in
the item val(present in the KEYUSE struct) when we are doing ref access.

For the optimization of splitting with GROUP BY we always set the null_rejecting to TRUE and we are doing ref access on
the GROUP by field. This does create a problem when the equality is NOT NULL rejecting. This happens in this case as
in the equality we have the right hand side as t1.pk where pk is a PRIMARY KEY , hence it is NOT NULLABLE. So we should have
null rejecting set to FALSE for such a case.
2018-05-06 23:05:37 +05:30
Igor Babaev
cff60be7fe MDEV-15899 Server crashes in st_join_table::is_inner_table_of_outer_join
The crash happened because JOIN::check_for_splittable_materialized()
called by mistake the function JOIN_TAB::is_inner_table_of_outer_join()
instead of the function TABLE_LIST::is_inner_table_of_outer_join().
The former cannot be called before the call of make_outerjoin_info().
2018-04-17 23:39:40 -07:00
Michael Widenius
ddc5764303 Remove compiler warnings
- Remove unused variables
- Mark variables unused
- Fix wrong types
- Add no-strict-aliasing to BUILD scripts
2018-04-16 20:16:43 +03:00
Monty
75dd94c7ce Fixed compiler warnings
Remove compiler warnings in sphinx, item_sum.cc and opt_split.cc
2018-03-29 14:19:48 +03:00
Vladislav Vaintroub
6c279ad6a7 MDEV-15091 : Windows, 64bit: reenable and fix warning C4267 (conversion from 'size_t' to 'type', possible loss of data)
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.

This fix excludes rocksdb, spider,spider, sphinx and connect for now.
2018-02-06 12:55:58 +00:00
Igor Babaev
7a9611aee2 Fixed MDEV-14994 Assertion `join->best_read < double(1.79...15e+308L)' or
server crash in JOIN::fix_all_splittings_in_plan

Cost formulas must take into account the case when a splittable table
has now rows.
2018-01-30 21:12:11 -08:00
Igor Babaev
775aa5542d Fixed mdev-15017 Server crashes in in st_join_table::fix_splitting
Do not apply splitting for constant tables.
2018-01-29 23:56:28 -08:00
Igor Babaev
c5ac1f953b Fixed mdev-14880: Assertion `inj_cond_list.elements' failed
in JOIN::inject_best_splitting_cond

The value of SplM_opt_info::last_plan should be set to NULL
before any search for a splitting plan for a splittable
materialized table.
2018-01-08 15:22:17 -08:00
Vladislav Vaintroub
7a01e64c3a Fix warnings 2018-01-06 22:11:42 +00:00
Igor Babaev
86cf60a615 Fixed mdev-14845 Server crashes in st_join_table::is_inner_table_of_outer_join
Do not try to apply the splitting optimization to a materialized
derived if it's specified by a select with impossible where
or if all joined tables of this select are constant.
2018-01-02 14:09:16 -08:00
Igor Babaev
4f0299f8b3 This is a full cost-based implementation of the optimization that employs
splitting technique for equi-joins of materialized derived tables/views/CTEs.
(see mdev-13369 and mdev-13389).
2017-12-30 12:29:09 -08:00