Commit graph

67 commits

Author SHA1 Message Date
unknown
da5214831d Fix for bug lp:944706, task MDEV-193
The patch enables back constant subquery execution during
query optimization after it was disabled during the development
of MWL#89 (cost-based choice of IN-TO-EXISTS vs MATERIALIZATION).

The main idea is that constant subqueries are allowed to be executed
during optimization if their execution is not expensive.

The approach is as follows:
- Constant subqueries are recursively optimized in the beginning of
  JOIN::optimize of the outer query. This is done by the new method
  JOIN::optimize_constant_subqueries(). This is done so that the cost
  of executing these queries can be estimated.
- Optimization of the outer query proceeds normally. During this phase
  the optimizer may request execution of non-expensive constant subqueries.
  Each place where the optimizer may potentially execute an expensive
  expression is guarded with the predicate Item::is_expensive().
- The implementation of Item_subselect::is_expensive has been extended
  to use the number of examined rows (estimated by the optimizer) as a
  way to determine whether the subquery is expensive or not.
- The new system variable "expensive_subquery_limit" controls how many
  examined rows are considered to be not expensive. The default is 100.

In addition, multiple changes were needed to make this solution work
in the light of the changes made by MWL#89. These changes were needed
to fix various crashes and wrong results, and legacy bugs discovered
during development.
2012-05-17 13:46:05 +03:00
Sergei Golubchik
f860b2aad4 merge 2012-04-07 15:58:46 +02:00
Sergey Petrunya
2a16e7674b BUG#913030: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3
- When doing join optimization, pre-sort the tables so that they mimic the execution
  order we've had with 'semijoin=off'. 
- That way, we will not get regressions when there are two query plans (the old and the
  new) that have indentical costs but different execution times (because of factors that
  the optimizer was not able to take into account).
2012-04-02 21:41:54 +04:00
Sergei Golubchik
25609313ff 5.3.4 merge 2012-02-15 18:08:08 +01:00
Sergey Petrunya
12bd3dfe29 BUG#923246: Loosescan reports different result than other semijoin methods
- If LooseScan is used with quick select, require that quick select produces 
  data in key order (this disables use of MRR, which can return data in arbitrary order).
2012-01-30 20:34:47 +04:00
unknown
69327e2987 Adjust test results after Monty's push of the new
handler counter Handler_read_rnd_deleted.
2012-01-18 12:53:50 +02:00
Sergei Golubchik
4f435bddfd 5.3 merge 2012-01-13 15:50:02 +01:00
Igor Babaev
2b1f0b8757 Back-ported the patch of the mysql-5.6 code line that
fixed several defects in the greedy optimization:

1) The greedy optimizer calculated the 'compare-cost' (CPU-cost)
   for iterating over the partial plan result at each level in
   the query plan as 'record_count / (double) TIME_FOR_COMPARE'

   This cost was only used locally for 'best' calculation at each
   level, and *not* accumulated into the total cost for the query plan.

   This fix added the 'CPU-cost' of processing 'current_record_count'
   records at each level to 'current_read_time' *before* it is used as
   'accumulated cost' argument to recursive 
   best_extension_by_limited_search() calls. This ensured that the
   cost of a huge join-fanout early in the QEP was correctly
   reflected in the cost of the final QEP.

   To get identical cost for a 'best' optimized query and a
   straight_join with the same join order, the same change was also
   applied to optimize_straight_join() and get_partial_join_cost()

2) Furthermore to get equal cost for 'best' optimized query and a
   straight_join the new code substrcated the same '0.001' in
   optimize_straight_join() as it had been already done in
   best_extension_by_limited_search()

3) When best_extension_by_limited_search() aggregated the 'best' plan a
   plan was 'best' by the check :

   'if ((search_depth == 1) || (current_read_time < join->best_read))'

   The term '(search_depth == 1' incorrectly caused a new best plan to be
   collected whenever the specified 'search_depth' was reached - even if
   this partial query plan was more expensive than what we had already
   found.
2011-12-24 08:55:10 -08:00
Igor Babaev
919f19110f Merge 2011-12-15 15:55:00 -08:00
Sergey Petrunya
04e9004fa3 BUG#901399: Wrong result (extra row) with semijoin=ON, materialization=OFF, optimizer_prune_level=0
- Correctly handle plan refinement stage for LooseScan plans: run create_ref_for_key() if LooseScan 
  plan includes a ref access, and if we don't have any fixed key components, switch to a full index scan.
2011-12-16 03:44:25 +04:00
Igor Babaev
a910e8ef5b Made join_cache_level == 2 by default. 2011-12-15 14:26:59 -08:00
Sergei Golubchik
2ccf247e93 after merge changes:
* rename all debugging related command-line options
  and variables to start from "debug-", and made them all
  OFF by default.
* replace "MySQL" with "MariaDB" in error messages
* "Cast ... converted ... integer to it's ... complement"
  is now a note, not a warning
* @@query_cache_strip_comments now has a session scope,
  not global.
2011-12-12 23:58:40 +01:00
Sergey Petrunya
ae480437ce Small semi-join optimization improvement:
- if we're considering FirstMatch access with one inner table, and 
  @@optimizer_switch has semijoin_with_cache flag, calculate costs
  as if we used join cache (because we will be able to do so)
2011-12-08 04:22:38 +04:00
Sergey Petrunya
255fd6c929 Make subquery Materialization, as well as semi-join Materialization be shown
in EXPLAIN as select_type==MATERIALIZED. 

Before, we had select_type==SUBQUERY and it was difficult to tell materialized
subqueries from uncorrelated scalar-context subqueries.
2011-12-05 01:31:42 +04:00
Sergei Golubchik
effed09bd7 5.3->5.5 merge 2011-11-27 17:46:20 +01:00
Igor Babaev
17b4e4a194 Set new default values for the optimizer switch flags 'derived_merge'
and 'derived_with_keys'. Now they are set on by default.
2011-11-26 14:23:00 -08:00
Sergey Petrunya
3a9edc5f77 Merge 2011-11-25 14:28:43 +04:00
Sergey Petrunya
f84dbf4b20 Semi-join optimizations code cleanup part 2:
- Make EXPLAIN display "Start temporary" at the start of the fanout (it used to display
  at the first table whose rowid gets into temp. table which is not that useful for
  the user)
- Updated test results (all checked)
2011-11-25 05:56:58 +04:00
Sergei Golubchik
d2755a2c9c 5.3->5.5 merge 2011-11-22 18:04:38 +01:00
unknown
f0d9908fc3 Merge enabling of materialization=on by default with main tree. 2011-11-21 16:56:32 +02:00
Igor Babaev
4d358f48c9 Merge. 2011-11-15 14:35:36 -08:00
Igor Babaev
b4b7d941fe Fixed LP bug #889750.
If the optimizer switch 'semijoin_with_cache' is set to 'off' then 
join cache cannot be used to join inner tables of a semijoin.

Also fixed a bug in the function check_join_cache_usage() that led
to wrong output of the EXPLAIN commands for some test cases.
2011-11-15 13:03:00 -08:00
unknown
f76bfc40ea Fix for LP BUG#824425: Prohibiting subqueries in rows for left part of IN/ALL/ANY
Fix for walk() method of subqueries: always call the method on the subquery.
2011-11-13 12:02:13 +02:00
unknown
511459bd14 Enable subquery materialization=ON by default. 2011-11-09 15:36:25 +02:00
Sergei Golubchik
76f0b94bb0 merge with 5.3
sql/sql_insert.cc:
  CREATE ... IF NOT EXISTS may do nothing, but
  it is still not a failure. don't forget to my_ok it.
  ******
  CREATE ... IF NOT EXISTS may do nothing, but
  it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
  small cleanup
  ******
  small cleanup
2011-10-19 21:45:18 +02:00
unknown
54caeee5d6 Making subquery cache on by default. 2011-10-05 18:18:00 +03:00
Igor Babaev
3f82e2edb8 The previous correction of the cost estimate to access a joined table
in the function best_access_path revealed another bug: currently 
table scans on NULL keys used for NOT IN subqueries cannot work 
together with employment of join caches for inner tables of these 
subqueries. Otherwise the result can be wrong as it could be seen 
with the result of the test case constructed for bug #37894 
in the file subselect3_jcl6.result.
2011-09-30 21:53:59 -07:00
Igor Babaev
715dc5f99d Fixed a cost estimation bug introduced into in the function best_access_path
of the 5.3 code line after a merge with 5.2 on 2010-10-28
in order not to allow the cost to access a joined table to be equal
to 0 ever.

Expanded data sets for many test cases to get the same execution plans
as before.
2011-09-30 18:55:02 -07:00
Igor Babaev
63abf00a62 Made the optimizer switches 'derived_merge' and 'derived_with_keys'
off by default.
2011-07-21 14:23:08 -07:00
unknown
c1b6eb1490 Merge of subquery cache off by default. 2011-07-15 12:16:46 +03:00
unknown
af284b55f0 Make subquery cache off by default.
mysql-test/r/subselect_scache.result:
  Test with subquery cache on.
mysql-test/t/subselect_scache.test:
  Test with subquery cache on.
2011-07-15 11:36:36 +03:00
Igor Babaev
03081bc1fd Changed the default setting of the optimizer switch 'optimize_join_buffer_size'.
Made it 'off' by default.
2011-07-14 22:24:59 -07:00
Sergey Petrunya
1492de8563 Set the default to be mrr=off,mrr_sort_keys=off:
- Set the default
- Adjust the testcases so that 'new' tests are run with optimizations turned on.
- Pull out relevant tests from "irrelevant" tests and run them with optimizations on.
- Run range.test and innodb.test with both mrr=on and mrr=off
2011-07-08 18:46:47 +04:00
Sergey Petrunya
c1de6f8b77 Change the default @@optimizer_switch setting from
semijoin=on,firstmatch=on,loosescan=on
to
  semijoin=off,firstmatch=off,loosescan=off
Adjust the testcases:
- Modify subselect*.test and join_cache.test so that all tests
  use the same execution paths as before (i.e. optimizations that
  are being tested are enabled)
- Let all other test files run with the new default settings (i.e.
  with new optimizations disabled)
- Copy subquery testcases from these files into t/subselect_extra.test
  which will run them with new optimizations enabled.
2011-07-05 01:44:15 +04:00
Sergei Golubchik
b4a0b2c2f8 post-merge fixes.
most tests pass.
5.3 merge is next
2011-07-02 22:12:12 +02:00
Igor Babaev
704f97035f Merged the code of MWL#106 into 5.3
Resolved all conflicts, bad merges and fixed a few minor bugs in the code.
Commented out the queries from multi_update, view, subselect_sj, func_str,
derived_view, view_grant that failed either with crashes in ps-protocol or
with wrong results.
The failures are clear indications of some bugs in the code and these bugs
are to be fixed.
2011-05-16 22:39:43 -07:00
unknown
5dc11616b2 MWL#89
Merge with 5.3
2011-05-02 21:59:16 +03:00
unknown
0f4236659c Fix LP BUG#718593
Analysis:
Build_equal_items_for_cond() rewrites the WHERE clause in such a way,
that it may merge the list join->cond_equal->current_level with the
list of child Items in an AND condition of the WHERE clause.

The place where this is done is:
static COND *build_equal_items_for_cond(THD *thd, COND *cond,
                                        COND_EQUAL *inherited)
{
  ...
      if (and_level)
    {
      args->concat(&eq_list);
      args->concat((List<Item> *)&cond_equal.current_level);
    }
  ...
}

As a result, later transformations on the WHERE clause may change the
structure of the list join->cond_equal->current_level without knowing this.

Specifically in this bug, Item_in_subselect::inject_in_to_exists_cond
creates a new AND of the old WHERE clause and the IN->EXISTS conditions.
It then calls fix_fields() for the new AND. Among other things, fix_fields
flattens all nested ANDs into one by merging the AND argument lists.

When there is a cond_equal for the JOIN, its list of Item_equal objects
is attached to the end of the original AND. When a lower-level AND is
merged into the top-level one, the argument list of the lower-level AND
is concatenated to the list of multiple equalities in the upper-level AND.

As a result, when substitute_for_best_equal_field processes the 
multiple equalities, it turns out that the multiple equality list contains
the Items from the lower-level AND which were concatenated to the end of
the join->cond_equal->current_level list. This results in a crash because
this list must not contain any other Items except for the previously found
Item_equal ones.

Solution:
When performing IN->EXIST predicate injection, and the where clause is an
AND, detach the list of Item_equal objects before calling fix_fields on
the injected where clause.

After fix_fields is done, reattach back the multiple equalities list to
the end of the argument list of the new AND.
2011-04-28 17:15:05 +03:00
Sergei Golubchik
0accbd0364 lots of post-merge changes 2011-04-25 17:22:25 +02:00
unknown
43acceeb47 Fix LP BUG#715069
Analysis:
The wrong result is a consquence of sorting the subquery
result and then selecting only the first row due to the
artificial LIMIT 1 introduced by the fix_fields phase.
Normally, if there is an ORDER BY in a subquery, the ORDER
is removed (Item_in_subselect::select_in_like_transformer),
however if a GROUP BY is transformed into ORDER, this happens
later, after the removal of the ORDER clause of subqueries, so
we end up with a subquery with an ORDER clause, and an artificially
added LIMIT 1.

The reason why the same works in the main 5.3 without MWL#89, is
that the 5.3 performs all subquery transformations, including
IN->EXISTS before JOIN::optimize(). The beginning of JOIN::optimize
does:
  if (having || (select_options & OPTION_FOUND_ROWS))
    select_limit= HA_POS_ERROR;
which sets the limit back to infinity, thus 5.3 sorts the whole
subquery result, and IN performs the lookup into all subquery result
rows.

Solution:
Sorting of subqueries without LIMIT is meaningless. Since LIMIT in
subqueries is not supported, the patch removes sorting by setting
  join->skip_sort_order= true
for each subquery JOIN object. This improves a number of execution
plans to not perform unnecessary sorting at all.
2011-04-20 18:36:55 +03:00
Sergey Petrunya
acc161d363 BUG#752992: Wrong results for a subquery with 'semijoin=on'
- Let advance_sj_state() save the value of JOIN::cur_dups_producing_tables
  in POSITION::prefix_dups_producing_tables, and restore_sj_state() restore
  it.
2011-04-08 02:12:03 +04:00
Sergey Petrunya
997445bc8e Make EXPLAIN better at displaying MRR/BKA:
- "Using MRR" is no longer shown with range access.
- Instead, both range and BKA accesses will show one of the following:
  = "Rowid-ordered scan"
  = "Key-ordered scan"
  = "Key-ordered Rowid-ordered scan"
depending on whether DS-MRR implementation will do scan keys in order, rowids in order,
or both.
- The patch also introduces a way for other storage engines/MRR implementations to
  pass information to EXPLAIN output about the properties of employed MRR scans.
2011-04-02 14:04:45 +04:00
unknown
71e9d94895 MWL#89
Merge 5.3 into 5.3-mwl89.
2011-03-01 15:54:21 +02:00
unknown
7895c35874 MWL#89
Merge MWL#89 with 5.3.
2011-03-01 14:16:28 +02:00
Igor Babaev
272e5e6212 BNLH algorithm always used a full table scan over the joined table
even in the cases when there existed range/index-merge scans that
were cheaper than the full table scan.
This was a defect/bug of the implementation of mwl #128. 
Now hash join can work not only with full table scan of the joined
table, but also with full index scan, range and index-merge scans.
Accordingly, in the cases when hash join is used the column 'type'
in the EXPLAINs can contain now 'hash_ALL', 'hash_index', 'hash_range'
and 'hash_index_merge'. If hash join is coupled with a range/index_merge
scan then the columns 'key' and 'key_len' contain info not only on
the used hash index, but also on the indexes used for the scan.
2011-02-23 22:23:12 -08:00
unknown
648e604615 MWL#89
Adjusted test cases in accordance with the implementation.
2011-02-03 17:00:28 +02:00
Igor Babaev
ec368ab9fa Merge 2011-01-21 22:48:28 -08:00
unknown
b0be3e2c68 Merge MWL#89 into 5.3 main. 2011-01-11 14:04:08 +02:00
Igor Babaev
af800fd92f The patch adds the code that allows to use equi-join conditions
for hash join in the cases when there are no suitable indexes
for these conditions.
2011-01-04 21:59:41 -08:00
unknown
bc7369b74b MWL#89: Cost-based choice between Materialization and IN->EXISTS transformation
Merge 5.3-mwl89 into 5.3 main.

There is one remaining test failure in this merge:
innodb_mysql_lock2. All other tests have been checked to
deliver the same results/explains as 5.3-mwl89, including
the few remaining wrong results.
2010-11-05 14:42:58 +02:00