Commit graph

27 commits

Author SHA1 Message Date
unknown
941018f8d1 Patch for mdev-287: CHEAP SQ: A query with subquery in SELECT list, EXISTS, inner joins takes hundreds times longer
Analysis:

The fix for lp:944706 introduces early subquery optimization.
While a subquery is being optimized some of its predicates may be
removed. In the test case, the EXISTS subquery is constant, and is
evaluated to TRUE. As a result the whole OR is TRUE, and thus the
correlated condition "b = alias1.b" is optimized away. The subquery
becomes non-correlated.

The subquery cache is designed to work only for correlated subqueries.
If constant subquery optimization is disallowed, then the constant
subquery is not evaluated, the subquery remains correlated, and its
execution is cached. As a result execution is fast.

However, when the constant subquery was optimized away, it was neither
cached by the subquery cache, nor it was cached by the internal subquery
caching. The latter was due to the fact that the subquery still appeared
as correlated to the subselect_XYZ_engine::exec methods, and they
re-executed the subquery on each call to Item_subselect::exec.

Solution:

The solution is to update the correlated status of the subquery after it has
been optimized. This status consists of:
- st_select_lex::is_correlated
- Item_subselect::is_correlated
- SELECT_LEX::uncacheable
- SELECT_LEX_UNIT::uncacheable
The status is updated by st_select_lex::update_correlated_cache(), and its
caller st_select_lex::optimize_unflattened_subqueries. The solution relies
on the fact that the optimizer already called
st_select_lex::update_used_tables() for each subquery. This allows to
efficiently update the correlated status of each subquery without walking
the whole subquery tree.

Notice that his patch is an improvement over MySQL 5.6 and older, where
subqueries are not pre-optimized, and the above analysis is not possible.
2012-05-30 00:18:53 +03:00
Igor Babaev
8b469eb515 Merge 5.3->5.5. 2012-03-01 14:22:22 -08:00
Igor Babaev
841a74a4d6 Fixed LP bug #939009.
The result of materialization of the right part of an IN subquery predicate
is placed into a temporary table. Each row of the materialized table is
distinct. A unique key over all fields of the temporary table is defined and
created. It allows to perform key look-ups into the table.
The table created for a materialized subquery can be accessed by key as
any other table. The function best_access-path search for the best access
to join a table to a given partial join. With some where conditions this
function considers a possibility of a ref_or_null access. If such access
employs the unique key on the temporary table then when estimating
the cost this access the function tries to use the array rec_per_key. Yet,
such array is not built for this unique key. This causes a crash of the server.

Rows returned by the subquery that contain nulls don't have to be placed
into temporary table, as they cannot be match any row produced by the
left part of the subquery predicate. So all fields of the temporary table
can be defined as non-nullable. In this case any ref_or_null access
to the temporary table does not make any sense and it does not make sense
to estimate such an access.

The fix makes sure that the temporary table for a materialized IN subquery
is defined with columns that are all non-nullable. The also ensures that 
any row with nulls returned by the subquery is not placed into the
temporary table.
2012-02-24 16:50:22 -08:00
Sergei Golubchik
4f435bddfd 5.3 merge 2012-01-13 15:50:02 +01:00
Igor Babaev
2b1f0b8757 Back-ported the patch of the mysql-5.6 code line that
fixed several defects in the greedy optimization:

1) The greedy optimizer calculated the 'compare-cost' (CPU-cost)
   for iterating over the partial plan result at each level in
   the query plan as 'record_count / (double) TIME_FOR_COMPARE'

   This cost was only used locally for 'best' calculation at each
   level, and *not* accumulated into the total cost for the query plan.

   This fix added the 'CPU-cost' of processing 'current_record_count'
   records at each level to 'current_read_time' *before* it is used as
   'accumulated cost' argument to recursive 
   best_extension_by_limited_search() calls. This ensured that the
   cost of a huge join-fanout early in the QEP was correctly
   reflected in the cost of the final QEP.

   To get identical cost for a 'best' optimized query and a
   straight_join with the same join order, the same change was also
   applied to optimize_straight_join() and get_partial_join_cost()

2) Furthermore to get equal cost for 'best' optimized query and a
   straight_join the new code substrcated the same '0.001' in
   optimize_straight_join() as it had been already done in
   best_extension_by_limited_search()

3) When best_extension_by_limited_search() aggregated the 'best' plan a
   plan was 'best' by the check :

   'if ((search_depth == 1) || (current_read_time < join->best_read))'

   The term '(search_depth == 1' incorrectly caused a new best plan to be
   collected whenever the specified 'search_depth' was reached - even if
   this partial query plan was more expensive than what we had already
   found.
2011-12-24 08:55:10 -08:00
Igor Babaev
7a1406f229 Fixed LP bug #904832.
Do not perform index condition pushdown for conditions containing subqueries
and stored functions.
2011-12-18 23:38:37 -08:00
Sergey Petrunya
255fd6c929 Make subquery Materialization, as well as semi-join Materialization be shown
in EXPLAIN as select_type==MATERIALIZED. 

Before, we had select_type==SUBQUERY and it was difficult to tell materialized
subqueries from uncorrelated scalar-context subqueries.
2011-12-05 01:31:42 +04:00
Sergei Golubchik
d2755a2c9c 5.3->5.5 merge 2011-11-22 18:04:38 +01:00
Igor Babaev
e0c1b3f242 Fixed LP bug #886145.
The bug happened because in some cases the function JOIN::exec
did not save the value of TABLE::pre_idx_push_select_cond in
TABLE::select->pre_idx_push_select_cond for the sort table.

Noticed and fixed a bug in the function make_cond_remainder
that builds the remainder condition after extraction of an index
pushdown condition from the where condition. The code
erroneously assumed that the function make_cond_for_table left
the value of ICP_COND_USES_INDEX_ONLY in sub-condition markers.
Adjusted many result files from the regression test suite
after this fix .
2011-11-06 01:23:03 -07:00
Igor Babaev
e1194ad63b Merge. 2011-10-03 15:50:42 -07:00
Igor Babaev
715dc5f99d Fixed a cost estimation bug introduced into in the function best_access_path
of the 5.3 code line after a merge with 5.2 on 2010-10-28
in order not to allow the cost to access a joined table to be equal
to 0 ever.

Expanded data sets for many test cases to get the same execution plans
as before.
2011-09-30 18:55:02 -07:00
Michael Widenius
f0c6ce9ade Fixed issue with slow query logging where examined rows where wrong
Reset 'examined_rows_count' in union to not count same rows twice

mysql-test/r/subselect_mat_cost.result:
  Test also slow query logging
mysql-test/t/subselect_mat_cost.test:
  Test also slow query logging
sql/sql_union.cc:
  Reset 'examined_rows_count' in union to not count same rows twice
2011-09-23 13:55:01 +03:00
Sergey Petrunya
1492de8563 Set the default to be mrr=off,mrr_sort_keys=off:
- Set the default
- Adjust the testcases so that 'new' tests are run with optimizations turned on.
- Pull out relevant tests from "irrelevant" tests and run them with optimizations on.
- Run range.test and innodb.test with both mrr=on and mrr=off
2011-07-08 18:46:47 +04:00
unknown
e1d734f383 MWL#89
Removed forgotten EXPLAIN EXTENDED from the test file.
2011-06-21 23:01:01 +03:00
unknown
a02682abcc MWL#89
- Added regression test with queries over the WORLD database.
- Discovered and fixed several bugs in the related cost calculation
  functionality both in the semijoin and non-semijon subquery code.
- Added DBUG printing of the cost variables used to decide between
  IN-EXISTS and MATERIALIZATION.
2011-06-21 15:50:07 +03:00
unknown
759d71eba1 MWL#89
Split the tests for MWL#89 into two parts - one for bugs
(currently active), and one for functionality tets
(currently in progress, and thus disabled).

Disable the test for LP BUG#718593.
2011-02-22 12:44:58 +02:00
unknown
96efe1cab3 Fix for LP BUG#714808 and LP BUG#719280.
The patch also adjusts several instable test results
to order the result.

Analysis:

The function prev_record_reads() may skip (jump over)
some query plan nodes where record_count < 1. At the
same time, even though get_partial_join_cost() uses
all first N plan nodes after the last constant table,
it may produce a smaller record_count than
prev_record_reads(), because the record count for
some plan nodes may be < 1, and these nodes may not
participate in prev_record_reads.

Solution:
The current solution is to treat the result of
get_partial_join_cost() as the upper bound for the
total number of unique lookup keys.
2011-02-15 22:17:18 +02:00
unknown
6c45d903f0 Fix LP BUG#718578
This bug extends the fix for LP BUG#715027 to cover one
more case of an Item being transformed, and its property
Item::with_subselect not being updated because
quick_fix_fields doesn't recalculate any properties.
2011-02-14 16:50:10 +02:00
unknown
2aeb4170a0 Fix LP BUG#715027
Analysis:
Before calling:
  write_record= (select->skip_record(thd) > 0);
the function find_all_keys needs to restore the original read/write
sets of the table that is sorted if the condition select->cond
contains a subquery.

This didn't happen in this test case because the flag "with_subselect"
was not set properly for select->cond.

The reason for the flag not being set properly, was that this condition
was rewritten by add_cond_and_fix() inside make_join_select() by:

      /* Add conditions added by add_not_null_conds(). */
      if (tab->select_cond)
        add_cond_and_fix(thd, &tmp, tab->select_cond);

However, the function add_cond_and_fix() called the shortcut method
Item::quick_fix_field() that didn't update the "with_subselect"
property.

Solution:
Call the complete Item::fix_fields() to update all Item properties,
including "with_subselect".
2011-02-14 00:11:46 +02:00
unknown
cd34946657 Fix LP BUG#715034
Analysis:
The failed assert is a result of calling Item_sum_distinct::clear()
on an incomplete object for which Item_sum_distinct::setup() was
not yet called.

The reason is that JOIN::exec for the outer query calls JOIN::reinit()
for all its subqueries, which in turn calls clear() for all aggregate
functions of the subqueries. The call stack is:
mysql_explain_union -> mysql_select -> JOIN::exec -> select_desribe ->
mysql_explain_union -> mysql_select -> JOIN::reinit

This assert doesn't fail in the main 5.3 because constant subqueries
are being executed during the optimize phase of the outer query,
thus the Unique object is created before calling JOIN::exec for the
outer query, and Item_sum_distinct::clear() actually cleans the
Unique object.

Solution:
The best solution is the obvious one - substitute the assert with
a test whether Item_sum_distinct::tree is NULL.
2011-02-11 18:46:31 +02:00
unknown
ac6653aacc Fix LP BUG#714999
Analysis:
The crash in EXPLAIN resulted from an attempt to print the
name of the internal temporary table created to compute
distinct for the innermost subquery of the test case.
Such tables do not have a corresponding TABLE_LIST (table
reference), hence the crash. The reason for this was that
the subquery was executed as part of constant condition
evaluation before EXPLAIN attempts to print the table name.
During the subquery execution, the subquery JOIN_TAB and
its table are substituted by a temporary table in
make_simple_join.

Solution:
Similar to the analogous case for other Items than the
IS NULL function, do not evaluate expensive constant
conditions.
2011-02-10 22:53:30 +02:00
unknown
6a66bf3182 MWL#89
Fixed LP BUG#714808 Assertion `outer_lookup_keys <= outer_record_count'

Analysis:

The function best_access_path() computes the number or records as
follows:

...
      if (rec < MATCHING_ROWS_IN_OTHER_TABLE)
        rec= MATCHING_ROWS_IN_OTHER_TABLE; // Fix for small tables
...
              if (table->quick_keys.is_set(key))
                records= (double) table->quick_rows[key];
              else
              {
                /* quick_range couldn't use key! */
                records= (double) s->records/rec;
              }

Above MATCHING_ROWS_IN_OTHER_TABLE == 10, and s->records == 1,
thus we get an estimated 0.1 records. As a result JOIN::get_partial_join_cost()
for the outer query computes outer_record_count == 0.1 records, which is
meaningless in this context.

Solution:
Round row count estimates that are < 1 to 1.
2011-02-10 16:23:59 +02:00
unknown
acd46e32aa Test case for LP BUG#641245
The bug itself got fixed after merging with the main 5.3.
2010-11-05 17:10:48 +02:00
unknown
9f2bddbd80 Fixed LP BUG#652727 and LP BUG#643424.
The fixes for #643424 was part of the fix for #652727, that's why both
fixes are pushed together.

- The cause for #643424 was the improper use of get_partial_join_cost(),
  which assumed that the 'n_tables' parameter was the upper bound for
  query plan node indexes.
  Fixed by generalizing get_partial_join_cost() as a method that computes
  the cost of any partial join.

- The cause of #652727 was that JOIN::choose_subquery_plan() incorrectly
  deleted the contents of the old keyuse array in the cases when an injected
  plan would not provide more key accesses, and reoptimization was not actually
  performed.
2010-11-02 15:27:01 +02:00
unknown
f670b6d22f MWL#89: Cost-based choice between Materialization and IN->EXISTS transformation
Added missing logic to handle the case when subquery tables are optimized
away early during optimization.
2010-10-23 21:28:58 +03:00
unknown
e85a4cb6b5 MWL#89: Cost-based choice between Materialization and IN->EXISTS transformation
- Added more tests to the MWL#89 specific test, and made the test more modular.
- Updated test files.
- Fixed a memory leak.
- More comments.

mysql-test/r/subselect_mat.result:
  - Updated the test file to reflect the new optimizer switches related to
    materialized subquery execution.
  - Added one extra test to test all cases that expose BUG#40037 (this is an old bug from 5.x).
  - Updated the test result with correct results that expose BUG#40037.
mysql-test/t/subselect_mat.test:
  - Updated the test file to reflect the new optimizer switches related to
    materialized subquery execution.
  - Added one extra test to test all cases that expose BUG#40037 (this is an old bug from 5.x).
  - Updated the test result with correct results that expose BUG#40037.
sql/sql_select.cc:
  Fixed a memory leak reported by Valgrind.
2010-10-20 15:43:55 +03:00
unknown
addd57828d MWL#89: Cost-based choice between Materialization and IN->EXISTS transformation
- Changed the default optimizer switches to provide 5.1/5.2 compatible behavior
- Added a regression test file to test consistently all cases covered by MWL#89
- Added/corrected/improved comments.
2010-10-09 17:48:05 +03:00