Commit graph

71 commits

Author SHA1 Message Date
Sergey Petrunya
84a53543c5 BUG#965872: Server crashes in embedding_sjm on a simple 1-table select with AND and OR
- This is a regession introduced by fix for BUG#951937
- The problem was that there were scenarios where check_simple_equality() would create an
  Item_equal object but would not call item_equal->set_context_field() on it. 
- The fix was to add the missing calls.
2012-03-27 14:43:26 +04:00
Sergey Petrunya
ff731f39ca Merge 2012-03-26 21:38:24 +04:00
Sergey Petrunya
f2947f97a6 BUG#951283: Wrong result (missing rows) with semijoin+firstmatch, IN/ANY subquery
- The problem was with execution strategy for cases where FirstMatch's inner tables
  were interleaved with outer-uncorrelated tables.
- I was unable to find any cases where such join orders would be practically useful,
  so fixed it by disabling them.
2012-03-26 21:34:24 +04:00
Michael Widenius
4e96c579ae Sorted some test results that can be different on different machines
mysql-test/t/subselect_sj2.test:
  Added --sorted_result
2012-03-26 15:05:50 +03:00
Sergey Petrunya
e2554b50cd BUG#951937: Wrong result (missing rows) with semijoin+materialization, IN subquery, InnoDB, TEMPTABLE view
- Fix equality propagation to work with SJM nests and OR clauses (full descirption of problem and
  solution in the comment in the patch)
(The second commit with post-review fixes)
2012-03-26 13:47:00 +04:00
Sergey Petrunya
d054709827 BUG#962667: Assertion `0' failed in QUICK_INDEX_SORT_SELECT::need_sorted_output()
- The problem was that
  = we've picked a LooseScan that used full index scan (tab->type==JT_ALL) on certain index.
  = there was also a quick select (tab->quick!=NULL), that used other indexes.
  = some old code assumes that (tab->type==JT_ALL && tab->quick) -> means that the
    quick select should be used, which is not true.
Fixed by discarding the quick select as soon as we know we're using LooseScan
without using the quick select.
2012-03-25 18:31:35 +04:00
Igor Babaev
841a74a4d6 Fixed LP bug #939009.
The result of materialization of the right part of an IN subquery predicate
is placed into a temporary table. Each row of the materialized table is
distinct. A unique key over all fields of the temporary table is defined and
created. It allows to perform key look-ups into the table.
The table created for a materialized subquery can be accessed by key as
any other table. The function best_access-path search for the best access
to join a table to a given partial join. With some where conditions this
function considers a possibility of a ref_or_null access. If such access
employs the unique key on the temporary table then when estimating
the cost this access the function tries to use the array rec_per_key. Yet,
such array is not built for this unique key. This causes a crash of the server.

Rows returned by the subquery that contain nulls don't have to be placed
into temporary table, as they cannot be match any row produced by the
left part of the subquery predicate. So all fields of the temporary table
can be defined as non-nullable. In this case any ref_or_null access
to the temporary table does not make any sense and it does not make sense
to estimate such an access.

The fix makes sure that the temporary table for a materialized IN subquery
is defined with columns that are all non-nullable. The also ensures that 
any row with nulls returned by the subquery is not placed into the
temporary table.
2012-02-24 16:50:22 -08:00
Sergey Petrunya
8bedf1ea1c BUG#912538: Wrong result (missing rows) with semijoin=on, firstmatch=on, ...
- setup_semijoin_dups_elimination() would incorrectly set join_tab->do_firstmatch 
  when the join order had outer tables interleaved with inner.
2012-01-19 23:44:43 +04:00
Igor Babaev
2b1f0b8757 Back-ported the patch of the mysql-5.6 code line that
fixed several defects in the greedy optimization:

1) The greedy optimizer calculated the 'compare-cost' (CPU-cost)
   for iterating over the partial plan result at each level in
   the query plan as 'record_count / (double) TIME_FOR_COMPARE'

   This cost was only used locally for 'best' calculation at each
   level, and *not* accumulated into the total cost for the query plan.

   This fix added the 'CPU-cost' of processing 'current_record_count'
   records at each level to 'current_read_time' *before* it is used as
   'accumulated cost' argument to recursive 
   best_extension_by_limited_search() calls. This ensured that the
   cost of a huge join-fanout early in the QEP was correctly
   reflected in the cost of the final QEP.

   To get identical cost for a 'best' optimized query and a
   straight_join with the same join order, the same change was also
   applied to optimize_straight_join() and get_partial_join_cost()

2) Furthermore to get equal cost for 'best' optimized query and a
   straight_join the new code substrcated the same '0.001' in
   optimize_straight_join() as it had been already done in
   best_extension_by_limited_search()

3) When best_extension_by_limited_search() aggregated the 'best' plan a
   plan was 'best' by the check :

   'if ((search_depth == 1) || (current_read_time < join->best_read))'

   The term '(search_depth == 1' incorrectly caused a new best plan to be
   collected whenever the specified 'search_depth' was reached - even if
   this partial query plan was more expensive than what we had already
   found.
2011-12-24 08:55:10 -08:00
unknown
072073c09e Backport of WL#5953 from MySQL 5.6
The patch differs from the original MySQL patch as follows:
- All test case differences have been reviewed one by one, and
  care has been taken to restore the original plan so that each
  test case executes the code path it was designed for.
- A bug was found and fixed in MariaDB 5.3 in
  Item_allany_subselect::cleanup().
- ORDER BY is not removed because we are unsure of all effects,
  and it would prevent enabling ORDER BY ... LIMIT subqueries.
- ref_pointer_array.m_size is not adjusted because we don't do
  array bounds checking, and because it looks risky.

Original comment by Jorgen Loland:
-------------------------------------------------------------
WL#5953 - Optimize away useless subquery clauses
      
For IN/ALL/ANY/SOME/EXISTS subqueries, the following clauses are 
meaningless:
      
* ORDER BY (since we don't support LIMIT in these subqueries)
* DISTINCT
* GROUP BY if there is no HAVING clause and no aggregate 
  functions
      
This WL detects and optimizes away these useless parts of the
query during JOIN::prepare()
2011-12-19 23:05:44 +02:00
Igor Babaev
a910e8ef5b Made join_cache_level == 2 by default. 2011-12-15 14:26:59 -08:00
Igor Babaev
f5dac20f38 Made the optimizer switch flags 'outer_join_with_cache', 'semijoin_with_cache'
set to 'on' by default.
2011-12-15 00:21:15 -08:00
Sergey Petrunya
ae480437ce Small semi-join optimization improvement:
- if we're considering FirstMatch access with one inner table, and 
  @@optimizer_switch has semijoin_with_cache flag, calculate costs
  as if we used join cache (because we will be able to do so)
2011-12-08 04:22:38 +04:00
Sergey Petrunya
136408b1cf Bug #899962: materialized subquery with join_cache_level=3
- Make create_tmp_table() set KEY_PART_INFO attributes for the keys it creates.
  This wasn't needed before but is needed now, when temp. tables that are 
  results of SJ-Materialization are being used for joins.
  This particular bug depended on HA_VAR_LENGTH_PART being set,
  but also added code to set HA_BLOB_PART and HA_NULL_PART when appropriate.
2011-12-06 01:04:27 +04:00
Sergey Petrunya
255fd6c929 Make subquery Materialization, as well as semi-join Materialization be shown
in EXPLAIN as select_type==MATERIALIZED. 

Before, we had select_type==SUBQUERY and it was difficult to tell materialized
subqueries from uncorrelated scalar-context subqueries.
2011-12-05 01:31:42 +04:00
Igor Babaev
b5a05df61e Fixed LP bug #899696.
If has been decided that the first match strategy is to be used to join table T
from a semi-join nest while no buffer can be employed to join this table
then no join buffer can be used to join any table in the join sequence between
the first one belonging to the semi-join nest and table T.
2011-12-04 07:43:33 -08:00
Igor Babaev
2f9734172f Fixed LP bug #898073.
The tables from the same semi-join or outer join nest cannot use
join buffers if in the join sequence of the query execution plan
they are separated by a table that is planned to be joined without
usage of a join buffer.
2011-11-30 10:22:53 -08:00
Sergey Petrunya
3a9edc5f77 Merge 2011-11-25 14:28:43 +04:00
Sergey Petrunya
f84dbf4b20 Semi-join optimizations code cleanup part 2:
- Make EXPLAIN display "Start temporary" at the start of the fanout (it used to display
  at the first table whose rowid gets into temp. table which is not that useful for
  the user)
- Updated test results (all checked)
2011-11-25 05:56:58 +04:00
Sergey Petrunya
694ce95557 Semi-join optimizations code cleanup:
- Break down POSITION/advance_sj_state() into four classes 
  representing potential semi-join strategies.

- Treat all strategies uniformly (before, DuplicateWeedout 
  was special as it was the catch-all strategy. Now, we're 
  still relying on it to be the catch-all, but are able to 
  function,e.g. with firstmatch=on,duplicate_weedout=off.

- Update test results (checked)
2011-11-23 04:25:52 +04:00
unknown
511459bd14 Enable subquery materialization=ON by default. 2011-11-09 15:36:25 +02:00
Igor Babaev
a8f7c03c1e Fixed LP bug #881318.
If a materialized derived table / view is empty then for this table
the value of file->ref is 0. This was not taken into account by
the function JOIN_CACHE::write_record_data. As a result a query
using an empty materialized derived tables as inner tables of outer
joins and IN subqueries in WHERE conditions could cause server crashes
when the optimizer employed join caches and duplicate elimination for
semi-joins.
2011-10-25 14:18:19 -07:00
Sergey Petrunya
2160a25adc BUG#869001: Wrong result with semijoin + materialization + firstmatch + multipart key
- Make advance_sj_state() not to attempt building duplicate removal strategies
  when we're doing optimization of an SJM-nest.
2011-10-12 13:19:37 +04:00
Sergey Petrunya
44f4cf6f40 Fix testcases from the previous push: add missing DROP TABLE statement 2011-10-11 23:48:50 +04:00
Sergey Petrunya
039da95e3d BUG#869012: Wrong result with semijoin + materialization + AND in WHERE
- in make_join_select(), use the correct condition to check whether the current table is a SJM nest (the previous 
  condition used to be correct before, but then sj-materialization temp table creation was moved to happen before
  make_join_select was called)
2011-10-11 21:34:00 +04:00
Igor Babaev
e1194ad63b Merge. 2011-10-03 15:50:42 -07:00
Igor Babaev
715dc5f99d Fixed a cost estimation bug introduced into in the function best_access_path
of the 5.3 code line after a merge with 5.2 on 2010-10-28
in order not to allow the cost to access a joined table to be equal
to 0 ever.

Expanded data sets for many test cases to get the same execution plans
as before.
2011-09-30 18:55:02 -07:00
Sergey Petrunya
4908d27b57 BUG#858732: Wrong result with semijoin + loosescan + comma join
- Fix wrong loop bounds in setup_semijoin_dups_elimination()
2011-09-26 13:56:09 +04:00
Sergey Petrunya
f0323a40d8 BUG#849763: Wrong result with second execution of prepared statement with semijoin + view
- The problem was that Item_direct_view_ref and its embedded Item_field were getting incorrect
  value of item->used_tables() after fix_fields() in the second and subsequent EXECUTE.
- Made relevant fixes in Item_field::fix_fields() and find_field_in_tables(), so that the 
  Item_field gets the correct attributes.
2011-09-20 20:40:07 +04:00
Sergey Petrunya
265b51df73 Merge 2011-07-19 11:45:46 +04:00
Igor Babaev
03081bc1fd Changed the default setting of the optimizer switch 'optimize_join_buffer_size'.
Made it 'off' by default.
2011-07-14 22:24:59 -07:00
Sergey Petrunya
56a23357ae BUG#803457: Wrong result with semijoin + view + outer join in maria-5.3-subqueries-mwl90
(This is not a real fix for this bug, even though it makes it to no longer repeat)
- Semi-join subquery predicates, i.e. ... WHERE outer_expr IN (SELECT ...) may have null-rejecting properties,
  may allow to convert outer joins into inner.
- When convert_subq_to_sj() injected IN-equality into parent's WHERE/ON clause, it didn't call 
  $new_cond->top_level_item(), which would cause null-rejecting properties to be lost.
- Fixed, now the mentioned outer-to-inner conversion will really take place.
2011-07-15 02:58:34 +04:00
Sergey Petrunya
85571ea76c Disable LooseScan and FirstMatch when outer joins are present. 2011-07-14 01:53:05 +04:00
Sergey Petrunya
1492de8563 Set the default to be mrr=off,mrr_sort_keys=off:
- Set the default
- Adjust the testcases so that 'new' tests are run with optimizations turned on.
- Pull out relevant tests from "irrelevant" tests and run them with optimizations on.
- Run range.test and innodb.test with both mrr=on and mrr=off
2011-07-08 18:46:47 +04:00
Sergey Petrunya
9e95a54793 Merge fix for BUG#803365 2011-07-05 21:48:50 +04:00
Sergey Petrunya
05d54b121c BUG#803365: Crash in pull_out_semijoin_tables with outer join + semijoin + derived tables in maria-5.3 with WL#106
- Don't perform table pullout out of semi-join nests that have nested outer joins.
2011-07-05 21:22:13 +04:00
Sergey Petrunya
c1de6f8b77 Change the default @@optimizer_switch setting from
semijoin=on,firstmatch=on,loosescan=on
to
  semijoin=off,firstmatch=off,loosescan=off
Adjust the testcases:
- Modify subselect*.test and join_cache.test so that all tests
  use the same execution paths as before (i.e. optimizations that
  are being tested are enabled)
- Let all other test files run with the new default settings (i.e.
  with new optimizations disabled)
- Copy subquery testcases from these files into t/subselect_extra.test
  which will run them with new optimizations enabled.
2011-07-05 01:44:15 +04:00
Sergey Petrunya
b0f423c5d3 BUG#761598: InnoDB: Error: row_search_for_mysql() is called without ha_innobase::external_lock() in maria-5.3
- Testcase
2011-06-15 15:32:24 +04:00
Sergey Petrunya
5cd18326c2 Merge 5.3->main -> 5.3-mwl90 2011-05-29 12:58:44 +04:00
Sergey Petrunya
77b3b960b1 Merge MWL#90 with 5.3-main 2011-05-25 19:31:13 +04:00
Sergey Petrunya
4d7c06b399 Stabilize a testcase after fix for BUG#784723 2011-05-20 14:15:22 +04:00
Sergey Petrunya
23cfe1261d Merge fix for BUG#784723 2011-05-20 10:13:02 +04:00
Sergey Petrunya
d1138283b9 BUG#784723: Wrong result with semijoin + nested subqueries in maria-5.3
- in advance_sj_state(), remember join->cur_dups_producing_tables in 
  pos->prefix_dups_producing_tables *before* we modify it, so that 
  restore_prev_sj_state() restores cur_dups_producing_tables in all cases.
- Updated test results in subselect_sj2[_jcl6].result (the original EXPLAIN
  was invalid there)
2011-05-20 01:05:06 +04:00
unknown
5dc11616b2 MWL#89
Merge with 5.3
2011-05-02 21:59:16 +03:00
Sergey Petrunya
3098c21b8f Merge MWL#90 into 5.3-main 2011-04-30 04:59:05 +04:00
Sergey Petrunya
997445bc8e Make EXPLAIN better at displaying MRR/BKA:
- "Using MRR" is no longer shown with range access.
- Instead, both range and BKA accesses will show one of the following:
  = "Rowid-ordered scan"
  = "Key-ordered scan"
  = "Key-ordered Rowid-ordered scan"
depending on whether DS-MRR implementation will do scan keys in order, rowids in order,
or both.
- The patch also introduces a way for other storage engines/MRR implementations to
  pass information to EXPLAIN output about the properties of employed MRR scans.
2011-04-02 14:04:45 +04:00
unknown
71e9d94895 MWL#89
Merge 5.3 into 5.3-mwl89.
2011-03-01 15:54:21 +02:00
unknown
7895c35874 MWL#89
Merge MWL#89 with 5.3.
2011-03-01 14:16:28 +02:00
Sergey Petrunya
cb147b3965 Merge 5.3 -> 5.3-subqueries-mwl90 2011-03-01 13:21:48 +03:00
Igor Babaev
272e5e6212 BNLH algorithm always used a full table scan over the joined table
even in the cases when there existed range/index-merge scans that
were cheaper than the full table scan.
This was a defect/bug of the implementation of mwl #128. 
Now hash join can work not only with full table scan of the joined
table, but also with full index scan, range and index-merge scans.
Accordingly, in the cases when hash join is used the column 'type'
in the EXPLAINs can contain now 'hash_ALL', 'hash_index', 'hash_range'
and 'hash_index_merge'. If hash join is coupled with a range/index_merge
scan then the columns 'key' and 'key_len' contain info not only on
the used hash index, but also on the indexes used for the scan.
2011-02-23 22:23:12 -08:00