Commit graph

891 commits

Author SHA1 Message Date
Sergey Petrunya
78497dbf5a Merge 2012-08-28 15:20:37 +04:00
Sergey Petrunya
2d99ea454f MDEV-430: Server crashes in select_describe on EXPLAIN with materialization+semijoin, etc
- Don't do early cleanup of uncorrelated subqueries if we're running an EXPLAIN.
2012-08-28 15:15:05 +04:00
Sergei Golubchik
9a64d0794c 5.3 merge 2012-08-27 18:13:17 +02:00
unknown
4d2b05b7d7 fix for MDEV-367
The problem was that was_null and null_value variables was reset in each reexecution of IN subquery, but engine rerun only for non-constant subqueries.

Fixed checking constant in Item_equal sort.
Fix constant reporting in Item_subselect.
2012-08-25 09:15:57 +03:00
unknown
80b3f74705 Fix bug mdev-447: Wrong output from the EXPLAIN command of the test case for lp bug #714999
The fix backports from MWL#182: Explain running statements the logic that
saves the original JOIN_TAB array of a query plan after optimization. This
array is later used during EXPLAIN to iterate over the original JOIN plan
nodes in the cases when this plan could be changed by early subquery
execution during the optimization phase of the outer query.
2012-08-21 15:24:43 +03:00
Sergey Petrunya
55597a4869 MDEV-410: EXPLAIN shows type=range, while SHOW EXPLAIN and userstat show full table scan is used
- Make Item_subselect::fix_fields() ignore UNCACHEABLE_EXPLAIN flag when deciding whether 
  the subquery item should be marked as constant.
2012-07-25 20:41:48 +04:00
unknown
0b93b444b6 Merged the fix for bug lp:944706, mdev-193 2012-06-19 15:06:45 +03:00
Sergey Petrunya
28f2c5641d 5.3->5.5 merge 2012-06-18 16:50:16 +04:00
unknown
db6dbadb5a Fix bug lp:1008686
Analysis:
The fix for bug lp:985667 implements the method Item_subselect::no_rows_in_result()
for all main kinds of subqueries. The purpose of this method is to be called from
return_zero_rows() and set Items to some default value in the case when a query
returns no rows. Aggregates and subqueries require special treatment in this case.

Every implementation of Item_subselect::no_rows_in_result() called
Item_subselect::make_const() to set the subquery predicate to its default value
irrespective of where the predicate was located in the query. Once the predicate
was set to a constant it was never executed.

At the same time, the JOIN object of the fake select for UNIONs (the one used for
the final result of the UNION), was set after all subqueries in the union were
executed. Since we set the subquery as constant, it was never executed, and the
corresponding JOIN was never created.

In order to decide whether the result of NOT IN is NULL or FALSE, Item_in_optimizer
needs to check if the subquery result was empty or not. This is where we got the
crash, because subselect_union_engine::no_rows() checks for
unit->fake_select_lex->join->send_records, and the join object was NULL.

Solution:
If a subquery is in the HAVING clause it must be evaluated in order to know its
result, so that we can properly filter the result records. Once subqueries in the
HAVING clause are executed even in the case of no result rows, this specific
crash will be solved, because the UNION will be executed, and its JOIN will be
constructed. Therefore the fix for this crash is to narrow the fix for lp:985667,
and to apply Item_subselect::no_rows_in_result() only when the subquery predicate
is in the SELECT clause.
2012-06-15 11:33:24 +03:00
unknown
c2677de7ac Merge the fix for lp:944706, mdev-193 2012-06-06 22:26:40 +03:00
unknown
8efc63ba5d Merge 2012-06-06 16:19:48 +03:00
unknown
f1ab00891a Fixed bug lp:1000649
Analysis:

When the method JOIN::choose_subquery_plan() decided to apply
the IN-TO-EXISTS strategy, it set the unit and select_lex
uncacheable flag to UNCACHEABLE_DEPENDENT_INJECTED unconditionally.
As result, even if IN-TO-EXISTS injected non-correlated predicates,
the subquery was still treated as correlated.

Solution:
Set the subquery as correlated only if the injected predicate(s) depend
on the outer query.
2012-06-05 17:25:10 +03:00
Sergei Golubchik
3e3606d21d merge with 5.3.
Take only test cases from MDEV-136 Non-blocking "set read_only"
2012-06-04 17:26:11 +02:00
unknown
7ddd5418d0 Fixed bug MDEV-288
CHEAP SQ: Valgrind warnings "Memory lost" with IN and EXISTS nested subquery, materialization+semijoin

Analysis:
The memory leak was a result of the interaction of semi-join optimization
with early optimization of constant subqueries. The function:
setup_jtbm_semi_joins() created a dummy temporary table "dummy_table"
in order to make some JOIN_TAB objects complete. Normally, such temporary
tables are freed inside JOIN_TAB::cleanup.

However, the inner-most subquery is pre-optimized, which allows the
optimization fo the MAX subquery to determine that its WHERE is TRUE,
and thus to compute the result of the MAX during optimization. This
ultimately allows the optimize phase of the outer query to find that
it WHERE clause is FALSE. Once JOIN::optimize finds that the result
set is empty, it sets zero_result_cause, and returns *before* it ever
reached make_join_statistics(). As a result the query plan has no
JOIN_TABs at all. Since the temporary table is supposed to be cleanup
via JOIN_TAB::cleanup, this never happens because there is no JOIN_TAB
for this table. Hence we get a memory leak.

Solution:
Whenever there are no JOIN_TABs, iterate over all table reference in
JOIN::join_list, and free the ones that contain semi-join temporary
tables.
2012-06-01 14:10:15 +03:00
unknown
941018f8d1 Patch for mdev-287: CHEAP SQ: A query with subquery in SELECT list, EXISTS, inner joins takes hundreds times longer
Analysis:

The fix for lp:944706 introduces early subquery optimization.
While a subquery is being optimized some of its predicates may be
removed. In the test case, the EXISTS subquery is constant, and is
evaluated to TRUE. As a result the whole OR is TRUE, and thus the
correlated condition "b = alias1.b" is optimized away. The subquery
becomes non-correlated.

The subquery cache is designed to work only for correlated subqueries.
If constant subquery optimization is disallowed, then the constant
subquery is not evaluated, the subquery remains correlated, and its
execution is cached. As a result execution is fast.

However, when the constant subquery was optimized away, it was neither
cached by the subquery cache, nor it was cached by the internal subquery
caching. The latter was due to the fact that the subquery still appeared
as correlated to the subselect_XYZ_engine::exec methods, and they
re-executed the subquery on each call to Item_subselect::exec.

Solution:

The solution is to update the correlated status of the subquery after it has
been optimized. This status consists of:
- st_select_lex::is_correlated
- Item_subselect::is_correlated
- SELECT_LEX::uncacheable
- SELECT_LEX_UNIT::uncacheable
The status is updated by st_select_lex::update_correlated_cache(), and its
caller st_select_lex::optimize_unflattened_subqueries. The solution relies
on the fact that the optimizer already called
st_select_lex::update_used_tables() for each subquery. This allows to
efficiently update the correlated status of each subquery without walking
the whole subquery tree.

Notice that his patch is an improvement over MySQL 5.6 and older, where
subqueries are not pre-optimized, and the above analysis is not possible.
2012-05-30 00:18:53 +03:00
Sergey Petrunya
1d3ba8a791 BUG#1000051: Query with simple join and ORDER BY takes thousands times longer when run with ICP
- Disable IndexConditionPushdown for reverse scans.
2012-05-23 11:46:40 +04:00
unknown
02bdc608b5 Fix bug lp:1002079
Analysis:
  The optimizer detects an empty result through constant table optimization.
  Then it calls return_zero_rows(), which in turns calls inderctly
  Item_maxmin_subselect::no_rows_in_result(). The latter method set "value=0",
  however "value" is pointer to Item_cache, and not just an integer value.
  
  All of the Item_[maxmin | singlerow]_subselect::val_XXX methods does:
    if (forced_const)
      return value->val_real();
  which of course crashes when value is a NULL pointer.
  
  Solution:
  When the optimizer discovers an empty result set, set
  Item_singlerow_subselect::value to a FALSE constant Item instead of NULL.
2012-05-22 15:22:55 +03:00
Sergei Golubchik
7f6f53a8df 5.2 merge 2012-05-20 14:57:29 +02:00
Sergei Golubchik
280fcf0808 5.1 merge 2012-05-18 14:23:05 +02:00
unknown
e5bca74bfb Fixed bug mdev-277 as part of the fix for lp:944706
The cause for this bug is that the method JOIN::get_examined_rows iterates over all
JOIN_TABs of the join assuming they are just a sequence. In the query above, the
innermost subquery is merged into its parent query. When we call
JOIN::get_examined_rows for the second-level subquery, the iteration that
assumes sequential order of join tabs goes outside the join_tab array and calls
the method JOIN_TAB::get_examined_rows on uninitialized memory. 

The fix is to iterate over JOIN_TABs in a way that takes into account the nested
semi-join structure of JOIN_TABs. In particular iterate as select_describe.
2012-05-18 14:52:01 +03:00
unknown
da5214831d Fix for bug lp:944706, task MDEV-193
The patch enables back constant subquery execution during
query optimization after it was disabled during the development
of MWL#89 (cost-based choice of IN-TO-EXISTS vs MATERIALIZATION).

The main idea is that constant subqueries are allowed to be executed
during optimization if their execution is not expensive.

The approach is as follows:
- Constant subqueries are recursively optimized in the beginning of
  JOIN::optimize of the outer query. This is done by the new method
  JOIN::optimize_constant_subqueries(). This is done so that the cost
  of executing these queries can be estimated.
- Optimization of the outer query proceeds normally. During this phase
  the optimizer may request execution of non-expensive constant subqueries.
  Each place where the optimizer may potentially execute an expensive
  expression is guarded with the predicate Item::is_expensive().
- The implementation of Item_subselect::is_expensive has been extended
  to use the number of examined rows (estimated by the optimizer) as a
  way to determine whether the subquery is expensive or not.
- The new system variable "expensive_subquery_limit" controls how many
  examined rows are considered to be not expensive. The default is 100.

In addition, multiple changes were needed to make this solution work
in the light of the changes made by MWL#89. These changes were needed
to fix various crashes and wrong results, and legacy bugs discovered
during development.
2012-05-17 13:46:05 +03:00
Sergei Golubchik
0a8c9b98f6 merge with mysql-5.1.63 2012-05-17 12:12:33 +02:00
Sergei Golubchik
431e042b5d c 2012-05-21 15:30:25 +02:00
Sergei Golubchik
44cf9ee5f7 5.3 merge 2012-05-04 07:16:38 +02:00
unknown
c04786d3e3 Fix bug lp:985667, MDEV-229
Analysis:

The reason for the wrong result is the interaction between constant
optimization (in this case 1-row table) and subquery optimization.

- First the outer query is optimized, and 'make_join_statistics' finds that
table t2 has one row, reads that row, and marks the whole table as constant.
This also means that all fields of t2 are constant.

- Next, we optimize the subquery in the end of the outer 'make_join_statistics'.
The field 'f2' is considered constant, with value '3'. The subquery predicate
is rewritten as the constant TRUE.

- The outer query execution detects early that the whole query result is empty
and calls 'return_zero_rows'. Since the query is with implicit grouping, we
have to produce one row with special values for the aggregates (depending on
each aggregate function), and NULL values for all non-aggregate fields.  This
function calls 'no_rows_in_result' to set each aggregate function to the
default value when it aggregates over an empty result, and then calls
'send_data', which in turn evaluates each Item in the SELECT list.

- When evaluation reaches the subquery predicate, it executes the subquery
with field 'f2' having a constant value '3', and the subquery produces the
incorrect result '7'.

Solution:

Implement Item::no_rows_in_result for all subquery predicates. In order to
make this work, it is also needed to make all val_* methods of all subquery
predicates respect the Item_subselect::forced_const flag. Otherwise subqueries
are executed anyways, and override the default value set by no_rows_in_result
with whatever result is produced from the subquery evaluation.
2012-04-27 12:59:17 +03:00
Georgi Kodinov
7fa28bcf56 merge mysql-5.5->mysql-5.5-security 2012-04-10 14:23:17 +03:00
Sergei Golubchik
16c5c53fc2 mysql 5.5.23 merge 2012-04-10 08:28:13 +02:00
Sergei Golubchik
f860b2aad4 merge 2012-04-07 15:58:46 +02:00
Sergey Petrunya
2a16e7674b BUG#913030: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3
- When doing join optimization, pre-sort the tables so that they mimic the execution
  order we've had with 'semijoin=off'. 
- That way, we will not get regressions when there are two query plans (the old and the
  new) that have indentical costs but different execution times (because of factors that
  the optimizer was not able to take into account).
2012-04-02 21:41:54 +04:00
Tor Didriksen
daf4107355 merge 5.1 => 5.5 2012-03-27 14:55:29 +02:00
Tor Didriksen
10120d363d Backport of fix for Bug#12763207 - ASSERT IN SUBSELECT::SINGLE_VALUE_TRANSFORMER 2012-03-27 14:39:27 +02:00
Tor Didriksen
13053fbe54 Bug#13721076 CRASH WITH TIME TYPE/TIMESTAMP() AND WARNINGS IN SUBQUERY
The table contains one time value: '00:00:32'
This value is converted to timestamp by a subquery.

In convert_constant_item we call (*item)->is_null()
which triggers execution of the Item_singlerow_subselect subquery,
and the string "0000-00-00 00:00:32" is cached
by Item_cache_datetime.
We continue execution and call update_null_value, which calls val_int()
on the cached item, which converts the time value to ((longlong) 32)
Then we continue to do (*item)->save_in_field()
which ends up in Item_cache_datetime::val_str() which fails,
since (32 < 101) in number_to_datetime, and val_str() returns NULL.

Item_singlerow_subselect::val_str isnt prepared for this:
if exec() succeeds, and return !null_value, then val_str()
*must* succeed.

Solution: refuse to cache strings like "0000-00-00 00:00:32"
in Item_cache_datetime::cache_value, and return NULL instead.

This is similar to the solution for 
Bug#11766860 - 60085: CRASH IN ITEM::SAVE_IN_FIELD() WITH TIME DATA TYPE

This patch is for 5.5 only.
The issue is not present after WL#946, since a time value
will be converted to a proper timestamp, with the current date
rather than "0000-00-00"


mysql-test/r/subselect.result:
  New test case.
mysql-test/t/subselect.test:
  New test case.
sql/item.cc:
  Verify proper date format before caching timestamps.
sql/item_timefunc.cc:
  Use named constant for readability.
2012-03-14 13:25:14 +01:00
Sergei Golubchik
18c51eee35 5.3 merge 2012-03-06 20:46:07 +01:00
unknown
8a5940c477 Fix for LP BUG#944504
Problem is that subquery execution can't be called during prepare/optimize phase.

Also small fix for subquery test suite.
2012-03-05 15:48:12 +02:00
Igor Babaev
8b469eb515 Merge 5.3->5.5. 2012-03-01 14:22:22 -08:00
Igor Babaev
841a74a4d6 Fixed LP bug #939009.
The result of materialization of the right part of an IN subquery predicate
is placed into a temporary table. Each row of the materialized table is
distinct. A unique key over all fields of the temporary table is defined and
created. It allows to perform key look-ups into the table.
The table created for a materialized subquery can be accessed by key as
any other table. The function best_access-path search for the best access
to join a table to a given partial join. With some where conditions this
function considers a possibility of a ref_or_null access. If such access
employs the unique key on the temporary table then when estimating
the cost this access the function tries to use the array rec_per_key. Yet,
such array is not built for this unique key. This causes a crash of the server.

Rows returned by the subquery that contain nulls don't have to be placed
into temporary table, as they cannot be match any row produced by the
left part of the subquery predicate. So all fields of the temporary table
can be defined as non-nullable. In this case any ref_or_null access
to the temporary table does not make any sense and it does not make sense
to estimate such an access.

The fix makes sure that the temporary table for a materialized IN subquery
is defined with columns that are all non-nullable. The also ensures that 
any row with nulls returned by the subquery is not placed into the
temporary table.
2012-02-24 16:50:22 -08:00
unknown
f6cdddf51f Test case for bug lp:905353
The bug itself is fixed by the patch for bug lp:908269.
2012-02-09 23:35:26 +02:00
Sergei Golubchik
25609313ff 5.3.4 merge 2012-02-15 18:08:08 +01:00
Igor Babaev
7b79d8a33f Merge 5.2->5.3 in preparation for the release of mariadb-5.3.4-rc. 2012-02-01 15:48:02 -08:00
Igor Babaev
bb4053afc3 Fixed LP bug #919427.
The function subselect_uniquesubquery_engine::copy_ref_key has to take into
account that when EXPLAIN is processed the array of store_key object created
for any TABLE_REF may contain elements for constant items. These items should
be ignored by thefunction.
2012-01-20 23:54:43 -08:00
Sergei Golubchik
4f435bddfd 5.3 merge 2012-01-13 15:50:02 +01:00
unknown
cf31ccc33c Fix for LP BUG#908269 Wrong result with subquery in select list, EXISTS, constant MyISAM/Aria table.
Problem: When building the condition for JOIN::outer_ref_cond the optimizer forgot to take into account
that this condition could depend on constant tables as well.
2012-01-10 23:26:00 +02:00
Igor Babaev
4b7919368e Back-ported the test case for bug #12616253 from mariadb-5.3 that
was actually a duplicate of LP bug #888456 fixed in mariadb-5.2.
2012-01-14 00:02:02 -08:00
Igor Babaev
4de7978a3f Back-ported the fix and the test case for bug #50257 from mariadb-5.3 code line.
Adjusted results for a few test cases.
2012-01-13 19:00:50 -08:00
Igor Babaev
6dfe0956d6 Back-ported the test cases for bug #12763207 from mysql-5.6 code line into 5.2
Completed the fix for this bug.
Note: in 5.3 the affected 'if' statement in Item_in_subselect::single_value_transformer()
starting with the  condition (thd->variables.sql_mode & MODE_ONLY_FULL_GROUP_BY)
should be removed altogether. The change from table.cc is not needed either.
This is because in 5.3
 - min/max transformation for subqueries are done at the optimization phase
 - evaluation of the expensive subqueries is done at the execution phase.

Added an EXPLAIN EXTENDED to the test case for bug #12329653.
2012-01-13 12:23:19 -08:00
Igor Babaev
c9259f166b Fixed LP bug #904345.
The MIN/MAX optimizer code from the function opt_sum_query erroneously
did not take into account conjunctive conditions that did not depend on
any table, yet were not identified as constant items. These could be
items containing rand() or PS/SP parameters. These items are supposed
to be evaluated at the execution phase. That's why if such conditions
can be extracted from the WHERE condition the MIN/MAX optimization is
not applied as currently it is always done at the optimization phase.

(In 5.3 expensive subqueries are also evaluated only at the execution
phase. So, if a constant condition with such subquery can be extracted
from the WHERE clause the MIN/MAX optimization should not be applied 
in 5.3.)

IF an IN/ALL/SOME predicate with a constant left part is transformed
into an EXISTS subquery the resulting subquery should not be considered
uncacheable if the right part of the predicate is not uncacheable.

Backported the function dbug_print_item() from 5.3. The function is used
only for debugging.
2011-12-27 13:19:13 -08:00
Igor Babaev
2b1f0b8757 Back-ported the patch of the mysql-5.6 code line that
fixed several defects in the greedy optimization:

1) The greedy optimizer calculated the 'compare-cost' (CPU-cost)
   for iterating over the partial plan result at each level in
   the query plan as 'record_count / (double) TIME_FOR_COMPARE'

   This cost was only used locally for 'best' calculation at each
   level, and *not* accumulated into the total cost for the query plan.

   This fix added the 'CPU-cost' of processing 'current_record_count'
   records at each level to 'current_read_time' *before* it is used as
   'accumulated cost' argument to recursive 
   best_extension_by_limited_search() calls. This ensured that the
   cost of a huge join-fanout early in the QEP was correctly
   reflected in the cost of the final QEP.

   To get identical cost for a 'best' optimized query and a
   straight_join with the same join order, the same change was also
   applied to optimize_straight_join() and get_partial_join_cost()

2) Furthermore to get equal cost for 'best' optimized query and a
   straight_join the new code substrcated the same '0.001' in
   optimize_straight_join() as it had been already done in
   best_extension_by_limited_search()

3) When best_extension_by_limited_search() aggregated the 'best' plan a
   plan was 'best' by the check :

   'if ((search_depth == 1) || (current_read_time < join->best_read))'

   The term '(search_depth == 1' incorrectly caused a new best plan to be
   collected whenever the specified 'search_depth' was reached - even if
   this partial query plan was more expensive than what we had already
   found.
2011-12-24 08:55:10 -08:00
Igor Babaev
bad3e4179c Fixed LP bug #906322.
If the sorted table belongs to a dependent subquery then the function
create_sort_index() should not clear TABLE:: select and TABLE::select
for this table after the sort of the table has been performed, because
these members are needed for the second execution of the subquery.
2011-12-19 14:55:30 -08:00
unknown
072073c09e Backport of WL#5953 from MySQL 5.6
The patch differs from the original MySQL patch as follows:
- All test case differences have been reviewed one by one, and
  care has been taken to restore the original plan so that each
  test case executes the code path it was designed for.
- A bug was found and fixed in MariaDB 5.3 in
  Item_allany_subselect::cleanup().
- ORDER BY is not removed because we are unsure of all effects,
  and it would prevent enabling ORDER BY ... LIMIT subqueries.
- ref_pointer_array.m_size is not adjusted because we don't do
  array bounds checking, and because it looks risky.

Original comment by Jorgen Loland:
-------------------------------------------------------------
WL#5953 - Optimize away useless subquery clauses
      
For IN/ALL/ANY/SOME/EXISTS subqueries, the following clauses are 
meaningless:
      
* ORDER BY (since we don't support LIMIT in these subqueries)
* DISTINCT
* GROUP BY if there is no HAVING clause and no aggregate 
  functions
      
This WL detects and optimizes away these useless parts of the
query during JOIN::prepare()
2011-12-19 23:05:44 +02:00
Igor Babaev
7a1406f229 Fixed LP bug #904832.
Do not perform index condition pushdown for conditions containing subqueries
and stored functions.
2011-12-18 23:38:37 -08:00