When the definition of the index used for hash join was created
in create_hj_key_for_table() it could cause memory overwrite
due to a bug that led to an underestimation of the number of
the index component.
In the function JOIN::shrink_join_buffers the iteration over joined
tables was organized in a wrong way. This could cause a crash if
the optimizer chose to materialize a semi-join that used join caches
for which the sizes must be adjusted.
The guilty part of the test checks for performance degradation on
a query with numerous joins on an empty table. The test expects
the query to take less than 1 second, and fails if it is not so
(which can happen on very slow builders).
The solution is to add more JOINs to the query. On a fixed server,
it should not have any noticeable impact on the query execution,
while on the unfixed version the query would take several times
longer (e.g. 6.5 sec vs 1.5 sec). Thus, we can increase the margin
for the error, and make the test fail when the query takes longer
than 5 seconds.
Redefine FT_KEYPART in a way that it does not conflict with Hash Join.
Hash join stores field->field_index in KEYUSE::keypart, so we must
use a value of FT_KEYPART that's greater than MAX_FIELDS.
Avoided exponential recursive calls of JOIN_CACHE::join_records() in the case
of non-nested outer joins.
A different solution is required to resolve this performance problem for
nested outer joins.
The method JOIN_CACHE::init may fail (return 1) if some conditions on the
used join buffer is not satisfied. For example it fails if join_buffer_size
is greater than join_buffer_space_limit. The conditions should be checked
when running the EXPLAIN command for the query. That's why the method
JOIN_CACHE::init has to be called for EXPLAIN commands as well.
- make_join_readinfo() has the code that forces use of "Using temporary;
Using filesort" when join buffering is in use.
That code didn't handle all cases, in particular it didn't hande the case
where ORDER BY originally has tables from multiple columns, but the
optimizer eventually figures out that doing filesort() on one table
will be sufficient. Adjusted the code to handle that case.
BNL and BNLH joins pre-filter the records from a joined table via JOIN_TAB::cache_select->cond.
There is no need to re-evaluate the same conditions via JOIN_TAB::select_cond. This patch removes
the duplicated conditions from the top-level conjuncts of each pushed condition.
The added "Using where" in few EXPLAINs is due to taking into account tab->cache_select->cond
in addition to tab->select_cond in JOIN::save_explain_data_intern.
In some rare cases when the value of the system variable join_buffer_size
was set to a number less than 256 the function JOIN_CACHE::set_constants
determined the size of an offset in the join buffer equal to 1 though
the minimal join buffer required more than 256 bytes. This could cause
a crash of the server when records from the join buffer were read.
If the flag 'optimize_join_buffer_size' is set to 'off' and the value
of the system variable 'join_buffer_size' is greater than the value of
the system variable 'join_buffer_space_limit' than no join cache can
be employed to join tables of the executed query.
A bug in the function JOIN_CACHE::alloc_buffer allowed to use join
buffer even in this case while another bug in the function
revise_cache_usage could cause a crash of the server in this case if the
chosen execution plan for the query contained outer join or semi-join
operation.
of mysql-5.6 code line. The bugs could not be reproduced in the latest release
of mariadb-5.3 as they were fixed either when the code of subquery optimization
was back-ported from mysql-6.0 or later when some other bugs were fixed.
If the duplicate elimination strategy is used for a semi-join and potentially
one of the block-based join algorithms can be employed to join the inner
tables of the semi-join then sorting of the head (first non-constant) table
for a query with ORDER BY / GROUP BY cannot be used.
The execution plan cannot use sorting on the first table from the
sequence of the joined tables if it plans to employ the block-based
hash join algorithm.
KEYUSE elements for a possible hash join key are not sorted by field
numbers of the second table T of the hash join operation. Besides
some of these KEYUSE elements cannot be used to build any key as their
key expressions depend on the tables that are planned to be accessed
after the table T.
The code before the patch did not take this into account and, as a result,
execition of a query the employing block-based hash join algorithm could
cause a crash or return a wrong result set.
The function setup_semijoin_dups_elimination erroneously assumed
that if join_cache_level is set to 3 or 4 then the type of the
access to a table cannot be JT_REF or JT_EQ_REF. This could lead
to wrong query result sets.
sql/sql_insert.cc:
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
******
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
small cleanup
******
small cleanup
of the 5.3 code line after a merge with 5.2 on 2010-10-28
in order not to allow the cost to access a joined table to be equal
to 0 ever.
Expanded data sets for many test cases to get the same execution plans
as before.
- Set the default
- Adjust the testcases so that 'new' tests are run with optimizations turned on.
- Pull out relevant tests from "irrelevant" tests and run them with optimizations on.
- Run range.test and innodb.test with both mrr=on and mrr=off
semijoin=on,firstmatch=on,loosescan=on
to
semijoin=off,firstmatch=off,loosescan=off
Adjust the testcases:
- Modify subselect*.test and join_cache.test so that all tests
use the same execution paths as before (i.e. optimizations that
are being tested are enabled)
- Let all other test files run with the new default settings (i.e.
with new optimizations disabled)
- Copy subquery testcases from these files into t/subselect_extra.test
which will run them with new optimizations enabled.
This crashing bug could manifest itself at execution of join queries
over materialized derived tables with IN subquery predicates in the
where clause. If for such a query the optimizer chose to use duplicate
weed-out with duplicates in a materialized derived table and chose to
employ join cache the the execution could cause a crash of the server.
It happened because the JOIN_CACHE::init method assumed that the value
of TABLE::file::ref is set at the moment when the method was called
for the employed join cache. It's true for regular tables, but it's
not true for materialized derived tables that are filled now at the
first access to them, i.e. after the JOIN_CACHE::init has done its job.
To fix this problem for any ROWID field of materialized derived table
the procedure that copies fields from record buffers into the employed
join buffer first checks whether the value of TABLE::file::ref has
been set for the table, and if it's not so the procedure sets this value.