This allows one to run the test suite even if any of the following
options are changed:
- character-set-server
- collation-server
- join-cache-level
- log-basename
- max-allowed-packet
- optimizer-switch
- query-cache-size and query-cache-type
- skip-name-resolve
- table-definition-cache
- table-open-cache
- Some innodb options
etc
Changes:
- Don't print out the value of system variables as one can't depend on
them to being constants.
- Don't set global variables to 'default' as the default may not
be the same as the test was started with if there was an additional
option file. Instead save original value and reset it at end of test.
- Test that depends on the latin1 character set should include
default_charset.inc or set the character set to latin1
- Test that depends on the original optimizer switch, should include
default_optimizer_switch.inc
- Test that depends on the value of a specific system variable should
set it in the test (like optimizer_use_condition_selectivity)
- Split subselect3.test into subselect3.test and subselect3.inc to
make it easier to set and reset system variables.
- Added .opt files for test that required specfic options that could
be changed by external configuration files.
- Fixed result files in rockdsb & tokudb that had not been updated for
a while.
In this case we are accessing incorrect memory when we have mergeable semi-joins.
In the case when we have mergeable semi joins parent select will have a table count
of all the tables in that select plus all the tables involved in the IN-subquery.
But this table count does not include the "sjm table" (only includes the inner and outer tables)
denotes as <subquery#> in explain.
For non-semi-join subquery optimization we do a cost based decision between
Materialisation and IN -> EXIST transformation. The issue in this case is that for IN->EXIST transformation
we run JOIN::reoptimize with the IN->EXISt conditions and we come up with a new query plan. But when we compare
the cost with Materialization, we make the decision to chose Materialization so we need to restore the query plan
for Materilization.
The saving and restoring for keyuse array and join_tab keyuse is only done when we have atleast
one element in the keyuse_array , we are now changing to do it even for 0 elements to main the generality.
This is actually a legacy bug:
SQL_SELECT::test_quick_select() was called
with SQL_SELECT::head not set.
It looks like that this problem can be
reproduced only on queries with ORDER BY
that use IN predicates converted to semi-joins.
Significantly reduce the amount of InnoDB, XtraDB and Mariabackup
code changes by defining pfs_os_file_t as something that is
transparently compatible with os_file_t.
When an IN subquery predicate was converted to a semi-join that were
materialized and the result of the materialization happened to be
the last in the execution plan then any conjunctive condition with RAND()
turned out to be lost.
Fixed by attaching this condition to the last top base table.
Also fixes:
MDEV-9391 InnoDB does not produce warnings when doing WHERE int_column=varchar_column
MDEV-9337 ALTER from DECIMAL and INT to DATETIME returns a wrong result
MDEV-9340 Copying from INT/DOUBLE to ENUM is inconsistent
MDEV-9392 Copying from DECIMAL to YEAR is not consistent about warnings
- Make semi-join optimizer not to choose LooseScan
when 1) the index is not covered and 2) full index
scan will be required.
- Make sure that the code in make_join_select() that may change
full index scan into a range scan is not invoked when the table
uses full scan.
In original code, sometimes one got an automatic DEFAULT value in some cases, in other cases not.
For example:
create table t1 (a int primary key) - No default
create table t2 (a int, primary key(a)) - DEFAULT 0
create table t1 SELECT .... - Default for all fields, even if they where defined as NOT NULL
ALTER TABLE ... MODIFY could sometimes add an unexpected DEFAULT value.
The patch is quite big because we had some many test cases that used
CREATE ... SELECT or CREATE ... (...PRIMARY KEY(xxx)) which doesn't have an automatic DEFAULT anymore.
Other things:
- Removed warnings from InnoDB when waiting from semaphore (got this when testing things with --big)
JOIN::cur_dups_producing_tables was not maintained correctly in
the cases of greedy optimization (search_depth < n_tables).
Moved it to POSITION structure where it will be maintained automatically.
Removed POSITION::prefix_dups_producing_tables since its value can now
be calculated.
- When the optimizer chose LooseScan, make_join_readinfo() should
use the index that was chosen for LooseScan, and should not try
to find a better (shortest) index.
This patch almost totally revised the patch for bug mdev-4177.
The latter had too many defects. In particular, it did not
propagate multiple equalities formed when merging a degenerate
disjunct into underlying AND formula.
This bug was the result of incompleteness of the patch for bug mdev-4177.
When an OR condition is simplified to a single conjunct it is merged
into the embedding AND condition. Multiple equalities are also merged,
and any field item involved in those equality should acquire a pointer
to a the multiple equality formed by this merge.
The function remove_eq_cond removes the parts of a disjunction
for which it has been proved that they are always true. In the
result of this removal the disjunction may be converted into a
formula without OR that must be merged into the the AND formula
that contains the disjunction.
The merging of two AND conditions must take into account the
multiple equalities that may be part of each of them.
These multiple equality must be merged and become part of the
and object built as the result of the merge of the AND conditions.
Erroneously the function remove_eq_cond lacked the code that
would merge multiple equalities of the merged AND conditions.
This could lead to confusing situations when at the same AND
level there were two multiple equalities with common members
and the list of equal items contained only some of these
multiple equalities.
This, in its turn, could lead to an incorrect work of the
function substitute_for_best_equal_field when it tried to optimize
ref accesses. This resulted in forming invalid TABLE_REF objects
that were used to build look-up keys when materialized subqueries
were exploited.
This bug in the legacy code could manifest itself in queries with
semi-join materialized subqueries.
When a subquery is materialized all conditions that are imposed
only on the columns belonging to the tables from the subquery
are taken into account.The code responsible for subquery optimizations
that employes subquery materialization makes sure to remove these
conditions from the WHERE conditions of the query obtained after
it has transformed the original query into a query with a semi-join.
If the condition to be removed is an equality condition it could
be added to ON expressions and/or conditions from disjunctive branches
(parts of OR conditions) in an attempt to generate better access keys
to the tables of the query. Such equalities are supposed to be removed
later from all the formulas where they have been added to.
However, erroneously, this was not done in some cases when an ON
expression and/or a disjunctive part of the OR condition could
be converted into one multiple equality. As a result some equality
predicates over columns belonging to the tables of the materialized
subquery remained in the ON condition and/or the a disjunctive
part of the OR condition, and the excuter later, when trying to
evaluate them, returned wrong answers as the values of the fields
from these equalities were not valid.
This happened because any standalone multiple equality (a multiple
equality that are not ANDed with any other predicates) lacked
the information about equality predicates inherited from upper
levels (in particular, inherited from the WHERE condition).
The fix adds a reference to such information to any standalone
multiple equality.
The wrong result set returned by the left join query from
the bug test case happened due to several inconsistencies
and bugs of the legacy mysql code.
The bug test case uses an execution plan that employs a scan
of a materialized IN subquery from the WHERE condition.
When materializing such an IN- subquery the optimizer injects
additional equalities into the WHERE clause. These equalities
express the constraints imposed by the subquery predicate.
The injected equality of the query in the test case happens
to belong to the same equality class, and a new equality
imposing a condition on the rows of the materialized subquery
is inferred from this class. Simultaneously the multiple
equality is added to the ON expression of the LEFT JOIN
used in the main query.
The inferred equality of the form f1=f2 is taken into account
when optimizing the scan of the rows the temporary table
that is the result of the subquery materialization: only the
values of the field f1 are read from the table into the record
buffer. Meanwhile the inferred equality is removed from the
WHERE conditions altogether as a constraint on the fields
of the temporary table that has been used when filling this table.
This equality is supposed to be removed from the ON expression
when the multiple equalities of the ON expression are converted
into an optimal set of equality predicates. It supposed to be
removed from the ON expression as an equality inferred from only
equalities of the WHERE condition. Yet, it did not happened
due to the following bug in the code.
Erroneously the code tried to build multiple equality for ON
expression twice: the first time, when it called optimize_cond()
for the WHERE condition, the second time, when it called
this function for the HAVING condition. When executing
optimize_con() for the WHERE condition a reference
to the multiple equality of the WHERE condition is set
in the multiple equality of the ON expression. This reference
would allow later to convert multiple equalities of the
ON expression into equality predicates. However the
the second call of build_equal_items() for the ON expression
that happened when optimize_cond() was called for the
HAVING condition reset this reference to NULL.
This bug fix blocks calling build_equal_items() for ON
expressions for the second time. In general, it will be
beneficial for many queries as it removes from ON
expressions any equalities that are to be checked for the
WHERE condition.
The patch also fixes two bugs in the list manipulation
operations and a bug in the function
substitute_for_best_equal_field() that resulted
in passing wrong reference to the multiple equalities
of where conditions when processing multiple
equalities of ON expressions.
The code of substitute_for_best_equal_field() and
the code the helper function eliminate_item_equal()
were also streamlined and cleaned up.
Now the conversion of the multiple equalities into
an optimal set of equality predicates first produces
the sequence of the all equalities processing multiple
equalities one by one, and, only after this, it inserts
the equalities at the beginning of the other conditions.
The multiple changes in the output of EXPLAIN
EXTENDED are mainly the result of this streamlining,
but in some cases is the result of the removal of
unneeded equalities from ON expressions. In
some test cases this removal were reflected in the
output of EXPLAIN resulted in disappearance of
“Using where” in some rows of the execution plans.
This bug happened because the executor tried to use a wrong
TABLE REF object when building access keys. It constructed
keys from fields of a materialized table from a ref object
created to construct keys from the fields of the underlying
base table. This could happen only when materialized table
was created for a non-correlated IN subquery and only
when the materialized table used for lookups.
In this case we are guaranteed to be able to construct the
keys from the fields of tables that would be outer tables
for the tables of the IN subquery.
The patch makes sure that no ref objects constructed from
fields of materialized lookup tables are to be used.
- When doing join optimization, pre-sort the tables so that they mimic the execution
order we've had with 'semijoin=off'.
- That way, we will not get regressions when there are two query plans (the old and the
new) that have indentical costs but different execution times (because of factors that
the optimizer was not able to take into account).