The function st_select_lex_unit::exec_recursive() missed resetting of
select_limit_cnt and offset_limit_cnt before execution of union parts.
As a result recursive CTEs specified by UNIONs whose SELECTs contained
LIMIT/OFFSET could return wrong sets of records.
The problem was that the original alias was replaced with a new allocated
string, but constraint item's are still pointing to the original alias.
Fixed by storing the original alias used when printing constraint in the
tables mem_root.
This patch fills a serious flaw in the implementation of common table
expressions. Before this patch an attempt to prepare a statement from
a query with a parameter marker in a CTE that was used more than once
in the query ended up with a bogus error message. Similarly if a statement
in a stored procedure contained a CTE whose specification used a
local variables and this CTE was referred to more than once in the
statement then the server failed to execute the stored procedure returning
a bogus error message on a non-existing field.
The problems appeared due to incorrect handling of parameter markers /
local variables in CTEs that were referred more than once.
This patch fixes the problems by differentiating between the original
occurrences of a parameter marker / local variable used in the
specification of a CTE and the corresponding occurrences used
in copies of this specification. These copies are substituted
instead of non-first references to the CTE.
The idea of the fix and even some code were taken from the MySQL
implementation of the common table expressions.
This problem manifested itself when a join query used two or more
materialized CTE such that each of them employed the same recursive CTE.
The bug caused a crash. The crash happened because the cleanup()
function was performed premature for recursive CTE. This clean up was
induced by the cleanup of the first CTE referenced the recusrsive CTE.
This cleanup destroyed the structures that would allow to read from the
temporary table containing the rows of the recursive CTE and an attempt to read
these rows for the second CTE referencing the recursive CTE triggered a
crash.
The clean up for a recursive CTE R should be performed after the cleanup
of the last materialized CTE that uses R.
The problem was that join_columns creation was not finished due to error of notfound column in USING, but next execution tried to use join_columns lists.
Solution is cleanup the lists on error. It can eat memory in statement MEM_ROOT but it is an error and error will be fixed or statement/procedure removed/altered.
Field_iterator_table_ref::set_field_iterator
Several functions that processed different prepare statements missed
the DT_INIT flag in last parameter of the open_normal_and_derived_tables()
calls. It made context analysis of derived tables dependent on the order in
which the derived tables were processed by mysql_handle_derived(). This
order was induced by the order of SELECTs in all_select_list.
In 10.4 the order of SELECTs in all_select_list became different and lack
of the DT_INIT flags in some open_normal_and_derived_tables() call became
critical as some derived tables were not identified as such.
optimizer_use_condition_selectivity>=3
Selectivity analysis should be disabled for Geometrical columns
for the case like geometric_field= string_constant.
Currently for selectivity calculation we perform range analysis for a column even when we don't have any statistics(EITS).
This makes less sense but is used to catch contradiction for WHERE condition.
So the solution is to not perform range analysis for selectivity calculation for columns that do not have statistics.
The bug appears because of the Item_func_in::build_clone() method.
The 'array' field for the Item_func_in item that can be pushed into
the materialized view/derived table was built in the wrong way.
It becomes lame after the pushdown of the condition into the first
SELECT that defines that view/derived table. The server crashes in
the pushdown into the next SELECT while trying to use already lame
'array' field.
To fix it Item_func_in::build_clone() was changed.
and use_stat_tables= PREFERABLY
Currently the code that calculates selectivity for a table does not take into account the case when
we can have GROUP BY optimization (looses index scan).
The bug appears because of the wrong pushdown into the WHERE clause of the
materialized derived table/view work. For the excl_dep_on_grouping_fields()
method that checks if the condition can be pushed into the WHERE clause
the case when Item_cond is used is missing. For Item_cond elements this
method always returns positive result (that condition can be pushed).
So this condition is pushed even if is shouldn't be pushed.
To fix it new Item_cond::excl_dep_on_grouping_fields() method is added.
using INSERT INTO
This patch allows condition pushdown into a materialized derived / view when
this table is used in INSERT SELECT, multi-table UPDATE and multi-table DELETE.
This patch introduces support for the system variable eq_range_index_dive_limit
that existed in MySQL starting from 5.6. The variable sets a limit for
index dives into equality ranges. Index dives are performed by optimizer
to estimate the number of rows in range scans. Index dives usually provide
good estimate but they are pretty expensive. To estimate the number of rows
in equality ranges statistical data on indexes can be employed. Its usage gives
not so good estimates but it's cheap. So if the number of equality dives
required by an index scan exceeds the set limit no dives for equality
ranges are performed by the optimizer for this index.
As the new system variable is introduced in a stable version the default
value for it is set to a special value meaning there is no limit for the number
of index dives performed by the optimizer.
The patch partially uses the MySQL code for WL 5957
'Statistics-based Range optimization for many ranges'.
Item_subselect::is_expensive() used to return FALSE (Inexpensive) whenever
it saw that one of SELECTs in the Subquery's UNION is degenerate. It
ignored the fact that other parts of the UNION might not be inexpensive,
including the case where pther parts of the UNION have no query plan yet.
For a subquery in form col >= ANY (SELECT 'foo' UNION SELECT 'bar')
this would cause the query to be considered inexpensive when there is
no query plan for the second part of the UNION, which in turn would
cause the SELECT 'foo' to compute and free itself while still inside
JOIN::optimize for that SELECT (See MDEV comment for full description).
truncate incorrect values in convert_period_to_month() so that
PERIOD_DIFF never returns a value outside of 2^23 range.
And, for safety, increase buffer sizes for int10_to_str
to be sufficienly big for any int10_to_str result.
After the commit b76b69cd5f
loose index scan for queries with DISTINCT stopped working.
That is why that commit has to be reverted.
Additionally this patch fixes the problem of MDEV-10880.
We do not accept:
1. We did not have this problem (fixed earlier and better)
d982e717ab Bug#27510150: MYSQLDUMP FAILS FOR SPECIFIC --WHERE CLAUSES
2. We do not have such options (an DBUG_ASSERT put just in case)
bbc2e37fe4 Bug#27759871: BACKRONYM ISSUE IS STILL IN MYSQL 5.7
3. Serg fixed it in other way in this release:
e48d775c6f Bug#27980823: HEAP OVERFLOW VULNERABILITIES IN MYSQL CLIENT LIBRARY
The problem happened in the derived condition pushdown code:
- When Item_func_regex::build_clone() was called, it created a copy of
the original Item_func_regex, and this copy got registered in free_list.
Class specific additional dynamic members (such as "re") made
a shallow copy, rather than a deep copy, in the cloned Item_func_regex.
As a result, the Regexp_processor_pcre::m_pcre of the cloned Item_func_regex
and of the original Item_func_regex pointed to the same compiled regular
expression.
- On cleanup_items(), both original and cloned copies of Item_func_regex
called re.cleanup(), which called pcre_free(m_pcre). So the same compiled
regular expression was freed two times, which was noticed by ASAN.
The same problem was repeatable for Item_func_regexp_instr.
A similar problem happened for Item_func_sp, for the sp_result_field member.
Both original and cloned copies of Item_func_sp pointed the same Field instance
and both deleted it on cleanup().
A possible solution would be to fix build_clone() to create deep
(instead of shallow) copies for the dynamic members of the affected classes
(Item_func_regex, Item_func_regexp_instr, Item_func sp).
However, this would be too complex.
As agreed with Galina and Igor, this patch disallows using using these
affected classes in derived condition pushdown by overriding get_clone()
to return NULL.
Assertion `used_tables_cache == 0' failed
This bug manifested itself when executing queries
over materialized derived tables /vies and with
conjunctive always true predicates containing
inexpensive single-row subqueries.
This bug disappeared after the patch mdev-15035
had been applied.
This patch fixes another problem introduced by the patch for mdev-4817.
The latter changed Item_cond::fix_fields() in such a way that it could
call the virtual method is_expensive(). With the first its call
the method saves the result in Item::is_expensive_cache. For all next
calls the method returns the result from this cache. So if the item
once was determined as expensive the method always returns true.
For subqueries it's not good, because non-optimized subqueries always
is considered as expensive.
It means that the cache should be invalidated after the call of
optimize_constant_subqueries().
In this case we are setting the field Item_func_eq::in_eqaulity_no for the semi-join equalities.
This helps us to remove these equalites as the inner tables are not available during parent select execution
while the outer tables are not available during materialization phase.
We only have it set for the equalites for the fields involved with the IN subquery
and reset it for the equalities which do not belong to the IN subquery.
For example in case of nested IN subqueries:
SELECT t1.a FROM t1 WHERE t1.a IN
(SELECT t2.a FROM t2 where t2.b IN
(select t3.b from t3 where t3.c=27 ))
there are two equalites involving the fields of the IN subquery
1) t2.b = t3.b : the field Item_func_eq::in_eqaulity_no is set when we merge the grandchild select into the child select
2) t1.a = t2.a : the field Item_func_eq::in_eqaulity_no is set when we merge the child select into the parent select
But when we perform case 2) we should ensure that we reset the equalities in the child's WHERE clause.
with join_cache_level>2
During muliple equality propagation for a query in which we have an IN subquery, the items in the select list of the
subquery may not be part of the multiple equality because there might be another occurence of the same field in the
where clause of the subquery.
So we keyuse_is_valid_for_access_in_chosen_plan function which expects the items in the select list of the subquery to
be same to the ones in the multiple equality (through these multiple equalities we create keyuse array).
The solution would be that we expect the same field not the same Item because when we have SEMI JOIN MATERIALIZATION SCAN,
we use copy back technique to copies back the materialised table fields to the original fields of the base tables.
This patch fixes another problem introduced by the patch for mdev-4817.
The latter changed Item_cond::fix_fields() in such a way that it could
call the virtual method is_expensive(). With the first its call
the method saves the result in Item::is_expensive_cache. For all next
calls the method returns the result from this cache. So if the item
once was determined as expensive the method always returns true.
For subqueries it's not good, because non-optimized subqueries always
is considered as expensive.
It means that the cache should be invalidated after the call of
optimize_constant_subqueries().
This patch fixes another problem introduced by the patch for mdev-4817.
The latter changed Item_cond::fix_fields() in such a way that it could
call the virtual method is_expensive(). With the first its call
the method saves the result in Item::is_expensive_cache. For all next
calls the method returns the result from this cache. So if the item
once was determined as expensive the method always returns true.
For subqueries it's not good, because non-optimized subqueries always
is considered as expensive.
It means that the cache should be invalidated after the call of
optimize_constant_subqueries().