The query causing the issue here has implicit grouping for we
have to produce one row with special values for the aggregates
(depending on each aggregate function), and NULL values for all
non-aggregate fields.
The subselect item where implicit grouping was being done,
null_value for the subselect item was not being set for
the case when the implicit grouping produces NULL values
for the items in the select list of the subquery.
This which was leading to the crash.
The fix would be to set the null_value when all the values
for the row column have NULL values.
Further changes are
1) etting null_value for Item_singlerow_subselect only
after val_* functions have been called.
2) Introduced a parameter null_value_inside to Item_cache that
would store be set to TRUE if any of the arguments of the
Item_cache are null.
Reviewed And co-authored by Monty
The issue here was the read_set bitmap was not set for a field which
was used as a reference in an inner select.
We need to make sure that if we are in an inner select and we have
references from outer select then we update the table bitmaps for
such references.
Introduced a function in the class Item_subselect that would
update bitmaps of table for the references within a
subquery that are defined in outer selects.
Follow-up fix to commit 26f5033(MDEV-23449)
The GROUP BY clause inside IN/ALL/ANY subquery is removed
when there is no aggregate function or HAVING clause in the subquery.
When the GROUP BY clause is removed, a subquery can also be removed
if it part of the GROUP BY clause. This is done inside the function
remove_redundant_subquery_clauses. Here we walk over the GROUP BY list
and remove a subselect from its unit via the callback function
eliminate_subselect_processor.
The issue here was that when the query was being re-executed it was trying
to reinitialize the select that was removed as stated above.
This is not required, so the fix would be to remove select_lex
both from tree lex structure and the global list of nodes so that
we don't do the reinitialization again.
of two table value costructors
This bug affected queries with a [NOT] IN/ANY/ALL subquery whose top level
unit contained several table value constructors.
The problem appeared because the code of the function
Item_subselect::fix_fields() that was responsible for wrapping table
value constructors encountered at the top level unit of a [NOT] IN/ANY/ALL
subquery did not take into account that the chain of the select objects
comprising the unit were not immutable.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Allow materialization strategy when collations on the
inner and outer sides of an IN subquery are the same and the
character set of the inner side is a proper subset of the character
set on the outer side.
This allows conversion from utf8mb3 to utf8mb4
as the former is a subset of the later.
This is only allowed when IN predicate is converted to an IN subquery
Backported part of the patch (d6a00d9b18) of MDEV-17905.
fix printing precedence for BETWEEN, LIKE/ESCAPE, REGEXP, IN
don't use precedence for printing CASE/WHEN/THEN/ELSE/END
fix parsing precedence of BETWEEN, LIKE/ESCAPE, REGEXP, IN
support predicate arguments for IN, BETWEEN, SOUNDS LIKE, LIKE/ESCAPE,
REGEXP
use %nonassoc for unary operators
fix parsing of IS TRUE/FALSE/UNKNOWN/NULL
remove parser_precedence test as superseded by the precedence test
When duplicates are removed from a table using a hash, if the record is a duplicate it is marked
as deleted. The handler API check if the record is deleted and send an error flag HA_ERR_RECORD_DELETED.
When we scan over the table if the thread is not killed then we skip the
records marked as HA_ERR_RECORD_DELETED.
The issue here is when a query is aborted by a user (this is happening when the LIMIT for ROWS EXAMINED
is exceeded), the scan over the table does not skip the records for which HA_ERR_RECORD_DELETED is sent.
It just returns an error flag HA_ERR_ABORTED_BY_USER.
This error flag is not checked at the upper level and hence we hit the assert.
If the query is aborted by the user we should just skip reading rows and return
control to the upper levels of execution.
* Fix the crash: IN-to-EXISTS rewrite causes an error (and so
JOIN::optimize() fails with an error, too), don't call
update_used_tables(). Terminate the query execution instead.
* Fix the cause of the error in the IN-to-EXISTS rewrite: don't do
the rewrite if doing it will cause an error of this kind:
This version of MariaDB doesn't yet support 'SUBQUERY in ROW in left
expression of IN/ALL/ANY'
* Fix another issue exposed by this testcase:
JOIN::setup_subquery_caches() may be invoked before any select has
saved its query plan, and will crash because none of the SELECTs
has called create_explain_query_if_not_exists() to create the Explain
Data Structure for this SELECT.
TODO: When merging this to 10.2, remove the poorly-placed call to
create_explain_query_if_not_exists made by fix for M_D_E_V-16153
- Some of the bug fixes are backports from 10.5!
- The fix in innobase/fil/fil0fil.cc is just a backport to get less
error messages in mysqld.1.err when running with valgrind.
- Renamed HAVE_valgrind_or_MSAN to HAVE_valgrind
The issue here was that the left expr and right expr of the ANY subquery
had different character sets, so we were converting the left expr to utf8 character set.
So when this conversion was happening we were actually converting the item inside the cache,
it looked like <cache>(convert(t1.l1 using utf8)), which is incorrect.
To fix this problem we are going to store the reference of the left expr and convert that
to utf8 character set, it would look like convert(<cache>(`test`.`t1`.`l1`) using utf8)
CREATE PROCEDURE did not detect unknown SP variables in assignments like this:
SET var=a_long_var_name_with_a_typo;
The error happened only during the SP execution time, and only of the control
flow reaches the erroneous statement.
Fixing most expressions to detect unknown identifiers.
This includes simple subqueries without tables:
- Query specification: SELECT list, WHERE,
HAVING (inside aggregate functions) clauses, e.g.
SET var= (SELECT unknown_ident+1);
SET var= (SELECT 1 WHERE unknown_identifier);
SET var= (SELECT 1 HAVING SUM(unknown_identifier);
- Table value constructor: VALUES clause, e.g.:
SET var= (VALUES(unknown_ident));
Note, in some more complex subquery cases unknown variables are still not detected
(this will be fixed separately):
- Derived tables:
SET a=(SELECT unknown_ident FROM (SELECT 1 AS alias) t1);
SET res=(SELECT * FROM t1 LEFT OUTER JOIN (SELECT unknown_ident) t2 USING (c1));
- CTE:
SET a=(WITH cte1 (a) AS (SELECT unknown_ident) SELECT * FROM cte1);
SET a=(WITH cte1 (a,b) AS (VALUES (unknown,2),(3,4)) SELECT * FROM cte1);
SET a=(WITH cte1 (a,b) AS (VALUES (1,2),(3,4)) SELECT unknown_ident FROM cte1);
- SELECT .. GROUP BY unknown_identifier
- SELECT .. ORDER BY unknown_identifier
- HAVING with an unknown identifier outside of any aggregate functions:
SELECT .. HAVING unknown_identifier;
When my_vsnprintf() is patched, the code protected disabled with
'WAITING_FOR_BUGFIX_TO_VSPRINTF' should be enabled again. Also all %b
formats in this patch should be revert to %s again
For the case when the optimizer does the IN-EXISTS transformation,
the equality condition is injected in the WHERE OR HAVING clause of
the subquery. If the select list of the subquery has a reference to
the parent select make sure to use the reference and not the original
item.
because internally setup_wild() adjusts select_lex->with_wild directly
anyway, so there is no reason to pretend that the number of '*' may be
anything else but select_lex->with_wild
And don't update select_lex->item_list, because fields can come
from anywhere and don't necessarily have to be copied into select_lex.
Shift-Reduce conflicts prevented parsing some queries with subqueries that
used set operations when the subqueries occurred in expressions or in IN
predicands.
The grammar rules for query expression were transformed in order to avoid
these conflicts. New grammar rules employ an idea taken from MySQL 8.0.
Step #2: "[ORDER BY ...] LIMIT n" should not prevent EXISTS-to-IN
conversion, as long as
- the LIMIT clause doesn't have OFFSET
- the LIMIT is not "LIMIT 0".
Step 1: Removal of ORDER BY [LIMIT] from the subquery should be done
earlier and for broader class of subqueries.
The rewrite was done in Item_in_subselect::select_in_like_transformer(),
but this had problems:
- It didn't cover EXISTS subqueries
- It covered IN-subqueries, but was done after the semi-join transformation
was considered inapplicable, because ORDER BY was present.
Remaining issue:
- EXISTS->IN transformation happens before
check_and_do_in_subquery_rewrites() is called, so it is still prevented
by the present ORDER BY.
query with VALUES()
A table value constructor can be used in all contexts where a select
can be used. In particular an ORDER BY clause or a LIMIT clause or both
of them can be attached to a table value constructor to produce a new
query. Unfortunately execution of such queries was not supported.
This patch fixes the problem.
`sel->quick' failure in JOIN::make_range_rowid_filters upon query
with rowid_filter=ON
Index ranges can be defined using conditions with inexpensive subqueries.
Such a subquery is evaluated when some representation of a possible range
sequence is built. After the evaluation the JOIN structure of the subsquery is distroyed.
Any attempt to build the above representation may fail because the
function that checks whether a subquery is inexpensive in some cases uses
the join structure of the subquery.
When a range rowid filter is built by a range sequence constructed out of
a range condition that uses an inexpensive subquery the representation of
the the sequence is built twice. Building the second representation fails
due to the described problem with the execution of Item_subselect::is_expensive().
The function was corrected to return the result of the last its invocation
if the Item_subselect object has been already evaluated.
The problem described in the bug report happened because the code
did not test check_cols(1) after fix_fields() in a few places.
Additionally, fix_fields() could be called multiple times for SP variables,
because they are all fixed at a early stage in append_for_log().
Solution:
1. Adding a few helper methods
- fix_fields_if_needed()
- fix_fields_if_needed_for_scalar()
- fix_fields_if_needed_for_bool()
- fix_fields_if_needed_for_order_by()
and using it in many cases instead of fix_fields() where
the "fixed" status is not definitely known to be "false".
2. Adding DBUG_ASSERT(!fixed) into Item_splocal*::fix_fields()
to catch double execution.
3. Adding tests.
As a good side effect, the patch removes a lot of duplicate code (~60 lines):
if (!item->fixed &&
item->fix_fields(..) &&
item->check_cols(1))
return true;
set the pointer to NULL to avoid double-free
when the item is cleaned up many times
(once in JOIN_TAB::cleanup(): tmp->jtbm_subselect->cleanup()
and once at the end of the query, with all other items)
Make sure that SELECT_LEX_UNIT::derived, behaves as documented
(points to the "TABLE_LIST representing this union in the
embedding select"). For recursive CTE this was not necessarily
the case, it could've pointed to the TABLE_LIST inside the CTE,
not in the embedding select.
To fix:
* don't update unit->derived in mysql_derived_prepare(), pass derived
as an argument to st_select_lex_unit::prepare()
* prefer to set unit->derived in TABLE_LIST::init_derived()
to the TABLE_LIST in the embedding select, not to the recursive
reference. Fail if there are many TABLE_LISTs in the embedding
select with conflicting FOR SYSTEM_TIME clauses.
cleanup:
* remove redundant THD* argument from st_select_lex_unit::prepare()
For the query having an IN subquery with no tables, we were converting the subquery with an expression between
the left part and the select list of the subquery . This can give incorrect results when we have a condition
in the subquery with a dual table (as this is treated as a no table).
The fix is that we don't do this conversion when we have conds in the subquery with a dual table.
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db
Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
caused an error
The function subselect_single_select_engine::print() did not print
the WITH clause attached to a subselect with single select engine.
As a result views using suqueries with attached WITH clauses lost
these clauses when saved in frm files.
Make differentiation between pullout for merge and pulout of outer field during exists2in transformation.
In last case the field was outer and so we can safely start from name resolution context of the SELECT where it was pulled.
Old behavior lead to inconsistence between list of tables and outer name resolution context (which skips one SELECT for merge purposes) which creates problem vor name resolution.
As a result of this merge the code for the following tasks appears in 10.3:
- MDEV-12172 Implement tables specified by table value constructors
- MDEV-12176 Transform [NOT] IN predicate with long list of values INTO
[NOT] IN subquery.
- Fix win64 pointer truncation warnings
(usually coming from misusing 0x%lx and long cast in DBUG)
- Also fix printf-format warnings
Make the above mentioned warnings fatal.
- fix pthread_join on Windows to set return value.
- Simplified use_trans_cache() to return at once if is_transactional is set
- Indentation and spelling errors fixed
- Don't call signal_update() if update_binlog_end_pos() is called as the
function already calls signal_update()
- Removed not used function wait_for_update_bin_log(), which would cause
errors if ever used.
- Simplified handler::clone() by always allocating 'ref' in ha_open(). To do
this I added an optional MEM_ROOT argument to ha_open() to be used when
allocating 'ref'
- Changed arguments to get_system_var() from LEX_CSTRING to LEX_CSTRING*
- Added THD as argument to create_select_for_variable(). Changed also char*
argument to LEX_CSTRING to avoid strlen() call.
- Change calls to append() to use LEX_CSTRING
- Added sql/mariadb.h file that should be included first by files in sql
directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables
that must be done before my_global.h is included)
- Removed a lot of include my_global.h from include files
- Removed include's of some files that my_global.h automatically includes
- Removed duplicated include's of my_sys.h
- Replaced include my_config.h with my_global.h
Item_in_subselect::pushed_cond_guards[] array is allocated only when
left_expr->maybe_null. And it is used (for row expressions) when
left_expr->element_index(i)->maybe_null.
For left_expr being a multi-column subquery, its maybe_null is
always false when the subquery doesn't use tables (see
Item_singlerow_subselect::fix_length_and_dec()
and subselect_single_select_engine::fix_length_and_dec()),
otherwise it's always true.
But row elements can be NULL regardless, so let's always allocate
pushed_cond_guards for multi-column subqueries, no matter whether
its maybe_null was forced to true or false.
Significantly reduce the amount of InnoDB, XtraDB and Mariabackup
code changes by defining pfs_os_file_t as something that is
transparently compatible with os_file_t.
In some rare cases queries with UNION ALL
using a derived table specified by
a grouping select with a subquery in WHERE and
impossible HAVING detected after constant row
substitution could hang.
The cause was not a proper return from the
function subselect_single_select_engine::exec()
in the case when the subquery was not optimized
beforehand and the optimization performed
in this function requested for a change of the
subquery engine. This was fixed.
Also a change was applied that avoided execution
of a subquery if impossible having was detected
for the main query at the optimization stage.
At some conditions the function opt_sum_query() can apply MIN/MAX
optimizations to to Item_sum objects of a select These optimizations
becomes invalid if this select is the subquery of an IN subquery
predicate that is converted to a EXISTS subquery. Thus in this case
the MIX/MAX optimizations that have been applied in opt_sum_query()
must be rolled back.
This bug appeared in 5.3 when the code for the cost base choice between
materialization and in-to-exists transformation of non-correlated
IN subqueries was introduced. Before this code in-to-exists
transformations were always performed before the call of opt_sum_query().
The code that blocked conversion of a IN subselect pedicate to a semi-join
if it occurred in the ON expression of an outer join did not do it correctly.
As a result, the conversion was blocked for IN subselect predicates
encountered in ON expressions of INNER joins or in WHERE conditions
of mergeable views / derived tables. This patch fixes this problem.
Here's what started happening after the patch that fixed
the bug mdev-10454 with query reported for the bug
SELECT * FROM t t1 right JOIN t t2 ON (t2.pk = t1.pk)
WHERE (t2.i, t2.pk) NOT IN ( SELECT t3.i, t3.i FROM t t3, t t4 )
AND t1.c = 'foo';
The patch added an implementation of propagate_equal_fields() for
the class Item_row and thus opened the possibility of equal fields
substitutions.
At the prepare stage after setup_conds() called for WHERE condition
had completed the flag of maybe_null of the Item_row object created
for (t2.i, t2.pk) was set to false, because the maybe_null flags of
both elements were set to false. However the flag of maybe_null for
t1.pk from the ON condition were set to true, because t1 was an inner
table of an outer join.
At the optimization stage the outer join was converted to inner join,
but the maybe_null flags were not corrected and remained the same.
So after the substitution t2.pk/t1.pk. the maybe_null flag for the
row remained false while the maybe_flag for the second element of
the row was true. As a result, when the in-to_exists transformation
was performed for the NOT IN predicate the guards variables were
not created for the elements of the row, but a guard object for
the second element was created. The object were not valid because
it referred to NULL as a guard variable. This ultimately caused
a crash when the expression with the guard was evaluated at the
execution stage.
The patch made sure that the guard objects are not created without
guard variables.
Yet it does not resolve the problem of inconsistent maybe_null flags.
and it might be that the problem will pop op in other pieces of code.
The resolution of this problem is not easy, but the problem should
be resolved in future versions.
Benefits of this patch:
- Removed a lot of calls to strlen(), especially for field_string
- Strings generated by parser are now const strings, less chance of
accidently changing a string
- Removed a lot of calls with LEX_STRING as parameter (changed to pointer)
- More uniform code
- Item::name_length was not kept up to date. Now fixed
- Several bugs found and fixed (Access to null pointers,
access of freed memory, wrong arguments to printf like functions)
- Removed a lot of casts from (const char*) to (char*)
Changes:
- This caused some ABI changes
- lex_string_set now uses LEX_CSTRING
- Some fucntions are now taking const char* instead of char*
- Create_field::change and after changed to LEX_CSTRING
- handler::connect_string, comment and engine_name() changed to LEX_CSTRING
- Checked printf() related calls to find bugs. Found and fixed several
errors in old code.
- A lot of changes from LEX_STRING to LEX_CSTRING, especially related to
parsing and events.
- Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING*
- Some changes for char* to const char*
- Added printf argument checking for my_snprintf()
- Introduced null_clex_str, star_clex_string, temp_lex_str to simplify
code
- Added item_empty_name and item_used_name to be able to distingush between
items that was given an empty name and items that was not given a name
This is used in sql_yacc.yy to know when to give an item a name.
- select table_name."*' is not anymore same as table_name.*
- removed not used function Item::rename()
- Added comparision of item->name_length before some calls to
my_strcasecmp() to speed up comparison
- Moved Item_sp_variable::make_field() from item.h to item.cc
- Some minimal code changes to avoid copying to const char *
- Fixed wrong error message in wsrep_mysql_parse()
- Fixed wrong code in find_field_in_natural_join() where real_item() was
set when it shouldn't
- ER_ERROR_ON_RENAME was used with extra arguments.
- Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already
give the error.
TODO:
- Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c
- Change code to not modify LEX_CSTRING for database name
(as part of lower_case_table_names)
This patch fixed some problems that occurred with subqueries that
contained directly or indirectly recursive references to recursive CTEs.
1. A [NOT] IN predicate with a constant left operand and a non-correlated
subquery as the right operand used in the specification of a recursive CTE
was considered as a constant predicate and was evaluated only once.
Now such a predicate is re-evaluated after every iteration of the process
that produces the records of the recursive CTE.
2. The Exists-To-IN transformation could be applied to [NOT] IN predicates
with recursive references. This opened a possibility of materialization
for the subqueries used as right operands. Yet, materialization
is prohibited for the subqueries if they contain a recursive reference.
Now the Exists-To-IN transformation cannot be applied for subquery
predicates with recursive references.
The function st_select_lex::check_subqueries_with_recursive_references()
is called now only for the first execution of the SELECT.
Also, implement MDEV-11027 a little differently from 5.5 and 10.0:
recv_apply_hashed_log_recs(): Change the return type back to void
(DB_SUCCESS was always returned).
Report progress also via systemd using sd_notifyf().
Also, implement MDEV-11027 a little differently from 5.5:
recv_sys_t::report(ib_time_t): Determine whether progress should
be reported.
recv_apply_hashed_log_recs(): Rename the parameter to last_batch.
As the function Item_subselect::fix_fields does it the function
Item_subselect::update_used_tables must ignore UNCACHEABLE_EXPLAIN
when deciding whether the subquery item should be considered as a
constant item.
This patch:
- Adds a new virtual method Type_handler::Item_get_cache
- Splits moves Item_cache::get_cache() into the new method, every
"case XXX_RESULT" to the corresponding Type_handler_xxx::Item_get_cache.
- Adds Item::get_cache as a convenience wrapper, to make the caller code
shorter.
- Changes the last argument of Arg_comparator::cache_converted_constant()
from Item_result to "const Type_handler *".
- Removes subselect_engine::cmp_type, subselect_engine::res_type,
subselect_engine::res_field_type and derives subselect_engine
from Type_handler_hybrid_field_type instead.
- Makes Type_handler_varchar public, as it's now needed as the
default data type handler for subselect_engine.
The idea of this fix was taken from the patch by Roy Lyseng
for mysql-5.6 bug iBug#14740889: "Wrong result for aggregate
functions when executing query through cursor".
Here's Roy's comment for his patch:
"
The problem was that a grouped query did not behave properly when
executed using a cursor. On further inspection, the query used one
intermediate temporary table for the grouping.
Then, Select_materialize::send_result_set_metadata created a temporary
table for storing the query result. Notice that get_unit_column_types()
is used to retrieve column meta-data for the query. The items contained
in this list are later modified so that their result_field points to
the row buffer of the materialized temporary table for the cursor.
But prior to this, these result_field objects have been prepared for
use in the grouping operation, by JOIN::make_tmp_tables_info(), hence
the grouping operation operates on wrong column buffers.
The problem is solved by using the list JOIN::fields when copying data
to the materialized table. This list is set by JOIN::make_tmp_tables_info()
and points to the columns of the last intermediate temporary table of
the executed query. For a UNION, it points to the temporary table
that is the result of the UNION query.
Notice that we have to assign a value to ::fields early in JOIN::optimize()
in case the optimization shortcuts due to a const plan detection.
A more optimal solution might be to avoid creating the final temporary
table when the query result is already stored in a temporary table.
"
The patch does not contain a test case, but the description of the
problem corresponds exactly what could be observed in the test
case for mdev-11081.
SUBSELECT_UNION_ENGINE::NO_ROWS
This patch is specific for mysql-5.5
ISSUE: When max_join_size is used and union query
results in evaluation of tuples greater than
max_join_size, the join object is not created,
and is set to NULL.
However, this join object is further dereferenced
by union logic to determine if query resulted in
any number of rows being returned.
Since, the object is NULL, it results in
program terminating abnormally.
SOLUTION: Added check to verify if join object is created.
If join object is created, it will be used to
determine if query resulted in any number of rows.
Else, when join object is not created, we return
'false' indicating that there were no rows for the
query.