Make mysqltest to use --ps-protocol more
use prepared statements for everything that server supports
with the exception of CALL (for now).
Fix discovered test failures and bugs.
tests:
* PROCESSLIST shows Execute state, not Query
* SHOW STATUS increments status variables more than in text protocol
* multi-statements should be avoided (see tests with a wrong delimiter)
* performance_schema events have different names in --ps-protocol
* --enable_prepare_warnings
mysqltest.cc:
* make sure run_query_stmt() doesn't crash if there's
no active connection (in wait_until_connected_again.inc)
* prepare all statements that server supports
protocol.h
* Protocol_discard::send_result_set_metadata() should not send
anything to the client.
sql_acl.cc:
* extract the functionality of getting the user for SHOW GRANTS
from check_show_access(), so that mysql_test_show_grants() could
generate the correct column names in the prepare step
sql_class.cc:
* result->prepare() can fail, don't ignore its return value
* use correct number of decimals for EXPLAIN columns
sql_parse.cc:
* discard profiling for SHOW PROFILE. In text protocol it's done in
prepare_schema_table(), but in --ps it is called on prepare only,
so nothing was discarding profiling during execute.
* move the permission checking code for SHOW CREATE VIEW to
mysqld_show_create_get_fields(), so that it would be called during
prepare step too.
* only set sel_result when it was created here and needs to be
destroyed in the same block. Avoid destroying lex->result.
* use the correct number of tables in check_show_access(). Saying
"as many as possible" doesn't work when first_not_own_table isn't
set yet.
sql_prepare.cc:
* use correct user name for SHOW GRANTS columns
* don't ignore verbose flag for SHOW SLAVE STATUS
* support preparing REVOKE ALL and ROLLBACK TO SAVEPOINT
* don't ignore errors from thd->prepare_explain_fields()
* use select_send result for sending ANALYZE and EXPLAIN, but don't
overwrite lex->result, because it might be needed to issue execute-time
errors (select_dumpvar - too many rows)
sql_show.cc:
* check grants for SHOW CREATE VIEW here, not in mysql_execute_command
sql_view.cc:
* use the correct function to check privileges. Old code was doing
check_access() for thd->security_ctx, which is invoker's sctx,
not definer's sctx. Hide various view related errors from the invoker.
sql_yacc.yy:
* initialize lex->select_lex for LOAD, otherwise it'll contain garbage
data that happen to fail tests with views in --ps (but not otherwise).
In this case we are setting the field Item_func_eq::in_eqaulity_no for the semi-join equalities.
This helps us to remove these equalites as the inner tables are not available during parent select execution
while the outer tables are not available during materialization phase.
We only have it set for the equalites for the fields involved with the IN subquery
and reset it for the equalities which do not belong to the IN subquery.
For example in case of nested IN subqueries:
SELECT t1.a FROM t1 WHERE t1.a IN
(SELECT t2.a FROM t2 where t2.b IN
(select t3.b from t3 where t3.c=27 ))
there are two equalites involving the fields of the IN subquery
1) t2.b = t3.b : the field Item_func_eq::in_eqaulity_no is set when we merge the grandchild select into the child select
2) t1.a = t2.a : the field Item_func_eq::in_eqaulity_no is set when we merge the child select into the parent select
But when we perform case 2) we should ensure that we reset the equalities in the child's WHERE clause.
with join_cache_level>2
During muliple equality propagation for a query in which we have an IN subquery, the items in the select list of the
subquery may not be part of the multiple equality because there might be another occurence of the same field in the
where clause of the subquery.
So we keyuse_is_valid_for_access_in_chosen_plan function which expects the items in the select list of the subquery to
be same to the ones in the multiple equality (through these multiple equalities we create keyuse array).
The solution would be that we expect the same field not the same Item because when we have SEMI JOIN MATERIALIZATION SCAN,
we use copy back technique to copies back the materialised table fields to the original fields of the base tables.
If the optimizer chose an execution plan where
a semi-join nest were materialized and the
result of materialization was scanned to access
other tables by ref access it could build a key
over columns of the tables from the nest that
were actually inaccessible.
The patch performs a proper check whether a key
that uses columns of the tables from a materialized
semi-join nest can be employed to access outer tables.
The implementation of the walk method for the class Item_in_subselect
was missing. As a result the method never traversed the left operand
of any IN subquery predicate.
Item_exists_subselect::exists2in_processor() that performs the
Exist-To-In transformation calls the walk method to collect info
on outer references. As the walk method did not traverse the
left operands of the IN subqueries the outer references there
were not taken into account and some subqueries that were actually
correlated were marked as uncorrelated. It could lead to an
attempt of the materialization of such a subquery.
Also added a cleanup for some test cases merged from 5.5.
Also fixed a wrong result for a test case for mdev-7691
(the alternative one).
The test cases for all these bug have materialized semi-joins used
inside dependent sub-queries.
The patch actually reverts the change inroduced by Monty in 2003.
It looks like this change is not valid anymore after the implementation
of semi-joins.
Adjusted output from EXPLAIN for many other test cases.
The problem was caused by a merged semi-join, which contained a non-merged
semi-join, which used references to the top-level query in the left_expr.
When moving non-merged semi-join from the subquery to its parent, do not
forget to call fix_after_pullout for its Item_subselect. We need to do
that specifically, because non-merged semi-joins do not have their
IN-equality in the WHERE clause at this stage.
Alternative fix that doesn't cause view.test crash in --ps:
Remember when Item_ref was fixed right in the constructor
and did not have a full Item_ref::fix_fields() call. Later
in PS/SP, after Item_ref::cleanup, we use this knowledge
to avoid doing full fix_fields() for items that were never
supposed to be fix_field'ed.
Simplify the test case.
[Attempt #] Make the code that handles "Prepare" phase for multi-table
UPDATE statements handle non-merged semijoins. It can encounter them when
a prepared statement is executed for the second time.
SELECT ... WHERE XX IN (SELECT YY)
this was transformed to something like:
SELECT ... WHERE IF_EXISTS(SELECT ... HAVING XX=YY)
The bug was that for normal execution XX was fixed in the original outer SELECT context while in PS it was fixed in the sub query context and this confused the optimizer.
Fixed by ensuring that XX is always fixed in the outer context.
- When traversing JOIN_TABs with first_linear_tab/next_linear_tab(), don't forget
to enter the semi-join nest when it is the first table in the join order.
Failure to do so could cause e.g. I_S tables not to be filled.
- With big_tables=ON, materialized table will use Aria (or MyISAM) SE, which
allows prefix key reads. However, the temp.table has rec_per_key=NULL which
causes the optimizer to crash when attempting to read index statistics for a
prefix index read.
- Fixed by providing a rec_per_key array with zeros (i.e. "no statistics data")
- convert_subq_to_sj() must connect child select's tables into
parent select's TABLE_LIST::next_local chain.
- The problem was that it took child's leaf_tables.head() which
is different. This could cause certain tables (in this bug's case,
child select's non-merged semi-join) not to be present in
TABLE_LIST::next_local chain. Which would cause non-merged semi-join
not to be initialized in setup_tables(), which would lead to
NULL pointer dereference.
- convert_subq_to_sj() must connect child select's tables into
parent select's TABLE_LIST::next_local chain.
- The problem was that it took child's leaf_tables.head() which
is different. This could cause certain tables (in this bug's case,
child select's non-merged semi-join) not to be present in
TABLE_LIST::next_local chain. Which would cause non-merged semi-join
not to be initialized in setup_tables(), which would lead to
NULL pointer dereference.
Apply fix suggested by Igor:
- When eliminate_item_equal() generates pair-wise equalities from a
multi-equality, do generate a "bridge" equality between the first
field inside SJM nest and the field that's first in the overall multi-equality.
- make multi_delete::initialize_tables() take into account that the JOIN structure may have
semi-join nests (which are not fully initialized when this function is called, they have
tab->table=NULL which caused the crash)
- Also checked multi_update::initialize_tables(): it has a different logic and needed no fixing.
- make make_cond_after_sjm() correctly handle OR clauses where one branch refers to the semi-join table
while the other branch refers to the non-semijoin table.
- The bug would show up
- when using PS (so that we get re-execution)
- the left_expr of the subquery is a reference to viewname.column_name, so that it crashes
when one tries to use it without having called fix_fields for it.
- when using SJ-Materialization, which makes use of sj_subq_pred->left_expr expression
- The fix is to have setup_conds() fix sj_subq_pred->left_expr for semi-join nests it finds.
The function create_hj_key_for_table() that builds the descriptor of
the hash join key to access a table of a materialized subquery must
ignore any equi-join predicate depending on the tables not belonging
to the subquery.
The result of materialization of the right part of an IN subquery predicate
is placed into a temporary table. Each row of the materialized table is
distinct. A unique key over all fields of the temporary table is defined and
created. It allows to perform key look-ups into the table.
The table created for a materialized subquery can be accessed by key as
any other table. The function best_access-path search for the best access
to join a table to a given partial join. With some where conditions this
function considers a possibility of a ref_or_null access. If such access
employs the unique key on the temporary table then when estimating
the cost this access the function tries to use the array rec_per_key. Yet,
such array is not built for this unique key. This causes a crash of the server.
Rows returned by the subquery that contain nulls don't have to be placed
into temporary table, as they cannot be match any row produced by the
left part of the subquery predicate. So all fields of the temporary table
can be defined as non-nullable. In this case any ref_or_null access
to the temporary table does not make any sense and it does not make sense
to estimate such an access.
The fix makes sure that the temporary table for a materialized IN subquery
is defined with columns that are all non-nullable. The also ensures that
any row with nulls returned by the subquery is not placed into the
temporary table.
- In return_zero_rows(), don't call mark_as_null_row() for semi-join
materialized tables, because 1) they may have been already freed, and
2)there is no real need to call mark_as_null_row() for them.
Fixed Item* Item_equal::get_first(JOIN_TAB *context, Item *field_item) to work correctly in the case where:
- context!= NO_PARTICULAR_TAB, it points to a table within SJ-Materialization nest
- field_item points to an item_equal that has a constant Item_field but does not have any fields
from tables that are within semi-join nests.
The patch differs from the original MySQL patch as follows:
- All test case differences have been reviewed one by one, and
care has been taken to restore the original plan so that each
test case executes the code path it was designed for.
- A bug was found and fixed in MariaDB 5.3 in
Item_allany_subselect::cleanup().
- ORDER BY is not removed because we are unsure of all effects,
and it would prevent enabling ORDER BY ... LIMIT subqueries.
- ref_pointer_array.m_size is not adjusted because we don't do
array bounds checking, and because it looks risky.
Original comment by Jorgen Loland:
-------------------------------------------------------------
WL#5953 - Optimize away useless subquery clauses
For IN/ALL/ANY/SOME/EXISTS subqueries, the following clauses are
meaningless:
* ORDER BY (since we don't support LIMIT in these subqueries)
* DISTINCT
* GROUP BY if there is no HAVING clause and no aggregate
functions
This WL detects and optimizes away these useless parts of the
query during JOIN::prepare()
- Let JTBM optimization code handle the case where the subquery is degenerate and doesn't have a
join query plan. Regular materialization would fall back to IN->EXISTS for such cases. Semi-Join
materialization does not have such option, instead we introduce and use "constant JTBM join tabs".
- Do a "more thorough" cleanup of SJ-Materialization join tab in JOIN_TAB::cleanup. The bug
was due to the fact that JOIN_TAB::cleanup() may be called multiple times for the same tab
if the join has grouping.