This patch adds for "--ps-protocol" second execution
of queries "SELECT".
Also in this patch it is added ability to disable/enable
(--disable_ps2_protocol/--enable_ps2_protocol) second
execution for "--ps-prototocol" in testcases.
This bug caused server crash when processing a multi-update statement that
used views if optimizer tracing was enabled.
The bug was introduced in the patch for MDEV-30539 that could incorrectly
detect the most top level selects of queries if views were used in them.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
MDEV-28073 Slow query performance in MariaDB when using many table
The idea is to prefer and chain EQ_REF tables (tables that uses an
unique key to find a row) when searching for the best table combination.
This significantly reduces row combinations that has to be examined.
This is optimization is enabled when setting optimizer_prune_level=2
(which is now default).
Implementation:
- optimizer_prune_level has a new level, 2, which enables EQ_REF
optimization in addition to the pruning done by level 1.
Level 2 is now default.
- Added JOIN::eq_ref_tables that contains bits of tables that could use
potentially use EQ_REF access in the query. This is calculated
in sort_and_filter_keyuse()
Under optimizer_prune_level=2:
- When the greedy_optimizer notices that the preceding table was an
EQ_REF table, it tries to add an EQ_REF table next. If an EQ_REF
table exists, only this one will be considered at this level.
We also collect all EQ_REF tables chained by the next levels and these
are ignored on the starting level as we have already examined these.
If no EQ_REF table exists, we continue as normal.
This optimization speeds up the greedy_optimizer combination test with
~25%
Other things:
- I ported the changes in MySQL 5.7 to greedy_optimizer.test to MariaDB
to be able to ensure we can handle all cases that MySQL can do.
- I have run all tests with --mysqld=--optimizer_prune_level=1 to verify that
there where no test changes.
MDEV-28073 Slow query performance in MariaDB when using many tables
The faster we can find a good query plan, the more options we have for
finding and pruning (ignoring) bad plans.
This patch adds sorting of plans to best_extension_by_limited_search().
The plans, from best_access_path() are sorted according to the numbers
of found rows. This allows us to faster find 'good tables' and we are
thus able to eliminate 'bad plans' faster.
One side effect of this patch is that if two tables have equal cost,
the table that which was used earlier in the query is preferred.
This allows users to improve plans by reordering eq_ref tables in the
order they would like them to be uses.
Result changes caused by the patch:
- Traces are different as now we print the cost for using tables before
we start considering them in the plan.
- Table order are changed for some plans. In most cases this is because
the plans are equal and tables are in this case sorted according to
their usage in the original query.
- A few plans was changed as the optimizer was able to find a better
plan (that was pruned by the original code).
Other things:
- Added a new statistic variable: "optimizer_join_prefixes_check_calls",
which counts number of calls to best_extension_by_limited_search().
This can be used to check the prune efficiency in greedy_search().
- Added variable "JOIN_TAB::embedded_dependent" to be able to handle
XX IN (SELECT..) in the greedy_optimizer. The idea is that we
should prune a table if any of the tables in embedded_dependent is
not yet read.
- When using many tables in a query, there will be some additional
memory usage as we need to pre-allocate table of
table_count*table_count*sizeof(POSITION) objects (POSITION is 312
bytes for now) to hold the pre-calculated best_access_path()
information. This memory usage is offset by the expected
performance improvement when using many tables in a query.
- Removed the code from an earlier patch to keep the table order in
join->best_ref in the original order. This is not needed anymore as we
are now sorting the tables for each best_extension_by_limited_search()
call.
make_join_select() calls const_cond->val_int(). There are edge cases
where const_cond may have a not-yet optimized subquery.
(The subquery will have used_tables() covered by join->const_tables. It
will still have const_item()==false, so other parts of the optimizer
will not try to evaluate it. We should probably mark such subqueries
as constant but that is outside the scope of this MDEV)
In mysql_execute_command(), move optimizer trace initialization to be
after run_set_statement_if_requested() call.
Unfortunately, mysql_execute_command() code uses "goto error" a lot, and
this means optimizer trace code cannot use RAII objects. Work this around
by:
- Make Opt_trace_start a non-RAII object, add init() method.
- Move the code that writes the top-level object and array into
Opt_trace_start::init().
This enables optimizer_trace output for the next SQL command.
Identical as if one would have done:
- Store value of @@optimizer_trace
- Set @optimizer_trace="enabled=on"
- Run query
- SELECT * from OPTIMIZER_TRACE
- Restore value of @@optimizer_trace
This is a great time saver when one wants to quickly check the optimizer
trace for a query in a mtr test.
Print this piece when we've just made the choice to convert to semi-join.
Also, print it when we've already made that choice before:
transformation": {
"select_id": 2,
"from": "IN (SELECT)",
"to": "semijoin",
"chosen": true
}
In case the test main.opt_trace is run with the option --ps-protocol
it fails since querying from the table INFORMATION_SCHEMA.OPTIMIZER_TRACE
produces an output that differed from the expected one in the following way:
@@ -2829,14 +2829,6 @@
}
},
{
- "transformation": {
- "select_id": 2,
- "from": "IN (SELECT)",
- "to": "semijoin",
- "chosen": true
- }
- },
- {
"expanded_query": "/* select#2 */ select t10.pk from t10"
}
The table INFORMATION_SCHEMA.OPTIMIZER_TRACE is filled when optimizer_trace is on.
The reason of missing above mentioned pieces in query result set is that
the C++ macros
OPT_TRACE_TRANSFORM(thd, trace_wrapper, trace_transform,
select_lex->select_number,
"IN (SELECT)", "semijoin");
located in the standalone function check_and_do_in_subquery_rewrites()
is executed twice in case the statement
explain extended select * from t1 where a in (select pk from t10);
is run in PS mode. The first time it is executed on PREPARE phase and
the second time on EXECUTE phase. The output produced by this macros on
EXECUTE phase rewrites the output produced on PREPARE phase.
In result test failed in case it was run in PS mode.
To make test output uniform regardless the test is run in PS or normal
mode the operator '--source include/protocol.inc' has been added to
the file opt_trace.test and extra opt_trace,ps.rdiff file has been added.
Additionally, added operators
--enable_prepared_warnings/--disable_prepared_warnings
in order to store warnings in result file that received on PREPARE phase
during running the statemement 'SELECT INTO'.
if a query used no fields from an I_S table, we were creating a temp
table with one, first, field (as a table cannot have zero fields),
with its length truncated to 1.
Now - force also this dummy field to be a normal field, not a BLOB