First UPDATE under START TRANSACTION does nothing (nstate= nstate),
but anyway generates history. Since update vector is empty we get into
(!uvect->n_fields) branch which only adds history row, but does not do
update. After that we get current row with wrong (old) row_start value
and because of that second UPDATE tries to insert history row again
because it sees trx->id != row_start which is the guard to avoid
inserting multiple trx_id-based history rows under same transaction
(because we have same trx_id and we get duplicate error and this bug
demostrates that). But this try anyway fails because PK is based on
row_end which is constant under same transaction, so PK didn't change.
The fix moves vers_make_update() to an earlier stage of
calc_row_difference(). Therefore it prepares update vector before
(!uvect->n_fields) check and never gets into that branch, hence no
need to handle versioning inside that condition anymore.
Now trx->id and row_start are equal after first UPDATE and we don't
try to insert second history row.
== Cleanups and improvements ==
ha_innobase::update_row():
vers_set_fields and vers_ins_row are cleaned up into direct condition
check. SQLCOM_ALTER_TABLE check now is not used as this is dead code,
assertion is done instead.
upd_node->is_delete is set in calc_row_difference() just to keep
versioning code as much in one place as possible. vers_make_delete()
is still located in row_update_for_mysql() as this is required for
ha_innodbase::delete_row() as well.
row_ins_duplicate_error_in_clust():
Restrict DB_FOREIGN_DUPLICATE_KEY to the better conditions.
VERSIONED_DELETE is used specifically to help lower stack to
understand what caused current insert. Related to MDEV-29813.
The new statistics is enabled by adding the "engine", "innodb" or "full"
option to --log-slow-verbosity
Example output:
# Pages_accessed: 184 Pages_read: 95 Pages_updated: 0 Old_rows_read: 1
# Pages_read_time: 17.0204 Engine_time: 248.1297
Page_read_time is time doing physical reads inside a storage engine.
(Writes cannot be tracked as these are usually done in the background).
Engine_time is the time spent inside the storage engine for the full
duration of the read/write/update calls. It uses the same code as
'analyze statement' for calculating the time spent.
The engine statistics is done with a generic interface that should be
easy for any engine to use. It can also easily be extended to provide
even more statistics.
Currently only InnoDB has counters for Pages_% and Undo_% status.
Engine_time works for all engines.
Implementation details:
class ha_handler_stats holds all engine stats. This class is included
in handler and THD classes.
While a query is running, all statistics is updated in the handler. In
close_thread_tables() the statistics is added to the THD.
handler::handler_stats is a pointer to where statistics should be
collected. This is set to point to handler::active_handler_stats if
stats are requested. If not, it is set to 0.
handler_stats has also an element, 'active' that is 1 if stats are
requested. This is to allow engines to avoid doing any 'if's while
updating the statistics.
Cloned or partition tables have the pointer set to the base table if
status are requested.
There is a small performance impact when using --log-slow-verbosity=engine:
- All engine calls in 'select' will be timed.
- IO calls for InnoDB reads will be timed.
- Incrementation of counters are done on local variables and accesses
are inline, so these should have very little impact.
- Statistics has to be reset for each statement for the THD and each
used handler. This is only 40 bytes, which should be neglectable.
- For partition tables we have to loop over all partitions to update
the handler_status as part of table_init(). Can be optimized in the
future to only do this is log-slow-verbosity changes. For this to work
we have to update handler_status for all opened partitions and
also for all partitions opened in the future.
Other things:
- Added options 'engine' and 'full' to log-slow-verbosity.
- Some of the new files in the test suite comes from Percona server, which
has similar status information.
- buf_page_optimistic_get(): Do not increment any counter, since we are
only validating a pointer, not performing any buf_pool.page_hash lookup.
- Added THD argument to save_explain_data_intern().
- Switched arguments for save_explain_.*_data() to have
always THD first (generates better code as other functions also have THD
first).
EXPLAIN EXTENDED for an UPDATE/DELETE/INSERT/REPLACE statement did not
produce the warning containing the text representation of the query
obtained after the optimization phase. Such warning was produced for
SELECT statements, but not for DML statements.
The patch fixes this defect of EXPLAIN EXTENDED for DML statements.
Remove table_count from Query_tables_list (not used, moved to MYSQL_LOCK).
Rename table_count from LEX to avoid mixing it with other counters of tables.
Cause: a copy of the joined TABLE_LIST is created during multi_update::prepare
and TABLE::pos_in_table_list of the tables are set to point to the new
TABLE_LIST object. This prevents some optimization steps to perform correctly.
Solution: do not update pos_in_table_list during multi_update::prepare
not every index-using plan sets bits in table->quick_keys.
QUICK_ROR_INTERSECT_SELECT, for example, doesn't.
Use the fact that select->quick is set instead.
Also allow EXPLAIN to work.
- Handle stored function conditions correctly, with the same logic as with UDFs.
- When running queries on Spider SE, by default, we do not push down WHERE conditions containing usage of UDFs/stored functions to remote data nodes, unless the user demands (by setting spider_use_pushdown_udf).
- Disable direct update/delete when a udf condition is skipped.
The columns that are part of DEFAULT expression were not read-marked
in statements like UPDATE...SET b=DEFAULT.
The problem is `F(DEFAULT)` expression depends of the left-hand side of an
assignment. However, setup_fields accepts only right-hand side value.
Neither Item::fix_fields does.
Suchwise, b=DEFAULT(b) works fine, because Item_default_field has
information on what field it is default of:
if (thd->mark_used_columns != MARK_COLUMNS_NONE)
def_field->default_value->expr->update_used_tables();
in Item_default_value::fix_fields().
It is not reasonable to pass a left-hand side to Item:fix_fields, because
the case is rare, so the rewrite
b= F(DEFAULT) -> b= F(DEFAULT(b))
is made instead.
Both UPDATE and multi-UPDATE are affected, however any form of INSERT
is not: it marks all the fields in DEFAULT expressions for read in
TABLE::mark_default_fields_for_write().
Reformulate mark_columns_used_by_index* function family in a more laconic
way:
mark_columns_used_by_index -> mark_index_columns
mark_columns_used_by_index_for_read_no_reset -> mark_index_columns_for_read
mark_columns_used_by_index_no_reset -> mark_index_columns_no_reset
static mark_index_columns -> do_mark_index_columns
Both EXPLAIN and EXPLAIN EXTENDED statements produce different results set
in case it is run in normal way and in PS mode for the statements
UPDATE/DELETE with subquery.
The use case below reproduces the issue:
MariaDB [test]> CREATE TABLE t1 (c1 INT KEY) ENGINE=MyISAM;
Query OK, 0 rows affected (0,128 sec)
MariaDB [test]> CREATE TABLE t2 (c2 INT) ENGINE=MyISAM;
Query OK, 0 rows affected (0,023 sec)
MariaDB [test]> CREATE TABLE t3 (c3 INT) ENGINE=MyISAM;
Query OK, 0 rows affected (0,021 sec)
MariaDB [test]> EXPLAIN EXTENDED UPDATE t3 SET c3 =
-> ( SELECT COUNT(d1.c1) FROM ( SELECT a11.c1 FROM t1 AS a11
-> STRAIGHT_JOIN t2 AS a21 ON a21.c2 = a11.c1 JOIN t1 AS a12
-> ON a12.c1 = a11.c1 ) d1 );
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
| 1 | PRIMARY | t3 | ALL | NULL | NULL | NULL | NULL | 0 | 100.00 | |
| 2 | SUBQUERY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE noticed after reading const tables
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
2 rows in set (0,002 sec)
MariaDB [test]> PREPARE stmt FROM
-> EXPLAIN EXTENDED UPDATE t3 SET c3 =
-> ( SELECT COUNT(d1.c1) FROM ( SELECT a11.c1 FROM t1 AS a11
-> STRAIGHT_JOIN t2 AS a21 ON a21.c2 = a11.c1 JOIN t1 AS a12
-> ON a12.c1 = a11.c1 ) d1 );
Query OK, 0 rows affected (0,000 sec)
Statement prepared
MariaDB [test]> EXECUTE stmt;
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
| 1 | PRIMARY | t3 | ALL | NULL | NULL | NULL | NULL | 0 | 100.00 | |
| 2 | SUBQUERY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | no matching row in const table |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
2 rows in set (0,000 sec)
The reason by that different result sets are produced is that on execution
of the statement 'EXECUTE stmt' the flag SELECT_DESCRIBE not set
in the data member SELECT_LEX::options for instances of SELECT_LEX that
correspond to subqueries used in the UPDTAE/DELETE statements.
Initially, these flags were set on parsing the statement
PREPARE stmt FROM "EXPLAIN EXTENDED UPDATE t3 SET ..."
but latter they were reset before starting real execution of
the parsed query during handling the statement 'EXECUTE stmt';
So, to fix the issue the functions mysql_update()/mysql_delete()
have been modified to set the flag SELECT_DESCRIBE forcibly
in the data member SELECT_LEX::options for the primary SELECT_LEX
of the UPDATE/DELETE statement.
The ROWNUM() function is for SELECT mapped to JOIN->accepted_rows, which is
incremented for each accepted rows.
For Filesort, update, insert, delete and load data, we map ROWNUM() to
internal variables incremented when the table is changed.
The connection between the row counter and Item_func_rownum is done
in sql_select.cc::fix_items_after_optimize() and
sql_insert.cc::fix_rownum_pointers()
When ROWNUM() is used anywhere in query, the optimization to ignore ORDER
BY in sub queries are disabled. This was done to get the following common
Oracle query to work:
select * from (select * from t1 order by a desc) as t where rownum() <= 2;
MDEV-3926 "Wrong result with GROUP BY ... WITH ROLLUP" contains a discussion
about this topic.
LIMIT optimization is enabled when in a top level WHERE clause comparing
ROWNUM() with a numerical constant using any of the following expressions:
- ROWNUM() < #
- ROWNUM() <= #
- ROWNUM() = 1
ROWNUM() can be also be the right argument to the comparison function.
LIMIT optimization is done in two cases:
- For the current sub query when the ROWNUM comparison is done on the top
level:
SELECT * from t1 WHERE rownum() <= 2 AND t1.a > 0
- For an inner sub query, when the upper level has only a ROWNUM comparison
in the WHERE clause:
SELECT * from (select * from t1) as t WHERE rownum() <= 2
In Oracle mode, one can also use ROWNUM without parentheses.
Other things:
- Fixed bug where the optimizer tries to optimize away sub queries
with RAND_TABLE_BIT set (non-deterministic queries). Now these
sub queries will not be converted to joins. This bug fix was also
needed to get rownum() working inside subqueries.
- In remove_const() remove setting simple_order to FALSE if ROLLUP is
USED. This code was disable a long time ago because of wrong assignment
in the following code. Instead we set simple_order to false if
RAND_TABLE_BIT was used in the SELECT list. This ensures that
we don't delete ORDER BY if the result set is not deterministic, like
in 'SELECT RAND() AS 'r' FROM t1 ORDER BY r';
- Updated parameters for Sort_param::init_for_filesort() to be able
to provide filesort with information where the number of accepted
rows should be stored
- Reordered fields in class Filesort to optimize storage layout
- Added new error messsage to tell that a function can't be used in HAVING
- Added field 'with_rownum' to THD to mark that ROWNUM() is used in the
query.
Co-author: Oleksandr Byelkin <sanja@mariadb.com>
LIMIT optimization for sub query
This commits replaces the call of the function setup_tables() with
a call of the function setup_tables_and_check_access() in the method
Multiupdate_prelocking_strategy::handle_end().
There is no known bug that would require this change. However the change
aligns this piece of code with the code existed before the patch for
MDEV-24823.
Before this patch mergeable derived tables / view used in a multi-table
update / delete were merged before the preparation stage.
When the merge of a derived table / view is performed the on expression
attached to it is fixed and ANDed with the where condition of the select S
containing this derived table / view. It happens after the specification of
the derived table / view has been merged into S. If the ON expression refers
to a non existing field an error is reported and some other mergeable derived
tables / views remain unmerged. It's not a problem if the multi-table
update / delete statement is standalone. Yet if it is used in a stored
procedure the select with incompletely merged derived tables / views may
cause a problem for the second call of the procedure. This does not happen
for select queries using derived tables / views, because in this case their
specifications are merged after the preparation stage at which all ON
expressions are fixed.
This patch makes sure that merging of the derived tables / views used in a
multi-table update / delete statement is performed after the preparation
stage.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Before this patch mergeable derived tables / view used in a multi-table
update / delete were merged before the preparation stage.
When the merge of a derived table / view is performed the on expression
attached to it is fixed and ANDed with the where condition of the select S
containing this derived table / view. It happens after the specification of
the derived table / view has been merged into S. If the ON expression refers
to a non existing field an error is reported and some other mergeable derived
tables / views remain unmerged. It's not a problem if the multi-table
update / delete statement is standalone. Yet if it is used in a stored
procedure the select with incompletely merged derived tables / views may
cause a problem for the second call of the procedure. This does not happen
for select queries using derived tables / views, because in this case their
specifications are merged after the preparation stage at which all ON
expressions are fixed.
This patch makes sure that merging of the derived tables / views used in a
multi-table update / delete statement is performed after the preparation
stage.
Approved by Oleksandr Byelkin <sanja@mariadb.com>