Before this patch the crash occured when a single row dataset is used and
Item::remove_eq_conds() is called for HAVING. This function is not supposed
to be called after the elimination of multiple equalities.
To fix this problem instead of Item::remove_eq_conds() Item::val_int() is
used. In this case the optimizer tries to evaluate the condition for the
single row dataset and discovers impossible HAVING immediately. So, the
execution phase is skipped.
Approved by Igor Babaev <igor@maridb.com>
New runtime diagnostic introduced with MDEV-34490 has detected
that `Item_int_with_ref` incorrectly returns an instance of its ancestor
class `Item_int`. This commit fixes that.
In addition, this commit reverts a part of the diagnostic related
to `clone_item()` checks. As it turned out, `clone_item()` is not required
to return an object of the same class as the cloned one. For example,
look at `Item_param::clone_item()`: it can return objects of `Item_null`,
`Item_int`, `Item_string`, etc, depending on the object state.
So the runtime type diagnostic is not applicable to `clone_item()` and
is disabled with this commit.
As the similar diagnostic failures are expected to appear again
in the future, this commit introduces a new test file in the main suite:
item_types.test, and new test cases may be added to this file
Reviewer: Oleksandr Byelkin <sanja@mariadb.com>
The `Item` class methods `get_copy()`, `build_clone()`, and `clone_item()`
face an issue where they may be defined in a descendant class
(e.g., `Item_func`) but not in a further descendant (e.g., `Item_func_child`).
This can lead to scenarios where `build_clone()`, when operating on an
instance of `Item_func_child` with a pointer to the base class (`Item`),
returns an instance of `Item_func` instead of `Item_func_child`.
Since this limitation cannot be resolved at compile time, this commit
introduces runtime type checks for the copy/clone operations.
A debug assertion will now trigger in case of a type mismatch.
`get_copy()`, `build_clone()`, and `clone_item()` are no more virtual,
but virtual `do_get_copy()`, `do_build_clone()`, and `do_clone_item()`
are added to the protected section of the class `Item`.
Additionally, const qualifiers have been added to certain methods
to enhance code reliability.
Reviewer: Oleksandr Byelkin <sanja@mariadb.com>
This commits adds the "materialization" block to the output of
EXPLAIN/ANALYZE FORMAT=JSON when materialized subqueries are involved
into processing. In the case of ANALYZE additional runtime information
is displayed, such as:
- chosen strategy of materialization
- number of partial match/index lookup loops
- sizes of partial match buffers
Improve performance of queries like
SELECT * FROM t1 WHERE field = NAME_CONST('a', 4);
by, in this example, replacing the WHERE clause with field = 4
in the case of ref access.
The rewrite is done during fix_fields and we disambiguate this
case from other cases of NAME_CONST by inspecting where we are
in parsing. We rely on THD::where to accomplish this. To
improve performance there, we change the type of THD::where to
be an enumeration, so we can avoid string comparisons during
Item_name_const::fix_fields. Consequently, this patch also
changes all usages of THD::where to conform likewise.
The memory leak happened on second execution of a prepared statement
that runs UPDATE statement with correlated subquery in right hand side of
the SET clause. In this case, invocation of the method
table->stat_records()
could return the zero value that results in going into the 'if' branch
that handles impossible where condition. The issue is that this condition
branch missed saving of leaf tables that has to be performed as first
condition optimization activity. Later the PS statement memory root
is marked as read only on finishing first time execution of the prepared
statement. Next time the same statement is executed it hits the assertion
on attempt to allocate a memory on the PS memory root marked as read only.
This memory allocation takes place by the sequence of the following
invocations:
Prepared_statement::execute
mysql_execute_command
Sql_cmd_dml::execute
Sql_cmd_update::execute_inner
Sql_cmd_update::update_single_table
st_select_lex::save_leaf_tables
List<TABLE_LIST>::push_back
To fix the issue, add the flag SELECT_LEX::leaf_tables_saved to control
whether the method SELECT_LEX::save_leaf_tables() has to be called or
it has been already invoked and no more invocation required.
Similar issue could take place on running the DELETE statement with
the LIMIT clause in PS/SP mode. The reason of memory leak is the same as for
UPDATE case and be fixed in the same way.
The optimizer deals with Rowid Filters this way:
1. First, range optimizer is invoked. It saves information
about all potential range accesses.
2. A query plan is chosen. Suppose, it uses a Rowid Filter on
index $IDX.
3. JOIN::make_range_rowid_filters() calls the range optimizer
again to create a quick select on index $IDX which will be used
to populate the rowid filter.
The problem: KILL command catches the query in step #3. Quick
Select is not created which causes a crash.
Fixed by checking if query was killed. Note: the problem also
affects 10.6, even if error handling for
SQL_SELECT::test_quick_select is different there.
(Variant for 10.6: return error code from SQL_SELECT::test_quick_select)
The optimizer deals with Rowid Filters this way:
1. First, range optimizer is invoked. It saves information
about all potential range accesses.
2. A query plan is chosen. Suppose, it uses a Rowid Filter on
index $IDX.
3. JOIN::make_range_rowid_filters() calls the range optimizer
again to create a quick select on index $IDX which will be used
to populate the rowid filter.
The problem: KILL command catches the query in step #3. Quick
Select is not created which causes a crash.
Fixed by checking if query was killed.
Rowid Filter cannot be used with reverse-ordered scans, for the
same reason as IndexConditionPushdown cannot be.
test_if_skip_sort_order() already has logic to disable ICP when
setting up a reverse-ordered scan. Added logic to also disable
Rowid Filter in this case, factored out the code into
prepare_for_reverse_ordered_access(), and added a comment describing
the cause of this limitation.
Changes:
- Fixed that MyISAM and Aria parallel repair works with tmp file limit.
This required to add current_thd to all parallel workers and add
protection in my_malloc_size_cb_func() and temp_file_size_cb_func() to
be able to handle shared THD's. I removed the old code in MyISAM to
set current_thd() as only worked when using with virtal indexed
columns and I wanted to keep the Aria and MyISAM code identical.
Other things:
- Improved error messages from Aria parallel repair and
create_internal_tmp_table_from_heap().
Two new variables added:
- max_tmp_space_usage : Limits the the temporary space allowance per user
- max_total_tmp_space_usage: Limits the temporary space allowance for
all users.
New status variables: tmp_space_used & max_tmp_space_used
New field in information_schema.process_list: TMP_SPACE_USED
The temporary space is counted for:
- All SQL level temporary files. This includes files for filesort,
transaction temporary space, analyze, binlog_stmt_cache etc.
It does not include engine internal temporary files used for repair,
alter table, index pre sorting etc.
- All internal on disk temporary tables created as part of resolving a
SELECT, multi-source update etc.
Special cases:
- When doing a commit, the last flush of the binlog_stmt_cache
will not cause an error even if the temporary space limit is exceeded.
This is to avoid giving errors on commit. This means that a user
can temporary go over the limit with up to binlog_stmt_cache_size.
Noteworthy issue:
- One has to be careful when using small values for max_tmp_space_limit
together with binary logging and with non transactional tables.
If a the binary log entry for the query is bigger than
binlog_stmt_cache_size and one hits the limit of max_tmp_space_limit
when flushing the entry to disk, the query will abort and the
binary log will not contain the last changes to the table.
This will also stop the slave!
This is also true for all Aria tables as Aria cannot do rollback
(except in case of crashes)!
One way to avoid it is to use @@binlog_format=statement for
queries that updates a lot of rows.
Implementation:
- All writes to temporary files or internal temporary tables, that
increases the file size, are routed through temp_file_size_cb_func()
which updates and checks the temp space usage.
- Most of the temporary file monitoring is done inside IO_CACHE.
Temporary file monitoring is done inside the Aria engine.
- MY_TRACK and MY_TRACK_WITH_LIMIT are new flags for ini_io_cache().
MY_TRACK means that we track the file usage. TRACK_WITH_LIMIT means
that we track the file usage and we give an error if the limit is
breached. This is used to not give an error on commit when
binlog_stmp_cache is flushed.
- global_tmp_space_used contains the total tmp space used so far.
This is needed quickly check against max_total_tmp_space_usage.
- Temporary space errors are using EE_LOCAL_TMP_SPACE_FULL and
handler errors are using HA_ERR_LOCAL_TMP_SPACE_FULL.
This is needed until we move general errors to it's own error space
so that they cannot conflict with system error numbers.
- Return value of my_chsize() and mysql_file_chsize() has changed
so that -1 is returned in the case my_chsize() could not decrease
the file size (very unlikely and will not happen on modern systems).
All calls to _chsize() are updated to check for > 0 as the error
condition.
- At the destruction of THD we check that THD::tmp_file_space == 0
- At server end we check that global_tmp_space_used == 0
- As a precaution against errors in the tmp_space_used code, one can set
max_tmp_space_usage and max_total_tmp_space_usage to 0 to disable
the tmp space quota errors.
- truncate_io_cache() function added.
- Aria tables using static or dynamic row length are registered in 8K
increments to avoid some calls to update_tmp_file_size().
Other things:
- Ensure that all handler errors are registered. Before, some engine
errors could be printed as "Unknown error".
- Fixed bug in filesort() that causes a assert if there was an error
when writing to the temporay file.
- Fixed that compute_window_func() now takes into account write errors.
- In case of parallel replication, rpl_group_info::cleanup_context()
could call trans_rollback() with thd->error set, which would cause
an assert. Fixed by resetting the error before calling trans_rollback().
- Fixed bug in subselect3.inc which caused following test to use
heap tables with low value for max_heap_table_size
- Fixed bug in sql_expression_cache where it did not overflow
heap table to Aria table.
- Added Max_tmp_disk_space_used to slow query log.
- Fixed some bugs in log_slow_innodb.test
This is to update the plugin to be compatible with Percona's
query_response_time plugin, with some additions.
Some of the tests are taken from Percona server.
- Added plugins QUERY_RESPONSE_TIME_READ, QUERY_RESPONSE_TIME_WRITE and
QUERY_RESPONSE_TIME_READ_WRITE.
- Added option query_response_time_session_stats, with possible values
GLOBAL, ON or OFF, to the query_response_time plugin.
Notes:
- All modules are dependent on QUERY_RESPONSE_READ_TIME. This must always
be enabled if any of the other modules are used.
This will be auto-enabled in the near future.
- Accounting are done per statement. Stored functions are regarded
as part of the original statement.
- For stored procedures the accounting are done per statement executed
in the stored procedure. CALL will not be accounted because of this.
- FLUSH commands will not be accounted for. This is to ensure that
FLUSH QUERY_RESPONSE_TIME is not part of the statistics.
(This helps when testing with mtr and otherwise).
- FLUSH QUERY_RESPONSE_TIME_READ and FLUSH QUERY_RESPONSE_TIME_READ
only resets the corresponding status.
- FLUSH QUERY_RESPONSE_TIME and FLUSH QUERY_RESPONSE_TIME_READ_WRITE or
changing the value of query_response_time_range_base followed by
any FLUSH of QUERY_RESPOSNSE_TIME resets all status.
Alternative operator name keywords like `and`, `or`, `xor`, etc., are
uncommon in MariaDB and can cause obscure build errors when the GCC
flag `-fno-operator-names` is applied.
Description of `-fno-operator-names`:
https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html
> Do not treat the operator name keywords `and`, `bitand`, `bitor`,
> `compl`, `not`, `or` and `xor` as synonyms as keywords.
Part of the build errors:
/local/p4clients/pkgbuild-LdLa_/workspace/src/RDSMariaDB/sql/sql_select.cc:11171:28: error: expected ‘)’ before ‘and’
11171 | DBUG_ASSERT(sel >= 0.0 and sel <= 1.00001);
| ^~~
/local/p4clients/pkgbuild-LdLa_/workspace/src/RDSMariaDB/include/my_global.h:372:44: note: in definition of macro ‘unlikely’
372 | #define unlikely(x) __builtin_expect(((x) != 0),0)
| ^
...
The build failure is caused by using alternative operator name keywords
`and` introduced in commit b66cdbd1e.
Replace the `and` keyword with `&&` and target on MariaDB 11.0+ branches
which include the commit.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer
Amazon Web Services, Inc.
A memory leak happens on the second execution of a query that run in PS mode
and uses the function ROWNUM().
A memory leak took place on allocation of an instance of the class Item_int
for storing a limit value that is performed at the function set_limit_for_unit
indirectly called from JOIN::optimize_inner. Typical trace to the place where
the memory leak occurred is below:
JOIN::optimize_inner
optimize_rownum
process_direct_rownum_comparison
set_limit_for_unit
new (thd->mem_root) Item_int(thd, lim, MAX_BIGINT_WIDTH);
To fix this memory leak, calling of the function optimize_rownum()
has to be performed only once on first execution and never called
after that. To control it, the new data member
first_rownum_optimization
added into the structure st_select_lex.