This patch fixes the patch for bug MDEV-30248 that unsatisfactorily
resolved the problem of resolution of references to CTE. In some cases
when such a reference has the same table name as the name of one of
CTEs containing this reference the reference could be resolved incorrectly
that led to an invalid select tree where units could be mutually dependent.
This in its turn could lead to an infinite sequence of recursive calls or
to falls into infinite loops.
The patch also removes LEX::resolve_references_to_cte_in_hanging_cte() as
with the new code for resolution of CTE references the call of this
function is not needed anymore.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
The user XA commit execution branch was caught not have been covered
with MDEV-21953 fixes.
The XA involved deadlock is resolved now to apply the former fixes
pattern.
Along the fixes the following changes have been implemented.
- MDL lock attribute correction
- dissociation of the externally completed XA from the current
thread's xid_state in the error branches
- cleanup_context() preseves the prepared XA
- wait_for_prior_commit() is relocated to satisfy both
the binlog ON (log-slave-updates and skip-log-bin)
and OFF slave execution branches.
(Initial patch by Varun Gupta. Amended and added comments).
When the query has both
1. Aggregate functions that require sorting data by group, and
2. Window functions
we need to use two temporary tables. The first temp.table will hold the
join output. Then it is passed to filesort(). Reading it in sorted
order allows to compute the aggregate functions.
Then, we need to write their values into the second temp. table. Then,
Window Function computation step can pass that to filesort() and read
them in the order it needs.
Failure to create the second temp. table would cause an assertion
failure: window function could would not find where to get the values
of the aggregate functions.
- InnoDB fails to clear the freed ranges during truncation of innodb
undo log tablespace. During shutdown, InnoDB flushes the freed page
ranges and throws the out of bound error.
mtr_t::commit_shrink(): clear the freed ranges while doing undo
tablespace truncation
disable bulk insert optimization if long uniques are used, because they
need to read the table (index_read) after every inserted now. And bulk
insert optimization might disable indexes.
bulk insert is already disabled in other cases when there are chances
that the table will be read duing the bulk insert.
plugin_vars_free_values() was walking plugin sysvars and thus
did not free memory of plugin PLUGIN_VAR_NOSYSVAR vars.
* change it to walk all plugin vars
* add the pluginname_ prefix to NOSYSVARS var names too,
so that plugin_vars_free_values() would be able to find their
bookmarks
The MariaDB code base uses strcat() and strcpy() in several
places. These are known to have memory safety issues and their usage is
discouraged. Common security scanners like Flawfinder flags them. In MariaDB we
should start using modern and safer variants on these functions.
This is similar to memory issues fixes in 19af1890b5
and 9de9f105b5 but now replace use of strcat()
and strcpy() with safer options strncat() and strncpy().
However, add '\0' forcefully to make sure the result string is correct since
for these two functions it is not guaranteed what new string will be null-terminated.
Example:
size_t dest_len = sizeof(g->Message);
strncpy(g->Message, "Null json tree", dest_len); strncat(g->Message, ":",
sizeof(g->Message) - strlen(g->Message)); size_t wrote_sz = strlen(g->Message);
size_t cur_len = wrote_sz >= dest_len ? dest_len - 1 : wrote_sz;
g->Message[cur_len] = '\0';
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the BSD-new
license. I am contributing on behalf of my employer Amazon Web Services
-- Reviewer and co-author Vicențiu Ciorbaru <vicentiu@mariadb.org>
-- Reviewer additions:
* The initial function implementation was flawed. Replaced with a simpler
and also correct version.
* Simplified code by making use of snprintf instead of chaining strcat.
* Simplified code by removing dynamic string construction in the first
place and using static strings if possible. See connect storage engine
changes.
Item_singlerow_subselect may be converted to Item_cond during
optimization. So there is a possibility of constructing nested
Item_cond_and or Item_cond_or which is not allowed (such
conditions must be flattened).
This commit checks if such kind of optimization has been applied
and flattens the condition if needed
There are no source code changes in this commit!
This is an empty follow-up commit for
284ac6f2b7
to comment what was done, as the patch itself did not have
change comments.
Problems solved in this patch:
1. The function calc_hash_for_unique() erroneously takes into account
the string length, so equal strings (in terms of the collation)
with different lengths got different hash value.
For example:
- LATIN LETTER A - 1 byte
- LATIN LETTER A WITH ACUTE - 2 bytes
are equal in utf8_general_ci, but as their lengths
are different, calc_hash_for_unique() returned
different hash values.
2. calc_hash_for_unique() also erroneously used val_str()
result to calculate hashes. This may not be correct for
some data types, e.g. TIMESTAMP, as its string
value depends on the session environment (e.g. @@time_zone).
Change summary:
Instead of doing Item::val_str(), we should always call
Field::hash() of the underlying Field. It properly
handles both cases (equal strings with different
lengths, as well as tricky data types like TIMESTAMP).
Detailed change description:
Non-functional changes (make the code cleaner):
- Adding a helper class Hasher, to pass hash parts
nr1 and nr2 through function arguments easier.
- Splitting virtual Field::hash() into non-virtual
wrapper Field::hash() and virtual Field::hash_not_null().
This helps to get rid of duplicate code handling SQL NULL,
as it was equal in all Field_xxx implementations.
- Adding a new method THD::my_ok_with_recreate_info().
Actual fix changes (make new tables work properly):
- Adding a virtual method Item::hash_not_null()
This helps to handle hashes on full fields (Item_field)
and hashes on prefix fields (Item_func_left(Item_field))
in a polymorphic way.
Implementing overrides for Item_field and Item_func_left.
- Rewriting Item_func_hash::val_int() to use Item::hash_not_null(),
instead of the combination of val_str() and alc_hash_for_unique().
Backward compatibility changes (make old tables work in the new server):
- Adding a new class Item_func_hash_mariadb_100403.
Moving the old version of Item_func_hash::val_int()
into Item_func_hash_mariadb_100403::val_int().
The old class Item_func_hash_mariadb_100403 is still needed,
to open old tables before upgrade is done.
- Adding TABLE_SHARE::old_long_hash_function() and
handler::check_long_hash_compatibility() to test
if a table is using an old hash function.
- Adding a helper method TABLE_SHARE::make_long_hash_func()
to instantiate either Item_func_hash_mariadb_100403 (for old
not upgraded tables) or Item_func_hash (for new tables).
Upgrade changes (make old tables upgrade in the new server properly):
Upgrading an old table to a new hash can be done using either
of these two statements:
ALTER IGNORE TABLE t1 FORCE;
REPAIR TABLE t1;
!!! These statements find and filter out erreneous duplicates!!!
The table after these statements will have less records
if there were erroneous duplicates (such and A and A WITH ACUTE).
The information about filtered out records is reported in both statements.
- Adding a new class Recreate_info to return out information
about copied and duplucate rows from these functions:
- mysql_alter_table()
- mysql_recreate_table()
- admin_recreate_table()
This helps to print a warning during REPAIR:
MariaDB [test]> repair table mdev27653_100422_text;
+----------------------------+--------+----------+------------------------------------+
| Table | Op | Msg_type | Msg_text |
+----------------------------+--------+----------+------------------------------------+
| test.mdev27653_100422_text | repair | Warning | Number of rows changed from 2 to 1 |
| test.mdev27653_100422_text | repair | status | OK |
+----------------------------+--------+----------+------------------------------------+
2 rows in set (0.018 sec)
When built with ubsan and trying to load the spider plugin, the hidden
visibility of mysqld compiling flag causes ha_spider.so to be missing
the symbol ha_partition. This commit fixes that, as well as some
memcpy null pointer issues when built with ubsan.
Signed-off-by: Yuchen Pei <yuchen.pei@mariadb.com>
Use SELECT_LEX to save lists for ORDER BY and GROUP BY before parsing
WINDOW clauses / specifications. This is needed for proper parsing
of a nested WINDOW clause when a WINDOW clause is used in a subquery
contained in another WINDOW clause.
Fix assignment of empty SQL_I_List to another one (in case of empty list
next shoud point on first).
Item_singlerow_subselect may be converted to Item_cond during
optimization. So there is a possibility of constructing nested
Item_cond_and or Item_cond_or which is not allowed (such
conditions must be flattened).
This commit checks if such kind of optimization has been applied
and flattens the condition if needed
There are no source code changes in this commit!
This is an empty follow-up commit for
284ac6f2b7
to comment what was done, as the patch itself did not have
change comments.
Problems solved in this patch:
1. The function calc_hash_for_unique() erroneously takes into account
the string length, so equal strings (in terms of the collation)
with different lengths got different hash value.
For example:
- LATIN LETTER A - 1 byte
- LATIN LETTER A WITH ACUTE - 2 bytes
are equal in utf8_general_ci, but as their lengths
are different, calc_hash_for_unique() returned
different hash values.
2. calc_hash_for_unique() also erroneously used val_str()
result to calculate hashes. This may not be correct for
some data types, e.g. TIMESTAMP, as its string
value depends on the session environment (e.g. @@time_zone).
Change summary:
Instead of doing Item::val_str(), we should always call
Field::hash() of the underlying Field. It properly
handles both cases (equal strings with different
lengths, as well as tricky data types like TIMESTAMP).
Detailed change description:
Non-functional changes (make the code cleaner):
- Adding a helper class Hasher, to pass hash parts
nr1 and nr2 through function arguments easier.
- Splitting virtual Field::hash() into non-virtual
wrapper Field::hash() and virtual Field::hash_not_null().
This helps to get rid of duplicate code handling SQL NULL,
as it was equal in all Field_xxx implementations.
- Adding a new method THD::my_ok_with_recreate_info().
Actual fix changes (make new tables work properly):
- Adding a virtual method Item::hash_not_null()
This helps to handle hashes on full fields (Item_field)
and hashes on prefix fields (Item_func_left(Item_field))
in a polymorphic way.
Implementing overrides for Item_field and Item_func_left.
- Rewriting Item_func_hash::val_int() to use Item::hash_not_null(),
instead of the combination of val_str() and alc_hash_for_unique().
Backward compatibility changes (make old tables work in the new server):
- Adding a new class Item_func_hash_mariadb_100403.
Moving the old version of Item_func_hash::val_int()
into Item_func_hash_mariadb_100403::val_int().
The old class Item_func_hash_mariadb_100403 is still needed,
to open old tables before upgrade is done.
- Adding TABLE_SHARE::old_long_hash_function() and
handler::check_long_hash_compatibility() to test
if a table is using an old hash function.
- Adding a helper method TABLE_SHARE::make_long_hash_func()
to instantiate either Item_func_hash_mariadb_100403 (for old
not upgraded tables) or Item_func_hash (for new tables).
Upgrade changes (make old tables upgrade in the new server properly):
Upgrading an old table to a new hash can be done using either
of these two statements:
ALTER IGNORE TABLE t1 FORCE;
REPAIR TABLE t1;
!!! These statements find and filter out erreneous duplicates!!!
The table after these statements will have less records
if there were erroneous duplicates (such and A and A WITH ACUTE).
The information about filtered out records is reported in both statements.
- Adding a new class Recreate_info to return out information
about copied and duplucate rows from these functions:
- mysql_alter_table()
- mysql_recreate_table()
- admin_recreate_table()
This helps to print a warning during REPAIR:
MariaDB [test]> repair table mdev27653_100422_text;
+----------------------------+--------+----------+------------------------------------+
| Table | Op | Msg_type | Msg_text |
+----------------------------+--------+----------+------------------------------------+
| test.mdev27653_100422_text | repair | Warning | Number of rows changed from 2 to 1 |
| test.mdev27653_100422_text | repair | status | OK |
+----------------------------+--------+----------+------------------------------------+
2 rows in set (0.018 sec)
When built with ubsan and trying to load the spider plugin, the hidden
visibility of mysqld compiling flag causes ha_spider.so to be missing
the symbol ha_partition. This commit fixes that, as well as some
memcpy null pointer issues when built with ubsan.
Signed-off-by: Yuchen Pei <yuchen.pei@mariadb.com>
This also fixes part of MDEV-29835 Partial server freeze
which is caused by violations of the latching order that was
defined in https://dev.mysql.com/worklog/task/?id=6326
(WL#6326: InnoDB: fix index->lock contention). Unless the
current thread is holding an exclusive dict_index_t::lock,
it must acquire page latches in a strict parent-to-child,
left-to-right order. Not all cases are fixed yet. Failure to
follow the correct latching order will cause deadlocks of threads
due to lock order inversion.
As part of these changes, the BTR_MODIFY_TREE mode is modified
so that an Update latch (U a.k.a. SX) will be acquired on the
root page, and eXclusive latches (X) will be acquired on all pages
leading to the leaf page, as well as any left and right siblings
of the pages along the path. The test innodb.innodb_wl6326
will be removed, because at the time the DEBUG_SYNC point is hit,
the thread is actually holding several page latches that will be
blocking a concurrent SELECT statement.
We also remove double bookkeeping that was caused due to excessive
information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo
store information of latched pages, and ensure that
mtr_memo_slot_t::object is never a null pointer.
The tree_blocks[] and tree_savepoints[] were redundant.
mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo.
This avoids many redundant entries in mtr_t::m_memo, as well as
redundant calls to buf_page_get_gen() for blocks that had already
been looked up in a mini-transaction.
btr_get_latched_root(): Return a pointer to an already latched root page.
This replaces btr_root_block_get() in cases where the mini-transaction
has already latched the root page.
btr_page_get_parent(): Fetch a parent page that was already latched
in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched().
If needed, upgrade the root page U latch to X.
This avoids bloating mtr_t::m_memo as well as redundant
buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for
B-tree defragmentation, we will invoke btr_cur_search_to_nth_level().
btr_cur_search_to_nth_level(): This will only be used for non-leaf
(level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE
or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be
removed altogether, or retained for the case of
CHECK TABLE without QUICK.
btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level()
for searches to level=0 (the leaf level).
btr_cur_t::pessimistic_search_leaf(): Implement the new
BTR_MODIFY_TREE latching logic in the case that page splits
or merges will be needed. The parent pages (and their siblings)
should already be latched on the first dive to the leaf and be
present in mtr_t::m_memo; there should be no need for
BTR_CONT_MODIFY_TREE. This pre-latching almost suffices;
MDEV-29835 will have to revise it and remove work-arounds where
mtr_t::get_already_latched() fails to find a block.
rtr_search_to_nth_level(): A SPATIAL INDEX version of
btr_search_to_nth_level() that can search to any level
(including the leaf level).
rtr_search_leaf(), rtr_insert_leaf(): Wrappers for
rtr_search_to_nth_level().
rtr_search(): Replaces rtr_pcur_open().
rtr_cur_restore_position(): Remove an unused constant parameter.
btr_pcur_open_on_user_rec(): Remove the constant parameter
mode=PAGE_CUR_GE.
btr_cur_latch_leaves(): Update a pre-existing mtr_t::m_memo entry
for the current leaf page.
row_ins_clust_index_entry_low(): Use a new
mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page
when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC.
btr_cur_t::open_leaf(): Some clean-up.
mtr_t::lock_register(): Register a page latch on a buffer-fixed block.
BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove.
BTR_CONT_MODIFY_TREE: Note that this is only used by
rtr_search_to_nth_level().
btr_pcur_optimistic_latch_leaves(): Replaces
btr_cur_optimistic_latch_leaves().
ibuf_delete_rec(): Acquire ibuf.index->lock.u_lock() in order
to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV).
Tested by: Matthias Leich
To avoid heap memory allocation overhead for mtr_t::m_memo,
we will allocate a small number of elements statically in
mtr_t::m_memo::small. Only if that preallocated data is
insufficient, we will invoke my_alloc() or my_realloc() for
more storage. The implementation of the data structure is
inspired by llvm::SmallVector.
Updated wsrep-lib to version in which server_state
wait_until_state() and sst_received() were changed to report
errors via return codes instead of throwing exceptions. Added
error handling accordingly.
Tested manually that failure in sst_received() which was
caused by server misconfiguration (unknown configuration variable
in server configuration) does not cause crash due to uncaught
exception.
Since commit d7d3ad698a, "hard" kill is
required to interrupt debug sync waits.
Affected the following tests:
- galera_var_retry_autocommit,
- galera_bf_abort_at_after_statement
- galera_parallel_apply_3nodes
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
The condition is freed in sp_head::execute, after calling
ha_spider::reset. This commit partially reverts the change in commit
e954d9de88, so that the condition is
always freed regardless of the wide_handler->sql_command, which will
prevent access to the freed condition later.
Signed-off-by: Yuchen Pei <yuchen.pei@mariadb.com>
Analysis:
When we skip level when path is found, it changes the state of the json
engine. This breaks the sequence for json_get_path_next() which is called at
the end to ensure json document is valid and leads to crash.
Fix:
Use json_scan_next() at the end to check if json document has correct
syntax (is valid).
MySQL 5.7.41 includes one InnoDB change
mysql/mysql-server@d2d6b2dd00
that seems to be applicable to MariaDB Server 10.3 and 10.4.
Even though commit 5b9ee8d819
seems to have fixed sporadic failures on our CI systems, it is
theoretically possible that another race condition remained.
buf_flush_page_cleaner_coordinator(): In the final loop,
wait also for buf_get_n_pending_read_ios() to reach 0.
In this way, if a secondary index leaf page was read into the
buffer pool and ibuf_merge_or_delete_for_page() modified that
page or some change buffer pages, the flush loop would execute
until the buffer pool really is in a clean state.
This potential data corruption bug does not affect MariaDB Server 10.5
or later, thanks to commit b42294bc64
which removed change buffer merges that are not explicitly requested.