Item_singlerow_subselect may be converted to Item_cond during
optimization. So there is a possibility of constructing nested
Item_cond_and or Item_cond_or which is not allowed (such
conditions must be flattened).
This commit checks if such kind of optimization has been applied
and flattens the condition if needed
There are no source code changes in this commit!
This is an empty follow-up commit for
284ac6f2b7
to comment what was done, as the patch itself did not have
change comments.
Problems solved in this patch:
1. The function calc_hash_for_unique() erroneously takes into account
the string length, so equal strings (in terms of the collation)
with different lengths got different hash value.
For example:
- LATIN LETTER A - 1 byte
- LATIN LETTER A WITH ACUTE - 2 bytes
are equal in utf8_general_ci, but as their lengths
are different, calc_hash_for_unique() returned
different hash values.
2. calc_hash_for_unique() also erroneously used val_str()
result to calculate hashes. This may not be correct for
some data types, e.g. TIMESTAMP, as its string
value depends on the session environment (e.g. @@time_zone).
Change summary:
Instead of doing Item::val_str(), we should always call
Field::hash() of the underlying Field. It properly
handles both cases (equal strings with different
lengths, as well as tricky data types like TIMESTAMP).
Detailed change description:
Non-functional changes (make the code cleaner):
- Adding a helper class Hasher, to pass hash parts
nr1 and nr2 through function arguments easier.
- Splitting virtual Field::hash() into non-virtual
wrapper Field::hash() and virtual Field::hash_not_null().
This helps to get rid of duplicate code handling SQL NULL,
as it was equal in all Field_xxx implementations.
- Adding a new method THD::my_ok_with_recreate_info().
Actual fix changes (make new tables work properly):
- Adding a virtual method Item::hash_not_null()
This helps to handle hashes on full fields (Item_field)
and hashes on prefix fields (Item_func_left(Item_field))
in a polymorphic way.
Implementing overrides for Item_field and Item_func_left.
- Rewriting Item_func_hash::val_int() to use Item::hash_not_null(),
instead of the combination of val_str() and alc_hash_for_unique().
Backward compatibility changes (make old tables work in the new server):
- Adding a new class Item_func_hash_mariadb_100403.
Moving the old version of Item_func_hash::val_int()
into Item_func_hash_mariadb_100403::val_int().
The old class Item_func_hash_mariadb_100403 is still needed,
to open old tables before upgrade is done.
- Adding TABLE_SHARE::old_long_hash_function() and
handler::check_long_hash_compatibility() to test
if a table is using an old hash function.
- Adding a helper method TABLE_SHARE::make_long_hash_func()
to instantiate either Item_func_hash_mariadb_100403 (for old
not upgraded tables) or Item_func_hash (for new tables).
Upgrade changes (make old tables upgrade in the new server properly):
Upgrading an old table to a new hash can be done using either
of these two statements:
ALTER IGNORE TABLE t1 FORCE;
REPAIR TABLE t1;
!!! These statements find and filter out erreneous duplicates!!!
The table after these statements will have less records
if there were erroneous duplicates (such and A and A WITH ACUTE).
The information about filtered out records is reported in both statements.
- Adding a new class Recreate_info to return out information
about copied and duplucate rows from these functions:
- mysql_alter_table()
- mysql_recreate_table()
- admin_recreate_table()
This helps to print a warning during REPAIR:
MariaDB [test]> repair table mdev27653_100422_text;
+----------------------------+--------+----------+------------------------------------+
| Table | Op | Msg_type | Msg_text |
+----------------------------+--------+----------+------------------------------------+
| test.mdev27653_100422_text | repair | Warning | Number of rows changed from 2 to 1 |
| test.mdev27653_100422_text | repair | status | OK |
+----------------------------+--------+----------+------------------------------------+
2 rows in set (0.018 sec)
When built with ubsan and trying to load the spider plugin, the hidden
visibility of mysqld compiling flag causes ha_spider.so to be missing
the symbol ha_partition. This commit fixes that, as well as some
memcpy null pointer issues when built with ubsan.
Signed-off-by: Yuchen Pei <yuchen.pei@mariadb.com>
Use SELECT_LEX to save lists for ORDER BY and GROUP BY before parsing
WINDOW clauses / specifications. This is needed for proper parsing
of a nested WINDOW clause when a WINDOW clause is used in a subquery
contained in another WINDOW clause.
Fix assignment of empty SQL_I_List to another one (in case of empty list
next shoud point on first).
Item_singlerow_subselect may be converted to Item_cond during
optimization. So there is a possibility of constructing nested
Item_cond_and or Item_cond_or which is not allowed (such
conditions must be flattened).
This commit checks if such kind of optimization has been applied
and flattens the condition if needed
There are no source code changes in this commit!
This is an empty follow-up commit for
284ac6f2b7
to comment what was done, as the patch itself did not have
change comments.
Problems solved in this patch:
1. The function calc_hash_for_unique() erroneously takes into account
the string length, so equal strings (in terms of the collation)
with different lengths got different hash value.
For example:
- LATIN LETTER A - 1 byte
- LATIN LETTER A WITH ACUTE - 2 bytes
are equal in utf8_general_ci, but as their lengths
are different, calc_hash_for_unique() returned
different hash values.
2. calc_hash_for_unique() also erroneously used val_str()
result to calculate hashes. This may not be correct for
some data types, e.g. TIMESTAMP, as its string
value depends on the session environment (e.g. @@time_zone).
Change summary:
Instead of doing Item::val_str(), we should always call
Field::hash() of the underlying Field. It properly
handles both cases (equal strings with different
lengths, as well as tricky data types like TIMESTAMP).
Detailed change description:
Non-functional changes (make the code cleaner):
- Adding a helper class Hasher, to pass hash parts
nr1 and nr2 through function arguments easier.
- Splitting virtual Field::hash() into non-virtual
wrapper Field::hash() and virtual Field::hash_not_null().
This helps to get rid of duplicate code handling SQL NULL,
as it was equal in all Field_xxx implementations.
- Adding a new method THD::my_ok_with_recreate_info().
Actual fix changes (make new tables work properly):
- Adding a virtual method Item::hash_not_null()
This helps to handle hashes on full fields (Item_field)
and hashes on prefix fields (Item_func_left(Item_field))
in a polymorphic way.
Implementing overrides for Item_field and Item_func_left.
- Rewriting Item_func_hash::val_int() to use Item::hash_not_null(),
instead of the combination of val_str() and alc_hash_for_unique().
Backward compatibility changes (make old tables work in the new server):
- Adding a new class Item_func_hash_mariadb_100403.
Moving the old version of Item_func_hash::val_int()
into Item_func_hash_mariadb_100403::val_int().
The old class Item_func_hash_mariadb_100403 is still needed,
to open old tables before upgrade is done.
- Adding TABLE_SHARE::old_long_hash_function() and
handler::check_long_hash_compatibility() to test
if a table is using an old hash function.
- Adding a helper method TABLE_SHARE::make_long_hash_func()
to instantiate either Item_func_hash_mariadb_100403 (for old
not upgraded tables) or Item_func_hash (for new tables).
Upgrade changes (make old tables upgrade in the new server properly):
Upgrading an old table to a new hash can be done using either
of these two statements:
ALTER IGNORE TABLE t1 FORCE;
REPAIR TABLE t1;
!!! These statements find and filter out erreneous duplicates!!!
The table after these statements will have less records
if there were erroneous duplicates (such and A and A WITH ACUTE).
The information about filtered out records is reported in both statements.
- Adding a new class Recreate_info to return out information
about copied and duplucate rows from these functions:
- mysql_alter_table()
- mysql_recreate_table()
- admin_recreate_table()
This helps to print a warning during REPAIR:
MariaDB [test]> repair table mdev27653_100422_text;
+----------------------------+--------+----------+------------------------------------+
| Table | Op | Msg_type | Msg_text |
+----------------------------+--------+----------+------------------------------------+
| test.mdev27653_100422_text | repair | Warning | Number of rows changed from 2 to 1 |
| test.mdev27653_100422_text | repair | status | OK |
+----------------------------+--------+----------+------------------------------------+
2 rows in set (0.018 sec)
When built with ubsan and trying to load the spider plugin, the hidden
visibility of mysqld compiling flag causes ha_spider.so to be missing
the symbol ha_partition. This commit fixes that, as well as some
memcpy null pointer issues when built with ubsan.
Signed-off-by: Yuchen Pei <yuchen.pei@mariadb.com>
This also fixes part of MDEV-29835 Partial server freeze
which is caused by violations of the latching order that was
defined in https://dev.mysql.com/worklog/task/?id=6326
(WL#6326: InnoDB: fix index->lock contention). Unless the
current thread is holding an exclusive dict_index_t::lock,
it must acquire page latches in a strict parent-to-child,
left-to-right order. Not all cases are fixed yet. Failure to
follow the correct latching order will cause deadlocks of threads
due to lock order inversion.
As part of these changes, the BTR_MODIFY_TREE mode is modified
so that an Update latch (U a.k.a. SX) will be acquired on the
root page, and eXclusive latches (X) will be acquired on all pages
leading to the leaf page, as well as any left and right siblings
of the pages along the path. The test innodb.innodb_wl6326
will be removed, because at the time the DEBUG_SYNC point is hit,
the thread is actually holding several page latches that will be
blocking a concurrent SELECT statement.
We also remove double bookkeeping that was caused due to excessive
information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo
store information of latched pages, and ensure that
mtr_memo_slot_t::object is never a null pointer.
The tree_blocks[] and tree_savepoints[] were redundant.
mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo.
This avoids many redundant entries in mtr_t::m_memo, as well as
redundant calls to buf_page_get_gen() for blocks that had already
been looked up in a mini-transaction.
btr_get_latched_root(): Return a pointer to an already latched root page.
This replaces btr_root_block_get() in cases where the mini-transaction
has already latched the root page.
btr_page_get_parent(): Fetch a parent page that was already latched
in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched().
If needed, upgrade the root page U latch to X.
This avoids bloating mtr_t::m_memo as well as redundant
buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for
B-tree defragmentation, we will invoke btr_cur_search_to_nth_level().
btr_cur_search_to_nth_level(): This will only be used for non-leaf
(level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE
or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be
removed altogether, or retained for the case of
CHECK TABLE without QUICK.
btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level()
for searches to level=0 (the leaf level).
btr_cur_t::pessimistic_search_leaf(): Implement the new
BTR_MODIFY_TREE latching logic in the case that page splits
or merges will be needed. The parent pages (and their siblings)
should already be latched on the first dive to the leaf and be
present in mtr_t::m_memo; there should be no need for
BTR_CONT_MODIFY_TREE. This pre-latching almost suffices;
MDEV-29835 will have to revise it and remove work-arounds where
mtr_t::get_already_latched() fails to find a block.
rtr_search_to_nth_level(): A SPATIAL INDEX version of
btr_search_to_nth_level() that can search to any level
(including the leaf level).
rtr_search_leaf(), rtr_insert_leaf(): Wrappers for
rtr_search_to_nth_level().
rtr_search(): Replaces rtr_pcur_open().
rtr_cur_restore_position(): Remove an unused constant parameter.
btr_pcur_open_on_user_rec(): Remove the constant parameter
mode=PAGE_CUR_GE.
btr_cur_latch_leaves(): Update a pre-existing mtr_t::m_memo entry
for the current leaf page.
row_ins_clust_index_entry_low(): Use a new
mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page
when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC.
btr_cur_t::open_leaf(): Some clean-up.
mtr_t::lock_register(): Register a page latch on a buffer-fixed block.
BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove.
BTR_CONT_MODIFY_TREE: Note that this is only used by
rtr_search_to_nth_level().
btr_pcur_optimistic_latch_leaves(): Replaces
btr_cur_optimistic_latch_leaves().
ibuf_delete_rec(): Acquire ibuf.index->lock.u_lock() in order
to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV).
Tested by: Matthias Leich
To avoid heap memory allocation overhead for mtr_t::m_memo,
we will allocate a small number of elements statically in
mtr_t::m_memo::small. Only if that preallocated data is
insufficient, we will invoke my_alloc() or my_realloc() for
more storage. The implementation of the data structure is
inspired by llvm::SmallVector.
Updated wsrep-lib to version in which server_state
wait_until_state() and sst_received() were changed to report
errors via return codes instead of throwing exceptions. Added
error handling accordingly.
Tested manually that failure in sst_received() which was
caused by server misconfiguration (unknown configuration variable
in server configuration) does not cause crash due to uncaught
exception.
Since commit d7d3ad698a, "hard" kill is
required to interrupt debug sync waits.
Affected the following tests:
- galera_var_retry_autocommit,
- galera_bf_abort_at_after_statement
- galera_parallel_apply_3nodes
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
The condition is freed in sp_head::execute, after calling
ha_spider::reset. This commit partially reverts the change in commit
e954d9de88, so that the condition is
always freed regardless of the wide_handler->sql_command, which will
prevent access to the freed condition later.
Signed-off-by: Yuchen Pei <yuchen.pei@mariadb.com>
Analysis:
When we skip level when path is found, it changes the state of the json
engine. This breaks the sequence for json_get_path_next() which is called at
the end to ensure json document is valid and leads to crash.
Fix:
Use json_scan_next() at the end to check if json document has correct
syntax (is valid).
MySQL 5.7.41 includes one InnoDB change
mysql/mysql-server@d2d6b2dd00
that seems to be applicable to MariaDB Server 10.3 and 10.4.
Even though commit 5b9ee8d819
seems to have fixed sporadic failures on our CI systems, it is
theoretically possible that another race condition remained.
buf_flush_page_cleaner_coordinator(): In the final loop,
wait also for buf_get_n_pending_read_ios() to reach 0.
In this way, if a secondary index leaf page was read into the
buffer pool and ibuf_merge_or_delete_for_page() modified that
page or some change buffer pages, the flush loop would execute
until the buffer pool really is in a clean state.
This potential data corruption bug does not affect MariaDB Server 10.5
or later, thanks to commit b42294bc64
which removed change buffer merges that are not explicitly requested.
Analysis:
Parsing json path happens only once. When paring, we set types of path
(types_used) to use later. If the path type has range or wild card, only
then multiple values get added to the result set.
However for each row in the table, types_used still gets
overwritten to default (no multiple values) and is also not set again
(because path is already parsed). Since multiple values depend on the
type of path, they dont get added to result set either.
Fix:
set default for types_used only if path is not parsed already.
galera_gcache_recover and galera_gcache_recover_manytrx
grepping on error log is not always successful as messages
might be in different order or contain different values
galera_vote_sr
We need to make sure required table creation has replicated
as we use WSREP_ON=off
This commit changes backup execution (namely the block ddl phase),
so that node is not paused from cluster. Instead, the following
backup execution is declared as vulnerable for possible cluster
level conflicts, especially with DDL statement applying.
With this, the mariabackup execution may be aborted, if DDL
statements happen during backup execution. This abortable
backup execution is optional feature and may be
enabled/disabled by wsrep_mode: BF_ABORT_MARIABACKUP.
Note that old style node desync and pause, despite of
WSREP_MODE_BF_MARIABACKUP is needed if node is operating as
SST donor.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
If two high priority threads have lock conflict, we look at the
order of these transactions and honor the earlier transaction.
for_locking parameter in lock_rec_has_to_wait() has become
obsolete and it is now removed from the code .
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
The problem was that federated engine does not support comparable rowids
which was not taken into account by semijoin code.
Fixed by checking that we don't use semijoin with tables that does not
support comparable rowids.
Other things:
- Fixed some typos in the code comments
The reason things fails in 10.5 and above is that test_quick_select()
returns -1 (impossible range) for empty tables if there are any
conditions attached.
This didn't happen in 10.4 as the cost for a range was more than for
a table scan with 0 rows and get_key_scan_params() did not create any
range plans and thus did not mark the range as impossible.
The code that checked the 'impossible range' conditions did not take
into account all cases of LEFT JOIN usage.
Adding an extra check if the table is used with an ON condition in case
of 'impossible range' fixes the issue.
The rather recent thd_need_ordering_with() function does not take
high priority transactions' order in consideration. Chaged this
funtion to compare also transaction seqnos and favor earlier transaction.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>