redundant table rebuild
- InnoDB alter fails to apply the online log during redundant table
rebuild. Problem is that InnoDB wrongly reads the length flags of the
record while applying the temporary log record.
rec_init_offsets_comp_ordinary(): For finding the n_core_null_bytes,
InnoDB should use the same logic as rec_convert_dtuple_to_rec_comp().
rec_init_offsets_comp_ordinary(), rec_init_offsets(),
rec_get_offsets_reverse(), rec_get_nth_field_offs_old():
Simplify some bitwise arithmetics to avoid conditional jumps,
and add branch prediction hints with the assumption that most
variable-length columns are short.
Tested by: Matthias Leich
When using the MySQL table type the CONNECT engine converted the YEAR
datatype to DATETIME for INSERT queries. This is incorrect, causing an
error on the INSERT. It should be SHORT instead.
- MY_I_S_MAYBE_NULL field attributes is added PAGE_NO and SPACE in
innodb_sys_index table. By doing this, InnoDB can set null for these
fields when it encounters discarded tablespace
When one session SELECT ... FOR UPDATE and holds the lock, subsequent
sessions that SELECT ... FOR UPDATE will wait to get the lock.
Currently, that event is labeled as `wait/io/table/sql/handler`, which
is incorrect. Instead, it should have been
`wait/lock/table/sql/handler`.
Two factors contribute to this bug:
1. Instrumentation interface and the heavy usage of `TABLE_IO_WAIT` in
`sql/handler.cc` file. See interface [^1] for better understanding;
2. The balancing act [^2] of doing instrumentation aggregration _AND_
having good performance. For example, EVENTS_WAITS_SUMMARY... is
aggregated using EVENTS_WAITS_CURRENT. Aggregration needs to be based
on the same wait class, and the code was overly aggressive in label a
LOCK operation as an IO operation in this case.
The proposed fix is pretty simple, but understanding the bug took a
while. Hence the footnotes below. For future improvement and
refactoring, we may want to consider renaming `TABLE_IO_WAIT` and making
it less coarse and more targeted.
Note that newly added test case, events_waits_current_MDEV-29091,
initially didn't pass Buildbot CI for embedded build tests. Further
research showed that other impacted tests all included not_embedded.inc.
This oversight was fixed later.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
[^1]: To understand `performance_schema` instrumentation interface, I
found this URL is the most helpful:
https://dev.mysql.com/doc/dev/mysql-server/latest/PAGE_PFS_PSI.html
[^2]: The best place to understand instrumentation projection,
composition, and aggregration is through the source file. Although I
prefer reading Doxygen produced html file, but for whatever reason, the
rendering is not ideal. Here is link to 10.6's pfs.cc:
https://github.com/MariaDB/server/blob/10.6/storage/perfschema/pfs.cc
- InnoDB tries to build the previous version of the record for
the virtual index, but the undo log record doesn't contain
virtual column information. This leads to assert failure while
building the tuple.
This patch is the result of running
run-clang-tidy -fix -header-filter=.* -checks='-*,modernize-use-equals-default' .
Code style changes have been done on top. The result of this change
leads to the following improvements:
1. Binary size reduction.
* For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by
~400kb.
* A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb.
2. Compiler can better understand the intent of the code, thus it leads
to more optimization possibilities. Additionally it enabled detecting
unused variables that had an empty default constructor but not marked
so explicitly.
Particular change required following this patch in sql/opt_range.cc
result_keys, an unused template class Bitmap now correctly issues
unused variable warnings.
Setting Bitmap template class constructor to default allows the compiler
to identify that there are no side-effects when instantiating the class.
Previously the compiler could not issue the warning as it assumed Bitmap
class (being a template) would not be performing a NO-OP for its default
constructor. This prevented the "unused variable warning".
The existing storage/rocksdb/CMakeCache.txt defined
ATOMIC_EXTRA_LIBS when atomics where required. This was
determined by the toplevel configure.cmake test
(HAVE_GCC_C11_ATOMICS_WITH_LIBATOMIC).
As build_rocksdb.cmake is included after ATOMIC_EXTRA_LIBS
was set, we just need to use it. As such no riscv64
specific macro is needed in build_rocksdb.cmake.
As highlighted by Gianfranco Costamagna (@LocutusOfBorg)
in #2472 overwriting SYSTEM_LIBS was problematic.
This is corrected in case in future SYSTEM_LIBS is changed
elsewhere.
Closes#2472.
The function spider_db_mbase::print_warnings() can potentially result
in a null pointer dereference.
Remove the null pointer dereference by cleaning up the function.
Some small changes to the original commit
422fb63a9b.
Co-Authored-By: Yuchen Pei <yuchen.pei@mariadb.com>
This is Kentoku's patch for MDEV-22979 (e6e41f04f4 + 22a0097727),
which fixes 30370.
It changes the wait to a timed wait for the first sts thread, which
waits on server start to execute the init queries for spider. It also
flips the flag init_command to false when the sts thread is being
freed. With these changes the sts thread can check the flag regularly
and abort the init_queries when it finds out the init_command is
false. This avoids the deadlock that causes the problem in MDEV-30370.
It also fixes MDEV-22979 for 10.4, but not 10.5. I have not tested
higher versions for MDEV-22979.
A test has also been done on MDEV-29904 to avoid regression, given
MDEV-27233 is a similar problem and its patch caused the
regression. The test passes for 10.4-11.0.
However, this adhoc test only works consistently when placed in the
main testsuite. We should not place spider tests in the main suite, so
we do not include it in this commit. A patch for MDEV-27912 should fix
this problem and allow a proper test for MDEV-29904. See comments in
the jira ticket MDEV-30370/29904 for the adhoc testcase used for this
commit.
The MariaDB code base uses strcat() and strcpy() in several
places. These are known to have memory safety issues and their usage is
discouraged. Common security scanners like Flawfinder flags them. In MariaDB we
should start using modern and safer variants on these functions.
This is similar to memory issues fixes in 19af1890b5
and 9de9f105b5 but now replace use of strcat()
and strcpy() with safer options strncat() and strncpy().
However, add '\0' forcefully to make sure the result string is correct since
for these two functions it is not guaranteed what new string will be null-terminated.
Example:
size_t dest_len = sizeof(g->Message);
strncpy(g->Message, "Null json tree", dest_len); strncat(g->Message, ":",
sizeof(g->Message) - strlen(g->Message)); size_t wrote_sz = strlen(g->Message);
size_t cur_len = wrote_sz >= dest_len ? dest_len - 1 : wrote_sz;
g->Message[cur_len] = '\0';
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the BSD-new
license. I am contributing on behalf of my employer Amazon Web Services
-- Reviewer and co-author Vicențiu Ciorbaru <vicentiu@mariadb.org>
-- Reviewer additions:
* The initial function implementation was flawed. Replaced with a simpler
and also correct version.
* Simplified code by making use of snprintf instead of chaining strcat.
* Simplified code by removing dynamic string construction in the first
place and using static strings if possible. See connect storage engine
changes.
When built with ubsan and trying to load the spider plugin, the hidden
visibility of mysqld compiling flag causes ha_spider.so to be missing
the symbol ha_partition. This commit fixes that, as well as some
memcpy null pointer issues when built with ubsan.
Signed-off-by: Yuchen Pei <yuchen.pei@mariadb.com>
MySQL 5.7.41 includes one InnoDB change
mysql/mysql-server@d2d6b2dd00
that seems to be applicable to MariaDB Server 10.3 and 10.4.
Even though commit 5b9ee8d819
seems to have fixed sporadic failures on our CI systems, it is
theoretically possible that another race condition remained.
buf_flush_page_cleaner_coordinator(): In the final loop,
wait also for buf_get_n_pending_read_ios() to reach 0.
In this way, if a secondary index leaf page was read into the
buffer pool and ibuf_merge_or_delete_for_page() modified that
page or some change buffer pages, the flush loop would execute
until the buffer pool really is in a clean state.
This potential data corruption bug does not affect MariaDB Server 10.5
or later, thanks to commit b42294bc64
which removed change buffer merges that are not explicitly requested.
If two high priority threads have lock conflict, we look at the
order of these transactions and honor the earlier transaction.
for_locking parameter in lock_rec_has_to_wait() has become
obsolete and it is now removed from the code .
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Cluster conflict victim's THD is marked with wsrep_aborter.
THD::wsrep_aorter holds the thread ID of the hight priority tread,
which is currently carrying out BF aborting for this victim.
However, the BF abort operation is not always successful,
and in such case the wsrep_aborter mark should be removed.
In the old code, this wsrep_aborter resetting did not happen,
and this could lead to a situation where the sticky wsrep_aborter
mark prevents any further attempt to BF abort this transaction.
This commit fixes this issue, and resets wsrep_aborter after
unsuccesful BF abort attempt.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
node->is_delete was incorrectly set to NO_DELETE for a set of operations.
In general we shouldn't rely on sql_command and look for more abstract ways
to control the behavior.
trg_event_map seems to be a suitable way. To mind replica nodes, it is ORed
with slave_fk_event_map, which stores trg_event_map when replica has
triggers disabled.
clang15 finally errors on old prototype definations.
Its also a lot fussier about variables that aren't used
as is the case a number of time with loop counters that
aren't examined.
RocksDB was complaining that its get_range function was
declared without the array length in ha_rocksdb.h. While
a constant is used rather than trying to import the
Rdb_key_def::INDEX_NUMBER_SIZE header (was causing a lot of
errors on the defination of other orders). If the constant
does change can be assured that the same compile warnings will
tell us of the error.
The ha_rocksdb::index_read_map_impl DBUG_EXECUTE_IF was similar
to the existing endless functions used in replication tests.
Its rather moot point as the rocksdb.force_shutdown test that
uses myrocks_busy_loop_on_row_read is currently disabled.
1. In case of system-versioned table add row_end into FTS_DOC_ID index
in fts_create_common_tables() and innobase_create_key_defs().
fts_n_uniq() returns 1 or 2 depending on whether the table is
system-versioned.
After this patch recreate of FTS_DOC_ID index is required for
existing system-versioned tables. If you see this message in error
log or server warnings: "InnoDB: Table db/t1 contains 2 indexes
inside InnoDB, which is different from the number of indexes 1
defined in the MariaDB" use this command to fix the table:
ALTER TABLE db.t1 FORCE;
2. Fix duplicate history for secondary unique index like it was done
in MDEV-23644 for clustered index (932ec586aa). In case of
existing history row which conflicts with currently inseted row we
check in row_ins_scan_sec_index_for_duplicate() whether that row
was inserted as part of current transaction. In that case we
indicate with DB_FOREIGN_DUPLICATE_KEY that new history row is not
needed and should be silently skipped.
3. Some parts of MDEV-21138 (7410ff436e) reverted. Skipping of
FTS_DOC_ID index for history rows made problems with purge
system. Now this is fixed differently by p.2.
4. wait_all_purged.inc checks that we didn't affect non-history rows
so they are deleted and purged correctly.
Additional FTS fixes
fts_init_get_doc_id(): exclude history rows from max_doc_id
calculation. fts_init_get_doc_id() callback is used only for crash
recovery.
fts_add_doc_by_id(): set max value for row_end field.
fts_read_stopword(): stopwords table can be system-versioned too. We
now read stopwords only for current data.
row_insert_for_mysql(): exclude history rows from doc_id validation.
row_merge_read_clustered_index(): exclude history_rows from doc_id
processing.
fts_load_user_stopword(): for versioned table retrieve row_end field
and skip history rows. For non-versioned table we retrieve 'value'
field twice (just for uniformity).
FTS tests for System Versioning now include maybe_versioning.inc which
adds 3 combinations:
'vers' for debug build sets sysvers_force and
sysvers_hide. sysvers_force makes every created table
system-versioned, sysvers_hide hides WITH SYSTEM VERSIONING
for SHOW CREATE.
Note: basic.test, stopword.test and versioning.test do not
require debug for 'vers' combination. This is controlled by
$modify_create_table in maybe_versioning.inc and these
tests run WITH SYSTEM VERSIONING explicitly which allows to
test 'vers' combination on non-debug builds.
'vers_trx' like 'vers' sets sysvers_force_trx and sysvers_hide. That
tests FTS with trx_id-based System Versioning.
'orig' works like before: no System Versioning is added, no debug is
required.
Upgrade/downgrade test for System Versioning is done by
innodb_fts.versioning. It has 2 combinations:
'prepare' makes binaries in std_data (requires old server and OLD_BINDIR).
It tests upgrade/downgrade against old server as well.
'upgrade' tests upgrade against binaries in std_data.
Cleanups:
Removed innodb-fts-stopword.test as it duplicates stopword.test
Works like vers_force but forces trx_id-based system-versioned tables
if the storage supports it (currently InnoDB-only). Otherwise creates
timestamp-based system-versioned table.
The incorrect type handler caused an incorrect result_type() for
Item_cache_row (STRING_RESULT rather than ROW_RESULT). By updating the
constructor of Item_cache_row with the correct type handler, it fixes
this problem.
Signed-off-by: Yuchen Pei <yuchen.pei@mariadb.com>
Reviewed-by: Sergei Golubchik <serg@mariadb.com>
Before the fix next-key lock was requested only if a record was
delete-marked for locking unique search in RR isolation level.
There can be several delete-marked records for the same unique key,
that's why InnoDB scans the records until eighter non-delete-marked record
is reached or all delete-marked records with the same unique key are
scanned.
For range scan next-key locks are used for RR to protect scanned range from
inserting new records by other transactions. And this is the reason of why
next-key locks are used for delete-marked records for unique searches.
If a record is not delete-marked, the requested lock type was "not-gap".
When a record is not delete-marked during lock request by trx 1, and
some other transaction holds conflicting lock, trx 1 creates waiting
not-gap lock on the record and suspends. During trx 1 suspending the
record can be delete-marked. And when the lock is granted on conflicting
transaction commit or rollback, its type is still "not-gap". So we have
"not-gap" lock on delete-marked record for RR. And this let some other
transaction to insert some record with the same unique key when trx 1 is
not committed, what can cause isolation level violation.
The fix is to set next-key locks for both delete-marked and
non-delete-marked records for unique search in RR.
When trying to create a spider table with banned charsets including
utf32, utf16, ucs2 and utf16le[1], spider should emit an error
immediately, rather than wait until a separate statement that
establishes a connection (e.g. SELECT). This also applies to ALTER
TABLE statement that changes charsets.
[1] https://mariadb.com/kb/en/server-system-variables/#character_set_client
Signed-off-by: Yuchen Pei <yuchen.pei@mariadb.com>
Reviewed-by: Nayuta Yanagisawa <nayuta.yanagisawa@mariadb.com>
Hide the errors related to missing innodb stats tables in bootstrap mode
on the assumption that because we are in bootstrap mode they are going
to be created.
mysql_discard_or_import_tablespace(): On successful
ALTER TABLE...DISCARD TABLESPACE, evict the table handle from the
table definition cache, so that ha_innobase::close() will be invoked,
like InnoDB expects to be the case. This will avoid an assertion failure
ut_a(table->get_ref_count() == 0) during IMPORT TABLESPACE.
ha_innobase::open(): Do not issue any ER_TABLESPACE_DISCARDED warning.
Member functions for DML will do that.
ha_innobase::truncate(), ha_innobase::check_if_supported_inplace_alter():
Issue ER_TABLESPACE_DISCARDED warnings, to compensate for the removal of
the warning in ha_innobase::open().
row_quiesce_write_indexes(): Only write information about committed
indexes. The ALTER TABLE t NOWAIT ADD INDEX(c) in the nondeterministic
test case will most of the time fail due to a metadata lock (MDL) timeout
and leave behind an uncommitted index.
Reviewed by: Sergei Golubchik
The bug is caused by a similar mechanism as MDEV-21027.
The function, check_insert_or_replace_autoincrement, failed to open
all the partitions on REPLACE SELECT statements and it results in the
assertion error.
- Bug happens only in case when the range function on empty key single
column index (XINDEXS) is used.
- Solution is to return with empty result in this scenario.
Reviewed by: <>
The crash occurs because of the following call of TABLE_LIST::init_one_table():
table_list.init_one_table(
&table_list.db, &table_list.table_name, 0, TL_WRITE);
One should not pass table_list.db and table_list.table_name to the function
because it update the very members internally.
The function is called previously, and there is no need to call it again.
So, simply removing the call will resolve the problem.
- InnoDB AHI tries to access the concurrent instant alter column,
leads to asan failure. Instant alter column should acquire the
clustered index search latch in exclusive mode before changing
the table cache definition.
- Removed the default parameter for the function
btr_search_drop_page_hash_index()
- Addressed the DWITH_INNODB_AHI=0 compilation failure
by passing two parameters from all callers of
btr_search_drop_page_hash_index()