dict_table_rename_in_cache(), dict_table_get_highest_foreign_id():
Reserve sufficient space for the fkid[] buffer, and ensure that the
fkid[] will be NUL-terminated.
The fkid[] must accommodate both the database name (which is already
encoded in my_charset_filename) and the constraint name
(which must be converted to my_charset_filename) so that we can check
if it is in the format databasename/tablename_ibfk_1 (all encoded in
my_charset_filename).
- InnoDB fts table initially added to LRU table cache
while creating the table. Later, table was marked
as non-evicted when we add the table to fts optimizer
list. Before marking the table as non-evicted, master
thread can try to evict the fts table.
- Import tablespace re-evicts and reload the table definition. During that
time, innodb has to load the table even though the secondary fts index
marked as corrupted
dict_load_foreigns(): Use a correctly sized buffer for the maximum-length
SYS_FOREIGN.ID. In case of overflow, do not crash the server but instead
return DB_CORRUPTION.
- InnoDB mistakenly identifies the non-unique FTS_DOC_ID index as
FTS_DOC_ID_INDEX while loading the table. dict_load_indexes()
should check whether the index is unique before assigning
fts_doc_id_index
- Redundant InnoDB table fails to set the flags2 while loading
the table. It leads to "Upgrade index name failure" during alter
operation. InnoDB should set the flags2 to FTS_AUX_HEX_NAME when
fts is being loaded
buf_page_print(): Dump the buffer page 32 bytes (64 hexadecimal digits)
per line. In this way, the limitation in mtr
("Data too long for column 'line'") will not be triggered.
Also, do not bother decoding the page contents, because everything
is present in the hexadecimal output.
dict_index_find_on_id_low(): Merge to dict_index_get_if_in_cache_low().
The direct call in buf_page_print() was prone to crashing, in case the
table definition was concurrently evicted or dropped from the
data dictionary cache.
- InnoDB fails to create a fts cache while loading the innodb fts
table which is stored in system tablespace. InnoDB should create
the fts cache while loading FTS_DOC_ID column from system column.
We will remove the parameter innodb_disallow_writes because it is badly
designed and implemented. The parameter was never allowed at startup.
It was only internally used by Galera snapshot transfer.
If a user executed
SET GLOBAL innodb_disallow_writes=ON;
the server could hang even on subsequent read operations.
During Galera snapshot transfer, we will block writes
to implement an rsync friendly snapshot, as follows:
sst_flush_tables() will acquire a global lock by executing
FLUSH TABLES WITH READ LOCK, which will block any writes
at the high level.
sst_disable_innodb_writes(), invoked via ha_disable_internal_writes(true),
will suspend or disable InnoDB background tasks or threads that could
initiate writes. As part of this, log_make_checkpoint() will be invoked
to ensure that anything in the InnoDB buf_pool.flush_list will be written
to the data files. This has the nice side effect that the Galera joiner
will avoid crash recovery.
The changes to sql/wsrep.cc and to the tests are based on a prototype
that was developed by Jan Lindström.
Reviewed by: Jan Lindström
To avoid potential race conditions between concurrent access to
dict_table_t::freed_indexes, let us consistently use
dict_table_t::autoinc_mutex.
dict_table_remove_from_cache_low(): To avoid extensive hold time
of table->autoinc_mutex, unconditionally free the FTS data structures.
Problem:
=======
The last AHI page for two indexes of an dropped table is being
freed at the same time by two threads. One thread frees the
table heap and other thread tries to access table heap again.
It leads to asan failure in btr_search_lazy_free().
Solution:
========
InnoDB uses autoinc_mutex to avoid the race condition
in btr_search_lazy_free()
Occasionally, the test innodb.alter_copy would fail in MariaDB 10.6.1,
reporting DB_MISSING_HISTORY during CHECK TABLE. It started to occur during
the development of MDEV-25180, which introduced purge_sys.stop_SYS().
If we delay purge more during DDL operations, then the test would
almost always fail. The reason is that during startup we will restore
a purge view, and CHECK TABLE would still use REPEATABLE READ
even though innodb_read_only is set and other isolation levels
than READ UNCOMMITTED are not guaranteed to work.
ha_innobase::check(): Use READ UNCOMMITTED isolation level if
innodb_read_only is set or innodb_force_recovery exceeds 3.
dict_set_corrupted(): Do not update the persistent data dictionary
if innodb_force_recovery exceeds 3.
InnoDB startup hangs if a DDL transaction needs to be
rolled back and a recovered transaction on statistics
tables exists. In that case, InnoDB should rollback
the transaction which holds locks on innodb_table_stats
or innodb_index_stats during trx_rollback_or_clean_recovered().
Between btr_pcur_store_position() and btr_pcur_restore_position()
it is possible that purge empties a table and enlarges
index->n_core_fields and index->n_core_null_bytes.
Therefore, we must cache index->n_core_fields in
btr_pcur_t::old_n_core_fields so that btr_pcur_t::old_rec can be
parsed correctly.
Unfortunately, this is a huge change, because we will replace
"bool leaf" parameters with "ulint n_core"
(passing index->n_core_fields, or 0 for non-leaf pages).
For special cases where we know that index->is_instant() cannot hold,
we may also pass index->n_fields.
Problem:
========
InnoDB fails to clean the index stub if it fails to add the
virtual index which contains new virtual column. But it clears
the newly virtual column from index in clear_added_indexes()
during inplace_alter_table. On commit, InnoDB evicts and
reload the table. In case of rollback, it doesn't happen.
InnoDB clears the ABORTED index while opening the table
or doing the DDL. In the mean time, InnoDB can access
the dropped virtual index columns while creating prebuilt
or rollback of concurrent DML.
Solution:
==========
(1) InnoDB should maintain newly added virtual column while
rollbacking the newly added virtual index.
(2) InnoDB must not defer the index removal
if the alter table is executed with LOCK=EXCLUSIVE.
(3) For LOCK=SHARED, InnoDB should check whether the table
has any other transaction lock other than alter transaction
before deferring the index stub.
Replaced has_new_v_col with dict_add_vcol_info in dict_index_t to
indicate whether the index has any new virtual column.
dict_index_t::has_new_v_col(): Returns whether the index has
newly added virtual column, it doesn't say which columns are
newly added virtual column
ha_innobase_inplace_ctx::is_new_vcol(): Return whether the
given column is added as a part of the current alter.
ha_innobase_inplace_ctx::clean_new_vcol_index(): Copy the newly
added virtual column to new_vcol_info in dict_index_t. Replace
the column in the index fields with virtual column stored
in new_vcol_info.
dict_index_t::assign_new_v_col(): Store the number of virtual
column added in index as a part of alter table.
dict_index_t::get_n_new_vcol(): Get the number of newly added
virtual column
dict_index_t::assign_drop_v_col(): Allocate the memory for
adding new virtual column in new_vcol_info.
dict_index_t::add_drop_v_col(): Add the newly added virtual
column in new_vcol_info.
dict_table_t::has_lock_for_other_trx(): Whether the table has
any other transaction lock than given transaction.
row_merge_drop_indexes(): Add parameter alter_trx and check
whether the table has any other lock than alter transaction.
When a table has been evicted from dict_sys and reloaded internally by
InnoDB for FOREIGN KEY processing, statistics may not be initialized,
but nevertheless row_update_cascade_for_mysql() could invoke
dict_stats_update_if_needed(). In that case, we cannot really update
the statistics. For tables that have STATS_PERSISTENT=1 and
STATS_AUTO_RECALC=1, ANALYZE TABLE might have to be executed later.
dict_stats_update_if_needed(): Replace the assertion with
a conditional early return.
The last_updated column of innodb_table_stats and innodb_index_stats
hasn't been DATA_FIXBINARY for many years.
Innodb represents TIMESTAMP as INT of length 4. Let's test it with this
and stop hiding the result in mysql_upgrade test.
Reviewer: Marko
This is a fixup patch for MDEV-23991 afc9d00c66
We really should read result.n_leaf_pages, which was set previously.
Analysis and fix was provided by Jukka Santala. Thanks!
Reviewed by: Marko Mäkelä
The nonnull attribute is not applicable to parameters that are
passed by reference, at least not in the Intel compiler.
Let us remove the reference indirection, which was only there
so that the pointer could be assigned to NULL, and let the
callers perform that task.
row_log_allocate(): Fix a bug in out-of-memory error handling
that would leave a pointer to freed memory.
As noted in commit 0b66d3f70d,
MariaDB does not support CREATE TABLESPACE for InnoDB.
Hence, some code that was added in
commit fec844aca8
and originally in
mysql/mysql-server@c71dd213bd
is unused in MariaDB and should be removed.
Patch removes dict_index_t::stats_latch. Table/index statistics now
protected with dict_sys->mutex. That way statistics computation can
happen in parallel in several threads and dict_sys->mutex will be locked
only for a short period of time.
This patch is a joint work with Marko Mäkelä
dict_index_t:🔒 make mutable which allows to pass const pointer
when only lock is touched in an object
btr_height_get()
btr_get_size(): make index argument const for better type safety
btr_estimate_number_of_different_key_vals(): now returns computed values
instead of setting fields in dict_index_t directly
remove everything related to dict_index_t::stats_latch
dict_stats_index_set_n_diff(): now returns computed values instead
of setting fields in dict_index_t directly
dict_stats_analyze_index(): now returns computed values instead
of setting fields in dict_index_t directly
Reviewed by: Marko Mäkelä
This issue is caused by MDEV-22456 ad6171b91c. Fix involves the backported version of 10.4 patch
MDEV-22778 5f2628d1ee and few parts of
MDEV-17441 (e9a5f288f2).
dict_table_t::stats_latch_created: Removed
dict_table_t::stats_latch: make value member and always lock it for
simplicity even for stats cloned table.
zip_pad_info_t::mutex_created: Removed
zip_pad_info_t::mutex: make member value instead of pointer
os0once.h: Removed
dict_table_remove_from_cache_low(): Ensure that fts_free() is always
called, even if dict_mem_table_free() is deferred until
btr_search_lazy_free().
InnoDB would always zip_pad_info_t::mutex and
dict_table_t::autoinc_mutex, even for tables are not in
ROW_FORMAT=COMPRESSED nor include any AUTO_INCREMENT column.
Marking of deletion of row in fts index happens twice in
self-referential foreign key relation. So while performing
referential checks of foreign key, InnoDB can avoid updating
of fts index if the foreign key has self-referential relationship.
Reviewed-by: Marko Mäkelä
dict_load_table_low(): Copy the 'discarded' flag to file_unreadable.
This allows to avoid a potentially harmful call to dict_stats_init()
in ha_innobase::open().
After DISCARD TABLESPACE, the tablespace of a table will no longer
exist, and dict_get_and_save_data_dir_path() would invoke
dict_get_first_path() to read an entry from SYS_DATAFILES.
For some reason, DISCARD TABLESPACE would not to remove the entry
from there.
dict_get_and_save_data_dir_path(): If the tablespace has been
discarded, do not bother trying to read the name.
Side note: The tables SYS_TABLESPACES and SYS_DATAFILES are
redundant and subject to removal in MDEV-22343.
dict_foreign_qualify_index(): Reject corrupted or garbage indexes.
For index stubs that are created on virtual columns, no
dict_field_t::col would be assign. Instead, the entire table
definition would be reloaded on a successful operation.
buf_page_create() is invoked when page is initialized. So that
previous contents of the page ignored. In few cases, it calls
buf_page_get_gen() is called to fetch the page from buffer pool.
It should take x-latch on the page. If other thread uses the block
or block io state is different from BUF_IO_NONE then release the
mutex and check the state and buffer fix count again. For compressed
page, use the existing free block from LRU list to create new page.
Retry to fetch the compressed page if it is in flush list
fseg_create(), fseg_create_general(): Introduce block as a parameter
where segment header is placed. It is used to avoid repetitive
x-latch on the same page
Change the assert to check whether the page has SX latch and
X latch in all callee function of buf_page_create()
mtr_t::get_fix_count(): Get the buffer fix count of the given
block added by the mtr
FindBlock is added to find the buffer fix count of the given
block acquired by the mini-transaction
Problem:
========
dict_load_table_one() doesn't handle the scenario where clustered
index page is FIL_NULL when DICT_ERR_IGNORE_INDEX_ROOT mode
is set.
Fix:
====
InnoDB should set the file_unreadable when it can't find the
clustered index root page.
MemorySanitizer (clang -fsanitize=memory) requires that all code
be compiled with instrumentation enabled. The only exception is the
C runtime library. Failure to use instrumented libraries will cause
bogus messages about memory being uninitialized.
In WITH_MSAN builds, we must avoid calling getservbyname(),
because even though it is a standard library function, it is
not instrumented, not even in clang 10.
Note: Before MariaDB Server 10.5, ./mtr will typically fail
due to the old PCRE library, which was updated in MDEV-14024.
The following cmake options were tested on 10.5
in commit 94d0bb4dbe:
cmake \
-DCMAKE_C_FLAGS='-march=native -O2' \
-DCMAKE_CXX_FLAGS='-stdlib=libc++ -march=native -O2' \
-DWITH_EMBEDDED_SERVER=OFF -DWITH_UNIT_TESTS=OFF -DCMAKE_BUILD_TYPE=Debug \
-DWITH_INNODB_{BZIP2,LZ4,LZMA,LZO,SNAPPY}=OFF \
-DPLUGIN_{ARCHIVE,TOKUDB,MROONGA,OQGRAPH,ROCKSDB,CONNECT,SPIDER}=NO \
-DWITH_SAFEMALLOC=OFF \
-DWITH_{ZLIB,SSL,PCRE}=bundled \
-DHAVE_LIBAIO_H=0 \
-DWITH_MSAN=ON
MEM_MAKE_DEFINED(): An alias for VALGRIND_MAKE_MEM_DEFINED()
and __msan_unpoison().
MEM_GET_VBITS(), MEM_SET_VBITS(): Aliases for
VALGRIND_GET_VBITS(), VALGRIND_SET_VBITS(), __msan_copy_shadow().
InnoDB: Replace the UNIV_MEM_ macros with corresponding MEM_ macros.
ut_crc32_8_hw(), ut_crc32_64_low_hw(): Use the compiler built-in
functions instead of inline assembler when building WITH_MSAN.
This will require at least -msse4.2 when building for IA-32 or AMD64.
The inline assembler would not be instrumented, and would thus cause
bogus failures.
We are supposed to commit and restart the mini-transaction
between records. There is no point to store and restore the
persistent cursor position otherwise.
If buf_page_optimistic_get() is patched to always fail, the
debug build would fail to start up due to trying to re-acquire
an already S-latched block.
This bug (which should not have visible impact to users, because
the code is only executed during startup, while no other threads
are accessing B-trees or causing pages to be evicted from the
buffer pool) was caught as part of a debugging effort for
something else.
The debugging approach was: Make buf_page_optimistic_get()
always return FALSE, and add ut_a(block->lock.lock_word == X_LOCK_DECR)
to both buf_LRU_get_free_only() and buf_LRU_block_free_non_file_page().
This would catch misuse of the buffer pool. If it were not for
buf_page_optimistic_get(), no buf_block_t::lock of any BUF_BLOCK_NOT_USED
block would ever be acquired.