Commit graph

10,230 commits

Author SHA1 Message Date
Marko Mäkelä
895cd553a3 MDEV-32175: Reduce page_align(), page_offset() calls
When srv_page_size and innodb_page_size were introduced,
the functions page_align() and page_offset() got more expensive.
Let us try to replace such calls with simpler pointer arithmetics
with respect to the buffer page frame.

page_rec_get_next_non_del_marked(): Add a page frame as a parameter,
and template<bool comp>.

page_rec_next_get(): A more efficient variant of page_rec_get_next(),
with template<bool comp> and const page_t* parameters.

lock_get_heap_no(): Replaces page_rec_get_heap_no() outside debug checks.

fseg_free_step(), fseg_free_step_not_header(): Take the header block
as a parameter.

Reviewed by: Vladislav Lesin
2024-11-21 11:01:30 +02:00
Marko Mäkelä
df3855a471 MDEV-35247: ut_hash_ulint() is a waste
ut_hash_ulint(): Remove. The exclusive OR before a modulus operation
does not serve any useful purpose; it is only obfuscating code and
wasting some CPU cycles.

Reviewed by: Debarun Banerjee
2024-11-21 08:59:31 +02:00
Marko Mäkelä
a9b0a1c5d0 MDEV-35247: ut_fold_ull() is a waste
ut_fold_ull(): For SIZEOF_SIZE_T < 8, we simulate universal hashing
(Carter and Wegman, 1977) by pretending that SIZE_T_MAX + 1
is a prime. In other words, we implement a Rabin–Karp rolling
hash algorithm similar to java.lang.String.hashCode().
This is used for representing 64-bit dict_index_t::id or
dict_table_t::id in the native word size.

For SIZEOF_SIZE_T >= 8, we just use an identity mapping.

Reviewed by: Debarun Banerjee
2024-11-21 08:59:17 +02:00
Marko Mäkelä
3c312d247c MDEV-35190 HASH_SEARCH duplicates effort before HASH_INSERT or HASH_DELETE
The HASH_ macros are unnecessarily obfuscating the logic,
so we had better replace them.

hash_cell_t::search(): Implement most of the HASH_DELETE logic,
for a subsequent insert or remove().

hash_cell_t::remove(): Remove an element.

hash_cell_t::find(): Implement the HASH_SEARCH logic.

xb_filter_hash_free(): Avoid any hash table lookup;
just traverse the hash bucket chains and free each element.

xb_register_filter_entry(): Search databases_hash only once.

rm_if_not_found(): Make use of find_filter_in_hashtable().

dict_sys_t::acquire_temporary_table(), dict_sys_t::find_table():
Define non-inline to avoid unnecessary code duplication.

dict_sys_t::add(dict_table_t *table), dict_table_rename_in_cache():
Look for duplicate while finding the insert position.

dict_table_change_id_in_cache(): Merged to the only caller
row_discard_tablespace().

hash_insert(): Helper function of dict_sys_t::resize().

fil_space_t::create(): Look for a duplicate (and crash if found)
when searching for the insert position.

lock_rec_discard(): Take the hash array cell as a parameter
to avoid a duplicated lookup.

lock_rec_free_all_from_discard_page(): Remove a parameter.

Reviewed by: Debarun Banerjee
2024-11-21 08:59:02 +02:00
Vlad Lesin
bcbeef6772 MDEV-35457 Remove btr_cur_t::path_arr
After MDEV-21136 fix, the btr_cur_t::path_arr field stayed declared, but
not used, wasting space in each btr_cur_t and btr_pcur_t. Remove it.
2024-11-20 17:43:04 +03:00
Marko Mäkelä
ba69d811fa MDEV-35409 InnoDB can still hang while running out of buffer pool
buf_pool_t::LRU_warn(): Also clear the try_LRU_scan flag, to ensure
that need_LRU_eviction() will hold. This should ensure progress when
buf_LRU_get_free_block() is expecting buf_flush_page_cleaner() to
make some room, even when buf_pool.LRU.count is small.

This hang was observed in trx_lists_init_at_db_start() while the last
batch of crash recovery was in progress, but it could theoretically
be possible also when a large part of the buffer pool is occupied by
record locks or the adaptive hash index.

Reviewed by: Debarun Banerjee
2024-11-18 08:13:18 +02:00
ParadoxV5
d5f16d6305 Extract some of #3360 fixes to 10.6.x
That PR uncovered countless issues on `my_snprintf` uses.
This commit backports a squashed subset of their fixes (excludes #3485).
2024-11-18 13:29:04 +11:00
Thirunarayanan Balathandayuthapani
b8f48d09cf MDEV-35363 Avoid cloning of table statistics while saving the InnoDB table stats
- While saving the InnoDB table statistics, InnoDB does clone
the table object and its statistics to protect the metadata
in case if table is being dropped. From 10.6 onwards, any
background task inside InnoDB on the table takes MDL.
So metadata is protected by MDL of the table. Avoid the
cloning of the table and its associated index statistics.
2024-11-14 10:58:39 +05:30
Marko Mäkelä
ccb6cd8053 MDEV-35189: Updating cache for INNODB_LOCKS et al is suboptimal
ha_storage_put_memlim(): Invoke my_crc32c() to "fold", and traverse
the hash table only once.

fold_lock(): Remove some redundant conditions and use my_crc32c()
instead of ut_fold_ulint_pair().

trx_i_s_cache_t::add(): Replaces add_lock_to_cache(),
search_innodb_locks(), and locks_row_eq_lock(). Avoid duplicated
traversal of the hash table.

Reviewed by: Debarun Banerjee
2024-11-12 12:17:34 +02:00
Thirunarayanan Balathandayuthapani
074831ec61 Merge branch 10.5 into 10.6 2024-11-08 18:17:15 +05:30
Thirunarayanan Balathandayuthapani
7afee25b08 MDEV-35115 Inconsistent Replace behaviour when multiple unique index exist
- Replace statement fails with duplicate key error when multiple
unique key conflict happens. Reason is that Server expects the
InnoDB engine to store the confliciting keys in ascending order.
But the InnoDB doesn't store the conflicting keys
in ascending order.

Fix:
===
- Enable HA_DUPLICATE_KEY_NOT_IN_ORDER for InnoDB storage engine
only when unique index order is different in .frm and innodb dictionary.
2024-11-08 16:46:41 +05:30
Thirunarayanan Balathandayuthapani
98d57719e2 MDEV-32667 dict_stats_save_index_stat() reads uninitialized index->stats_error_printed
Problem:
========
- dict_stats_table_clone_create() does not initialize the
flag stats_error_printed in either dict_table_t or dict_index_t.
Because dict_stats_save_index_stat() is operating on a copy
of a dict_index_t object, it appears that
dict_index_t::stats_error_printed will always be false
for actual metadata objects, and uninitialized in
dict_stats_save_index_stat().

Solution:
=========
dict_stats_table_clone_create(): Assign stats_error_printed
for table and index while copying the statistics
2024-11-08 11:35:19 +05:30
Vladislav Vaintroub
7a62b029b3 post-merge cleanup - remove copy&paste code in fil_node_t::find_metadata 2024-11-05 21:44:35 +01:00
Oleksandr Byelkin
f2bb2ab58c Merge branch '10.6' into mariadb-10.6.20 2024-11-04 07:40:45 +01:00
Vlad Lesin
3734ff7c7e MDEV-34690 lock_rec_unlock_unmodified() causes deadlock
Post-push fix: row_vers_impl_x_locked() must be invoked under unlatched
lock_sys, the corresponding assertion was removed in MDEV-34466 and
was not restored in MDEV-34690. This fix restores it.
2024-10-31 12:16:21 +03:00
Oleksandr Byelkin
f00711bba2 Merge branch '10.5' into 10.6 2024-10-29 14:20:03 +01:00
Marko Mäkelä
decdd4bf49 MDEV-29015/MDEV-29260/MDEV-34938: os_file_get_size() WSL work-around
When MariaDB Server is run in a container under
Windows Subsystem for Linux, the fstat(2) system calls that InnoDB
invokes in os_file_set_size() or os_file_get_size() are causing a
failure in case the file had been renamed in the past while the file
handle was open. This affects at least ALTER TABLE and OPTIMIZE TABLE.

os_file_get_size(): Invoke lseek(2) instead of fstat(2). We do not mind
if the file pointer is moving to the end of the file, because InnoDB
exclusively invokes positioned reads and writes, or in some rare cases,
appends to an existing file.

os_file_set_size(): Invoke os_file_get_size() instead of fstat(2).
Define the POSIX and Windows versions separately. Formerly, the
Windows version was called os_file_change_size_win32().

fil_node_t::read_page0(): Use os_file_get_size() to determine the
size, and do not crash on error.

fil_node_t::read_metadata(): Remove the non-Windows stat* parameter
and always invoke fstat(2) outside Windows, but do tolerate errors.
Because fstat(2) is more likely to fail than lseek(2), and this is
not time critical code, we can afford the extra lseek(2) system call.

Reviewed by: Vladislav Vaintroub
2024-10-24 16:08:56 +03:00
Vlad Lesin
8c7786e7d5 MDEV-34690 lock_rec_unlock_unmodified() causes deadlock
lock_rec_unlock_unmodified() is executed either under lock_sys.wr_lock()
or under a combination of lock_sys.rd_lock() + record locks hash table
cell latch. It also requests page latch to check if locked records were
changed by the current transaction or not.

Usually InnoDB requests page latch to find the certain record on the
page, and then requests lock_sys and/or record lock hash cell latch to
request record lock. lock_rec_unlock_unmodified() requests the latches
in the opposite order, what causes deadlocks. One of the possible
scenario for the deadlock is the following:

thread 1 - lock_rec_unlock_unmodified() is invoked under locks hash table
           cell latch, the latch is acquired;
thread 2 - purge thread acquires page latch and tries to remove
           delete-marked record, it invokes lock_update_delete(), which
           requests locks hash table cell latch, held by thread 1;
thread 1 - requests page latch, held by thread 2.

To fix it we need to release lock_sys.latch and/or lock hash cell latch,
acquire page latch and re-acquire lock_sys related latches.

When lock_sys.latch and/or lock hash cell latch are released in
lock_release_on_prepare() and lock_release_on_prepare_try(), the page on
which the current lock is held, can be merged. In this case the bitmap
of the current lock must be cleared, and the new lock must be added to
the end of trx->lock.trx_locks list, or bitmap of already existing lock
must be changed.

The new field trx_lock_t::set_nth_bit_calls indicates if new locks
(bits in existing lock bitmaps or new lock objects) were created during
the period when lock_sys was released in trx->lock.trx_locks list
iteration loop in lock_release_on_prepare() or
lock_release_on_prepare_try(). And, if so, we traverse the list again.

The block can be freed during pages merging, what causes assertion
failure in buf_page_get_gen(), as btr_block_get() passes BUF_GET as page
get mode to it. That's why page_get_mode parameter was added to
btr_block_get() to pass BUF_GET_POSSIBLY_FREED from
lock_release_on_prepare() and lock_release_on_prepare_try() to
buf_page_get_gen().

As searching for id of trx, which modified secondary index record, is
quite expensive operation, restrict its usage for master. System variable
was added to remove the restriction for testing simplifying. The
variable exists only either for debug build or for build with
-DINNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY option to increase the
probability of catching bugs for release build with RQG.

Note that the code, which does primary index lookup to find out what
transaction modified secondary index record, is necessary only when
there is no primary key and no unique secondary key on replica with row
based replication, because only in this case extra X locks on unmodified
records can be set during scan phase.

Reviewed by Marko Mäkelä.
2024-10-23 12:36:17 +03:00
Vlad Lesin
92180ad513 MDEV-34466 XA prepare don't release unmodified records for some cases
There is no need to exclude exclusive non-gap locks from the procedure
of locks releasing on XA PREPARE execution in
lock_release_on_prepare_try() after commit
17e59ed3aa (MDEV-33454), because
lock_rec_unlock_unmodified() should check if the record was modified
with the XA, and release the lock if it was not.

lock_release_on_prepare_try(): don't skip X-locks, let
lock_rec_unlock_unmodified() to process them.

lock_sec_rec_some_has_impl(): add template parameter for not acquiring
trx_t::mutex for the case if a caller already holds the mutex, don't
crash if lock's bitmap is clean.

row_vers_impl_x_locked(), row_vers_impl_x_locked_low(): add new argument
to skip trx_t::mutex acquiring.

rw_trx_hash_t::validate_element(): don't acquire trx_t::mutex if the
current thread already holds it.

Thanks to Andrei Elkin for finding the bug.
Reviewed by Marko Mäkelä, Debarun Banerjee.
2024-10-23 12:36:17 +03:00
Marko Mäkelä
1cad1dbde6 MDEV-35235 innodb_snapshot_isolation=ON fails to signal transaction rollback
convert_error_code_to_mysql(): Treat DB_DEADLOCK and DB_RECORD_CHANGED
in the same way, that is, signal to the SQL layer that the transaction
had been rolled back.
2024-10-23 07:55:22 +03:00
Jan Lindström
b3be3c2157 MDEV-30653 : With wsrep_mode=REPLICATE_ARIA only part of mixed-engine transactions is replicated
Replication of non-transactional engines is experimental and
uses TOI. This naturally means that if there is open transaction
with transactional engine it's changes will be rolled back.

Fixed by adding error message if non-transactional engine
is part of multi-engine transaction with warning.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-10-23 04:00:52 +02:00
Marko Mäkelä
b38edd09ff MDEV-34830 fixup: Relax an assertion
This follows up 1067046b7f
2024-10-22 11:35:33 +03:00
Marko Mäkelä
1067046b7f MDEV-34830 fixup: Relax an assertion
It is possible that recv_sys.scanned_lsn is ahead of recv_sys.recovered_lsn
by a few 512-byte log blocks in case the last mini-transaction in the log
had not been written out completely before the server was killed.
This is occasionally the case when running the test
innodb.innodb-32k-crash.
2024-10-22 09:09:11 +03:00
Marko Mäkelä
bea4adcb5a MDEV-35225 Bogus debug assertion failures in innodb.innodb-32k-crash
log_sort_flush_list(): Correct some debug assertions that had been added in
commit 0d175968d1 (MDEV-31354).
The writes of some blocks may be completed and the oldest_modification()
set to 1 at any time.

The bogus assertion failures led to occasional failures of the test
innodb.innodb-32k-crash.
2024-10-22 09:07:57 +03:00
Vladislav Vaintroub
e8db5c8760 MDEV-35171 OS_FILE_NORMAL and OS_FILE_AIO are misleading
Removed 'purpose' parameter from os_file_create() and related functions.
Always use FILE_FLAG_OVERLAPPED when opening Windows files.

No performance regression was measured, nor there is any measurable
improvement.
2024-10-21 15:31:32 +02:00
Marko Mäkelä
7701ccb72d MDEV-35149 Race condition around SET GLOBAL innodb_lru_scan_depth
A debug assertion in buf_LRU_get_free_block() could fail if
SET GLOBAL innodb_lru_scan_depth is being executed during a workload
that involves allocating buffer pool pages.

buf_pool_t::LRU_scan_depth: Replaces srv_LRU_scan_depth.

buf_pool_t::flush_neighbors: Replaces srv_flush_neighbors.

innodb_buf_pool_update<T>(): Update a parameter of buf_pool
while holding buf_pool.mutex.
2024-10-21 10:08:58 +03:00
Thirunarayanan Balathandayuthapani
7f7d78bc18 MDEV-35183 ADD FULLTEXT INDEX unnecessarily DROPS FTS COMMON TABLES
- InnoDB fulltext rebuilds the FTS COMMON table while adding the
new fulltext index. This can be optimized by avoiding rebuilding
the FTS COMMON table in case of FTS COMMON TABLE already exists.

Reviewed-by: Marko Mäkelä <marko.makela@mariadb.com>
2024-10-21 12:27:09 +05:30
Marko Mäkelä
bb47e575de MDEV-34830: LSN in the future is not being treated as serious corruption
The invariant of write-ahead logging is that before any change to a
page is written to the data file, the corresponding log record must
must first have been durably written.

On crash recovery, there were some sloppy checks for this. Let us
implement accurate checks and flag an inconsistency as a hard error,
so that we can avoid further corruption of a corrupted database.
For data extraction from the corrupted database, innodb_force_recovery
can be used.

Before recovery is reading any data pages or invoking
buf_dblwr_t::recover() to recover torn pages from the
doublewrite buffer, InnoDB will have parsed the log until the
final LSN and updated log_sys.lsn to that. So, we can rely on
log_sys.lsn at all times. The doublewrite buffer recovery has been
refactored in such a way that the recv_sys.dblwr.pages may be consulted
while discovering files and their page sizes, but nothing will be
written back to data files before buf_dblwr_t::recover() is invoked.

A section of the test mariabackup.innodb_redo_overwrite
that is parsing some mariadb-backup --backup output has
been removed, because that output "redo log block is overwritten"
would often be missing in a Microsoft Windows environment
as a result of these changes.

recv_max_page_lsn, recv_lsn_checks_on: Remove.

recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging
condition at the end of the recovery.

recv_dblwr_t::validate_page(): Keep track of the maximum LSN
(if we are checking a non-doublewrite copy of a page) but
do not complain LSN being in the future. The doublewrite buffer
is a special case, because it will be read early during recovery.
Besides, starting with commit 762bcb81b5
the dblwr=true copies of pages may legitimately be "too new".

recv_dblwr_t::find_page(): Find a valid page with the smallest
FIL_PAGE_LSN that is in the valid range for recovery.

recv_dblwr_t::restore_first_page(): Replaced by find_page().
Only buf_dblwr_t::recover() will write to data files.

buf_dblwr_t::recover(): Simplify the message output. Do attempt
doublewrite recovery on user page read error. Ignore doublewrite
pages whose FIL_PAGE_LSN is outside the usable bounds. Previously,
we could wrongly recover a too new page from the doublewrite buffer.
It is unlikely that this could have lead to an actual error.
Write back all recovered pages from the doublewrite buffer here,
including for the first page of any tablespace.

buf_page_is_corrupted(): Distinguish the return values
CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.

buf_page_check_corrupt(): Return the error code DB_CORRUPTION
in case the LSN is in the future.

Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff
in the same way on both 32-bit and 64-bit architectures.

Datafile::read_first_page_flags(): Split from read_first_page().
Take a copy of the first page as a parameter.

recv_sys_t::free_corrupted_page(): Take the file as a parameter
and return whether a message was displayed. This avoids some duplicated
and incomplete error messages.

buf_page_t::read_complete(): Remove some redundant output and always
display the name of the corrupted file. Never return DB_FAIL;
use it only in internal error handling.

IORequest::read_complete(): Assume that buf_page_t::read_complete()
will have reported any error.

fil_space_t::set_corrupted(): Return whether this is the first time
the tablespace had been flagged as corrupted.

Datafile::validate_first_page(), fil_node_open_file_low(),
fil_node_open_file(), fil_space_t::read_page0(),
fil_node_t::read_page0(): Add a parameter for a copy of the
first page, and a parameter to indicate whether the FIL_PAGE_LSN
check should be suppressed. Before buf_dblwr_t::recover() is
invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the
FSP_SPACE_FLAGS and the tablespace ID that may be present in a
potentially too new copy of a page.

Reviewed by: Debarun Banerjee
2024-10-17 17:24:20 +03:00
Marko Mäkelä
740519e15a MDEV-35125: Unnecessary buf_pool.page_hash lookups
dict_index_t::clear(), btr_drop_temporary_table(): Make use of the
root page guess if it is available.

btr_read_autoinc(): Invoke btr_root_block_get() to access the root page.

btr_blob_free(): Retain a buffer-fix on the page across mtr_t::commit()
in order to avoid a buf_pool.page_hash lookup.

dict_load_table_one(): Remove a redundant check for page id. It was
already validated in buf_page_t::read_complete().

trx_t::apply_log(): Make use of buf_pool.page_fix() to avoid some
mtr_t related overhead.

Reviewed by: Thirunarayanan Balathandayuthapani
2024-10-17 09:10:45 +03:00
Vladislav Vaintroub
c1fc59277a MDEV-34929 page-compressed tables do not work on Windows
Remove workaround for MDEV-13941, it served for 5 years,and all affected
pre-release 10.2 installation should have been already fixed in between.

Apparently Innodb is using is_sparse parameter in os_file_set_size()
inconsistently, and it passes is_sparse=false now during first file
extension. With MDEV-13941 workaround in place, it would unsparse
the file, which is makes compression not to work at all anymore.
2024-10-16 16:02:13 +02:00
Marko Mäkelä
a4d2cc931d MDEV-35174 Possible hang in trx_undo_prev_version()
In commit b7b9f3ce82 (MDEV-34515) we
accidentally made the InnoDB MVCC code acquire a shared
purge_sys.latch twice. Recursive shared latch acquisition may cause a
deadlock of InnoDB threads if another thread in between will start waiting
for an exclusive latch.

purge_sys_t::latch: In debug builds, use srw_lock_debug instead of
srw_spin_lock, so that bugs like this will result in debug assertion
failures.

trx_undo_report_row_operation(): Pass the view_guard to
trx_undo_prev_version() and the rest of the arguments in the same
order, so that the work to permute argument registers is minimized.
2024-10-16 14:37:44 +03:00
Thirunarayanan Balathandayuthapani
6aaae4c03b MDEV-35122 Incorrect NULL value handling for instantly dropped BLOB columns
Problem:
=======
- Redundant table fails to insert into the table after
instant drop blob column. Instant drop column only marking
the column as hidden and consecutive insert statement tries
to insert NULL value for the dropped BLOB column and returns
the fixed length of the blob type as 65535. This lead to
row size too large error.

Fix:
====
 For redundant table, if the non-fixed dropped column can be null
then set the length of the field type as 0.
2024-10-15 12:04:37 +05:30
Yuchen Pei
cd5577ba4a
Merge branch '10.5' into 10.6 2024-10-15 16:00:44 +11:00
Thirunarayanan Balathandayuthapani
5777d9f282 MDEV-35116 InnoDB fails to set error index for HA_ERR_NULL_IN_SPATIAL
- InnoDB fails to set the index information or index number
for the spatial index error HA_ERR_NULL_IN_SPATIAL.

row_build_spatial_index_key(): Initialize the tmp_mbr array completely.

check_if_supported_inplace_alter(): Fix the spelling mistake of alter
2024-10-14 14:28:24 +05:30
Oleksandr Byelkin
1d0e94c55f Merge branch '10.5' into 10.6 2024-10-09 08:38:48 +02:00
Thirunarayanan Balathandayuthapani
23820f1d79 MDEV-34392 Inplace algorithm violates the foreign key constraint
- Fixing the compilation issue for the compiler lesser than gcc-6

Reviewed-by : Marko Mäkelä <marko.makela@mariadb.com>
2024-10-09 10:14:29 +05:30
Thirunarayanan Balathandayuthapani
65418ca9ad MDEV-34392 Inplace algorithm violates the foreign key constraint
- Fix the compilation error in gcc-5
2024-10-08 16:43:57 +05:30
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
Marko Mäkelä
cc70ca7eab MDEV-35059 ALTER TABLE...IMPORT TABLESPACE with FULLTEXT SEARCH may corrupt the adaptive hash index
build_fts_hidden_table(): Correct a mistake that had been made in
commit 903ae30069 (MDEV-30655).
2024-10-02 11:09:31 +03:00
Sergei Golubchik
b1bbdbab9e cleanup: remove redundant if()
likely, a result of auto-merge of two fixes in different versions
2024-10-01 18:29:11 +02:00
Marko Mäkelä
464055fe65 MDEV-34078 Memory leak in InnoDB purge with 32-column PRIMARY KEY
row_purge_reset_trx_id(): Reserve large enough offsets for accomodating
the maximum width PRIMARY KEY followed by DB_TRX_ID,DB_ROLL_PTR.

Reviewed by: Thirunarayanan Balathandayuthapani
2024-10-01 18:35:39 +03:00
Marko Mäkelä
a298dfb84c MDEV-35053 Crash in purge_sys_t::iterator::free_history_rseg()
purge_sys_t::get_page(): Avoid accessing a freed reference to pages[id]
after pages.erase(id).  This heap-use-after-free would sometimes be
caught by AddressSanitizer.

purge_sys_t::iterator::free_history_rseg(): Do not crash if undo=nullptr
(the database is corrupted).

Reviewed by: Debarun Banerjee
2024-10-01 15:03:04 +03:00
Marko Mäkelä
2d031f4a71 MDEV-34973 fixup for POWER,s390x
xtest(): Correct the declaration.
2024-10-01 13:29:59 +03:00
Max Kellermann
6715e4dfe1 MDEV-34973: innobase/dict0dict: add noexcept to lock/unlock methods
Another chance for cutting back overhead due to C++ exceptions being
enabled; the `dict_sys_t` class is a good candidate because its
locking methods are called frequently.

Binary size reduction this time:

    text	  data	   bss	   dec	   hex	filename
 24448622	2436488	9473537	36358647	22ac9f7	build/release/sql/mariadbd
 24448474	2436488	9473601	36358563	22ac9a3	build/release/sql/mariadbd
2024-10-01 09:53:16 +03:00
Max Kellermann
813123e3e0 MDEV-34973: innobase/lock0lock: add noexcept
MariaDB is compiled with C++ exceptions enabled, and that disallows
some optimizations (e.g. the stack must always be unwinding-safe).  By
adding `noexcept` to functions that are guaranteed to never throw,
some of these optimizations can be regained.  Low-level locking
functions that are called often are a good candidate for this.

This shrinks the executable a bit (tested with GCC 14 on aarch64):

    text	  data	   bss	   dec	   hex	filename
 24448910	2436488	9473185	36358583	22ac9b7	build/release/sql/mariadbd
 24448622	2436488	9473537	36358647	22ac9f7	build/release/sql/mariadbd
2024-10-01 09:53:16 +03:00
Thirunarayanan Balathandayuthapani
cc810e64d4 MDEV-34392 Inplace algorithm violates the foreign key constraint
Don't allow the referencing key column from NULL TO NOT NULL
when

 1) Foreign key constraint type is ON UPDATE SET NULL
 2) Foreign key constraint type is ON DELETE SET NULL
 3) Foreign key constraint type is UPDATE CASCADE and referenced
 column declared as NULL

Don't allow the referenced key column from NOT NULL to NULL
when foreign key constraint type is UPDATE CASCADE
and referencing key columns doesn't allow NULL values

get_foreign_key_info(): InnoDB sends the information about
nullability of the foreign key fields and referenced key fields.

fk_check_column_changes(): Enforce the above rules for COPY
algorithm

innobase_check_foreign_drop_col(): Checks whether the dropped
column exists in existing foreign key relation

innobase_check_foreign_low() : Enforce the above rules for
INPLACE algorithm

dict_foreign_t::check_fk_constraint_valid(): This is used
by CREATE TABLE statement to check nullability for foreign
key relation.
2024-10-01 09:41:56 +05:30
Max Kellermann
45298b730b sql/handler: referenced_by_foreign_key() returns bool
The method was declared to return an unsigned integer, but it is
really a boolean (and used as such by all callers).

A secondary change is the addition of "const" and "noexcept" to this
method.

In ha_mroonga.cpp, I also added "inline" to the two helper methods of
referenced_by_foreign_key().  This allows the compiler to flatten the
method.
2024-09-30 16:33:25 +03:00
Marko Mäkelä
d28ac3f82d MDEV-34207: ALTER TABLE...STATS_PERSISTENT=0 fails to drop statistics
commit_try_norebuild(): If the STATS_PERSISTENT attribute of the table
is being changed to disabled, drop the persistent statistics of the table.
2024-09-30 15:27:38 +03:00
Denis Protivensky
231900e5bb MDEV-34836: TOI on parent table must BF abort SR in progress on a child
Applied SR transaction on the child table was not BF aborted by TOI running
on the parent table for several reasons:

Although SR correctly collected FK-referenced keys to parent, TOI in Galera
disregards common certification index and simply sets itself to depend on
the latest certified write set seqno.

Since this write set was the fragment of SR transaction, TOI was allowed to
run in parallel with SR presuming it would BF abort the latter.

At the same time, DML transactions in the server don't grab MDL locks on
FK-referenced tables, thus parent table wasn't protected by an MDL lock from
SR and it couldn't provoke MDL lock conflict for TOI to BF abort SR transaction.

In InnoDB, DDL transactions grab shared MDL locks on child tables, which is not
enough to trigger MDL conflict in Galera.

InnoDB-level Wsrep patch didn't contain correct conflict resolution logic due to
the fact that it was believed MDL locking should always produce conflicts correctly.

The fix brings conflict resolution rules similar to MDL-level checks to InnoDB,
thus accounting for the problematic case.

Apart from that, wsrep_thd_is_SR() is patched to return true only for executing
SR transactions. It should be safe as any other SR state is either the same as
for any single write set (thus making the two logically equivalent), or it reflects
an SR transaction as being aborting or prepared, which is handled separately in
BF-aborting logic, and for regular execution path it should not matter at all.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-24 11:14:01 +02:00
Marko Mäkelä
638c62acac MDEV-34983: Remove x86 asm from InnoDB
Starting with GCC 7 and clang 15, single-bit operations such as
fetch_or(1) & 1 are translated into 80386 instructions such as
LOCK BTS, instead of using the generic translation pattern
of emitting a loop around LOCK CMPXCHG.

Given that the oldest currently supported GNU/Linux distributions
ship GCC 7, and that older versions of GCC are out of support,
let us remove some work-arounds that are not strictly necessary.
If someone compiles the code using an older compiler, it will work
but possibly less efficiently.

srw_mutex_impl::HOLDER: Changed from 1U<<31 to 1 in order to
work around https://github.com/llvm/llvm-project/issues/37322
which is specific to setting the most significant bit.

srw_mutex_impl::WAITER: A multiplier of waiting requests.
This used to be 1, which would now collide with HOLDER.

fil_space_t::set_stopping(): Remove this unused function.

In MSVC we need _interlockedbittestandset() for LOCK BTS.
2024-09-23 12:51:27 +03:00