Commit graph

9997 commits

Author SHA1 Message Date
Marko Mäkelä
986d39c3f5 MDEV-29694 follow-up: Simplify mlog_init_t
The Boolean flag mlog_init_t::init::created was only needed by
mark_ibuf_exist(), which commit f27e9c8947
removed. We only need to store the page initialization LSN in the map.
2023-01-25 10:18:12 +02:00
Marko Mäkelä
75c78316d6 Merge 10.11 into 11.0 2023-01-25 10:17:54 +02:00
Marko Mäkelä
a30d4250c2 MDEV-26790 InnoDB read-ahead may cause page writes
buf_LRU_get_free_block(): Replace the Boolean parameter with a
ternary parameter, so that have_no_mutex_soft can be specified
reduce the chances of initiating page eviction flushing in read-ahead.

buf_read_acquire(): Invoke buf_LRU_get_free_block(have_no_mutex_soft)
and check in each caller for a nullptr return value.
2023-01-24 15:23:01 +02:00
Marko Mäkelä
d6aed21621 MDEV-30216 Read-ahead unnecessarily allocates and frees pages when a page is in the buffer pool
buf_pool_t::page_hash_contains(): Check if a page is cached.

buf_read_ahead_random(), buf_read_page_background(),
buf_read_ahead_linear(): Before invoking buf_read_page_low(),
preallocate a buffer page for the read request.

buf_read_page(), buf_page_init_for_read(), buf_read_page_low():
Add a parameter for the buf_pool.page_hash chain, to avoid duplicated
computations.

buf_page_t::read_complete(): Only attempt recovery if an uncompressed
page frame has been allocated.

buf_page_init_for_read(): Before trying to acquire buf_pool.mutex, acquire
an exclusive buf_pool.page_hash latch and check if the page is already
located in the buffer pool. If the buf_pool.mutex is not immediately
available, release both latches and acquire them in the correct order,
and then recheck if the page is already in the buffer pool. This should
hopefully reduce some contention on buf_pool.mutex.

buf_page_init_for_read(), buf_read_page_low(): Input the "recovery needed"
flag in the least significant bit of zip_size.

buf_read_acquire(), buf_read_release(): Interface for allocating and
freeing buffer pages for reading.

buf_read_recv_pages(): Set the flag that recovery is needed.
Other ROW_FORMAT=COMPRESSED reads during recovery
will not need any recovery.
2023-01-24 15:23:01 +02:00
Marko Mäkelä
10635c2833 Merge 10.10 into 10.11 2023-01-24 15:17:39 +02:00
Marko Mäkelä
51fc6b91d2 Merge 10.9 into 10.10 2023-01-24 15:17:10 +02:00
Marko Mäkelä
4d9fe4032b Merge 10.8 into 10.9 2023-01-24 14:59:42 +02:00
Marko Mäkelä
fa543a0f62 Merge 10.7 into 10.8 2023-01-24 14:52:25 +02:00
Marko Mäkelä
cea50896d2 Merge 10.6 into 10.7 2023-01-24 14:35:36 +02:00
Marko Mäkelä
de4030e4d4 MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT
This also fixes part of MDEV-29835 Partial server freeze
which is caused by violations of the latching order that was
defined in https://dev.mysql.com/worklog/task/?id=6326
(WL#6326: InnoDB: fix index->lock contention). Unless the
current thread is holding an exclusive dict_index_t::lock,
it must acquire page latches in a strict parent-to-child,
left-to-right order. Not all cases of MDEV-29835 are fixed yet.
Failure to follow the correct latching order will cause deadlocks
of threads due to lock order inversion.

As part of these changes, the BTR_MODIFY_TREE mode is modified
so that an Update latch (U a.k.a. SX) will be acquired on the
root page, and eXclusive latches (X) will be acquired on all pages
leading to the leaf page, as well as any left and right siblings
of the pages along the path. The DEBUG_SYNC test innodb.innodb_wl6326
will be removed, because at the time the DEBUG_SYNC point is hit,
the thread is actually holding several page latches that will be
blocking a concurrent SELECT statement.

We also remove double bookkeeping that was caused due to excessive
information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo
store information of latched pages, and ensure that
mtr_memo_slot_t::object is never a null pointer.
The tree_blocks[] and tree_savepoints[] were redundant.

buf_page_get_low(): If innodb_change_buffering_debug=1, to avoid
a hang, do not try to evict blocks if we are holding a latch on
a modified page. The test innodb.innodb-change-buffer-recovery
will be removed, because change buffering may no longer be forced
by debug injection when the change buffer comprises multiple pages.
Remove a debug assertion that could fail when
innodb_change_buffering_debug=1 fails to evict a page.
For other cases, the assertion is redundant, because we already
checked that right after the got_block: label. The test
innodb.innodb-change-buffering-recovery will be removed, because
due to this change, we will be unable to evict the desired page.

mtr_t::lock_register(): Register a change of a page latch
on an unmodified buffer-fixed block.

mtr_t::x_latch_at_savepoint(), mtr_t::sx_latch_at_savepoint():
Replaced by the use of mtr_t::upgrade_buffer_fix(), which now
also handles RW_S_LATCH.

mtr_t::set_modified(): For temporary tables, invoke
buf_page_t::set_modified() here and not in mtr_t::commit().
We will never set the MTR_MEMO_MODIFY flag on other than
persistent data pages, nor set mtr_t::m_modifications when
temporary data pages are modified.

mtr_t::commit(): Only invoke the buf_flush_note_modification() loop
if persistent data pages were modified.

mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo.
This avoids many redundant entries in mtr_t::m_memo, as well as
redundant calls to buf_page_get_gen() for blocks that had already
been looked up in a mini-transaction.

btr_get_latched_root(): Return a pointer to an already latched root page.
This replaces btr_root_block_get() in cases where the mini-transaction
has already latched the root page.

btr_page_get_parent(): Fetch a parent page that was already latched
in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched().
If needed, upgrade the root page U latch to X.
This avoids bloating mtr_t::m_memo as well as performing redundant
buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for
B-tree defragmentation, we will invoke btr_cur_search_to_nth_level().

btr_cur_search_to_nth_level(): This will only be used for non-leaf
(level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE
or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be
removed altogether, or retained for the case of
CHECK TABLE without QUICK.

btr_cur_t::left_block: Remove. btr_pcur_move_backward_from_page()
can retrieve the left sibling from the end of mtr_t::m_memo.

btr_cur_t::open_leaf(): Some clean-up.

btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level()
for searches to level=0 (the leaf level). We will never release
parent page latches before acquiring leaf page latches. If we need to
temporarily release the level=1 page latch in the BTR_SEARCH_PREV or
BTR_MODIFY_PREV latch_mode, we will reposition the cursor on the
child node pointer so that we will land on the correct leaf page.

btr_cur_t::pessimistic_search_leaf(): Implement new BTR_MODIFY_TREE
latching logic in the case that page splits or merges will be needed.
The parent pages (and their siblings) should already be latched on
the first dive to the leaf and be present in mtr_t::m_memo; there
should be no need for BTR_CONT_MODIFY_TREE. This pre-latching almost
suffices; it must be revised in MDEV-29835 and work-arounds removed
for cases where mtr_t::get_already_latched() fails to find a block.

rtr_search_to_nth_level(): A SPATIAL INDEX version of
btr_search_to_nth_level() that can search to any level
(including the leaf level).

rtr_search_leaf(), rtr_insert_leaf(): Wrappers for
rtr_search_to_nth_level().

rtr_search(): Replaces rtr_pcur_open().

rtr_latch_leaves(): Replaces btr_cur_latch_leaves(). Note that unlike
in the B-tree code, there is no error handling in case the sibling
pages are corrupted.

rtr_cur_restore_position(): Remove an unused constant parameter.

btr_pcur_open_on_user_rec(): Remove the constant parameter
mode=PAGE_CUR_GE.

row_ins_clust_index_entry_low(): Use a new
mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page
when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC.

BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove.

BTR_CONT_MODIFY_TREE: Note that this is only used by
rtr_search_to_nth_level().

btr_pcur_optimistic_latch_leaves(): Replaces
btr_cur_optimistic_latch_leaves().

ibuf_delete_rec(): Acquire exclusive ibuf.index->lock in order
to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV).

btr_blob_log_check_t(): Acquire a U latch on the root page,
so that btr_page_alloc() in btr_store_big_rec_extern_fields()
will avoid a deadlock.

btr_store_big_rec_extern_fields(): Assert that the root page latch
is being held.

Tested by: Matthias Leich
Reviewed by: Vladislav Lesin
2023-01-24 14:09:21 +02:00
Denis Protivensky
39f4674599 MDEV-24623 Replicate bulk insert as table-level exclusive key
- introduce table key construction function in wsrep service interface
- don't add row keys when replicating bulk insert
- don't start bulk insert on applier or when transaction is not active
- don't start bulk insert on system versioned tables
- implement actual bulk insert table-level key replication

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2023-01-24 11:54:25 +02:00
Thirunarayanan Balathandayuthapani
ef6b3806bb MDEV-30393 InnoDB: Assertion failure in dict0dict.cc upon ADD FULLTEXT INDEX
Problem:
========
- InnoDB fails to remove the newly created table or index from
data dictionary and table cache if the alter fails in commit phase

Solution:
========
- InnoDB should restart the transaction to remove the newly
created table and index when it fails in commit phase of an alter
operation. innodb_fts.misc_debug tests the scenario with the
help of debug point "stats_lock_fail"
2023-01-24 13:13:52 +05:30
Marko Mäkelä
aafe85ecb1 MDEV-30447: use of undeclared identifier O_DIRECT
In commit 24648768b4, some use of
O_DIRECT was added without proper #ifdef guard. That broke the
compilation in environments that do not define O_DIRECT, such as
OpenBSD.
2023-01-24 09:03:06 +02:00
Marko Mäkelä
e41fb3697c Revert "MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT"
This reverts commit f9cac8d2cb
which was accidentally pushed prematurely.
2023-01-23 14:52:49 +02:00
Marko Mäkelä
851c56771e Merge 10.5 into 10.6 2023-01-23 13:15:41 +02:00
Thirunarayanan Balathandayuthapani
647a7232ff MDEV-30438 innodb.undo_truncate,4k fails when innodb-immediate-scrub-data-uncompressed is enabled
- InnoDB fails to clear the freed ranges during truncation of innodb
undo log tablespace. During shutdown, InnoDB flushes the freed page
ranges and throws the out of bound error.

mtr_t::commit_shrink(): clear the freed ranges while doing undo
tablespace truncation
2023-01-23 09:55:49 +05:30
Marko Mäkelä
f9cac8d2cb MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT
This also fixes part of MDEV-29835 Partial server freeze
which is caused by violations of the latching order that was
defined in https://dev.mysql.com/worklog/task/?id=6326
(WL#6326: InnoDB: fix index->lock contention). Unless the
current thread is holding an exclusive dict_index_t::lock,
it must acquire page latches in a strict parent-to-child,
left-to-right order. Not all cases are fixed yet. Failure to
follow the correct latching order will cause deadlocks of threads
due to lock order inversion.

As part of these changes, the BTR_MODIFY_TREE mode is modified
so that an Update latch (U a.k.a. SX) will be acquired on the
root page, and eXclusive latches (X) will be acquired on all pages
leading to the leaf page, as well as any left and right siblings
of the pages along the path. The test innodb.innodb_wl6326
will be removed, because at the time the DEBUG_SYNC point is hit,
the thread is actually holding several page latches that will be
blocking a concurrent SELECT statement.

We also remove double bookkeeping that was caused due to excessive
information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo
store information of latched pages, and ensure that
mtr_memo_slot_t::object is never a null pointer.
The tree_blocks[] and tree_savepoints[] were redundant.

mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo.
This avoids many redundant entries in mtr_t::m_memo, as well as
redundant calls to buf_page_get_gen() for blocks that had already
been looked up in a mini-transaction.

btr_get_latched_root(): Return a pointer to an already latched root page.
This replaces btr_root_block_get() in cases where the mini-transaction
has already latched the root page.

btr_page_get_parent(): Fetch a parent page that was already latched
in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched().
If needed, upgrade the root page U latch to X.
This avoids bloating mtr_t::m_memo as well as redundant
buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for
B-tree defragmentation, we will invoke btr_cur_search_to_nth_level().

btr_cur_search_to_nth_level(): This will only be used for non-leaf
(level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE
or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be
removed altogether, or retained for the case of
CHECK TABLE without QUICK.

btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level()
for searches to level=0 (the leaf level).

btr_cur_t::pessimistic_search_leaf(): Implement the new
BTR_MODIFY_TREE latching logic in the case that page splits
or merges will be needed. The parent pages (and their siblings)
should already be latched on the first dive to the leaf and be
present in mtr_t::m_memo; there should be no need for
BTR_CONT_MODIFY_TREE. This pre-latching almost suffices;
MDEV-29835 will have to revise it and remove work-arounds where
mtr_t::get_already_latched() fails to find a block.

rtr_search_to_nth_level(): A SPATIAL INDEX version of
btr_search_to_nth_level() that can search to any level
(including the leaf level).

rtr_search_leaf(), rtr_insert_leaf(): Wrappers for
rtr_search_to_nth_level().

rtr_search(): Replaces rtr_pcur_open().

rtr_cur_restore_position(): Remove an unused constant parameter.

btr_pcur_open_on_user_rec(): Remove the constant parameter
mode=PAGE_CUR_GE.

btr_cur_latch_leaves(): Update a pre-existing mtr_t::m_memo entry
for the current leaf page.

row_ins_clust_index_entry_low(): Use a new
mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page
when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC.

btr_cur_t::open_leaf(): Some clean-up.

mtr_t::lock_register(): Register a page latch on a buffer-fixed block.

BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove.

BTR_CONT_MODIFY_TREE: Note that this is only used by
rtr_search_to_nth_level().

btr_pcur_optimistic_latch_leaves(): Replaces
btr_cur_optimistic_latch_leaves().

ibuf_delete_rec(): Acquire ibuf.index->lock.u_lock() in order
to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV).

Tested by: Matthias Leich
2023-01-19 17:19:18 +02:00
Marko Mäkelä
67dc8af2a7 MDEV-30289: Implement small_vector for mtr_t::m_memo
To avoid heap memory allocation overhead for mtr_t::m_memo,
we will allocate a small number of elements statically in
mtr_t::m_memo::small. Only if that preallocated data is
insufficient, we will invoke my_alloc() or my_realloc() for
more storage. The implementation of the data structure is
inspired by llvm::SmallVector.
2023-01-19 16:10:29 +02:00
Marko Mäkelä
7fa5cce305 MDEV-30289: Remove the pointer indirection for mtr_t::m_memo 2023-01-19 16:10:18 +02:00
Oleksandr Byelkin
66bd8cd6c3 Merge branch '10.10' into 10.11 2023-01-18 16:58:28 +01:00
Oleksandr Byelkin
45087dd0b3 Merge branch '10.9' into 10.10 2023-01-18 16:45:59 +01:00
Oleksandr Byelkin
08d4968404 Merge branch '10.8' into 10.9 2023-01-18 16:39:11 +01:00
Oleksandr Byelkin
26d8485244 Merge branch '10.7' into 10.8 2023-01-18 16:37:40 +01:00
Oleksandr Byelkin
795ff0daf0 Merge branch '10.6' into 10.7 2023-01-18 16:36:13 +01:00
Marko Mäkelä
a8c5635cf1 Merge 10.5 into 10.6 2023-01-17 20:02:29 +02:00
Jan Lindström
179c283372 Merge branch 10.4 into 10.5 2023-01-14 08:25:57 +02:00
sjaakola
a44d896f98 10.4-MDEV-29684 Fixes for cluster wide write conflict resolving
If two high priority threads have lock conflict, we look at the
order of these transactions and honor the earlier transaction.
for_locking parameter in lock_rec_has_to_wait() has become
obsolete and it is now removed from the code .

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2023-01-14 07:50:04 +02:00
sjaakola
66c05326d2 MDEV-29684 Fixes for cluster wide write conflict resolving
Cluster conflict victim's THD is marked with wsrep_aborter.
THD::wsrep_aorter holds the thread ID of the hight priority tread,
which is currently carrying out BF aborting for this victim.

However, the BF abort operation is not always successful,
and in such case the wsrep_aborter mark should be removed.
In the old code, this wsrep_aborter resetting did not happen,
and this could lead to a situation where the sticky wsrep_aborter
mark prevents any further attempt to BF abort this transaction.

This commit fixes this issue, and resets wsrep_aborter after
unsuccesful BF abort attempt.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2023-01-13 13:11:03 +02:00
Marko Mäkelä
44dce3b207 MDEV-29986 Set innodb_undo_tablespaces=3 by default
Starting with commit baf276e6d4 (MDEV-19229)
the parameter innodb_undo_tablespaces can be increased from its
previous default value 0 while allowing an upgrade from old databases.

We will change the default setting to innodb_undo_tablespaces=3
so that the space occupied by possible bursts of undo log records
can be reclaimed after SET GLOBAL innodb_undo_log_truncate=ON.

We will not enable innodb_undo_log_truncate by default, because it
causes some observable performance degradation.

Special thanks to Thirunarayanan Balathandayuthapani for diagnosing
and fixing a number of bugs related to this new default setting.

Tested by: Matthias Leich, Axel Schwenke, Vladislav Vaintroub
(with both values of innodb_undo_log_truncate)
2023-01-13 12:46:30 +02:00
Marko Mäkelä
d6d85c92ee Merge 10.11 into 11.0 2023-01-13 12:33:12 +02:00
Marko Mäkelä
bb3a63903e Merge 10.10 into 10.11 2023-01-13 12:22:30 +02:00
Marko Mäkelä
6ffe9ad0d4 Merge 10.9 into 10.10 2023-01-13 11:45:57 +02:00
Marko Mäkelä
5d5735c181 Merge 10.8 into 10.9 2023-01-13 11:22:29 +02:00
Marko Mäkelä
88c35781cc Merge 10.7 into 10.8 2023-01-13 11:11:04 +02:00
Marko Mäkelä
1e04cafcba Merge 10.6 into 10.7 2023-01-13 10:47:56 +02:00
Marko Mäkelä
3386b30975 Merge 10.5 into 10.6 2023-01-13 10:45:41 +02:00
Marko Mäkelä
73ecab3d26 Merge 10.4 into 10.5 2023-01-13 10:18:30 +02:00
Marko Mäkelä
71e8e4934d Merge 10.3 into 10.4 2023-01-13 09:28:25 +02:00
Nikita Malyavin
7a98d232e4 MDEV-30378 Versioned REPLACE succeeds with ON DELETE RESTRICT constraint
node->is_delete was incorrectly set to NO_DELETE for a set of operations.

In general we shouldn't rely on sql_command and look for more abstract ways
to control the behavior.

trg_event_map seems to be a suitable way. To mind replica nodes, it is ORed
with slave_fk_event_map, which stores trg_event_map when replica has
triggers disabled.
2023-01-12 21:51:48 +03:00
Marko Mäkelä
944beb9e7a MDEV-19506 Remove the global sequence DICT_HDR_ROW_ID for DB_ROW_ID
InnoDB tables that lack a primary key (and any UNIQUE INDEX whose
all columns are NOT NULL) will use an internally generated index,
called GEN_CLUST_INDEX(DB_ROW_ID) in the InnoDB data dictionary,
and hidden from the SQL layer.

The 48-bit (6-byte) DB_ROW_ID is being assigned from a
global sequence that is persisted in the DICT_HDR page.

There is absolutely no reason for the DB_ROW_ID to be globally
unique across all InnoDB tables.

A downgrade to earlier versions will be prevented by the file format
change related to removing the InnoDB change buffer (MDEV-29694).

DICT_HDR_ROW_ID, dict_sys_t::row_id: Remove.

dict_table_t::row_id: The per-table sequence of DB_ROW_ID.

commit_try_rebuild(): Copy dict_table_t::row_id from the old table.

btr_cur_instant_init(), row_import_cleanup(): If needed, perform
the equivalent of SELECT MAX(DB_ROW_ID) to initialize
dict_table_t::row_id.

row_ins(): If needed, obtain DB_ROW_ID from dict_table_t::row_id.
Should it exceed the maximum 48-bit value, return DB_OUT_OF_FILE_SPACE
to prevent further inserts into the table.

dict_load_table_one(): Move a condition to btr_cur_instant_init_low()
so that dict_table_t::row_id will be restored also for
ROW_FORMAT=COMPRESSED tables.

Tested by: Matthias Leich
2023-01-11 17:59:55 +02:00
Marko Mäkelä
f27e9c8947 MDEV-29694 Remove the InnoDB change buffer
The purpose of the change buffer was to reduce random disk access,
which could be useful on rotational storage, but maybe less so on
solid-state storage.
When we wished to
(1) insert a record into a non-unique secondary index,
(2) delete-mark a secondary index record,
(3) delete a secondary index record as part of purge (but not ROLLBACK),
and the B-tree leaf page where the record belongs to is not in the buffer
pool, we inserted a record into the change buffer B-tree, indexed by
the page identifier. When the page was eventually read into the buffer
pool, we looked up the change buffer B-tree for any modifications to the
page, applied these upon the completion of the read operation. This
was called the insert buffer merge.

We remove the change buffer, because it has been the source of
various hard-to-reproduce corruption bugs, including those fixed in
commit 5b9ee8d819 and
commit 165564d3c3 but not limited to them.

A downgrade will fail with a clear message starting with
commit db14eb16f9 (MDEV-30106).

buf_page_t::state: Merge IBUF_EXIST to UNFIXED and
WRITE_FIX_IBUF to WRITE_FIX.

buf_pool_t::watch[]: Remove.

trx_t: Move isolation_level, check_foreigns, check_unique_secondary,
bulk_insert into the same bit-field. The only purpose of
trx_t::check_unique_secondary is to enable bulk insert into an
empty table. It no longer enables insert buffering for UNIQUE INDEX.

btr_cur_t::thr: Remove. This field was originally needed for change
buffering. Later, its use was extended to cover SPATIAL INDEX.
Much of the time, rtr_info::thr holds this field. When it does not,
we will add parameters to SPATIAL INDEX specific functions.

ibuf_upgrade_needed(): Check if the change buffer needs to be updated.

ibuf_upgrade(): Merge and upgrade the change buffer after all redo log
has been applied. Free any pages consumed by the change buffer, and
zero out the change buffer root page to mark the upgrade completed,
and to prevent a downgrade to an earlier version.

dict_load_tablespaces(): Renamed from
dict_check_tablespaces_and_store_max_id(). This needs to be invoked
before ibuf_upgrade().

btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics.
The change buffer merge does not need this function anymore.

btr_page_alloc(): Renamed from btr_page_alloc_low(). We no longer
allocate any change buffer pages.

btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics.
The change buffer merge does not need this function anymore.

row_search_index_entry(), btr_lift_page_up(): Add a parameter thr
for the SPATIAL INDEX case.

rtr_page_split_and_insert(): Specialized from btr_page_split_and_insert().

rtr_root_raise_and_insert(): Specialized from btr_root_raise_and_insert().

Note: The support for upgrading from the MySQL 3.23 or MySQL 4.0
change buffer format that predates the MySQL 4.1 introduction of
the option innodb_file_per_table was removed in MySQL 5.6.5
as part of mysql/mysql-server@69b6241a79
and MariaDB 10.0.11 as part of 1d0f70c2f8.

In the tests innodb.log_upgrade and innodb.log_corruption, we create
valid (upgraded) change buffer pages.

Tested by: Matthias Leich
2023-01-11 17:59:36 +02:00
Marko Mäkelä
24648768b4 MDEV-30136: Deprecate innodb_flush_method
We introduce the following settable Boolean global variables:

innodb_log_file_write_through: Whether writes to ib_logfile0 are
write-through (disabling any caching, as in O_SYNC or O_DSYNC).

innodb_data_file_write_through: Whether writes to any InnoDB data files
(including the temporary tablespace) are write-through.

innodb_data_file_buffering: Whether the file system cache is enabled
for InnoDB data files.

All these parameters are OFF by default, that is, the file system cache
will be disabled, but any hardware caching is enabled, that is,
explicit calls to fsync(), fdatasync() or similar functions are needed.

On systems that support FUA it may make sense to enable write-through,
to avoid extra system calls.

If the deprecated read-only start-up parameter is set to one of the
following values, then the values of the 4 Boolean flags (the above 3
plus innodb_log_file_buffering) will be set as follows:

O_DSYNC:
innodb_log_file_write_through=ON, innodb_data_file_write_through=ON,
innodb_data_file_buffering=OFF, and
(if supported) innodb_log_file_buffering=OFF.

fsync, littlesync, nosync, or (Microsoft Windows specific) normal:
innodb_log_file_write_through=OFF, innodb_data_file_write_through=OFF,
and innodb_data_file_buffering=ON.

Note: fsync() or fdatasync() will only be disabled if the separate
parameter debug_no_sync (in the code, my_disable_sync) is set.

In mariadb-backup, the parameter innodb_flush_method will be ignored.

The Boolean parameters can be modified by SET GLOBAL while the
server is running. This will require reopening the ib_logfile0
or all currently open InnoDB data files.

We will open files straight in O_DSYNC or O_SYNC mode when applicable.
Data files we will try to open straight in O_DIRECT mode when the
page size is at least 4096 bytes. For atomically creating data files,
we will invoke os_file_set_nocache() to enable O_DIRECT afterwards,
because O_DIRECT is not supported on some file systems. We will also
continue to invoke os_file_set_nocache() on ib_logfile0 when
innodb_log_file_buffering=OFF can be fulfilled.

For reopening the ib_logfile0, we use the same logic that was developed
for online log resizing and reused for updates of
innodb_log_file_buffering.

Reopening all data files is implemented in the new function
fil_space_t::reopen_all().

Reviewed by: Vladislav Vaintroub
Tested by: Matthias Leich
2023-01-11 17:55:56 +02:00
Marko Mäkelä
e581396b7a MDEV-29983 Deprecate innodb_file_per_table
Before commit 6112853cda in MySQL 4.1.1
introduced the parameter innodb_file_per_table, all InnoDB data was
written to the InnoDB system tablespace (often named ibdata1).
A serious design problem is that once the system tablespace has grown to
some size, it cannot shrink even if the data inside it has been deleted.

There are also other design problems, such as the server hang MDEV-29930
that should only be possible when using innodb_file_per_table=0 and
innodb_undo_tablespaces=0 (storing both tables and undo logs in the
InnoDB system tablespace).

The parameter innodb_change_buffering was deprecated
in commit b5852ffbee.
Starting with commit baf276e6d4
(MDEV-19229) the number of innodb_undo_tablespaces can be increased,
so that the undo logs can be moved out of the system tablespace
of an existing installation.

If all these things (tables, undo logs, and the change buffer) are
removed from the InnoDB system tablespace, the only variable-size
data structure inside it is the InnoDB data dictionary.

DDL operations on .ibd files was optimized in
commit 86dc7b4d4c (MDEV-24626).
That should have removed any thinkable performance advantage of
using innodb_file_per_table=0.

Since there should be no benefit of setting innodb_file_per_table=0,
the parameter should be deprecated. Starting with MySQL 5.6 and
MariaDB Server 10.0, the default value is innodb_file_per_table=1.
2023-01-11 17:55:56 +02:00
Marko Mäkelä
3a237f7666 Merge 10.10 into 10.11 2023-01-11 11:13:56 +02:00
Marko Mäkelä
b218dfead2 Remove an unused parameter
lock_rec_has_to_wait(): Remove the unused parameter for_locking
that had been originally added
in commit df4dd593f2
2023-01-11 08:37:27 +02:00
Marko Mäkelä
cae5a0328b Merge 10.9 into 10.10 2023-01-10 15:06:25 +02:00
Marko Mäkelä
820ebcec86 Merge 10.8 into 10.9 2023-01-10 14:50:58 +02:00
Marko Mäkelä
92c8d6f168 Merge 10.7 into 10.8
The MDEV-25004 test innodb_fts.versioning is omitted because ever since
commit 685d958e38 InnoDB would not allow
writes to a database where the redo log file ib_logfile0 is missing.
2023-01-10 14:42:50 +02:00
Marko Mäkelä
ab36eac584 Merge 10.6 into 10.7 2023-01-10 13:58:03 +02:00
Marko Mäkelä
56c9b0bca0 Merge 10.5 into 10.6 2023-01-10 13:54:17 +02:00