Commit graph

1122 commits

Author SHA1 Message Date
Marko Mäkelä
18eab4a832 MDEV-26682 Replication timeouts with XA PREPARE
The purpose of non-exclusive locks in a transaction is to guarantee
that the records covered by those locks must remain in that way until
the transaction is committed. (The purpose of gap locks is to ensure
that a record that was nonexistent will remain that way.)

Once a transaction has reached the XA PREPARE state, the only allowed
further actions are XA ROLLBACK or XA COMMIT. Therefore, it can be
argued that only the exclusive locks that the XA PREPARE transaction
is holding are essential.

Furthermore, InnoDB never preserved explicit locks across server restart.
For XA PREPARE transations, we will only recover implicit exclusive locks
for records that had been modified.

Because of the fact that XA PREPARE followed by a server restart will
cause some locks to be lost, we might as well always release all
non-exclusive locks during the execution of an XA PREPARE statement.

lock_release_on_prepare(): Release non-exclusive locks on XA PREPARE.

trx_prepare(): Invoke lock_release_on_prepare() unless the
isolation level is SERIALIZABLE or this is an internal distributed
transaction with the binlog (not actual XA PREPARE statement).

This has been discussed with Sergei Golubchik and Andrei Elkin.

Reviewed by: Sergei Golubchik
2021-10-18 12:49:10 +03:00
Marko Mäkelä
df94aa344b MDEV-26445 followup: Try to work around GCC 4.8.5 ICE on ARMv8
GCC 4.8.5 would crash when compiling trx_purge_truncate_history().
Let us try to avoid that by disabling optimizations for the function.
2021-10-05 07:13:14 +03:00
Marko Mäkelä
88f38661b7 Merge 10.4 into 10.5 2021-09-24 12:14:35 +03:00
Marko Mäkelä
d5bd704f4b Merge 10.3 into 10.4 2021-09-24 12:11:52 +03:00
Marko Mäkelä
4bfdba2e89 MDEV-26672 innodb_undo_log_truncate may reset transaction ID sequence
trx_rseg_header_create(): Add a parameter for the value that is
to be written to TRX_RSEG_MAX_TRX_ID. If we omit this write, then
the updated test innodb.undo_truncate will fail for the 4k, 8k, 16k
page sizes. This was broken ever since
commit 947efe17ed (MDEV-15158)
removed the writes of transaction identifiers to the TRX_SYS page.

srv_do_purge(): Truncate undo tablespaces also during slow shutdown
(innodb_fast_shutdown=0).

Thanks to Krunal Bauskar for noticing this problem.
2021-09-24 11:23:37 +03:00
Marko Mäkelä
f5794e1dc6 MDEV-26445 innodb_undo_log_truncate is unnecessarily slow
trx_purge_truncate_history(): Do not force a write of the undo tablespace
that is being truncated. Instead, prevent page writes by acquiring
an exclusive latch on all dirty pages of the tablespace.

fseg_create(): Relax an assertion that could fail if a dirty undo page
is being initialized during undo tablespace truncation (and
trx_purge_truncate_history() already acquired an exclusive latch on it).

fsp_page_create(): If we are truncating a tablespace, try to reuse
a page that we may have already latched exclusively (because it was
in buf_pool.flush_list). To some extent, this helps the test
innodb.undo_truncate,16k to avoid running out of buffer pool.

mtr_t::commit_shrink(): Mark as clean all pages that are outside the
new bounds of the tablespace, and only add the newly reinitialized pages
to the buf_pool.flush_list.

buf_page_create(): Do not unnecessarily invoke change buffer merge on
undo tablespaces.

buf_page_t::clear_oldest_modification(bool temporary): Move some
assertions to the caller buf_page_write_complete().

innodb.undo_truncate: Use a bigger innodb_buffer_pool_size=24M.
On my system, it would otherwise hang 1 out of 1547 attempts
(on the 40th repeat of innodb.undo_truncate,16k).
Other page sizes were not affected.
2021-09-24 08:24:03 +03:00
Marko Mäkelä
f5fddae3cb MDEV-26450: Corruption due to innodb_undo_log_truncate
At least since commit 055a3334ad
(MDEV-13564) the undo log truncation in InnoDB did not work correctly.

The main issue is that during the execution of
trx_purge_truncate_history() some pages of the newly truncated
undo tablespace could be discarded.

This is improved from commit 1cb218c37c
which was applied to earlier-version branches.

fsp_try_extend_data_file(): Apply the peculiar rounding of
fil_space_t::size_in_header only to the system tablespace,
whose size can be expressed in megabytes in a configuration parameter.
Other files may freely grow by a number of pages.

fseg_alloc_free_page_low(): Do allow the extension of undo tablespaces,
and mention the file name in the error message.

mtr_t::commit_shrink(): Implement crash-safe shrinking of a tablespace:
(1) durably write the log
(2) release the page latches of the rebuilt tablespace
(3) release the mutexes
(4) truncate the file
(5) release the tablespace latch
This is refactored from trx_purge_truncate_history().

log_write_and_flush_prepare(), log_write_and_flush(): New functions
to durably write log during mtr_t::commit_shrink().
2021-09-24 08:22:19 +03:00
Marko Mäkelä
9024498e88 Merge 10.3 into 10.4 2021-09-22 18:26:54 +03:00
Marko Mäkelä
b46cf33ab8 Merge 10.2 into 10.3 2021-09-22 18:01:41 +03:00
Marko Mäkelä
1cb218c37c MDEV-26450: Corruption due to innodb_undo_log_truncate
At least since commit 055a3334ad
(MDEV-13564) the undo log truncation in InnoDB did not work correctly.

The main issue is that during the execution of
trx_purge_truncate_history() some pages of the newly truncated
undo tablespace could be discarded.

fsp_try_extend_data_file(): Apply the peculiar rounding of
fil_space_t::size_in_header only to the system tablespace,
whose size can be expressed in megabytes in a configuration parameter.
Other files may freely grow by a number of pages.

fseg_alloc_free_page_low(): Do allow the extension of undo tablespaces,
and mention the file name in the error message.

mtr_t::commit_shrink(): Implement crash-safe shrinking of a tablespace
file. First, durably write the log, then shrink the file, and finally
release the page latches of the rebuilt tablespace. Refactored from
trx_purge_truncate_history().

log_write_and_flush_prepare(), log_write_and_flush(): New functions
to durably write log during mtr_t::commit_shrink().
2021-09-22 14:15:00 +03:00
Marko Mäkelä
1b2acc5b9d Merge 10.4 into 10.5 2021-08-25 07:53:23 +03:00
Marko Mäkelä
e696e9e63f Merge 10.3 into 10.4 2021-08-25 07:30:47 +03:00
Marko Mäkelä
c0a84fb9b0 MDEV-26465 Race condition in trx_purge_rseg_get_next_history_log()
trx_purge_rseg_get_next_history_log(): Fix a race condition that
was introduced in commit e46f76c974
(MDEV-15912). The buffer pool page contents must not be accessed
while not holding a page latch. The page latch was released by
mtr_t::commit().

This race resulted in an ASAN heap-use-after-poison during a stress test.
2021-08-23 17:00:01 +03:00
Oleksandr Byelkin
ae6bdc6769 Merge branch '10.4' into 10.5 2021-07-31 23:19:51 +02:00
Oleksandr Byelkin
7841a7eb09 Merge branch '10.3' into 10.4 2021-07-31 22:59:58 +02:00
Marko Mäkelä
f50eb0d398 Merge 10.2 into 10.3 2021-07-27 10:47:17 +03:00
Marko Mäkelä
cf1fc59856 MDEV-25594: Improve debug checks
trx_t::will_lock: Changed the type to bool.

trx_t::is_autocommit_non_locking(): Replaces
trx_is_autocommit_non_locking().

trx_is_ac_nl_ro(): Remove (replaced with equivalent assertion expressions).

assert_trx_nonlocking_or_in_list(): Remove.
Replaced with at least as strict checks in each place.

check_trx_state(): Moved to a static function; partially replaced with
individual debug assertions implementing equivalent or stricter checks.

This is a backport of commit 7b51d11cca
from 10.5.
2021-07-27 08:52:01 +03:00
Sergei Golubchik
6190a02f35 Merge branch '10.2' into 10.3 2021-07-21 20:11:07 +02:00
Jagdeep Sidhu
5f8651ac23 Fix switch case statement in trx_flush_log_if_needed_low()
In commit 2e814d4702 on MariaDB 10.2
the switch case statement in trx_flush_log_if_needed_low() regressed.

Since 10.2 this code was refactored to have switches in descending
order, so value of 3 for innodb_flush_log_at_trx_commit is behaving
the same as value of 2, that is no FSYNC is being enforced during
COMMIT phase. The switch should however not be empty and cases 2 and 3
should not have the identical contents.

As per documentation, setting innodb_flush_log_at_trx_commit to 3
should do FSYNC to disk if innodb_flush_log_at_trx_commit is set to 3.
This fixes the regression so that the switch statement again does
what users expect the setting should do.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
2021-07-20 16:05:40 +03:00
Vladislav Vaintroub
fc2ec25733 MDEV-26166 replace log_write_up_to(LSN_MAX,...) with log_buffer_flush_to_disk()
Also, remove comparison lsn > flush/write lsn, prior to calling
log_write_up_to. The checks and early returns are part of this function.
2021-07-16 18:44:58 +02:00
Marko Mäkelä
8af538979b MDEV-25801: buf_flush_dirty_pages() is very slow
In commit 7cffb5f6e8 (MDEV-23399)
the implementation of buf_flush_dirty_pages() was replaced with
a slow one, which would perform excessive scans of the
buf_pool.flush_list and make little progress.

buf_flush_list(), buf_flush_LRU(): Split from buf_flush_lists().
Vladislav Vaintroub noticed that we will not need to invoke
log_flush_task.wait() for the LRU eviction flushing.

buf_flush_list_space(): Replaces buf_flush_dirty_pages().
This is like buf_flush_list(), but operating on a single
tablespace at a time. Writes at most innodb_io_capacity
pages. Returns whether some of the tablespace might remain
in the buffer pool.
2021-06-23 19:06:52 +03:00
Marko Mäkelä
344e59904d Merge 10.4 into 10.5 2021-06-23 08:17:49 +03:00
Marko Mäkelä
09b03ff31b Merge 10.3 into 10.4 2021-06-23 08:05:27 +03:00
Marko Mäkelä
35a9aaebe2 MDEV-25981 InnoDB upgrade fails
trx_undo_mem_create_at_db_start(): Relax too strict upgrade checks
that were introduced in
commit e46f76c974 (MDEV-15912).
On commit, pages will typically be set to TRX_UNDO_CACHED state.
Having the type TRX_UNDO_INSERT in such pages is common and
unproblematic; the type would be reset in trx_undo_reuse_cached().

trx_rseg_array_init(): On failure, clean up the rollback segments
that were initialized so far, to avoid an assertion failure later
during shutdown.
2021-06-22 09:30:25 +03:00
Marko Mäkelä
a1907fed60 Merge 10.4 into 10.5 2021-06-21 18:49:36 +03:00
Marko Mäkelä
ce868cd89e Merge 10.3 into 10.4 2021-06-21 18:44:14 +03:00
Marko Mäkelä
9dc50ea229 MDEV-25979 Invalid page number written to DB_ROLL_PTR
trx_undo_report_row_operation(): Fix a race condition that was introduced
in commit f74023b955 (MDEV-15090).
We must not access undo_block after the page latch has been released
in mtr_t::commit(), because the block could be evicted or replaced.
2021-06-21 18:13:28 +03:00
Marko Mäkelä
a42c80bd48 Merge 10.4 into 10.5 2021-06-21 14:22:22 +03:00
Marko Mäkelä
d3e4fae797 Merge 10.3 into 10.4 2021-06-21 12:38:25 +03:00
Marko Mäkelä
e46f76c974 MDEV-15912: Remove traces of insert_undo
Let us simply refuse an upgrade from earlier versions if the
upgrade procedure was not followed. This simplifies the purge,
commit, and rollback of transactions.

Before upgrading to MariaDB 10.3 or later, a clean shutdown
of the server (with innodb_fast_shutdown=1 or 0) is necessary,
to ensure that any incomplete transactions are rolled back.
The undo log format was changed in MDEV-12288. There is only
one persistent undo log for each transaction.
2021-06-21 12:34:07 +03:00
Marko Mäkelä
7b51d11cca MDEV-25594: Improve debug checks
trx_t::will_lock: Changed the type to bool.

trx_t::is_autocommit_non_locking(): Replaces
trx_is_autocommit_non_locking().

trx_is_ac_nl_ro(): Remove (replaced with equivalent assertion expressions).

assert_trx_nonlocking_or_in_list(): Remove.
Replaced with at least as strict checks in each place.

check_trx_state(): Moved to a static function; partially replaced with
individual debug assertions implementing equivalent or stricter checks.
2021-05-18 09:27:59 +03:00
Nikita Malyavin
3f55c56951 Merge branch bb-10.4-release into bb-10.5-release 2021-05-05 23:57:11 +03:00
Nikita Malyavin
509e4990af Merge branch bb-10.3-release into bb-10.4-release 2021-05-05 23:03:01 +03:00
Nikita Malyavin
a8a925dd22 Merge branch bb-10.2-release into bb-10.3-release 2021-05-04 14:49:31 +03:00
Thirunarayanan Balathandayuthapani
b862377c3e MDEV-25503 InnoDB hangs on startup during recovery
InnoDB startup hangs if a DDL transaction needs to be
rolled back and a recovered transaction on statistics
tables exists. In that case, InnoDB should rollback
the transaction which holds locks on innodb_table_stats
or innodb_index_stats during trx_rollback_or_clean_recovered().
2021-04-27 17:07:37 +05:30
Marko Mäkelä
6c3e860cbf Merge 10.4 into 10.5 2021-04-14 11:35:39 +03:00
Marko Mäkelä
5008171b05 Merge 10.3 into 10.4 2021-04-14 10:33:59 +03:00
Marko Mäkelä
b8c8692fd9 MDEV-24620 ASAN heap-buffer-overflow in btr_pcur_restore_position()
Between btr_pcur_store_position() and btr_pcur_restore_position()
it is possible that purge empties a table and enlarges
index->n_core_fields and index->n_core_null_bytes.
Therefore, we must cache index->n_core_fields in
btr_pcur_t::old_n_core_fields so that btr_pcur_t::old_rec can be
parsed correctly.

Unfortunately, this is a huge change, because we will replace
"bool leaf" parameters with "ulint n_core"
(passing index->n_core_fields, or 0 for non-leaf pages).
For special cases where we know that index->is_instant() cannot hold,
we may also pass index->n_fields.
2021-04-13 10:28:13 +03:00
Marko Mäkelä
6e6318b29b Merge 10.2 into 10.3 2021-04-13 10:26:01 +03:00
Marko Mäkelä
75dd7a0483 MDEV-24434 Assertion trx->in_rw_trx_list... in trx_sys_any_active_transactions()
trx_sys_any_active_transactions(): Remove a bogus debug assertion.
In trx_commit_in_memory() and trx_erase_lists(), we will remove
the transaction from trx_sys->rw_trx_list and set the state to
TRX_STATE_COMMITTED_IN_MEMORY.
2021-04-12 10:53:08 +03:00
Marko Mäkelä
be881ec457 Merge 10.4 into 10.5 2021-03-19 13:09:21 +02:00
Marko Mäkelä
44d70c01f0 Merge 10.3 into 10.4 2021-03-19 11:42:44 +02:00
Marko Mäkelä
867724fd30 MDEV-25125 Assertion failure in fetch_data_into_cache_low()
Before MDEV-14638, there was no race condition between the
execution of fetch_data_into_cache() and transaction commit.

fetch_data_into_cache(): Acquire trx_t::mutex before checking
trx_t::state, to prevent a concurrent transition from
TRX_STATE_COMMITTED_IN_MEMORY to TRX_STATE_NOT_STARTED
in trx_commit_in_memory().
2021-03-18 13:36:02 +02:00
Marko Mäkelä
10d544aa7b Merge 10.4 into 10.5 2021-03-05 12:54:43 +02:00
Marko Mäkelä
8bab5bb332 Merge 10.3 into 10.4 2021-03-05 10:36:51 +02:00
Marko Mäkelä
5bd994b0d5 MDEV-24811 Assertion find(table) failed with innodb_evict_tables_on_commit_debug
This is a backport of commit 18535a4028
from 10.6.

lock_release(): Implement innodb_evict_tables_on_commit_debug.
Before releasing any locks, collect the identifiers of tables to
be evicted. After releasing all locks, look up for the tables and
evict them if it is safe to do so.

trx_commit_in_memory(): Invoke trx_update_mod_tables_timestamp()
before lock_release(), so that our locks will protect the tables
from being evicted.
2021-03-03 10:13:56 +02:00
Sergei Golubchik
25d9d2e37f Merge branch 'bb-10.4-release' into bb-10.5-release 2021-02-15 16:43:15 +01:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Sergei Golubchik
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00
Marko Mäkelä
961c7938bb Merge 10.4 into 10.5 2021-01-25 12:44:24 +02:00