Commit graph

6011 commits

Author SHA1 Message Date
Eugene Kosov
4a90bb85c0 InnoDB: fix debug assertion 2020-08-24 20:54:19 +03:00
Marko Mäkelä
f3160ee44f MDEV-22782 AddressSanitizer race condition in trx_free()
In trx_free() we used to declare the entire trx_t unaccessible
and then declare that some data members are accessible.
This involves a race condition with other threads that may concurrently
access the data members that must remain accessible.
One type of error is "AddressSanitizer: unknown-crash", whose
exact cause we have not determined.

Another type of error (reported in MDEV-23472) is "use-after-poison",
where the reported shadow bytes would in fact be 00, indicating that
the memory was no longer poisoned. The poison-access-unpoison race
condition was confirmed by "rr replay".

We eliminate the race condition by invoking MEM_NOACCESS on each
individual data member of trx_t before freeing the memory to the pool.
The memory would not be unpoisoned until the pool is freed
or the memory is being reused for another allocation.

trx_t::free(): Replaces trx_free().

trx_t::active_commit_ordered: Changed to bool, so that MEM_NOACCESS
can be invoked. Removed some accessor functions.

Pool: Remove all MEM_ instrumentation.

TrxFactory: Move the MEM_ instrumentation from Pool.

TrxFactory::debug(): Removed. Moved to trx_t::free(). Because
the memory was already marked unaccessible in trx_t::free(), the
Factory::debug() call in Pool::putl() would be unable to access it.

trx_allocate_for_background(): Replaces trx_create_low().

trx_t::free(): Perform all consistency checks while avoiding
duplication, and declare most data members unaccessible.
2020-08-21 18:23:28 +03:00
Thirunarayanan Balathandayuthapani
a79c257894 MDEV-23452 Assertion `buf_page_get_io_fix(bpage) == BUF_IO_NONE' failed
in buf_page_set_sticky

- Adding os_thread_yield() in buf_page_create() to avoid the continuous
buffer pool mutex acquistions.
2020-08-20 11:38:10 +05:30
Thirunarayanan Balathandayuthapani
e9d6f1c7ac MDEV-23452 Assertion `buf_page_get_io_fix(bpage) == BUF_IO_NONE' failed
in buf_page_set_sticky

commit a1f899a8ab (MDEV-23233) added the
code to make page sticky. So that InnoDB can't allow the page to
be grabbed by other thread while doing lazy drop of ahi.

But the block could be in flush list and it could have io_fix value
as BUF_IO_WRITE. It could lead to the failure in buf_page_set_sticky().

buf_page_create(): If btr_search_drop_page_hash_index() must be invoked,
take x-latch on the block. If the block io_fix value is other than
BUF_IO_NONE, release the buffer pool mutex and page hash lock and
wait for I/O to complete.
2020-08-20 11:34:53 +05:30
Marko Mäkelä
22c4a7512f MDEV-23514 Race conditions between ROLLBACK and ALTER TABLE
Since commit 1509363970 (MDEV-23484)
the rollback of InnoDB transactions is no longer protected by
dict_operation_lock. Removing that protection revealed a race
condition between transaction rollback and the rollback of an
online table-rebuilding operation (OPTIMIZE TABLE, or any online
ALTER TABLE that is rebuilding the table).

row_undo_mod_clust(): Re-check dict_index_is_online_ddl() after
acquiring index->lock, similar to how row_undo_ins_remove_clust_rec()
is doing it. Because innobase_online_rebuild_log_free() is holding
exclusive index->lock while invoking row_log_free(), this re-check
will ensure that row_log_table_low() will not be invoked when
index->online_log=NULL.

A different race condition is possible between the rollback of a
recovered transaction and the start of online secondary index creation.
Because prepare_inplace_alter_table_dict() is not acquiring an InnoDB
table lock in this case, and because recovered transactions are not
covered by metadata locks (MDL), the dict_table_t::indexes could be
modified by prepare_inplace_alter_table_dict() while the rollback of
a recovered transaction is being executed. Normal transactions would
be covered by MDL, and during prepare_inplace_alter_table_dict() we
do hold MDL_EXCLUSIVE, that is, an online ALTER TABLE operation may
not execute concurrently with other transactions that have accessed
the table.

row_undo(): To prevent a race condition with
prepare_inplace_alter_table_dict(), acquire dict_operation_lock
for all recovered transactions. Before MDEV-23484 we used to acquire
it for all transactions, not only recovered ones.

Note: row_merge_drop_indexes() would not invoke
dict_index_remove_from_cache() while transactional locks
exist on the table, or while any thread is holding an open table handle.
OK, it does that for FULLTEXT INDEX, but ADD FULLTEXT INDEX is not
supported as an online operation, and therefore
prepare_inplace_alter_table_dict() would acquire a table S lock,
which cannot succeed as long as recovered transactions on the table
exist, because they would hold a conflicting IX lock on the table.
2020-08-20 08:34:55 +03:00
Marko Mäkelä
309302a3da MDEV-23475 InnoDB performance regression for write-heavy workloads
In commit fe39d02f51 (MDEV-20638)
we removed some wake-up signaling of the master thread that should
have been there, to ensure a steady log checkpointing workload.

Common sense suggests that the commit omitted some necessary calls
to srv_inc_activity_count(). But, an attempt to add the call to
trx_flush_log_if_needed_low() as well as to reinstate the function
innobase_active_small() did not restore the performance for the
case where sync_binlog=1 is set.

Therefore, we will revert the entire commit in MariaDB Server 10.2.
In MariaDB Server 10.5, adding a srv_inc_activity_count() call to
trx_flush_log_if_needed_low() did restore the performance, so we
will not revert MDEV-20638 across all versions.
2020-08-19 11:18:56 +03:00
Marko Mäkelä
1509363970 MDEV-23484 Rollback unnecessarily acquires dict_operation_lock for every row
InnoDB transaction rollback includes an unnecessary work-around for
a data corruption bug that was fixed by me in MySQL 5.6.12
mysql/mysql-server@935ba09d52
and ported to MariaDB 10.0.8 by
commit c291ddfdf7
in 2013 and 2014, respectively.

By acquiring and releasing dict_operation_lock in shared mode,
row_undo() hopes to prevent the table from being dropped while
the undo log record is being rolled back. But, thanks to mentioned fix,
debug assertions (that we are adding) show that the rollback is
protected by transactional locks (table IX lock, in addition to
implicit or explicit exclusive locks on the records that had been modified).

Because row_drop_table_for_mysql() would invoke
row_add_table_to_background_drop_list() if any locks exist on the table,
the mere existence of locks (which is guaranteed during ROLLBACK) is
enough to protect the table from disappearing. Hence, acquiring and
releasing dict_operation_lock for every row that is being rolled back is
unnecessary.

row_undo(): Remove the unnecessary acquisition and release of
dict_operation_lock.

Note: row_add_table_to_background_drop_list() is mostly working around
bugs outside InnoDB:
MDEV-21175 (insufficient MDL protection of FOREIGN KEY operations)
MDEV-21602 (incorrect error handling of CREATE TABLE...SELECT).
2020-08-18 17:30:34 +03:00
Marko Mäkelä
4c50120d14 MDEV-23474 InnoDB fails to restart after SET GLOBAL innodb_log_checksums=OFF
Regretfully, the parameter innodb_log_checksums was introduced
in MySQL 5.7.9 (the first GA release of that series) by
mysql/mysql-server@af0acedd88
which partly replaced a parameter that had been introduced in 5.7.8
mysql/mysql-server@22ba38218e
as innodb_log_checksum_algorithm.

Given that the CRC-32C operations are accelerated on many processor
implementations (AMD64 with SSE4.2; since MDEV-22669 also on IA-32
with SSE4.2, POWER 8 and later, ARMv8 with some extensions)
and by lookup tables when only generic SISD instructions are available,
there should be no valid reason to disable checksums.

In MariaDB 10.5.2, as a preparation for MDEV-12353, MDEV-19543 deprecated
and ignored the parameter innodb_log_checksums altogether. This should
imply that after a clean shutdown with innodb_log_checksums=OFF one
cannot upgrade to MariaDB Server 10.5 at all.

Due to these problems, let us deprecate the parameter innodb_log_checksums
and honor it only during server startup.
The command SET GLOBAL innodb_log_checksums will always set the
parameter to ON.
2020-08-18 16:46:07 +03:00
Thirunarayanan Balathandayuthapani
8268f26605 MDEV-22934 Table disappear after two alter table command
Problem:
=======
InnoDB drops the column which has foreign key relations on it. So it
tries to load the foreign key during rename process of copy algorithm
even though the foreign_key_check is disabled.

Solution:
========
During alter copy algorithm, InnoDB ignores the error while loading
the foreign key constraint if foreign key check is disabled. It
should throw the warning about failure of the foreign key constraint
when foreign key check is disabled.
2020-08-18 15:05:23 +05:30
Thirunarayanan Balathandayuthapani
362b18c536 MDEV-23380 InnoDB reads a page from disk despite parsing MLOG_INIT_FILE_PAGE2 record
This problem is caused by 6697135c6d
(MDEV-21572). During recovery, InnoDB prefetches the siblings of
change buffer index leaf page. It does asynchronous page read
and recovery scenario wasn't handled in buf_read_page_background().
It leads to the refusal of startup of the server.

Solution:
=========
  InnoDB shouldn't allow the change buffer index page siblings
to be prefetched.
2020-08-18 14:59:16 +05:30
Marko Mäkelä
3e617b8bef Merge 10.1 into 10.2 2020-08-13 17:50:40 +03:00
Marko Mäkelä
7c2aad6be2 MDEV-23463 fil_page_decompress() debug check wastes 128KiB of stack
fil_page_decompress(): Remove a rather useless debug check.
We should have test coverage for reading page_compressed pages
from files, either due to buffer pool page eviction or due to
server restarts.

A similar check was removed from fil_space_encrypt() in
commit 0b36c27e0c (MDEV-20307).
2020-08-13 17:43:37 +03:00
Marko Mäkelä
182e2d4a6c Merge 10.1 into 10.2 2020-08-13 07:38:35 +03:00
Marko Mäkelä
101ce10d0d MDEV-20672 Inconsistent usage message for innodb_compression_algorithm
The usage message for the innodb_compression_algorithm system variable
did not list snappy, which was added as an optional compression algorithm
in MariaDB 10.1.3 and might actually work since
commit 90c52e5291 (MDEV-12615)
in MariaDB 10.1.24.

Unfortunately, we will include also unavailable compression algorithms
in the list, because ENUM parameters allow numeric values, and we do
not want innodb_compression_algorithm=3 to change meaning depending on
the way how the source code was compiled.
2020-08-12 18:35:21 +03:00
Marko Mäkelä
efd8af535a MDEV-19526 heap number overflow on innodb_page_size=64k
InnoDB only reserves 13 bits for the heap number in the record header,
limiting the heap number to be at most 8191. But, when using
innodb_page_size=64k and secondary index records of 7 bytes each,
it is possible to exceed the maximum heap number.

btr_cur_optimistic_insert(): Let the operation fail if the
maximum number of records would be exceeded.

page_mem_alloc_heap(): Move to the same compilation unit with the
only caller, and let the operation fail if the maximum heap number
has been allocated already.
2020-08-12 18:21:53 +03:00
Marko Mäkelä
18f374cb20 MDEV-23439 Assertion size == space->size failed in buf_read_ahead_random
The debug assertion is bogus, and we had removed it in
commit b1ab211dee (MDEV-15053)
in the MariaDB Server 10.5 branch.

For a small data file, fil_space_extend_must_retry() would always
allocate a minimum size of 4*innodb_page_size.

It is possible that random read-ahead will be triggered for
a smaller file than this. In the observed case, the read-ahead
was triggered for a 6-page file that used ROW_FORMAT=COMPRESSED
with 8KiB page size. So, the desired file size was 49152 bytes,
but the actual size was 65536 bytes.
2020-08-12 13:12:51 +03:00
Marko Mäkelä
c96be848d3 MDEV-14119 Assertion cmp_rec_rec() in ALTER TABLE
innobase_pk_order_preserved(): Treat an added AUTO_INCREMENT
column in the same way as an added existing column.
In either case, the column values are not guaranteed to
be constant, and thus the ordering may change if such a column
is added before any existing PRIMARY KEY columns.

prepare_inplace_alter_table_dict(): Initialize
dict_table_t::persistent_autoinc before invoking
innobase_pk_order_preserved().
2020-08-11 18:52:38 +03:00
Marko Mäkelä
de8d57e522 MDEV-23447 SIGSEGV in fil_system_t::keyrotate_next()
fil_system_t::keyrotate_next(): If space && space->is_in_rotation_list
does not hold, iterate from the start of the list.

In debug builds, we would typically have hit SIGSEGV because the
iterator would have wrapped a null pointer. It might also be that
we are dereferencing a stale pointer.

There is no test case, because the encryption is very nondeterministic
in nature, due to the use of background threads.

This scenario can be hit by setting the following:

SET GLOBAL innodb_encryption_threads=5;
SET GLOBAL innodb_encryption_rotate_key_age=0;
2020-08-11 15:58:17 +03:00
Marko Mäkelä
31aef3ae99 Fix GCC 10.2.0 -Og -Wmaybe-uninitialized
For some reason, GCC emits more -Wmaybe-uninitialized warnings
when using the flag -Og than when using -O2. Many of the warnings
look genuine.
2020-08-11 15:58:16 +03:00
Marko Mäkelä
3b6dadb5eb Merge 10.1 into 10.2 2020-08-10 17:57:14 +03:00
Marko Mäkelä
7f67ef1485 MDEV-16115 Hang after reducing innodb_encryption_threads
The test encryption.create_or_replace would occasionally fail,
because some fil_space_t::n_pending_ops would never be decremented.

fil_crypt_find_space_to_rotate(): If rotate_thread_t::should_shutdown()
holds due to innodb_encryption_threads having been reduced, do
release the reference.

fil_space_remove_from_keyrotation(), fil_space_next(): Declare the
functions static, simplify a little, and define in the same compilation
unit with the only caller, fil_crypt_find_space_to_rotate().

fil_crypt_key_mutex: Remove (unused).
2020-08-10 17:17:25 +03:00
Marko Mäkelä
91caf130b7 MDEV-23101 fixup: Remove redundant code
lock_rec_has_to_wait_in_queue(): Remove an obviously redundant assertion
that was added in commit a8ec45863b
and also enclose a Galera-specific condition in #ifdef WITH_WSREP.
2020-08-04 09:56:09 +03:00
Jan Lindström
a8ec45863b MDEV-23101: SIGSEGV in lock_rec_unlock() when Galera is enabled
lock_rec_has_to_wait
wsrep_kill_victim
lock_rec_create_low
lock_rec_add_to_queue
DeadlockChecker::select_victim()

	THD can't change from normal transaction to BF (brute force) transaction
	here, thus there is no need to syncronize access in wsrep_thd_is_BF
	function.

lock_rec_has_to_wait_in_queue

	Add condition that lock is not NULL and add assertions if we are in
	strong state.
2020-08-03 15:15:40 +03:00
Oleksandr Byelkin
ef7cb0a0b5 Merge branch '10.1' into 10.2 2020-08-02 11:05:29 +02:00
Thirunarayanan Balathandayuthapani
5ec40fbb27 MDEV-14711 Fix-up 2020-07-31 16:45:35 +05:30
Marko Mäkelä
879ba1979b MDEV-11799 Doublewrite recovery can corrupt data pages
The purpose of the InnoDB doublewrite buffer is to make InnoDB
tolerant against cases where the server was killed in the middle
of a page write. (In Linux, killing a process may interrupt a
write system call, typically on a 4096-byte boundary.)

There may exist multiple copies of a page number in the doublewrite
buffer. Recovery should choose the latest valid copy of the page.
By design, the FIL_PAGE_LSN must not precede the latest checkpoint LSN
nor be later than the end of the recovered log.

For page_compressed and encrypted pages, we were missing proper
consistency checks. In the 10.4 data set generated for in MDEV-23231,
the data file contained a valid page_compressed page, and an
identical copy of that page was also present in the doublewrite
buffer. But, recovery would incorrectly consider the page invalid
and restore an uncompressed copy of the same page that had been
written before the log checkpoint. (In fact, no redo log was to
be applied to that page.)

buf_dblwr_process(): Validate the FIL_PAGE_LSN in the doublewrite
buffer pages, and always skip page 0, because those pages should
have been recovered by Datafile::restore_from_doublewrite() if
necessary.

Datafile::restore_from_doublewrite(): Choose the latest applicable
page from the doublewrite buffer.

recv_dblwr_t::find_page(): Also validate encrypted or
page_compressed pages.

recv_dblwr_t::validate_page(): New function to validate a page,
either a copy in a data file or in the doublewrite buffer.
Also validate encrypted or page_compressed pages.

This is joint work with Thirunarayanan Balathandayuthapani.
2020-07-31 11:54:35 +03:00
Marko Mäkelä
f35d172103 MDEV-23198 Crash in REPLACE
row_vers_impl_x_locked_low(): clust_offsets may point to memory
that is allocated by mem_heap_alloc() and may have been freed.
For initializing clust_offsets, try to use the stack-allocated
buffer instead of a pointer that may point to freed memory.

This fixes a regression that was introduced in
commit f0aa073f2b (MDEV-20950).
2020-07-31 11:54:35 +03:00
Thirunarayanan Balathandayuthapani
8a612314d0 MDEV-23332 Index online status assert failure in btr_search_drop_page_hash_index
Problem:
========
In row_merge_drop_indexes(), InnoDB drops only the index from
dictionary and frees the index pages but it maintains the index
object if the table is being used by other DML threads. It sets
the online status of the index to ONLINE_INDEX_ABORTED_DROPPED.
Removing the index from dictionary doesn't remove the
corressponding ahi entries of the index. When block is being
reused, InnoDB tries to remove ahi entries for the block and
it fails if index online status is ONLINE_INDEX_ABORTED_DROPPED.

Fix:
====
MDEV-22456 allows the index ahi entries to be dropped lazily.
so checking online status in btr_search_drop_page_hash_index()
is meaningless and should be removed.
2020-07-30 13:59:03 +05:30
Marko Mäkelä
c5d4dd2533 MDEV-23339 innodb_force_recovery=2 may still abort the rollback of recovered transactions
trx_rollback_active(), trx_rollback_resurrected(): Replace
an incorrect condition that we failed to replace in
commit b68f1d847f (MDEV-21217).
2020-07-30 09:24:36 +03:00
Marko Mäkelä
3c3f172f17 MDEV-23308 CHECK TABLE attempts to access parent_right_page_no=FIL_NULL
mysql/mysql-server@e00ad49edc
which introduced WL#6326 to MySQL 5.7.2 added a buffer page
acquisition to CHECK TABLE code (solely for the purpose of
obeying the changed latching order), but failed to check that
a parent page actually exists. It would not necessarily exist in a
corrupted index where a parent page is missing pointer records
to child pages.
2020-07-28 13:00:59 +03:00
Thirunarayanan Balathandayuthapani
a1f899a8ab MDEV-23233: Race condition for btr_search_drop_page_hash_index() in buf_page_create()
commit ad6171b91c (MDEV-22456)
introduced code to buf_page_create() that would lazily drop
adaptive hash index entries for an index that has been
evicted from the data dictionary cache.

Unfortunately, that call was missing adequate protection.
While the btr_search_drop_page_hash_index(block) was executing,
the block could be reused for something else.

buf_page_create(): If btr_search_drop_page_hash_index() must be
invoked, pin the block before releasing the buf_pool->page_hash lock,
so that the block cannot be grabbed by other threads.
2020-07-27 22:17:51 +05:30
Thirunarayanan Balathandayuthapani
5f1ec5cbb7 MDEV-14711 Assertion `mode == 16 || mode == 12 || !fix_block->page.file_page_was_freed' failed in buf_page_get_gen (rollback requesting a freed undo page)
Problem:
=======
In buf_cur_optimistic_latch_leaves(), requesting a left block with BTR_GET
after releasing current block. But there is no guarantee that left block
could be still available.

Fix:
====

(1) In btr_cur_optimistic_latch_leaves(), replace the BUF_GET with
BUF_GET_POSSIBLY_FREED for fetching left block.
(2) Once InnoDB acquires left block, it should check FIL_PAGE_NEXT with
current block page number. If not, release cursor->left_block and return
false.
2020-07-24 20:32:27 +05:30
Teemu Ollakka
8ef41c6084 MDEV-23272 Galera stack-use-after-scope error with ASAN build
THD proc info was assigned from stack allocated temporary buffer
which went out of scope immediately after assignment.

Fixed by removing the use of temp buffer and assign proc info
from string literal.
2020-07-24 08:50:15 +03:00
Thirunarayanan Balathandayuthapani
b3b1c51e4d MDEV-23134 SEGV in dict_load_table_one during restart after server crash
Problem:
========
dict_load_table_one() doesn't handle the scenario where clustered
index page is FIL_NULL when DICT_ERR_IGNORE_INDEX_ROOT mode
is set.

Fix:
====
InnoDB should set the file_unreadable when it can't find the
clustered index root page.
2020-07-23 20:43:28 +05:30
Thirunarayanan Balathandayuthapani
adeb736f9a MDEV-22903 heap-use-after-free while accessing fts cache deleted doc ids
Problem:
=======
  fts_cache_append_deleted_doc_ids() holds the deleted_lock and tries to
access size of deleted_doc_ids. In the meantime, fts_cache_clear()
clears the sync_heap before clearing deleted_doc_ids. It leads to
invalid access of deleted_doc_ids.

Fix:
===
fts_cache_clear() should free the sync_heap after clearing
deleted_doc_ids.
2020-07-23 16:34:38 +05:30
Thirunarayanan Balathandayuthapani
fe39d02f51 MDEV-20638 Remove the deadcode from srv_master_thread() and srv_active_wake_master_thread_low()
- Due to commit fe95cb2e40 (MDEV-16125),
InnoDB master thread does not need to call srv_resume_thread()
and therefore there is no need to wake up the thread.
Due to the above patch, InnoDB should remove the following dead code.

srv_check_activity(): Makes the parameter as in,out and returns the
recent activity value

innobase_active_small(): Removed

srv_active_wake_master_thread(): Removed

srv_wake_master_thread(): Removed

srv_active_wake_master_thread_low(): Removed

Simplify srv_master_thread() and remove switch cases, added the assert.

Replace srv_wake_master_thread() with srv_inc_activity_count()

INNOBASE_WAKE_INTERVAL: Removed
2020-07-23 16:23:20 +05:30
Marko Mäkelä
52ccedd6dd MDEV-23268 SIGSEGV on srv_monitor_event if InnoDB is read-only
The srv_monitor_event and the srv_monitor_thread would not be
created when InnoDB is in read-only mode. Yet, some code would
unconditionally invoke os_event_set(srv_monitor_event).
2020-07-23 09:59:16 +03:00
Thirunarayanan Balathandayuthapani
3a8943ae73 MDEV-17481 mariadb service won't shutdown when it's running and the OS datetime updated backwards
__pthread_cond_timedwait() in page cleaner hangs if os time moved
backwards.Workaround could be waking up the page cleaner thread in
logs_empty_and_mark_files_at_shutdown(). But there is possibility that
server could hang when server is running. So InnoDB should wake up page
cleaner thread periodically in srv_master_do_idle_tasks().
2020-07-22 18:02:52 +05:30
Thirunarayanan Balathandayuthapani
2a3bc0b9cd MDEV-13830 Assertion failed: recv_sys->mlog_checkpoint_lsn <= recv_sys->recovered_lsn
There can be multiple MLOG_CHECKPOINT record for the same checkpoint.
During recovery, InnoDB could encounter the previous MLOG_CHECKPOINT
for the checkpoint lsn. So the assertion
mlog_checkpoint_lsn <= recovered_lsn is wrong.
2020-07-22 18:00:03 +05:30
Marko Mäkelä
ca9276e37e Merge 10.1 into 10.2 2020-07-20 14:53:24 +03:00
Marko Mäkelä
57ec42bc32 MDEV-23190 InnoDB data file extension is not crash-safe
When InnoDB is extending a data file, it is updating the FSP_SIZE
field in the first page of the data file.

In commit 8451e09073 (MDEV-11556)
we removed a work-around for this bug and made recovery stricter,
by making it track changes to FSP_SIZE via redo log records, and
extend the data files before any changes are being applied to them.

It turns out that the function fsp_fill_free_list() is not crash-safe
with respect to this when it is initializing the change buffer bitmap
page (page 1, or generally, N*innodb_page_size+1). It uses a separate
mini-transaction that is committed (and will be written to the redo
log file) before the mini-transaction that actually extended the data
file. Hence, recovery can observe a reference to a page that is
beyond the current end of the data file.

fsp_fill_free_list(): Initialize the change buffer bitmap page in
the same mini-transaction.

The rest of the changes are fixing a bug that the use of the separate
mini-transaction was attempting to work around. Namely, we must ensure
that no other thread will access the change buffer bitmap page before
our mini-transaction has been committed and all page latches have been
released.

That is, for read-ahead as well as neighbour flushing, we must avoid
accessing pages that might not yet be durably part of the tablespace.

fil_space_t::committed_size: The size of the tablespace
as persisted by mtr_commit().

fil_space_t::max_page_number_for_io(): Limit the highest page
number for I/O batches to committed_size.

MTR_MEMO_SPACE_X_LOCK: Replaces MTR_MEMO_X_LOCK for fil_space_t::latch.

mtr_x_space_lock(): Replaces mtr_x_lock() for fil_space_t::latch.

mtr_memo_slot_release_func(): When releasing MTR_MEMO_SPACE_X_LOCK,
copy space->size to space->committed_size. In this way, read-ahead
or flushing will never be invoked on pages that do not yet exist
according to FSP_SIZE.
2020-07-20 14:48:56 +03:00
Marko Mäkelä
98e2c17e9e Cleanup: Remove fil_check_adress_in_tablespace() 2020-07-20 14:48:56 +03:00
Marko Mäkelä
14543afd59 Cleanup: Remove unused AbstractCallback::m_free_limit 2020-07-20 14:48:56 +03:00
Marko Mäkelä
147d4b1ec0 MDEV-21347 innodb_log_optimize_ddl=OFF is not crash safe
In commit 0f90728bc0 (MDEV-16809)
we introduced the configuration option innodb_log_optimize_ddl
for controlling whether native index creation or table-rebuild
in InnoDB should avoid writing full redo log.

Fungo Wang reported that this option is causing occasional failures.
The reason is that pages may be written to data files in an
inconsistent state. Applying log records to such inconsistent pages
may fail.

The solution is to always invoke PageBulk::finish() before page latches
may be released, to ensure that the page contents is in a consistent
state.

Something similar was implemented in MySQL 8.0.13:
mysql/mysql-server@d1254b9473

buf_block_t::skip_flush_check: Remove. Suppressing consistency checks
is a bad idea.

PageBulk::needs_finish(): New predicate: Determine whether
PageBulk::finish() must fix up the page.

PageBulk::init(): Clear PAGE_DIRECTION to ensure that needs_finish()
will hold. We change the field from PAGE_NO_DIRECTION to 0
and back without writing redo log. This trick avoids the need
to introduce any new data member to PageBulk.

PageBulk::insert(): Replace some high-level accessors to bypass
debug assertions related to PAGE_HEAP_TOP that we will be violating
until finish() has been executed.

PageBulk::finish(): Tolerate m_rec_no==0. We must invoke this also
on an empty page, to ensure that PAGE_HEAP_TOP is initialized.

PageBulk::commit(): Always invoke finish().

PageBulk::release(), BtrBulk::pageSplit(), BtrBulk::storeExt(),
BtrBulk::finish(): Invoke PageBulk::finish().
2020-07-16 06:35:15 +03:00
Marko Mäkelä
fee11c7727 Make page validation stricter
page_simple_validate_old(), page_simple_validate_new():
Require PAGE_N_DIR_SLOTS to be at least 2.
2020-07-15 19:41:01 +03:00
Marko Mäkelä
38b4c07833 MDEV-23183 Infinite loop on page_validate() on corrupted page
MDEV-22721 (commit eba2d10ac5)
inadvertently introduced an infinite loop.

page_validate(): Remove the infinite loop.
2020-07-15 19:41:01 +03:00
Marko Mäkelä
07e89bf7d1 MDEV-23163 Merge new release of InnoDB 5.7.31 to 10.2
The only InnoDB change between MySQL 5.7.30 and MySQL 5.7.31
that is applicable to MariaDB Server was applied in
commit 8d061996e6 (MDEV-23161).
2020-07-14 15:13:40 +03:00
Marko Mäkelä
646a6005e7 Merge 10.1 into 10.2 2020-07-14 15:10:59 +03:00
Marko Mäkelä
142f85142a Update the InnoDB version number to 5.6.49
There were no InnoDB changes between MySQL 5.6.48 and MySQL 5.6.49.
2020-07-14 13:25:18 +03:00
Marko Mäkelä
8d061996e6 MDEV-23161 avg_count_reset may wrongly be NULL in I_S.INNODB_METRICS
This issue was originally reported by Fungo Wang, along with a fix, as
MySQL Bug #98990.

His suggested fix was applied as part of
mysql/mysql-server@a003fc373d
and released in MySQL 5.7.31.

i_s_metrics_fill(): Add the missing call to Field::set_notnull(),
and simplify some code.
2020-07-14 13:21:01 +03:00