Commit graph

335 commits

Author SHA1 Message Date
Marko Mäkelä
f9f2f37495 MDEV-24188: Merge 10.2 into 10.3 2020-11-13 20:41:48 +02:00
Marko Mäkelä
bb328a2a27 MDEV-24188 Hang in buf_page_create() after reusing a previously freed page
The fix of MDEV-23456 (commit b1009ae5c1)
introduced a livelock between page flushing and a thread that is
executing buf_page_create().

buf_page_create(): If the current mini-transaction is holding
an exclusive latch on the page, do not attempt to acquire another
one, and do not care about any I/O fix.

mtr_t::have_x_latch(): Replaces mtr_t::get_fix_count().

dyn_buf_t::for_each_block(const Functor&) const: A new variant.

rw_lock_own(): Add a const qualifier.

Reviewed by: Thirunarayanan Balathandayuthapani
2020-11-13 20:16:39 +02:00
Marko Mäkelä
150f447af1 Merge 10.2 into 10.3 2020-11-12 10:37:21 +02:00
Marko Mäkelä
bd528b0c93 MDEV-24182 ibuf_merge_or_delete_for_page() contains dead code
The function ibuf_merge_or_delete_for_page() was always being
invoked with update_ibuf_bitmap=true ever since
commit cd623508df
fixed up something after MDEV-9566.

Furthermore, the parameter page_size is never being passed as a
null pointer, and therefore it should better be a reference to
a constant object.
2020-11-11 15:48:43 +02:00
Marko Mäkelä
a8de8f261d Merge 10.2 into 10.3 2020-10-28 10:01:50 +02:00
Thirunarayanan Balathandayuthapani
bc540b8706 MDEV-23693 Failing assertion: my_atomic_load32_explicit(&lock->lock_word, MY_MEMORY_ORDER_RELAXED) == X_LOCK_DECR
InnoDB frees the block lock during buffer pool shrinking when other
thread is yet to release the block lock.  While shrinking the
buffer pool, InnoDB allows the page to be freed unless it is buffer
fixed. In some cases, InnoDB releases the latch after unfixing the
block.

Fix:
====
- InnoDB should unfix the block after releases the latch.

- Add more assertion to check buffer fix while accessing the page.

- Introduced block_hint structure to store buf_block_t pointer
and allow accessing the buf_block_t pointer only by passing a
functor. It returns original buf_block_t* pointer if it is valid
or nullptr if the pointer become stale.

- Replace buf_block_is_uncompressed() with
buf_pool_t::is_block_pointer()

This change is motivated by a change in mysql-5.7.32:
mysql/mysql-server@46e60de444
Bug #31036301 ASSERTION FAILURE: SYNC0RW.IC:429:LOCK->LOCK_WORD
2020-10-27 18:30:00 +05:30
Marko Mäkelä
7e07e38cf6 Merge 10.2 into 10.3 2020-09-09 13:06:46 +03:00
Thirunarayanan Balathandayuthapani
b1009ae5c1 MDEV-23456 fil_space_crypt_t::write_page0() is accessing an uninitialized page
buf_page_create() is invoked when page is initialized. So that
previous contents of the page ignored. In few cases, it calls
buf_page_get_gen() is called to fetch the page from buffer pool.
It should take x-latch on the page. If other thread uses the block
or block io state is different from BUF_IO_NONE then release the
mutex and check the state and buffer fix count again. For compressed
page, use the existing free block from LRU list to create new page.
Retry to fetch the compressed page if it is in flush list

fseg_create(), fseg_create_general(): Introduce block as a parameter
where segment header is placed. It is used to avoid repetitive
x-latch on the same page

Change the assert to check whether the page has SX latch and
X latch in all callee function of buf_page_create()

mtr_t::get_fix_count(): Get the buffer fix count of the given
block added by the mtr

FindBlock is added to find the buffer fix count of the given
block acquired by the mini-transaction
2020-09-09 11:58:15 +05:30
Marko Mäkelä
de0e7cd72a Merge 10.2 into 10.3 2020-08-20 09:12:16 +03:00
Thirunarayanan Balathandayuthapani
a79c257894 MDEV-23452 Assertion `buf_page_get_io_fix(bpage) == BUF_IO_NONE' failed
in buf_page_set_sticky

- Adding os_thread_yield() in buf_page_create() to avoid the continuous
buffer pool mutex acquistions.
2020-08-20 11:38:10 +05:30
Thirunarayanan Balathandayuthapani
e9d6f1c7ac MDEV-23452 Assertion `buf_page_get_io_fix(bpage) == BUF_IO_NONE' failed
in buf_page_set_sticky

commit a1f899a8ab (MDEV-23233) added the
code to make page sticky. So that InnoDB can't allow the page to
be grabbed by other thread while doing lazy drop of ahi.

But the block could be in flush list and it could have io_fix value
as BUF_IO_WRITE. It could lead to the failure in buf_page_set_sticky().

buf_page_create(): If btr_search_drop_page_hash_index() must be invoked,
take x-latch on the block. If the block io_fix value is other than
BUF_IO_NONE, release the buffer pool mutex and page hash lock and
wait for I/O to complete.
2020-08-20 11:34:53 +05:30
Marko Mäkelä
66ec3a770f Merge 10.2 into 10.3 2020-07-31 13:51:28 +03:00
Thirunarayanan Balathandayuthapani
a1f899a8ab MDEV-23233: Race condition for btr_search_drop_page_hash_index() in buf_page_create()
commit ad6171b91c (MDEV-22456)
introduced code to buf_page_create() that would lazily drop
adaptive hash index entries for an index that has been
evicted from the data dictionary cache.

Unfortunately, that call was missing adequate protection.
While the btr_search_drop_page_hash_index(block) was executing,
the block could be reused for something else.

buf_page_create(): If btr_search_drop_page_hash_index() must be
invoked, pin the block before releasing the buf_pool->page_hash lock,
so that the block cannot be grabbed by other threads.
2020-07-27 22:17:51 +05:30
Marko Mäkelä
73aa31fbfd Merge 10.2 into 10.3 2020-07-16 06:55:23 +03:00
Marko Mäkelä
147d4b1ec0 MDEV-21347 innodb_log_optimize_ddl=OFF is not crash safe
In commit 0f90728bc0 (MDEV-16809)
we introduced the configuration option innodb_log_optimize_ddl
for controlling whether native index creation or table-rebuild
in InnoDB should avoid writing full redo log.

Fungo Wang reported that this option is causing occasional failures.
The reason is that pages may be written to data files in an
inconsistent state. Applying log records to such inconsistent pages
may fail.

The solution is to always invoke PageBulk::finish() before page latches
may be released, to ensure that the page contents is in a consistent
state.

Something similar was implemented in MySQL 8.0.13:
mysql/mysql-server@d1254b9473

buf_block_t::skip_flush_check: Remove. Suppressing consistency checks
is a bad idea.

PageBulk::needs_finish(): New predicate: Determine whether
PageBulk::finish() must fix up the page.

PageBulk::init(): Clear PAGE_DIRECTION to ensure that needs_finish()
will hold. We change the field from PAGE_NO_DIRECTION to 0
and back without writing redo log. This trick avoids the need
to introduce any new data member to PageBulk.

PageBulk::insert(): Replace some high-level accessors to bypass
debug assertions related to PAGE_HEAP_TOP that we will be violating
until finish() has been executed.

PageBulk::finish(): Tolerate m_rec_no==0. We must invoke this also
on an empty page, to ensure that PAGE_HEAP_TOP is initialized.

PageBulk::commit(): Always invoke finish().

PageBulk::release(), BtrBulk::pageSplit(), BtrBulk::storeExt(),
BtrBulk::finish(): Invoke PageBulk::finish().
2020-07-16 06:35:15 +03:00
Marko Mäkelä
f3f23b5c4b One more ASAN/MSAN cleanup
commit 484931325e included a
workaround for a 10.5 merge issue that should now be properly
addressed in commit ab4069909d.

buf_chunk_init(): Remove an unnecessary MEM_MAKE_ADDRESSABLE().
We might invoke MEM_UNDEFINED() here, but actually the allocated
memory ought to be guaranteed to be zero-initialized.
2020-07-05 16:31:34 +03:00
Marko Mäkelä
453dc4b300 Fixup the parent commit for MSAN and Valgrind
commit 484931325e was a necessary
fix for the buffer pool resizing tests in 10.5 in
AddressSanitizer. However, that change would break the tests
innodb.innodb_buffer_pool_resize and
innodb.innodb_buffer_pool_resize_with_chunks
when run in MemorySanitizer, or presumably in Valgrind as well.
(Those tests run "forever" in Valgrind.)

buf_pool_resize(): Cancel the effect of MEM_NOACCESS() in Valgrind
and ASAN. In MSAN, MEM_NOACCESS() is a no-op, and hence we must do
nothing special here.

MEM_MAKE_ADDRESSABLE() would declare the memory contents undefined.
In this particular case, we must actually declare the contents
defined for Valgrind.
2020-07-04 22:02:30 +03:00
Monty
484931325e Fix for MSAN from Marko related to buf_pool_resize 2020-07-03 18:37:33 +03:00
Marko Mäkelä
1df1a63924 Merge 10.2 into 10.3 2020-07-02 06:17:51 +03:00
Marko Mäkelä
c36834c832 MDEV-20377: Make WITH_MSAN more usable
MemorySanitizer (clang -fsanitize=memory) requires that all code
be compiled with instrumentation enabled. The only exception is the
C runtime library. Failure to use instrumented libraries will cause
bogus messages about memory being uninitialized.

In WITH_MSAN builds, we must avoid calling getservbyname(),
because even though it is a standard library function, it is
not instrumented, not even in clang 10.

Note: Before MariaDB Server 10.5, ./mtr will typically fail
due to the old PCRE library, which was updated in MDEV-14024.

The following cmake options were tested on 10.5
in commit 94d0bb4dbe:

cmake \
-DCMAKE_C_FLAGS='-march=native -O2' \
-DCMAKE_CXX_FLAGS='-stdlib=libc++ -march=native -O2' \
-DWITH_EMBEDDED_SERVER=OFF -DWITH_UNIT_TESTS=OFF -DCMAKE_BUILD_TYPE=Debug \
-DWITH_INNODB_{BZIP2,LZ4,LZMA,LZO,SNAPPY}=OFF \
-DPLUGIN_{ARCHIVE,TOKUDB,MROONGA,OQGRAPH,ROCKSDB,CONNECT,SPIDER}=NO \
-DWITH_SAFEMALLOC=OFF \
-DWITH_{ZLIB,SSL,PCRE}=bundled \
-DHAVE_LIBAIO_H=0 \
-DWITH_MSAN=ON

MEM_MAKE_DEFINED(): An alias for VALGRIND_MAKE_MEM_DEFINED()
and __msan_unpoison().

MEM_GET_VBITS(), MEM_SET_VBITS(): Aliases for
VALGRIND_GET_VBITS(), VALGRIND_SET_VBITS(), __msan_copy_shadow().

InnoDB: Replace the UNIV_MEM_ macros with corresponding MEM_ macros.

ut_crc32_8_hw(), ut_crc32_64_low_hw(): Use the compiler built-in
functions instead of inline assembler when building WITH_MSAN.
This will require at least -msse4.2 when building for IA-32 or AMD64.
The inline assembler would not be instrumented, and would thus cause
bogus failures.
2020-07-01 17:23:00 +03:00
Marko Mäkelä
680463a8d9 Merge 10.2 into 10.3 2020-06-05 16:51:26 +03:00
Marko Mäkelä
138c11cce5 MDEV-22790 Race between btr_page_mtr_lock() dropping AHI on the same block
This race condition was introduced by
commit ad6171b91c (MDEV-22456).

In the observed case, two threads were executing
btr_search_drop_page_hash_index() on the same block,
to free a stale entry that was attached to a dropped index.
Both threads were holding an S latch on the block.

We must prevent the double-free of block->index by holding
block->lock in exclusive mode.

btr_search_guess_on_hash(): Do not invoke
btr_search_drop_page_hash_index(block) to get rid of
stale entries, because we are not necessarily holding
an exclusive block->lock here.

buf_defer_drop_ahi(): New function, to safely drop stale
entries in buf_page_mtr_lock(). We will skip the call to
btr_search_drop_page_hash_index(block) when only requesting
bufferfixing (no page latch), because in that case, we should
not be accessing the adaptive hash index, and we might get
a deadlock if we acquired the page latch.
2020-06-05 15:22:46 +03:00
Marko Mäkelä
eba2d10ac5 MDEV-22721 Remove bloat caused by InnoDB logger class
Introduce a new ATTRIBUTE_NOINLINE to
ib::logger member functions, and add UNIV_UNLIKELY hints to callers.

Also, remove some crash reporting output. If needed, the
information will be available using debugging tools.

Furthermore, remove some fts_enable_diag_print output that included
indexed words in raw form. The code seemed to assume that words are
NUL-terminated byte strings. It is not clear whether a NUL terminator
is always guaranteed to be present. Also, UCS2 or UTF-16 strings would
typically contain many NUL bytes.
2020-06-04 10:24:10 +03:00
Thirunarayanan Balathandayuthapani
ad2bf1129c MDEV-22646 Assertion `table2->cached' failed in dict_table_t::add_to_cache
Problem:
========
  During buffer pool resizing, InnoDB recreates the dictionary hash
tables. Dictionary hash table reuses the heap of AHI hash tables.
It leads to memory corruption.

Fix:
====
- While disabling AHI, free the heap and AHI hash tables. Recreate the
AHI hash tables and assign new heap when AHI is enabled.

- btr_blob_free() access invalid page if page was reallocated during
buffer poolresizing. So btr_blob_free() should get the page from
buf_pool instead of using existing block.

- btr_search_enabled and block->index should be checked after
acquiring the btr_search_sys latch

- Moved the buffer_pool_scan debug sync to earlier before accessing the
btr_search_sys latches to avoid the hang of truncate_purge_debug
test case

- srv_printf_innodb_monitor() should acquire btr_search_sys latches
before AHI hash tables.
2020-06-03 16:02:02 +05:30
Marko Mäkelä
8300f639a1 Merge 10.2 into 10.3 2020-06-02 10:25:11 +03:00
Thirunarayanan Balathandayuthapani
02f68552a4 MDEV-22650 Dirty compressed page checksum validation fails
Problem:
=======
  While evicting the uncompressed page from buffer pool, InnoDB writes
the checksum for the compressed page in buf_LRU_free_page().
So while flushing the compressed page, checksum validation fails
when innodb_checksum_algorithm variable changed to strict_none.

Solution:
========
- Calculate the checksum only during flushing of page. Removed the
checksum write in buf_LRU_free_page().
2020-06-01 14:34:16 +05:30
Marko Mäkelä
3d0bb2b7f1 Merge 10.2 into 10.3 2020-05-15 19:11:57 +03:00
Marko Mäkelä
6a6bcc53b8 Merge 10.2 into 10.3 2020-05-15 17:55:01 +03:00
Marko Mäkelä
ad6171b91c MDEV-22456 Dropping the adaptive hash index may cause DDL to lock up InnoDB
If the InnoDB buffer pool contains many pages for a table or index
that is being dropped or rebuilt, and if many of such pages are
pointed to by the adaptive hash index, dropping the adaptive hash index
may consume a lot of time.

The time-consuming operation of dropping the adaptive hash index entries
is being executed while the InnoDB data dictionary cache dict_sys is
exclusively locked.

It is not actually necessary to drop all adaptive hash index entries
at the time a table or index is being dropped or rebuilt. We can let
the LRU replacement policy of the buffer pool take care of this gradually.
For this to work, we must detach the dict_table_t and dict_index_t
objects from the main dict_sys cache, and once the last
adaptive hash index entry for the detached table is removed
(when the garbage page is evicted from the buffer pool) we can free
the dict_table_t and dict_index_t object.

Related to this, in MDEV-16283, we made ALTER TABLE...DISCARD TABLESPACE
skip both the buffer pool eviction and the drop of the adaptive hash index.
We shifted the burden to ALTER TABLE...IMPORT TABLESPACE or DROP TABLE.
We can remove the eviction from DROP TABLE. We must retain the eviction
in the ALTER TABLE...IMPORT TABLESPACE code path, so that in case the
discarded table is being re-imported with the same tablespace identifier,
the fresh data from the imported tablespace will replace any stale pages
in the buffer pool.

rpl.rpl_failed_drop_tbl_binlog: Remove the test. DROP TABLE can
no longer be interrupted inside InnoDB.

fseg_free_page(), fseg_free_step(), fseg_free_step_not_header(),
fseg_free_page_low(), fseg_free_extent(): Remove the parameter
that specifies whether the adaptive hash index should be dropped.

btr_search_lazy_free(): Lazily free an index when the last
reference to it is dropped from the adaptive hash index.

buf_pool_clear_hash_index(): Declare static, and move to the
same compilation unit with the bulk of the adaptive hash index
code.

dict_index_t::clone(), dict_index_t::clone_if_needed():
Clone an index that is being rebuilt while adaptive hash index
entries exist. The original index will be inserted into
dict_table_t::freed_indexes and dict_index_t::set_freed()
will be called.

dict_index_t::set_freed(), dict_index_t::freed(): Note that
or check whether the index has been freed. We will use the
impossible page number 1 to denote this condition.

dict_index_t::n_ahi_pages(): Replaces btr_search_info_get_ref_count().

dict_index_t::detach_columns(): Move the assignment n_fields=0
to ha_innobase_inplace_ctx::clear_added_indexes().
We must have access to the columns when freeing the
adaptive hash index. Note: dict_table_t::v_cols[] will remain
valid. If virtual columns are dropped or added, the table
definition will be reloaded in ha_innobase::commit_inplace_alter_table().

buf_page_mtr_lock(): Drop a stale adaptive hash index if needed.

We will also reduce the number of btr_get_search_latch() calls
and enclose some more code inside #ifdef BTR_CUR_HASH_ADAPT
in order to benefit cmake -DWITH_INNODB_AHI=OFF.
2020-05-15 17:23:08 +03:00
Marko Mäkelä
1b16572074 Merge 10.1 into 10.2 2020-05-14 17:48:06 +03:00
Marko Mäkelä
ee5152fc4b MDEV-22070 MSAN use-of-uninitialized-value in encryption.innodb-redo-badkey
On a checksum failure of a ROW_FORMAT=COMPRESSED page,
buf_LRU_free_one_page() would invoke buf_LRU_block_remove_hashed()
which will read the uncompressed page frame, although it would not
be initialized. With bad enough luck, fil_page_get_type(page)
could return an unrecognized value and cause the server to abort.

buf_page_io_complete(): On the corruption of a ROW_FORMAT=COMPRESSED
page, zerofill the uncompressed page frame.
2020-05-14 17:41:37 +03:00
Marko Mäkelä
15fa70b840 Merge 10.2 into 10.3 2020-05-13 11:45:05 +03:00
Marko Mäkelä
b9f177f66a MDEV-11254 cleanup: Remove buf_page_t::write_size
commit 6495806e59 removed all reads
of buf_page_t::write_size. Let us remove the field altogether.
2020-05-05 08:54:33 +03:00
Eugene Kosov
7f9dc0d84a split log_t::buf into two buffers
Maybe this patch will help catch problems like buffer overflow.

log_t::first_in_use: removed

log_t::buf: this is where mtr_t are supposed to append data
log_t::flush_buf: this is from server writes to a file

Those two buffers are std::swap()ped when some thread is gonna write
to a file
2020-04-30 11:56:16 +03:00
Marko Mäkelä
1a9b6c4c7f Merge 10.2 into 10.3 2020-03-30 11:12:56 +03:00
Thirunarayanan Balathandayuthapani
6697135c6d MDEV-21572 buf_page_get_gen() should apply buffered page initialized
redo log during recovery

- InnoDB unnecessarily reads the page even though it has fully initialized
buffered redo log records. Allow the page initialization redo log to
apply for the page in buf_page_get_gen() during recovery.
- Renamed buf_page_get_gen() to buf_page_get_low()
- Newly added buf_page_get_gen() will check for buffered redo log for
the particular page id during recovery
- Added new function buf_page_mtr_lock() which basically latches the page
for the given latch type.
- recv_recovery_create_page() is inline function which creates a page
if it has page initialization redo log records.
2020-03-23 16:41:48 +05:30
Eugene Kosov
5e9e0b8e3b MDEV-21993 asan failure in encryption.innochecksum
simplify fix

field_ref_zero: make bigger

buf_is_zeroes(): remove a loop and check in one go
2020-03-21 17:57:04 +03:00
Eugene Kosov
23993c0036 MDEV-21993 asan failure in encryption.innochecksum
buf_is_zeroes(): stop assuming that argument buffer size
is always a multiply of 4096. And thus stop reading past
that buffer.
2020-03-21 17:08:52 +03:00
Eugene Kosov
54b2da9535 correct comment in buf_page_is_corrupted() 2020-03-20 21:35:42 +03:00
Eugene Kosov
884d22f288 remove fishy reinterpret_cast from buf_page_is_zeroes()
In my micro-benchmarks memcmp(4196) 3 times faster than old
implementation. Also, it's generally better to use as less
reinterpret_casts<> as possible.

buf_is_zeroes(): renamed from buf_page_is_zeroes() and
argument changed to span<> for convenience.

st_::span<T>::const_iterator: fixed

page_zip-verify_checksum(): make argument byte* instead of void*
2020-03-20 21:35:42 +03:00
Marko Mäkelä
44298e4dea Merge 10.2 into 10.3
Also, clean up the test innodb_gis.geometry a little further.
2020-03-20 18:12:17 +02:00
Marko Mäkelä
9fd692aeca MDEV-13626: Clean up the buffer pool resizing tests from MySQL 5.7
buf_pool_resize(): Simplify the fault injection
for innodb.buf_pool_resize_oom.

innodb.buf_pool_resize_oom: Use a small buffer pool.

innodb.innodb_buffer_pool_load_now: Make use of the sequence engine,
to avoid creating explicit InnoDB record locks. Clean up the
accesses to information_schema.innodb_buffer_page_lru.
2020-03-19 13:01:20 +02:00
Marko Mäkelä
5fe87ac413 Merge 10.2 into 10.3 2020-03-13 12:31:55 +02:00
Eugene Kosov
5257bcfc7a InnoDB: improve error message for checksum mismatch 2020-03-12 14:47:45 +03:00
Oleksandr Byelkin
58b70dc136 Merge branch '10.2' into 10.3 2020-02-10 20:34:16 +01:00
Eugene Kosov
bd36a4ca12 introduce HASH_REPLACE() for hash_table_t
HASH_REPLACE(): allows to not travel through linked list twice
when HASH_INSERT() happens right after HASH_DELETE()
2020-01-31 22:14:18 +08:00
Marko Mäkelä
5ab70e7f68 Merge 10.2 into 10.3 2019-12-27 15:14:48 +02:00
Marko Mäkelä
a59a015a75 Plug memory leaks from numa_get_mems_allowed() 2019-12-23 07:47:16 +02:00
Marko Mäkelä
73985d8301 Merge 10.1 into 10.2 2019-12-23 07:14:51 +02:00
Eugene Kosov
a5a433e256 fix windows compilation 2019-12-18 23:35:33 +08:00