Commit graph

275 commits

Author SHA1 Message Date
Marko Mäkelä
44d70c01f0 Merge 10.3 into 10.4 2021-03-19 11:42:44 +02:00
Marko Mäkelä
19052b6deb Merge 10.2 into 10.3 2021-03-18 12:34:48 +02:00
Marko Mäkelä
1af8558193 MDEV-25101 Assertion !strcmp(field->name, "table_name") failed
btr_node_ptr_max_size(): Let us remove the debug assertion that was
added in MDEV-14637. The assertion assumed that no additional
indexes exist in mysql.innodb_index_stats or mysql.innodb_table_stats.
The code path is working around an incorrect definition of a table,
interpreting VARCHAR(64) as the more correct VARCHAR(199).

No test case will be added, because MDEV-24579 proves that executing
DDL on the statistics tables involves a race condition. The test
case included the following:

	ALTER TABLE mysql.innodb_index_stats ADD KEY (stat_name);
	CREATE TABLE t (a INT) ENGINE=InnoDB STATS_PERSISTENT=1;
2021-03-10 11:08:51 +02:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Sergei Golubchik
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00
sjaakola
beaea31ab1 MDEV-23851 BF-BF Conflict issue because of UK GAP locks
Some DML operations on tables having unique secondary keys cause scanning
in the secondary index, for instance to find potential unique key violations
in the seconday index. This scanning may involve GAP locking in the index.
As this locking happens also when applying replication events in high priority
applier threads, there is a probabality for lock conflicts between two wsrep
high priority threads.

This PR avoids lock conflicts of high priority wsrep threads, which do
secondary index scanning e.g. for duplicate key detection.

The actual fix is the patch in sql_class.cc:thd_need_ordering_with(), where
we allow relaxed GAP locking protocol between wsrep high priority threads.
wsrep high priority threads (replication appliers, replayers and TOI processors)
are ordered by the replication provider, and they will not need serializability
support gained by secondary index GAP locks.

PR contains also a mtr test, which exercises a scenario where two replication
applier threads have a false positive conflict in GAP of unique secondary index.
The conflicting local committing transaction has to replay, and the test verifies
also that the replaying phase will not conflict with the latter repllication applier.
Commit also contains new test scenario for galera.galera_UK_conflict.test,
where replayer starts applying after a slave applier thread, with later seqno,
has advanced to commit phase. The applier and replayer have false positive GAP
lock conflict on secondary unique index, and replayer should ignore this.
This test scenario caused crash with earlier version in this PR, and to fix this,
the secondary index uniquenes checking has been relaxed even further.

Now innodb trx_t structure has new member: bool wsrep_UK_scan, which is set to
true, when high priority thread is performing unique secondary index scanning.
The member trx_t::wsrep_UK_scan is defined inside WITH_WSREP directive, to make
it possible to prepare a MariaDB build where this additional trx_t member is
not present and is not used in the code base. trx->wsrep_UK_scan is set to true
only for the duration of function call for: lock_rec_lock() trx->wsrep_UK_scan
is used only in lock_rec_has_to_wait() function to relax the need to wait if
wsrep_UK_scan is set and conflicting transaction is also high priority.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-01-18 08:09:06 +02:00
Marko Mäkelä
7b2bb67113 Merge 10.3 into 10.4 2020-10-29 13:38:38 +02:00
Marko Mäkelä
2b6f804490 Merge 10.2 into 10.3 2020-10-28 10:44:40 +02:00
Marko Mäkelä
a8de8f261d Merge 10.2 into 10.3 2020-10-28 10:01:50 +02:00
Eugene Kosov
afc9d00c66 MDEV-23991 dict_table_stats_lock() has unnecessarily long scope
Patch removes dict_index_t::stats_latch. Table/index statistics now
protected with dict_sys->mutex. That way statistics computation can
happen in parallel in several threads and dict_sys->mutex will be locked
only for a short period of time.

This patch is a joint work with Marko Mäkelä

dict_index_t:🔒 make mutable which allows to pass const pointer
when only lock is touched in an object

btr_height_get()
btr_get_size(): make index argument const for better type safety

btr_estimate_number_of_different_key_vals(): now returns computed values
instead of setting fields in dict_index_t directly

remove everything related to dict_index_t::stats_latch

dict_stats_index_set_n_diff(): now returns computed values instead
of setting fields in dict_index_t directly

dict_stats_analyze_index():  now returns computed values instead
of setting fields in dict_index_t directly

Reviewed by: Marko Mäkelä
2020-10-27 19:09:20 +03:00
Thirunarayanan Balathandayuthapani
bc540b8706 MDEV-23693 Failing assertion: my_atomic_load32_explicit(&lock->lock_word, MY_MEMORY_ORDER_RELAXED) == X_LOCK_DECR
InnoDB frees the block lock during buffer pool shrinking when other
thread is yet to release the block lock.  While shrinking the
buffer pool, InnoDB allows the page to be freed unless it is buffer
fixed. In some cases, InnoDB releases the latch after unfixing the
block.

Fix:
====
- InnoDB should unfix the block after releases the latch.

- Add more assertion to check buffer fix while accessing the page.

- Introduced block_hint structure to store buf_block_t pointer
and allow accessing the buf_block_t pointer only by passing a
functor. It returns original buf_block_t* pointer if it is valid
or nullptr if the pointer become stale.

- Replace buf_block_is_uncompressed() with
buf_pool_t::is_block_pointer()

This change is motivated by a change in mysql-5.7.32:
mysql/mysql-server@46e60de444
Bug #31036301 ASSERTION FAILURE: SYNC0RW.IC:429:LOCK->LOCK_WORD
2020-10-27 18:30:00 +05:30
Marko Mäkelä
46957a6a77 Merge 10.3 into 10.4 2020-10-22 13:27:18 +03:00
Thirunarayanan Balathandayuthapani
7b7ea33124 MDEV-23072 Diskspace not reused for Blob in data file
- This issue is caused by commit a4948dafcd.
Purge doesn't free the externally stored page associated with the
last record of the root page. In that case, purge thread does empty
the root page and leads to more orphaned blob page in the tablespace.
Purge thread should free the blob even for the last record of the
root page.

Reviewed-by: Marko Mäkelä
2020-10-20 12:34:06 +05:30
Marko Mäkelä
2fa9f8c53a Merge 10.3 into 10.4 2020-08-20 11:01:47 +03:00
Marko Mäkelä
de0e7cd72a Merge 10.2 into 10.3 2020-08-20 09:12:16 +03:00
Thirunarayanan Balathandayuthapani
362b18c536 MDEV-23380 InnoDB reads a page from disk despite parsing MLOG_INIT_FILE_PAGE2 record
This problem is caused by 6697135c6d
(MDEV-21572). During recovery, InnoDB prefetches the siblings of
change buffer index leaf page. It does asynchronous page read
and recovery scenario wasn't handled in buf_read_page_background().
It leads to the refusal of startup of the server.

Solution:
=========
  InnoDB shouldn't allow the change buffer index page siblings
to be prefetched.
2020-08-18 14:59:16 +05:30
Marko Mäkelä
2f7b37b021 Merge 10.3 into 10.4, except MDEV-22543
Also, fix GCC -Og -Wmaybe-uninitialized in run_backup_stage()
2020-08-13 18:48:41 +03:00
Marko Mäkelä
4bd56a697f Merge 10.2 into 10.3 2020-08-13 18:18:25 +03:00
Marko Mäkelä
182e2d4a6c Merge 10.1 into 10.2 2020-08-13 07:38:35 +03:00
Marko Mäkelä
efd8af535a MDEV-19526 heap number overflow on innodb_page_size=64k
InnoDB only reserves 13 bits for the heap number in the record header,
limiting the heap number to be at most 8191. But, when using
innodb_page_size=64k and secondary index records of 7 bytes each,
it is possible to exceed the maximum heap number.

btr_cur_optimistic_insert(): Let the operation fail if the
maximum number of records would be exceeded.

page_mem_alloc_heap(): Move to the same compilation unit with the
only caller, and let the operation fail if the maximum heap number
has been allocated already.
2020-08-12 18:21:53 +03:00
Marko Mäkelä
9216114ce7 Merge 10.3 into 10.4 2020-07-31 18:09:08 +03:00
Thirunarayanan Balathandayuthapani
5ec40fbb27 MDEV-14711 Fix-up 2020-07-31 16:45:35 +05:30
Marko Mäkelä
66ec3a770f Merge 10.2 into 10.3 2020-07-31 13:51:28 +03:00
Thirunarayanan Balathandayuthapani
5f1ec5cbb7 MDEV-14711 Assertion `mode == 16 || mode == 12 || !fix_block->page.file_page_was_freed' failed in buf_page_get_gen (rollback requesting a freed undo page)
Problem:
=======
In buf_cur_optimistic_latch_leaves(), requesting a left block with BTR_GET
after releasing current block. But there is no guarantee that left block
could be still available.

Fix:
====

(1) In btr_cur_optimistic_latch_leaves(), replace the BUF_GET with
BUF_GET_POSSIBLY_FREED for fetching left block.
(2) Once InnoDB acquires left block, it should check FIL_PAGE_NEXT with
current block page number. If not, release cursor->left_block and return
false.
2020-07-24 20:32:27 +05:30
Marko Mäkelä
de20872331 MDEV-22988 Corrupted table after DROP INDEX
This form of corruption was only reproduced on MariaDB 10.5.4
after the MDEV-22867 fix was applied in
commit 431200090e.

While we do not know how to reproduce this corruption in
MariaDB 10.4, we are applying the code fix without a test case.

btr_cur_pessimistic_update(): Invoke btr_set_instant() if needed.
2020-07-13 16:44:46 +03:00
Monty
5211af1c16 Merge remote-tracking branch 'origin/10.3' into 10.4 2020-07-03 00:35:28 +03:00
Marko Mäkelä
b6ec1e8bbf MDEV-20377 post-fix: Introduce MEM_MAKE_ADDRESSABLE
In AddressSanitizer, we only want memory poisoning to happen
in connection with custom memory allocation or freeing.

The primary use of MEM_UNDEFINED is for declaring memory uninitialized
in Valgrind or MemorySanitizer. We do not want MEM_UNDEFINED to
have the unwanted side effect that AddressSanitizer would no longer
be able to complain about accessing unallocated memory.

MEM_UNDEFINED(): Define as no-op for AddressSanitizer.

MEM_MAKE_ADDRESSABLE(): Define as MEM_UNDEFINED() or
ASAN_UNPOISON_MEMORY_REGION().

MEM_CHECK_ADDRESSABLE(): Wrap also __asan_region_is_poisoned().
2020-07-02 17:59:28 +03:00
Monty
65f831d17c Fixed bugs found by valgrind
- Some of the bug fixes are backports from 10.5!
- The fix in innobase/fil/fil0fil.cc is just a backport to get less
  error messages in mysqld.1.err when running with valgrind.
- Renamed HAVE_valgrind_or_MSAN to HAVE_valgrind
2020-07-02 17:57:34 +03:00
Marko Mäkelä
f347b3e0e6 Merge 10.3 into 10.4 2020-07-02 07:39:33 +03:00
Marko Mäkelä
1df1a63924 Merge 10.2 into 10.3 2020-07-02 06:17:51 +03:00
Marko Mäkelä
c36834c832 MDEV-20377: Make WITH_MSAN more usable
MemorySanitizer (clang -fsanitize=memory) requires that all code
be compiled with instrumentation enabled. The only exception is the
C runtime library. Failure to use instrumented libraries will cause
bogus messages about memory being uninitialized.

In WITH_MSAN builds, we must avoid calling getservbyname(),
because even though it is a standard library function, it is
not instrumented, not even in clang 10.

Note: Before MariaDB Server 10.5, ./mtr will typically fail
due to the old PCRE library, which was updated in MDEV-14024.

The following cmake options were tested on 10.5
in commit 94d0bb4dbe:

cmake \
-DCMAKE_C_FLAGS='-march=native -O2' \
-DCMAKE_CXX_FLAGS='-stdlib=libc++ -march=native -O2' \
-DWITH_EMBEDDED_SERVER=OFF -DWITH_UNIT_TESTS=OFF -DCMAKE_BUILD_TYPE=Debug \
-DWITH_INNODB_{BZIP2,LZ4,LZMA,LZO,SNAPPY}=OFF \
-DPLUGIN_{ARCHIVE,TOKUDB,MROONGA,OQGRAPH,ROCKSDB,CONNECT,SPIDER}=NO \
-DWITH_SAFEMALLOC=OFF \
-DWITH_{ZLIB,SSL,PCRE}=bundled \
-DHAVE_LIBAIO_H=0 \
-DWITH_MSAN=ON

MEM_MAKE_DEFINED(): An alias for VALGRIND_MAKE_MEM_DEFINED()
and __msan_unpoison().

MEM_GET_VBITS(), MEM_SET_VBITS(): Aliases for
VALGRIND_GET_VBITS(), VALGRIND_SET_VBITS(), __msan_copy_shadow().

InnoDB: Replace the UNIV_MEM_ macros with corresponding MEM_ macros.

ut_crc32_8_hw(), ut_crc32_64_low_hw(): Use the compiler built-in
functions instead of inline assembler when building WITH_MSAN.
This will require at least -msse4.2 when building for IA-32 or AMD64.
The inline assembler would not be instrumented, and would thus cause
bogus failures.
2020-07-01 17:23:00 +03:00
Marko Mäkelä
68d9d512e9 Merge 10.3 into 10.4 2020-06-05 18:05:22 +03:00
Marko Mäkelä
680463a8d9 Merge 10.2 into 10.3 2020-06-05 16:51:26 +03:00
Thirunarayanan Balathandayuthapani
ad2bf1129c MDEV-22646 Assertion `table2->cached' failed in dict_table_t::add_to_cache
Problem:
========
  During buffer pool resizing, InnoDB recreates the dictionary hash
tables. Dictionary hash table reuses the heap of AHI hash tables.
It leads to memory corruption.

Fix:
====
- While disabling AHI, free the heap and AHI hash tables. Recreate the
AHI hash tables and assign new heap when AHI is enabled.

- btr_blob_free() access invalid page if page was reallocated during
buffer poolresizing. So btr_blob_free() should get the page from
buf_pool instead of using existing block.

- btr_search_enabled and block->index should be checked after
acquiring the btr_search_sys latch

- Moved the buffer_pool_scan debug sync to earlier before accessing the
btr_search_sys latches to avoid the hang of truncate_purge_debug
test case

- srv_printf_innodb_monitor() should acquire btr_search_sys latches
before AHI hash tables.
2020-06-03 16:02:02 +05:30
Marko Mäkelä
8059148154 Merge 10.3 into 10.4 2020-06-03 07:32:09 +03:00
Marko Mäkelä
8300f639a1 Merge 10.2 into 10.3 2020-06-02 10:25:11 +03:00
Marko Mäkelä
83d0e72b34 Cleanup: Remove thr_is_recv(), trx_is_recv()
Compare to trx_roll_crash_recv_trx directly where needed.
2020-06-01 10:23:11 +03:00
Marko Mäkelä
9e6e43551f Merge 10.3 into 10.4
We will expose some more std::atomic internals in Atomic_counter,
so that dict_index_t::lock will support the default assignment operator.
2020-05-16 07:39:15 +03:00
Marko Mäkelä
6a6bcc53b8 Merge 10.2 into 10.3 2020-05-15 17:55:01 +03:00
Marko Mäkelä
ad6171b91c MDEV-22456 Dropping the adaptive hash index may cause DDL to lock up InnoDB
If the InnoDB buffer pool contains many pages for a table or index
that is being dropped or rebuilt, and if many of such pages are
pointed to by the adaptive hash index, dropping the adaptive hash index
may consume a lot of time.

The time-consuming operation of dropping the adaptive hash index entries
is being executed while the InnoDB data dictionary cache dict_sys is
exclusively locked.

It is not actually necessary to drop all adaptive hash index entries
at the time a table or index is being dropped or rebuilt. We can let
the LRU replacement policy of the buffer pool take care of this gradually.
For this to work, we must detach the dict_table_t and dict_index_t
objects from the main dict_sys cache, and once the last
adaptive hash index entry for the detached table is removed
(when the garbage page is evicted from the buffer pool) we can free
the dict_table_t and dict_index_t object.

Related to this, in MDEV-16283, we made ALTER TABLE...DISCARD TABLESPACE
skip both the buffer pool eviction and the drop of the adaptive hash index.
We shifted the burden to ALTER TABLE...IMPORT TABLESPACE or DROP TABLE.
We can remove the eviction from DROP TABLE. We must retain the eviction
in the ALTER TABLE...IMPORT TABLESPACE code path, so that in case the
discarded table is being re-imported with the same tablespace identifier,
the fresh data from the imported tablespace will replace any stale pages
in the buffer pool.

rpl.rpl_failed_drop_tbl_binlog: Remove the test. DROP TABLE can
no longer be interrupted inside InnoDB.

fseg_free_page(), fseg_free_step(), fseg_free_step_not_header(),
fseg_free_page_low(), fseg_free_extent(): Remove the parameter
that specifies whether the adaptive hash index should be dropped.

btr_search_lazy_free(): Lazily free an index when the last
reference to it is dropped from the adaptive hash index.

buf_pool_clear_hash_index(): Declare static, and move to the
same compilation unit with the bulk of the adaptive hash index
code.

dict_index_t::clone(), dict_index_t::clone_if_needed():
Clone an index that is being rebuilt while adaptive hash index
entries exist. The original index will be inserted into
dict_table_t::freed_indexes and dict_index_t::set_freed()
will be called.

dict_index_t::set_freed(), dict_index_t::freed(): Note that
or check whether the index has been freed. We will use the
impossible page number 1 to denote this condition.

dict_index_t::n_ahi_pages(): Replaces btr_search_info_get_ref_count().

dict_index_t::detach_columns(): Move the assignment n_fields=0
to ha_innobase_inplace_ctx::clear_added_indexes().
We must have access to the columns when freeing the
adaptive hash index. Note: dict_table_t::v_cols[] will remain
valid. If virtual columns are dropped or added, the table
definition will be reloaded in ha_innobase::commit_inplace_alter_table().

buf_page_mtr_lock(): Drop a stale adaptive hash index if needed.

We will also reduce the number of btr_get_search_latch() calls
and enclose some more code inside #ifdef BTR_CUR_HASH_ADAPT
in order to benefit cmake -DWITH_INNODB_AHI=OFF.
2020-05-15 17:23:08 +03:00
Marko Mäkelä
a12aed0398 Fix GCC 9.3.0 -Wunused-but-set-variable 2020-05-14 13:36:11 +03:00
Marko Mäkelä
38f6c47f8a Merge 10.3 into 10.4 2020-05-13 12:52:57 +03:00
Marko Mäkelä
15fa70b840 Merge 10.2 into 10.3 2020-05-13 11:45:05 +03:00
Marko Mäkelä
ba3d58ad4c MDEV-22523 index->rtr_ssn.mutex is wasting memory
As part of the SPATIAL INDEX implementation in InnoDB,
dict_index_t was expanded by a rtr_ssn_t field. There are only
3 operations for this field, all protected by rtr_ssn_t::mutex:

* btr_cur_search_to_nth_level() stores the least significant 32 bits
of the 64-bit value that is stored in the index root page.
(This would better be done when the table is opened for the
very first time.)
* rtr_get_new_ssn_id() increments the value by 1.
* rtr_get_current_ssn_id() reads the current value.

All these operations can be implemented equally safely by using
atomic memory access operations.
2020-05-11 14:23:37 +03:00
Marko Mäkelä
2c3c851d2c Merge 10.3 into 10.4 2020-05-05 20:33:10 +03:00
Oleksandr Byelkin
7fb73ed143 Merge branch '10.2' into 10.3 2020-05-04 16:47:11 +02:00
Daniel Black
ba2061da52 MDEV-21595: innodb offset_t rename to rec_offs
thanks to:

perl -i -pe 's/\boffset_t\b/rec_offs/g' $(git grep -lw offset_t storage/innobase)
2020-04-29 12:02:47 +03:00
Marko Mäkelä
87a61355e8 Merge 10.3 into 10.4
The MDEV-17062 fix in commit c4195305b2
was omitted.
2020-01-20 15:49:48 +02:00
Marko Mäkelä
6373ec3ec7 Merge 10.2 into 10.3 2020-01-18 16:56:16 +02:00
Marko Mäkelä
457ce97ef2 MDEV-21512 InnoDB may hang due to SPATIAL INDEX
MySQL 5.7.29 includes the following fix:
Bug #30287668 INNODB: A LONG SEMAPHORE WAIT
mysql/mysql-server@5cdbb22b51

There is no test case. It seems that the problem could occur when
a spatial index is large and peculiar enough so that multiple R-tree
leaf pages will have the exactly same maximum bounding rectangle (MBR).

The commit message suggests that the hang can occur when R-tree
non-leaf pages are being merged, which should only be possible
during transaction rollback or the purge of transaction history,
when the R-tree index is at least 2 levels high and very many records
are being deleted. The message says that a comparison result that two
spatial index node pointer records are equal will cause an infinite loop
in rtr_page_copy_rec_list_end_no_locks(). Hence, we must include the
child page number in the comparison to be consistent with
mysql/mysql-server@2e11fe0e15.

We fix this bug in a simpler way, involving fewer code changes.

cmp_rec_rec(): Renamed from cmp_rec_rec_with_match().
Assert that rec2 always resides in an index page.
Treat non-leaf spatial index pages specially.
2020-01-17 14:27:29 +02:00