Commit graph

22478 commits

Author SHA1 Message Date
Marko Mäkelä
b42294bc64 MDEV-19514 Defer change buffer merge until pages are requested
We will remove the InnoDB background operation of merging buffered
changes to secondary index leaf pages. Changes will only be merged as a
result of an operation that accesses a secondary index leaf page,
such as a SQL statement that performs a lookup via that index,
or is modifying the index. Also ROLLBACK and some background operations,
such as purging the history of committed transactions, or computing
index cardinality statistics, can cause change buffer merge.
Encryption key rotation will not perform change buffer merge.

The motivation of this change is to simplify the I/O logic and to
allow crash recovery to happen in the background (MDEV-14481).
We also hope that this will reduce the number of "mystery" crashes
due to corrupted data. Because change buffer merge will typically
take place as a result of executing SQL statements, there should be
a clearer connection between the crash and the SQL statements that
were executed when the server crashed.

In many cases, a slight performance improvement was observed.

This is joint work with Thirunarayanan Balathandayuthapani
and was tested by Axel Schwenke and Matthias Leich.

The InnoDB monitor counter innodb_ibuf_merge_usec will be removed.

On slow shutdown (innodb_fast_shutdown=0), we will continue to
merge all buffered changes (and purge all undo log history).

Two InnoDB configuration parameters will be changed as follows:

innodb_disable_background_merge: Removed.
This parameter existed only in debug builds.
All change buffer merges will use synchronous reads.

innodb_force_recovery will be changed as follows:
* innodb_force_recovery=4 will be the same as innodb_force_recovery=3
(the change buffer merge cannot be disabled; it can only happen as
a result of an operation that accesses a secondary index leaf page).
The option used to be capable of corrupting secondary index leaf pages.
Now that capability is removed, and innodb_force_recovery=4 becomes 'safe'.
* innodb_force_recovery=5 (which essentially hard-wires
SET GLOBAL TRANSACTION ISOLATION LEVEL READ UNCOMMITTED)
becomes safe to use. Bogus data can be returned to SQL, but
persistent InnoDB data files will not be corrupted further.
* innodb_force_recovery=6 (ignore the redo log files)
will be the only option that can potentially cause
persistent corruption of InnoDB data files.

Code changes:

buf_page_t::ibuf_exist: New flag, to indicate whether buffered
changes exist for a buffer pool page. Pages with pending changes
can be returned by buf_page_get_gen(). Previously, the changes
were always merged inside buf_page_get_gen() if needed.

ibuf_page_exists(const buf_page_t&): Check if a buffered changes
exist for an X-latched or read-fixed page.

buf_page_get_gen(): Add the parameter allow_ibuf_merge=false.
All callers that know that they may be accessing a secondary index
leaf page must pass this parameter as allow_ibuf_merge=true,
unless it does not matter for that caller whether all buffered
changes have been applied. Assert that whenever allow_ibuf_merge
holds, the page actually is a leaf page. Attempt change buffer
merge only to secondary B-tree index leaf pages.

btr_block_get(): Add parameter 'bool merge'.
All callers of btr_block_get() should know whether the page could be
a secondary index leaf page. If it is not, we should avoid consulting
the change buffer bitmap to even consider a merge. This is the main
interface to requesting index pages from the buffer pool.

ibuf_merge_or_delete_for_page(), recv_recover_page(): Replace
buf_page_get_known_nowait() with much simpler logic, because
it is now guaranteed that that the block is x-latched or read-fixed.

mlog_init_t::mark_ibuf_exist(): Renamed from mlog_init_t::ibuf_merge().
On crash recovery, we will no longer merge any buffered changes
for the pages that we read into the buffer pool during the last batch
of applying log records.

buf_page_get_gen_known_nowait(), BUF_MAKE_YOUNG, BUF_KEEP_OLD: Remove.

btr_search_guess_on_hash(): Merge buf_page_get_gen_known_nowait()
to its only remaining caller.

buf_page_make_young_if_needed(): Define as an inline function.
Add the parameter buf_pool.

buf_page_peek_if_young(), buf_page_peek_if_too_old(): Add the
parameter buf_pool.

fil_space_validate_for_mtr_commit(): Remove a bogus comment
about background merge of the change buffer.

btr_cur_open_at_rnd_pos_func(), btr_cur_search_to_nth_level_func(),
btr_cur_open_at_index_side_func(): Use narrower data types and scopes.

ibuf_read_merge_pages(): Replaces buf_read_ibuf_merge_pages().
Merge the change buffer by invoking buf_page_get_gen().
2019-10-11 17:28:15 +03:00
Marko Mäkelä
d04f2de80a Merge 10.4 into 10.5 2019-10-11 08:41:36 +03:00
Marko Mäkelä
09afd3da1a Merge 10.3 into 10.4 2019-10-10 21:30:40 +03:00
Marko Mäkelä
4cdb72f237 MDEV-19783: Relax an assertion
btr_page_get_split_rec_to_left(): Assert that in the leftmost leaf page,
if the metadata record exists, index->is_instant() must hold.
The assertion of commit 01f45becd1
could fail during innobase_instant_try().
2019-10-10 21:22:38 +03:00
Marko Mäkelä
01f45becd1 MDEV-19783: Add more assertions
btr_page_get_split_rec_to_left(): Assert that in the leftmost leaf page,
the metadata record exists if and only if index->is_instant().

page_validate(): Correct the wording of a message.

rec_init_offsets(): Assert that whenever a record is in "instant ALTER"
format, index->is_instant() must hold.
2019-10-10 20:40:26 +03:00
Marko Mäkelä
7f84e3ad75 Merge 10.2 into 10.3 2019-10-10 20:38:44 +03:00
Marko Mäkelä
6d7a826953 MDEV-20788: Bogus assertion failure for PAGE_FREE list
In MDEV-11369 (instant ADD COLUMN) in MariaDB Server 10.3,
we introduced the hidden metadata record that must be the
first record in the clustered index if and only if
index->is_instant() holds.

To catch MDEV-19783, in
commit ed0793e096 and
commit 99dc40d6ac
we added some assertions to find cases where
the metadata record is missing while it should not be, or a
record exists when it should not. Those assertions were invalid
when traversing the PAGE_FREE list. That list can contain anything;
we must only be able to determine the successor and the size of
each garbage record in it.

page_validate(), page_simple_validate_old(), page_simple_validate_new():
Do not invoke page_rec_get_next_const() for traversing the PAGE_FREE
list, but instead use a lower-level accessor that does not attempt to
validate the REC_INFO_MIN_REC_FLAG.

page_copy_rec_list_end_no_locks(),
page_copy_rec_list_start(), page_delete_rec_list_start():
Add assertions.

btr_page_get_split_rec_to_left(): Remove a redundant return value,
and make the output parameter the return value.

btr_page_get_split_rec_to_right(), btr_page_split_and_insert(): Clean up.
2019-10-10 20:29:30 +03:00
Marko Mäkelä
0f7732d1d1 MDEV-19335 adjustment for innodb_checksum_algorithm=full_crc32
When MDEV-12026 introduced innodb_checksum_algorithm=full_crc32 in
MariaDB 10.4, it accidentally added a dependency on buf_page_t::encrypted.
Now that the flag has been removed, we must adjust the page-read routine.

buf_page_io_complete(): When the full_crc32 page checksum matches but the
tablespace ID in the page does not match after decrypting, we should
declare it a decryption failure and suppress the page dump output and
any attempts to re-read the page.
2019-10-10 15:24:14 +03:00
Aleksey Midenkov
545c545206 Fix compilation 2 (GCC 9)
Fixed warning: -Woverloaded-virtual for GCC 9 (Clang treats it differently)

Caused by c9cba59749
2019-10-10 13:37:02 +03:00
Marko Mäkelä
c11e5cdd12 Merge 10.3 into 10.4 2019-10-10 11:19:25 +03:00
Aleksey Midenkov
3c78d1b640 Fix Mroonga compilation
Fixed warnings: -Woverloaded-virtual, -Winconsistent-missing-override

Caused by c9cba59749
2019-10-10 11:13:05 +03:00
Aleksey Midenkov
75ba5c815d MDEV-16210 FK constraints on versioned tables use historical rows, which may cause constraint violation
Constraint check is done on secondary index update.
F.ex. DELETE does row_upd_sec_index_entry() and checks constraints in
row_upd_check_references_constraints(). UPDATE is optimized for the
case when order is not changed (node->cmpl_info & UPD_NODE_NO_ORD_CHANGE)
and doesn't do row_upd_sec_index_entry(), so it doesn't check constraints.

Since for versioned DELETE we do UPDATE actually, but expect behaviour
of DELETE in terms of constraints, we should deny this optimization to
get constraints checked.

Fix wrong referenced table check when versioned DELETE inserts history in parent
table. Set check_ref to false in this case.

Removed unused dup_chk_only argument for row_ins_sec_index_entry() and
added check_ref argument.

MDEV-18057 fix was superseded by this fix and reverted.

foreign.test:
All key_type combinations: pk, unique, sec(ondary).
2019-10-10 00:20:34 +03:00
Marko Mäkelä
6fde0073bf Rename log_make_checkpoint_at() to log_make_checkpoint()
The function was always called with lsn=LSN_MAX.
Remove that redundant parameter.

Spotted by Thirunarayanan Balathandayuthapani.
2019-10-09 18:47:14 +03:00
Marko Mäkelä
892378fb9d Merge 10.2 into 10.3 2019-10-09 13:25:11 +03:00
Thirunarayanan Balathandayuthapani
c65cb244b3 MDEV-19335 Remove buf_page_t::encrypted
The field buf_page_t::encrypted was added in MDEV-8588.
It was made mostly redundant in MDEV-12699. Remove the field.
2019-10-09 13:13:12 +03:00
Oleksandr Byelkin
896b869685 MDEV-19238 Mariadb spider - crashes on where null
(fix and explanation came with MDEV-20753 (duplicate of this bug))
2019-10-09 08:55:00 +02:00
Oleksandr Byelkin
b7408be0c3 MDEV-20753: Sequence with limit 0 crashes server
Do not try to push down conditions to engine if query was resolved without tables (and so the engine).
2019-10-09 08:55:00 +02:00
Marko Mäkelä
24232ec12c Merge 10.1 into 10.2 2019-10-09 08:30:23 +03:00
Eugene Kosov
ed0793e096 MDEV-19783: Add more REC_INFO_MIN_REC_FLAG checks
btr_cur_pessimistic_delete(): code changed in a way that allows
to put more REC_INFO_MIN_REC_FLAG assertions inside btr_set_min_rec_mark().
Without that change tests innodb.innodb-table-online,
innodb.temp_table_savepoint and innodb_zip.prefix_index_liftedlimit fail.

Removed basically duplicated page_zip_validate() calls
which fails because of temporary(!) invariant violation.
That fixed innodb_zip.wl5522_debug_zip and
innodb_zip.prefix_index_liftedlimit
2019-10-09 08:29:26 +03:00
Eugene Kosov
99dc40d6ac MDEV-19783 Random crashes and corrupt data in INSTANT-added columns
The bug affects MariaDB Server 10.3 or later, but it makes sense
to improve CHECK TABLE in earlier versions already.

page_validate(): Check REC_INFO_MIN_REC_FLAG in the records.
This allows CHECK TABLE to catch more bugs.
2019-10-09 08:29:26 +03:00
Marko Mäkelä
d480d28f4f Add page_has_prev(), page_has_next(), page_has_siblings()
Until now, InnoDB inefficiently compared the aligned fields
FIL_PAGE_PREV, FIL_PAGE_NEXT to the byte-order-agnostic value FIL_NULL.

This is a backport of 32170f8c6d
from MariaDB Server 10.3.
2019-10-09 08:29:26 +03:00
Alexander Barkov
6ea5c2b5b6 MDEV-274 The data type for IPv6/IPv4 addresses in MariaDB 2019-10-08 23:42:02 +04:00
Marko Mäkelä
6bc75a4253 MDEV-15528 preparation: Remove a constant parameter
fsp_free_seg_inode(): Remove the constant parameter log=true.
Only fsp_free_page() could also be called with log=false.

The dead code was introduced in
commit 304ae942f7
by me, and spotted by Thirunarayanan Balathandayuthapani.
2019-10-08 13:05:35 +03:00
Marko Mäkelä
d95f96ad1b Merge 5.5 into 10.1 2019-10-08 12:43:37 +03:00
Marko Mäkelä
db9a4d928d Remove orphan declaration buf_flush_wait_batch_end_wait_only()
The function was declared but not defined in
commit 9d6d1902e0
2019-10-07 17:18:10 +03:00
Eugene Kosov
5e65c67cfc fix clang warning 2019-10-03 17:45:36 +03:00
Sergey Vojtovich
5b2fa078e8 Cleanup mman.h includes
As it is included from my_global.h already.
2019-10-02 20:21:30 +04:00
Daniel Black
716c748f97 MDEV-20684: innodb/query cache use madvise CORE/NOCORE on FreeBSD
This applies to large allocations.

This maps to the way Linux does it in MDEV-10814 except FreeBSD uses
different constants.

Adjust error string to match to implementation.

Tested on FreeBSD-12.0
2019-10-02 20:00:05 +04:00
Alexander Barkov
1ae09ec863 Merge remote-tracking branch 'origin/10.4' into 10.5 2019-10-01 11:44:27 +04:00
Alexander Barkov
dc588e3d3f Merge remote-tracking branch 'origin/10.3' into 10.4 2019-10-01 10:45:52 +04:00
Alexander Barkov
7e44c455f4 Merge remote-tracking branch 'origin/10.2' into 10.3 2019-10-01 09:37:40 +04:00
Robert Bindar
576a5f091e MDEV-20647 Fix and enable SphinxSE tests 2019-09-30 15:47:09 +03:00
Marko Mäkelä
46b785262b Fix -Wunused for CMAKE_BUILD_TYPE=RelWithDebInfo
For release builds, do not declare unused variables.

unpack_row(): Omit a debug-only variable from WSREP diagnostic message.

create_wsrep_THD(): Fix -Wmaybe-uninitialized for the PSI_thread_key.
2019-09-30 12:49:53 +03:00
Marko Mäkelä
12414cd9f2 Merge 10.4 into 10.5 2019-09-27 19:12:07 +03:00
Marko Mäkelä
9b5cdeeb0f Merge 10.3 into 10.4 2019-09-27 16:26:53 +03:00
Marko Mäkelä
ea2b19dee6 MDEV-20117: Fix another scenario
Thanks to Eugene Kosov for noting that the fix is incomplete.
It turns out that on instant DROP/reorder column (MDEV-15562),
we must always write the metadata record, even though the table
was empty. Alternatively, we should guarantee that all undo
log records for the table have been purged. (Attempting to do
that by updating table_id leads to other problems; see
commit 1b31d8852c00b4bab6e6fe179b97db45ccb8d535.)

It would be tempting to remove dict_index_t::clear_instant_alter()
altogether, but it turns that we need that when the instant ALTER TABLE
operation of a first-time DROP COLUMN is being rolled back.

innobase_instant_try(): Clarify a comment. Purge never calls
dict_index_t::clear_instant_alter(), but it may invoke
dict_index_t::clear_instant_add(). On first-time instant DROP/reorder,
always write a metadata record, even if the table is empty.
2019-09-27 16:01:55 +03:00
Marko Mäkelä
2911a9a693 Merge 10.2 into 10.3 2019-09-27 15:56:15 +03:00
Thirunarayanan Balathandayuthapani
c76873f23d MDEV-20688 Recovery crashes after unnecessarily reading a corrupted page
The test encryption.innodb-redo-badkey was accidentally disabled
until commit 23657a2101 enabled
it recently. Once it was enabled, it started failing randomly.

recv_recover_corrupt_page(): Do not assume that any redo log exists
for the page. A page may be unnecessarily read by read-ahead.
When noting the corruption, reset recv_addr->state to RECV_PROCESSED,
so that even if the same page is re-read again, we will only
decrement recv_sys->n_addrs once.
2019-09-27 17:46:10 +05:30
Marko Mäkelä
d874cdeccc dict_load_table(): Remove constant parameter cached=true
Spotted by Thirunarayanan Balathandayuthapani.
2019-09-27 14:29:22 +03:00
Marko Mäkelä
718fcee0a3 Reduce rw_lock_debug_mutex contention
rw_lock_own(), rw_lock_own_flagged(): Traverse the rw_lock_t::debug_list
only after quickly checking if the thread is holding X-latch or SX-latch.
2019-09-27 14:22:59 +03:00
Marko Mäkelä
4ec0c346b8 Remove a useless large test, and add a debug assertion
The test innodb_fts.fulltext_table_evict was only creating 1000 tables
with fulltext indexes, only to check that no tables with fulltext
indexes are being evicted.

The reason why tables containing fulltext indexes cannot be evicted is
that fts_optimize_init() invokes dict_table_prevent_eviction().
2019-09-27 14:05:39 +03:00
Marko Mäkelä
ca9e0089d5 MDEV-19740: Fix GCC 9.2.1 -Wmaybe-uninitialized on AMD64
For CMAKE_BUILD_TYPE=Debug, the default MYSQL_MAINTAINER_MODE=AUTO
implies -Werror along with other flags in cmake/maintainer.cmake,
which would break the debug builds when CMAKE_CXX_FLAGS include -O2.

This fix includes a backport of 6dd3f24090
from MariaDB 10.3.
2019-09-27 10:43:23 +03:00
Marko Mäkelä
72f671ab7b Merge 10.4 into 10.5 2019-09-27 07:15:07 +03:00
Marko Mäkelä
1f4ee3fa5a MDEV-20117 Assertion 0 failed in row_sel_get_clust_rec_for_mysql
The crash scenario is as follows:

(1) A non-empty table exists.
(2) MDEV-15562 instant ADD/DROP/reorder has been invoked.
(3) Some purgeable undo log exists for the table.
(4) The table becomes empty, containing not even any delete-marked records,
only containing the hidden metadata record that was added in (2).
(5) An instant ADD/DROP/reorder column is executed, and the table
is emptied and the (2) metadata removed.
(6) Purge processes an undo log record from (3), which will refer to
a non-existent clustered index field, because the metadata that
was created in (2) was remoeved in (5).

We fix this by adjusting step (5) so that we will never remove the
MDEV-15562-style metadata record. Removing the MDEV-11369 metadata
record (instant ADD COLUMN to the end of the table) is completely
fine at any time when the table becomes empty, because
dict_index_t::n_fields will remain unchanged.

innobase_instant_try(): Never remove the MDEV-15562 metadata record.

page_cur_delete_rec(): Do not reset FIL_PAGE_TYPE when the
MDEV-15562 metadata record is being removed as part of
btr_cur_pessimistic_update() invoked by innobase_instant_try().
2019-09-26 19:45:10 +03:00
Marko Mäkelä
bb5afc7ceb Merge 10.3 into 10.4 2019-09-26 16:56:02 +03:00
Marko Mäkelä
3e4931cdf3 MDEV-20675 Crash in SHOW ENGINE INNODB STATUS with innodb_force_recovery=5
lock_print_info::operator(): Do not dereference purge_sys.query in case
it is NULL. We would not initialize purge_sys if innodb_force_recovery
is set to 5 or 6.

The test case will be added by merge from 10.2.
2019-09-26 13:18:22 +03:00
Marko Mäkelä
e3c39c0be8 MDEV-13564 follow-up: Remove dead code
In MariaDB 10.4.0, commit 09af00cbde
removed the crash-upgrade logic for the MariaDB 10.2
innodb_safe_truncate=OFF TRUNCATE TABLE (which was the only option
between MariaDB 10.2.2 and 10.2.18), but failed to adjust some
comments and code.

buf_page_io_complete(): Remove a bogus comment about TRUNCATE.

dict_recreate_index_tree(): Unused function; remove.

fil_space_t::stop_new_ops: Clarify the comment.

fil_space_acquire_low(): Remove a bogus comment about TRUNCATE.

fil_check_pending_ops(), fil_check_pending_io(): Adjust a warning message.
This code is only invoked as part of DISCARD TABLESPACE or DROP TABLE.
DROP TABLE is internally used as part of ALTER TABLE, OPTIMIZE TABLE,
or TRUNCATE TABLE.

RemoteDatafile::create_link_file(): Clarify a comment.

ibuf_delete_for_discarded_space(): Clarify the function comment.

dict_table_x_lock_indexes(), dict_table_x_unlock_indexes():
Merge with the only remaining caller, row_quiesce_set_state().

page_create_zip(): Remove a bogus comment about TRUNCATE.
2019-09-26 10:25:34 +03:00
Marko Mäkelä
a340af9223 btr_block_get(): Remove redundant parameters 2019-09-25 16:08:48 +03:00
Marko Mäkelä
5d0bab47fc btr_block_get(), btr_block_get_func(): Change the parameter to
const dict_index_t&

btr_level_list_remove(): Clean up the parameters. Renamed from
btr_level_list_remove_func().
2019-09-25 13:34:49 +03:00
Marko Mäkelä
fba9883bc4 Merge 10.4 into 10.5 2019-09-25 10:18:22 +03:00