Commit graph

5770 commits

Author SHA1 Message Date
Marko Mäkelä
fa929f7cdf Simplify row_undo_ins_remove_sec_low()
Reduce the scope of some variables, remove a goto and a redundant
assertion.

For B-tree secondary indexes, this function can remove a delete-marked
purgeable record, in case a row rollback of the INSERT was initiated
due to an error in an earlier secondary index.
2019-10-17 17:12:23 +03:00
Marko Mäkelä
b027830232 MDEV-20850 Merge new release of InnoDB 5.7.28 to 10.2 2019-10-17 17:08:58 +03:00
Marko Mäkelä
fa32d28f2f MDEV-20852 BtrBulk is unnecessarily holding dict_index_t::lock
The BtrBulk class, which was introduced in MySQL 5.7, is by design
the exclusive writer to an index. It is therefore unnecessary to
acquire the dict_index_t::lock in that code.

Holding the dict_index_t::lock would unnecessarily block other threads
(SQL connections and the InnoDB purge threads) from buffering concurrent
modifications to being-created secondary indexes.

This fix is motivated by a change in MySQL 5.7.28:
Bug #29008298 MYSQLD CRASHES ITSELF WHEN CREATING INDEX
mysql/mysql-server@f9fb96c20f

PageBulk::init(), PageBulk::latch(): Never acquire m_index->lock.

PageBulk::storeExt(): Remove some pointer indirection, and improve
a debug assertion that seems to prove that some code is redundant.

BtrBulk::pageCommit(): Assert that m_index->lock is not being held.

btr_blob_log_check_t: Do not acquire m_index->lock if
m_op == BTR_STORE_INSERT_BULK. Add UNIV_UNLIKELY hints around
that condition.

btr_store_big_rec_extern_fields(): Allow index->lock not to be held
while op == BTR_STORE_INSERT_BULK. Add UNIV_UNLIKELY hints around
that condition.
2019-10-17 14:04:07 +03:00
Marko Mäkelä
f989c0ce66 MDEV-20813: Do not rotate keys for unallocated pages
fil_crypt_rotate_page(): Skip the key rotation for pages that carry 0
in FIL_PAGE_TYPE. This avoids not only unnecessary writes, but also
failures of the recently added debug assertion in
buf_flush_init_for_writing() that the FIL_PAGE_TYPE should be nonzero.

Note: the debug assertion can fail if the file was originally created
before MySQL 5.5. In old InnoDB versions, FIL_PAGE_TYPE was only
initialized for B-tree pages, to FIL_PAGE_INDEX. For any other pages,
the field could be garbage, including FIL_PAGE_INDEX. In MariaDB 10.2
and later, buf_flush_init_for_writing() would initialize the
FIL_PAGE_TYPE on such old pages, but only after passing the debug
assertion that insists that pages have a nonzero FIL_PAGE_TYPE.
Thus, the debug assertion at the start of buf_flush_init_for_writing()
can fail when upgrading from very old debug files. This assertion is
only present in debug builds, not release builds.
2019-10-14 17:26:21 +03:00
Marko Mäkelä
361e8284f3 MDEV-20813 Assertion failure in buf_flush_init_for_writing() for innodb_immediate_scrub_data_uncompressed=ON
The assertion that was added in
commit c0c003beb4
to augment the fix of MDEV-20805 turns out to be invalid when
innodb_immediate_scrub_data_uncompressed is enabled.
In this mode, fsp_init_file_page() will be invoked on data pages
that have been freed, causing writes of almost-all-zero pages.

btr_page_free(): Adjust the comment.

buf_flush_init_for_writing(): Disable the assertion with a note
that it should be re-enabled in MDEV-15528.
2019-10-12 15:28:55 +03:00
Marko Mäkelä
38736928e7 Fix -std=c++98 -Wzero-length-array
This is another follow-up fix to
commit b393e2cb0c
which turned out to be still broken.

Replace the C++11 keyword 'constexpr' with #define.

debug_sync_t::str: Remove the zero-length array.
Replace sync->str with reinterpret_cast<char*>(&sync[1]).
2019-10-11 21:26:16 +03:00
Marko Mäkelä
1e1b53ccfd After-merge fix: Correct an assertion
During IMPORT TABLESPACE, we do invoke
buf_flush_init_for_writing() with block==NULL and newest_lsn!=0.
2019-10-11 21:24:48 +03:00
Marko Mäkelä
966d97b5f9 Merge 10.1 into 10.2 2019-10-11 18:38:18 +03:00
Marko Mäkelä
1fd1ef25c2 Fix CMAKE_BUILD_TYPE=Debug
Remove unused variables and type mismatch that was introduced
in commit b393e2cb0c

Also, fix a typo in the documentation of the parameter, and
update the test.
2019-10-11 18:36:08 +03:00
Marko Mäkelä
c0c003beb4 MDEV-20805 follow-up: Catch writes of bogus pages
buf_flush_init_for_writing(): Assert that FIL_PAGE_TYPE is set
except when creating a new data file with a dummy first page.

buf_dblwr_create(): Ensure that FIL_PAGE_TYPE on all pages
will be initialized. Reset buf_dblwr_being_created at the end.
2019-10-11 15:32:04 +03:00
Marko Mäkelä
cbfd6882f4 Merge 5.5 into 10.1 2019-10-11 15:19:55 +03:00
Marko Mäkelä
ea61b79694 MDEV-20805 ibuf_add_free_page() is not initializing FIL_PAGE_TYPE first
In the function recv_parse_or_apply_log_rec_body() there are debug checks
for validating the state of the page when redo log records are being
applied. Most notably, FIL_PAGE_TYPE should be set before anything else
is being written to the page.

ibuf_add_free_page(): Set FIL_PAGE_TYPE before performing any other changes.
2019-10-11 14:12:36 +03:00
Nikita Malyavin
350e46a8b5 MDEV-18546 ASAN heap-use-after-free in innobase_get_computed_value / row_purge
the bug was already fixed in MDEV-17005, so now only test is added
2019-10-11 17:02:39 +10:00
Nikita Malyavin
b393e2cb0c add innodb_debug_sync var to support DEBUG_SYNC from purge threads 2019-10-11 17:02:39 +10:00
Marko Mäkelä
6d7a826953 MDEV-20788: Bogus assertion failure for PAGE_FREE list
In MDEV-11369 (instant ADD COLUMN) in MariaDB Server 10.3,
we introduced the hidden metadata record that must be the
first record in the clustered index if and only if
index->is_instant() holds.

To catch MDEV-19783, in
commit ed0793e096 and
commit 99dc40d6ac
we added some assertions to find cases where
the metadata record is missing while it should not be, or a
record exists when it should not. Those assertions were invalid
when traversing the PAGE_FREE list. That list can contain anything;
we must only be able to determine the successor and the size of
each garbage record in it.

page_validate(), page_simple_validate_old(), page_simple_validate_new():
Do not invoke page_rec_get_next_const() for traversing the PAGE_FREE
list, but instead use a lower-level accessor that does not attempt to
validate the REC_INFO_MIN_REC_FLAG.

page_copy_rec_list_end_no_locks(),
page_copy_rec_list_start(), page_delete_rec_list_start():
Add assertions.

btr_page_get_split_rec_to_left(): Remove a redundant return value,
and make the output parameter the return value.

btr_page_get_split_rec_to_right(), btr_page_split_and_insert(): Clean up.
2019-10-10 20:29:30 +03:00
Marko Mäkelä
6fde0073bf Rename log_make_checkpoint_at() to log_make_checkpoint()
The function was always called with lsn=LSN_MAX.
Remove that redundant parameter.

Spotted by Thirunarayanan Balathandayuthapani.
2019-10-09 18:47:14 +03:00
Thirunarayanan Balathandayuthapani
c65cb244b3 MDEV-19335 Remove buf_page_t::encrypted
The field buf_page_t::encrypted was added in MDEV-8588.
It was made mostly redundant in MDEV-12699. Remove the field.
2019-10-09 13:13:12 +03:00
Marko Mäkelä
24232ec12c Merge 10.1 into 10.2 2019-10-09 08:30:23 +03:00
Eugene Kosov
ed0793e096 MDEV-19783: Add more REC_INFO_MIN_REC_FLAG checks
btr_cur_pessimistic_delete(): code changed in a way that allows
to put more REC_INFO_MIN_REC_FLAG assertions inside btr_set_min_rec_mark().
Without that change tests innodb.innodb-table-online,
innodb.temp_table_savepoint and innodb_zip.prefix_index_liftedlimit fail.

Removed basically duplicated page_zip_validate() calls
which fails because of temporary(!) invariant violation.
That fixed innodb_zip.wl5522_debug_zip and
innodb_zip.prefix_index_liftedlimit
2019-10-09 08:29:26 +03:00
Eugene Kosov
99dc40d6ac MDEV-19783 Random crashes and corrupt data in INSTANT-added columns
The bug affects MariaDB Server 10.3 or later, but it makes sense
to improve CHECK TABLE in earlier versions already.

page_validate(): Check REC_INFO_MIN_REC_FLAG in the records.
This allows CHECK TABLE to catch more bugs.
2019-10-09 08:29:26 +03:00
Marko Mäkelä
d480d28f4f Add page_has_prev(), page_has_next(), page_has_siblings()
Until now, InnoDB inefficiently compared the aligned fields
FIL_PAGE_PREV, FIL_PAGE_NEXT to the byte-order-agnostic value FIL_NULL.

This is a backport of 32170f8c6d
from MariaDB Server 10.3.
2019-10-09 08:29:26 +03:00
Marko Mäkelä
d95f96ad1b Merge 5.5 into 10.1 2019-10-08 12:43:37 +03:00
Marko Mäkelä
db9a4d928d Remove orphan declaration buf_flush_wait_batch_end_wait_only()
The function was declared but not defined in
commit 9d6d1902e0
2019-10-07 17:18:10 +03:00
Marko Mäkelä
46b785262b Fix -Wunused for CMAKE_BUILD_TYPE=RelWithDebInfo
For release builds, do not declare unused variables.

unpack_row(): Omit a debug-only variable from WSREP diagnostic message.

create_wsrep_THD(): Fix -Wmaybe-uninitialized for the PSI_thread_key.
2019-09-30 12:49:53 +03:00
Thirunarayanan Balathandayuthapani
c76873f23d MDEV-20688 Recovery crashes after unnecessarily reading a corrupted page
The test encryption.innodb-redo-badkey was accidentally disabled
until commit 23657a2101 enabled
it recently. Once it was enabled, it started failing randomly.

recv_recover_corrupt_page(): Do not assume that any redo log exists
for the page. A page may be unnecessarily read by read-ahead.
When noting the corruption, reset recv_addr->state to RECV_PROCESSED,
so that even if the same page is re-read again, we will only
decrement recv_sys->n_addrs once.
2019-09-27 17:46:10 +05:30
Marko Mäkelä
d874cdeccc dict_load_table(): Remove constant parameter cached=true
Spotted by Thirunarayanan Balathandayuthapani.
2019-09-27 14:29:22 +03:00
Marko Mäkelä
718fcee0a3 Reduce rw_lock_debug_mutex contention
rw_lock_own(), rw_lock_own_flagged(): Traverse the rw_lock_t::debug_list
only after quickly checking if the thread is holding X-latch or SX-latch.
2019-09-27 14:22:59 +03:00
Marko Mäkelä
4ec0c346b8 Remove a useless large test, and add a debug assertion
The test innodb_fts.fulltext_table_evict was only creating 1000 tables
with fulltext indexes, only to check that no tables with fulltext
indexes are being evicted.

The reason why tables containing fulltext indexes cannot be evicted is
that fts_optimize_init() invokes dict_table_prevent_eviction().
2019-09-27 14:05:39 +03:00
Marko Mäkelä
ef701bfd07 Remove the unused function btr_page_get()
btr_block_get(): Remove #ifdef around the definition
2019-09-24 13:29:23 +03:00
Marko Mäkelä
60cb5559a9 MDEV-17614 post-fix: Remove dead dup_chk_only=true code.
The parameter dup_chk_only was always passed as a constant false.
Remove the parameter and the dead code related to it.
2019-09-23 08:29:39 +03:00
Thirunarayanan Balathandayuthapani
f94d9ab9f8 MDEV-20483 Follow-up fix
At commit, trx->lock.table_locks (which is a cache of trx_locks) can
consist of NULL pointers. Add a debug assertion for that, and clear
the vector.
2019-09-18 20:20:04 +05:30
Marko Mäkelä
bb4214272a Merge 10.1 into 10.2 2019-09-18 16:24:48 +03:00
Marko Mäkelä
a624b99f91 Remove an unused declaration 2019-09-18 16:10:03 +03:00
Thirunarayanan Balathandayuthapani
8a79fa0e4d MDEV-19529 InnoDB hang on DROP FULLTEXT INDEX
Problem:
=======
  During dropping of fts index, InnoDB waits for fts_optimize_remove_table()
and it holds dict_sys->mutex and dict_operaiton_lock even though the
table id is not present in the queue. But fts_optimize_thread does wait
for dict_sys->mutex to process the unrelated table id from the slot.

Solution:
========
  Whenever table is added to fts_optimize_wq, update the fts_status
of in-memory fts subsystem to TABLE_IN_QUEUE. Whenever drop index
wants to remove table from the queue, it can check the fts_status
to decide whether it should send the MSG_DELETE_TABLE to the queue.

Removed the following functions because these are all deadcode.
dict_table_wait_for_bg_threads_to_exit(),
fts_wait_for_background_thread_to_start(),fts_start_shutdown(), fts_shudown().
2019-09-18 13:22:08 +05:30
Thirunarayanan Balathandayuthapani
708f1e3419 MDEV-19647 Server hangs after dropping full text indexes and restart
- There is no need to add the table in fts_optimize_wq if there is
no fts indexes associated with it.
2019-09-17 20:47:58 +05:30
Thirunarayanan Balathandayuthapani
fb3e3a6a3d MDEV-20483 trx_lock_t::table_locks is not a subset of trx_lock_t::trx_locks
Problem:
=======
  Transaction left with nonempty table locks list. This leads to
assumption that table_locks is not subset of trx_locks. Problem is that
lock_wait_timeout_thread() doesn't remove the table lock from
table_locks for transaction.

Solution:
========
  In lock_wait_timeout_thread(), remove the lock from table vector of
transaction.
2019-09-17 19:54:55 +05:30
Marko Mäkelä
0f950e53f0 MDEV-20562 btr_cur_open_at_rnd_pos() fails to return error for corrupted page
In mysql-server/commit@f46329044f
the InnoDB function btr_cur_open_at_rnd_pos() was corrected so that
it would return a status that indicates whether the cursor was
successfully positioned. But this change was not correctly merged to
MariaDB in 2e814d4702.

btr_cur_open_at_rnd_pos(): In the code path that was introduced in
MDEV-8588, properly return failure status.

No deterministic test case was found for this failure.
It was caught after removing the function
page_copy_rec_list_end_to_created_page() in a development branch.
As a result, the fill factor of index trees would improve, and
supposedly, so would the probability of btr_cur_open_at_rnd_pos()
reaching the intentionally corrupted page in the test
innodb.leaf_page_corrupted_during_recovery.
The wrong return value would cause
btr_estimate_number_of_different_key_vals() to wrongly invoke
btr_rec_get_externally_stored_len() on a non-leaf page and
trigger an assertion failure at the start of that function.
2019-09-11 15:30:19 +03:00
Thirunarayanan Balathandayuthapani
df4dee4b84 MDEV-17939 Assertion `++loop_count < 2' failed in trx_undo_report_rename
- During trx_undo_report_rename(), InnoDB can fail to write undo log
for it if undo log doesn't fit in the undo page. In that case, InnoDB
adds one more undo log page and retry to write the rename undo log.
But the assert is wrong and it doesn't allow to fail even for one time.
2019-09-11 16:02:41 +05:30
Marko Mäkelä
43a6e81ccb MDEV-19514 preparation: Remove innodb_change_buffering_debug=2
The setting innodb_change_buffering_debug=2 was supposed to inject
a crash during change buffer merge. There is no public test for
that functionality, and even if there were, it would be better
to use DEBUG_SYNC to halt the thread that does change buffer merge,
force a redo log flush from another thread, and finally kill the
server externally.
2019-09-09 18:18:52 +03:00
Marko Mäkelä
292e2649d4 MDEV-12121: Avoid unused variable
With cmake -DWITH_INNODB_AHI=OFF -DCMAKE_BUILD_TYPE=RelWithDebInfo
the variable 'i' in fseg_free_extent() was declared but not used.
2019-09-06 12:50:53 +03:00
Marko Mäkelä
dae1b3b04c MDEV-15326: Backport trx_t::is_referenced()
Backport the applicable part of Sergey Vojtovich's commit
0ca2ea1a65 from MariaDB Server 10.3.

trx reference counter was updated under mutex and read without any
protection. This is both slow and unsafe. Use atomic operations for
reference counter accesses.
2019-09-04 09:42:38 +03:00
Marko Mäkelä
b07beff894 MDEV-15326: InnoDB: Failing assertion: !other_lock
MySQL 5.7.9 (and MariaDB 10.2.2) introduced a race condition
between InnoDB transaction commit and the conversion of implicit
locks into explicit ones.

The assertion failure can be triggered with a test that runs
3 concurrent single-statement transactions in a loop on a simple
table:

CREATE TABLE t (a INT PRIMARY KEY) ENGINE=InnoDB;
thread1: INSERT INTO t SET a=1;
thread2: DELETE FROM t;
thread3: SELECT * FROM t FOR UPDATE; -- or DELETE FROM t;

The failure scenarios are like the following:
(1) The INSERT statement is being committed, waiting for lock_sys->mutex.
(2) At the time of the failure, both the DELETE and SELECT transactions
are active but have not logged any changes yet.
(3) The transaction where the !other_lock assertion fails started
lock_rec_convert_impl_to_expl().
(4) After this point, the commit of the INSERT removed the transaction from
trx_sys->rw_trx_set, in trx_erase_lists().
(5) The other transaction consulted trx_sys->rw_trx_set and determined
that there is no implicit lock. Hence, it grabbed the lock.
(6) The !other_lock assertion fails in lock_rec_add_to_queue()
for the lock_rec_convert_impl_to_expl(), because the lock was 'stolen'.
This assertion failure looks genuine, because the INSERT transaction
is still active (trx->state=TRX_STATE_ACTIVE).

The problematic step (4) was introduced in
mysql/mysql-server@e27e0e0bb7
which fixed something related to MVCC (covered by the test
innodb.innodb-read-view). Basically, it reintroduced an error
that had been mentioned in an earlier commit
mysql/mysql-server@a17be6963f:
"The active transaction was removed from trx_sys->rw_trx_set prematurely."

Our fix goes along the following lines:

(a) Implicit locks will released by assigning
trx->state=TRX_STATE_COMMITTED_IN_MEMORY as the first step.
This transition will no longer be protected by lock_sys_t::mutex,
only by trx->mutex. This idea is by Sergey Vojtovich.
(b) We detach the transaction from trx_sys before starting to release
explicit locks.
(c) All callers of trx_rw_is_active() and trx_rw_is_active_low() must
recheck trx->state after acquiring trx->mutex.
(d) Before releasing any explicit locks, we will ensure that any activity
by other threads to convert implicit locks into explicit will have ceased,
by checking !trx_is_referenced(trx). There was a glitch
in this check when it was part of lock_trx_release_locks(); at the end
we would release trx->mutex and acquire lock_sys->mutex and trx->mutex,
and fail to recheck (trx_is_referenced() is protected by trx_t::mutex).
(e) Explicit locks can be released in batches (LOCK_RELEASE_INTERVAL=1000)
just like we did before.

trx_t::state: Document that the transition to COMMITTED is only
protected by trx_t::mutex, no longer by lock_sys_t::mutex.

trx_rw_is_active_low(), trx_rw_is_active(): Document that the transaction
state should be rechecked after acquiring trx_t::mutex.

trx_t::commit_state(): New function to change a transaction to committed
state, to release implicit locks.

trx_t::release_locks(): New function to release the explicit locks
after commit_state().

lock_trx_release_locks(): Move much of the logic to the caller
(which must invoke trx_t::commit_state() and trx_t::release_locks()
as needed), and assert that the transaction will have locks.

trx_get_trx_by_xid(): Make the parameter a pointer to const.

lock_rec_other_trx_holds_expl(): Recheck trx->state after acquiring
trx->mutex, and avoid a redundant lookup of the transaction.

lock_rec_queue_validate(): Recheck impl_trx->state while holding
impl_trx->mutex.

row_vers_impl_x_locked(), row_vers_impl_x_locked_low():
Document that the transaction state must be rechecked after
trx_mutex_enter().

trx_free_prepared(): Adjust for the changes to lock_trx_release_locks().
2019-09-04 09:42:38 +03:00
Marko Mäkelä
7c79c12784 MDEV-15326 preparation: Remove trx_sys_t::n_prepared_trx
This is a backport of 900b07908b
from MariaDB Server 10.3.
2019-09-04 09:42:38 +03:00
Marko Mäkelä
b2775ae855 MVCC::view_close(): Correct comments 2019-09-04 09:42:38 +03:00
Marko Mäkelä
2842ae03bc Remove a bogus comment
Changes of PAGE_MAX_TRX_ID must be redo-logged for correctness.
That was fixed in the InnoDB Plugin for MySQL 5.1 already.
2019-08-28 15:27:35 +03:00
Marko Mäkelä
25af2a183b MDEV-15326/MDEV-16136 dead code removal
Revert part of fa2a74e08d.

trx_reference(): Remove, and merge the relevant part to the only caller
trx_rw_is_active(). If the statements trx = NULL; were ever executed,
the function would have dereferenced a NULL pointer and crashed in
trx_mutex_exit(trx). Hence, those statements must have been unreachable,
and they can be replaced with debug assertions.

trx_rw_is_active(): Avoid unnecessary acquisition and release of trx->mutex
when do_ref_count=false.

lock_trx_release_locks(): Do not reset trx->id=0. Had the statement been
necessary, we would have experienced crashes in trx_reference().
2019-08-27 16:38:57 +03:00
Marko Mäkelä
b01a95f6fc row_undo_mod_remove_clust_low(): Remove duplicated code
Some code was duplicated near the start of the function,
only for InnoDB, not XtraDB. This was noticed by
comparing the InnoDB between MariaDB and MySQL.
2019-08-22 17:37:13 +03:00
Marko Mäkelä
9de2e60d74 MDEV-17187 table doesn't exist in engine after ALTER of FOREIGN KEY
ha_innobase::open(): Always ignore problems with FOREIGN KEY constraints
(pass DICT_ERR_IGNORE_FK_NOKEY), no matter whether foreign_key_checks
is enabled. Instead, we must report errors when enforcing the FOREIGN KEY
constraints. As a result of ignoring these errors, the tables will be
loaded with dict_foreign_t objects whose foreign_index or referenced_index
will be NULL.

Also, pass DICT_ERR_IGNORE_FK_NOKEY instead of DICT_ERR_IGNORE_NONE
to dict_table_open_on_id_low() in many other cases. Notably, on
CREATE TABLE and ALTER TABLE, we will keep validating the FOREIGN KEY
constraints as before.

dict_table_open_on_name(): If no other flags than
DICT_ERR_IGNORE_FK_NOKEY are set, refuse access to unreadable tables.
Some encryption tests rely on this code path.

For the DML code path, we used to have the problem that when
one of the indexes was missing in dict_foreign_t, we would ignore
the FOREIGN KEY constraint altogether. The following changes
address that.

row_ins_check_foreign_constraints(): Add the parameter pk.
For the primary key, consider also foreign key constraints for which
foreign->foreign_index=NULL (no underlying index is available).

row_ins_check_foreign_constraint(): Report errors also for !check_ref.
Remove a redundant check for srv_read_only_mode.

row_ins_foreign_report_add_err(): Tolerate foreign->foreign_index=NULL.
2019-08-21 11:38:33 +03:00
Marko Mäkelä
e279c0076d MDEV-17187: Code cleanup
fkerr_t: Errors for the foreign key checks. Replaces ulint,
which used #define that looked like dberr_t literals.

wsrep_dict_foreign_find_index(): Remove. Use
dict_foreign_find_index() instead, with default parameters.

dict_foreign_push_index_error(): Do not add redundant quotes
around quoted table names.
2019-08-21 11:38:33 +03:00
Marko Mäkelä
ddaebdd210 dict_table_open_on_index_id(): Remove a redundant parameter 2019-08-21 11:38:33 +03:00