Commit graph

2912 commits

Author SHA1 Message Date
Marko Mäkelä
d346763479 Merge 10.5 into 10.6 2021-03-08 10:51:31 +02:00
Marko Mäkelä
a5d3c1c819 Merge 10.4 into 10.5 2021-03-08 10:16:20 +02:00
Marko Mäkelä
a26e7a3726 Merge 10.3 into 10.4 2021-03-08 09:39:54 +02:00
Marko Mäkelä
03ff588d15 Merge 10.5 into 10.6 2021-03-05 16:05:47 +02:00
Marko Mäkelä
10d544aa7b Merge 10.4 into 10.5 2021-03-05 12:54:43 +02:00
Marko Mäkelä
8bab5bb332 Merge 10.3 into 10.4 2021-03-05 10:36:51 +02:00
Varun Gupta
f691d9865b MDEV-7317: Make an index ignorable to the optimizer
This feature adds the functionality of ignorability for indexes.
Indexes are not ignored be default.

To control index ignorability explicitly for a new index,
use IGNORE or NOT IGNORE as part of the index definition for
CREATE TABLE, CREATE INDEX, or ALTER TABLE.

Primary keys (explicit or implicit) cannot be made ignorable.

The table INFORMATION_SCHEMA.STATISTICS get a new column named IGNORED that
would store whether an index needs to be ignored or not.
2021-03-04 22:50:00 +05:30
Vicențiu Ciorbaru
e9b8b76f47 Merge branch '10.2' into 10.3 2021-03-04 16:04:30 +02:00
Thirunarayanan Balathandayuthapani
b044898b97 MDEV-24748 extern column check missing in btr_index_rec_validate()
In btr_index_rec_validate(), externally stored column
check is missing while matching the length of the field
with the length of the field data stored in record.
Fetch the length of the externally stored part and compare it
with the fixed field length.
2021-03-03 17:20:43 +05:30
Marko Mäkelä
ddbc612692 Merge 10.2 into 10.3 2021-03-03 09:41:50 +02:00
Monty
676987c4a1 MDEV-24532 Table corruption ER_NO_SUCH_TABLE_IN_ENGINE .. on table with foreign key
When doing a truncate on an Innodb under lock tables, InnoDB would rename
the old table to #sql-... and recreate a new 't1' table. The table lock
would still be on the #sql-table.

When doing ALTER TABLE, Innodb would do the changes on the #sql table
(which would disappear on close).
When the SQL layer, as part of inline alter table, would close the
original t1 table (#sql in InnoDB) and then reopen the t1 table, Innodb
would notice that this does not match it's own (old) t1 table and
generate an error.

Fixed by adding code in truncate table that if we are under lock tables
and truncating an InnoDB table, we would close, reopen and lock the
table after truncate. This will remove the #sql table and ensure that
lock tables is using the new empty table.

Reviewer: Marko Mäkelä
2021-03-02 15:23:56 +02:00
Marko Mäkelä
7cf4419fc4 MDEV-24789: Reduce lock_sys.wait_mutex contention
A performance regression was introduced by
commit e71e613353 (MDEV-24671)
and mostly addressed by
commit 455514c800.

The regression is likely caused by increased contention
lock_sys.latch (former lock_sys.mutex), possibly indirectly
caused by contention on lock_sys.wait_mutex. This change aims to
reduce both, but further improvements will be needed.

lock_wait(): Minimize the lock_sys.wait_mutex hold time.

lock_sys_t::deadlock_check(): Add a parameter for indicating
whether lock_sys.latch is exclusively locked.

trx_t::was_chosen_as_deadlock_victim: Always use atomics.

lock_wait_wsrep(): Assume that no mutex is being held.

Deadlock::report(): Always kill the victim transaction.

lock_sys_t::timeout: New counter to back MONITOR_TIMEOUT.
2021-02-26 14:58:48 +02:00
Marko Mäkelä
5c9229b96f MDEV-24951 Assertion m.first->second.valid(trx->undo_no) failed
trx_t::commit_in_memory(): Invoke mod_tables.clear().

trx_free_at_shutdown(): Invoke mod_tables.clear() for transactions
that are discarded on shutdown.

Everywhere else, assert mod_tables.empty() on freed transaction objects.
2021-02-24 15:49:58 +02:00
Marko Mäkelä
7953bae22a Merge 10.5 into 10.6 2021-02-24 09:30:17 +02:00
Sergei Golubchik
f33e57a9e6 Merge branch '10.4' into 10.5 2021-02-23 13:06:22 +01:00
Sergei Golubchik
e841957416 Merge branch '10.3' into 10.4 2021-02-23 09:25:57 +01:00
Sergei Golubchik
0ab1e3914c Merge branch '10.2' into 10.3 2021-02-22 22:42:27 +01:00
Marko Mäkelä
93522bc9a9 MDEV-24917 Page cleaner wrongly remains idle
commit a993310593 (MDEV-24537)
introduced the regression that the page cleaner will keep sleeping
even if there is work to do.

innodb_max_dirty_pages_pct_update(): Always wake up the page cleaner
on any SET GLOBAL innodb_max_dirty_pages_pct= assignment.

buf_flush_page_cleaner(): If innodb_max_dirty_pages_pct is nonzero,
consult only that parameter when determining whether there is work
to do. Else, consult innodb_max_dirty_pages.
2021-02-18 18:20:50 +02:00
Marko Mäkelä
94b4578704 Merge 10.5 into 10.6 2021-02-17 19:39:05 +02:00
Marko Mäkelä
c68007d958 MDEV-24738 Improve the InnoDB deadlock checker
A new configuration parameter innodb_deadlock_report is introduced:
* innodb_deadlock_report=off: Do not report any details of deadlocks.
* innodb_deadlock_report=basic: Report transactions and waiting locks.
* innodb_deadlock_report=full (default): Report also the blocking locks.

The improved deadlock checker will consider all involved transactions
in one loop, even if the deadlock loop includes several transactions.
The theoretical maximum number of transactions that can be involved in
a deadlock is `innodb_page_size` * 8, limited by the persistent data
structures.

Note: Similar to
mysql/mysql-server@3859219875
our deadlock checker will consider at most one blocking transaction
for each waiting transaction. The new field trx->lock.wait_trx be
nullptr if and only if trx->lock.wait_lock is nullptr. Note that
trx->lock.wait_lock->trx == trx (the waiting transaction), while
trx->lock.wait_trx points to one of the transactions whose lock is
conflicting with trx->lock.wait_lock.

Considering only one blocking transaction will greatly simplify
our deadlock checker, but it may also make the deadlock checker
blind to some deadlocks where the deadlock cycle is 'hidden' by
the fact that the registered trx->lock.wait_trx is not actually
waiting for any InnoDB lock, but something else. So, instead of
deadlocks, sometimes lock wait timeout may be reported.

To improve on this, whenever trx->lock.wait_trx is changed, we
will register further 'candidate' transactions in Deadlock::to_check(),
and check for 'revealed' deadlocks as soon as possible, in lock_release()
and innobase_kill_query().

The old DeadlockChecker was holding lock_sys.latch, even though using
lock_sys.wait_mutex should be less contended (and thus preferred)
in the likely case that no deadlock is present.

lock_wait(): Defer the deadlock check to this function, instead of
executing it in lock_rec_enqueue_waiting(), lock_table_enqueue_waiting().

DeadlockChecker: Complete rewrite:
(1) Explicitly keep track of transactions that are being waited for,
in trx->lock.wait_trx, protected by lock_sys.wait_mutex. Previously,
we were painstakingly traversing the lock heaps while blocking
concurrent registration or removal of any locks (even uncontended ones).
(2) Use Brent's cycle-detection algorithm for deadlock detection,
traversing each trx->lock.wait_trx edge at most 2 times.
(3) If a deadlock is detected, release lock_sys.wait_mutex,
acquire LockMutexGuard, re-acquire lock_sys.wait_mutex and re-invoke
find_cycle() to find out whether the deadlock is still present.
(4) Display information on all transactions that are involved in the
deadlock, and choose a victim to be rolled back.

lock_sys.deadlocks: Replaces lock_deadlock_found. Protected by wait_mutex.

Deadlock::find_cycle(): Quickly find a cycle of trx->lock.wait_trx...
using Brent's cycle detection algorithm.

Deadlock::report(): Report a deadlock cycle that was found by
Deadlock::find_cycle(), and choose a victim with the least weight.
Altogether, we may traverse each trx->lock.wait_trx edge up to 5
times (2*find_cycle()+1 time for reporting and choosing the victim).

Deadlock::check_and_resolve(): Find and resolve a deadlock.

lock_wait_rpl_report(): Report the waits-for information to
replication. This used to be executed as part of DeadlockChecker.
Replication must know the waits-for relations even if no deadlocks
are present in InnoDB.

Reviewed by: Vladislav Vaintroub
2021-02-17 12:44:08 +02:00
Marko Mäkelä
3ddb4fddf1 MDEV-24738: Extend the test innodb.deadlock_detect 2021-02-17 12:34:24 +02:00
Marko Mäkelä
067465cd2f MDEV-15641 fixup: Make the test faster
Let us avoid the excessive allocation of explicit record locks
(a work-around of MDEV-24813) so that the test will execute
much faster under AddressSanitizer, MemorySanitizer, Valgrind.
2021-02-16 12:07:48 +02:00
Marko Mäkelä
e926964cb8 Remove useless test innodb.innodb_bug60049
The test innodb.innodb_bug60049 used to check that the record
(ID,NAME)=(12,'SYS_FOREIGN_COLS') is the last record in the
secondary index of the system table SYS_TABLES.
But, ever since commit 2336558423
or mysql/mysql-server@082d59670f
that record no longer is the last one in the table!

The more recent test innodb.purge_secondary covers the purge
functionality much better.
2021-02-15 18:12:31 +02:00
Sergei Golubchik
25d9d2e37f Merge branch 'bb-10.4-release' into bb-10.5-release 2021-02-15 16:43:15 +01:00
Marko Mäkelä
2e84846ec0 MDEV-24861 Assertion `trx->rsegs.m_redo.rseg' failed in innodb_prepare_commit_versioned
trx_t::commit_tables(): Ensure that mod_tables will be empty.
This was broken in commit b08448de64
where the query cache invalidation was moved from lock_release().
2021-02-15 10:19:57 +02:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Marko Mäkelä
da3211e487 MDEV-24763 fixup: Use deterministic ORDER BY 2021-02-12 14:03:25 +02:00
Marko Mäkelä
6f3f191cfa MDEV-24763 ALTER TABLE fails to rename a column in SYS_FIELDS
innobase_rename_column_try(): When renaming SYS_FIELDS records
for secondary indexes, try to use both formats of SYS_FIELDS.POS
as keys, in case the PRIMARY KEY includes a column prefix.

Without this fix, an ALTER TABLE that renames a column followed
by a server restart (or LRU eviction of the table definition
from dict_sys) would make the table inaccessible.
2021-02-12 09:48:36 +02:00
Marko Mäkelä
028ba10d0b MDEV-18468 fixup: Make test case robust w.r.t. deferred DROP TABLE 2021-02-12 09:41:15 +02:00
Thirunarayanan Balathandayuthapani
a2fbbba2e3 MDEV-24832 Root page AHI removal fails during rollback of bulk insert
This failure is caused by commit 43ca6059ca
(MDEV-24720). InnoDB fails to remove the ahi entries
during rollback of bulk insert operation. InnoDB should
remove the AHI entries of root page before reinitialising it.

Reviewed-by: Marko Mäkelä
2021-02-10 15:27:25 +05:30
Marko Mäkelä
c42ee8a7cf MDEV-24781 fixup: Adjust innodb.innodb-index-debug
Now that an INSERT into an empty table is replicated more efficiently
during online ALTER, an old test case started to fail. Let us disable
the MDEV-515 logic for the critical INSERT statement.
2021-02-05 08:32:57 +02:00
Thirunarayanan Balathandayuthapani
597510adfc MDEV-24781 Assertion `mode == 16 || mode == 12 || fix_block->page.status != buf_page_t::FREED' failed in buf_page_get_low
This is caused by commit 3cef4f8f0f
(MDEV-515). dict_table_t::clear() frees all the blob during
rollback of bulk insert.But online log tries to read the
freed blob while applying the log. It can be fixed if we
truncate the online log during rollback of bulk insert operation.
2021-02-05 10:32:36 +05:30
Marko Mäkelä
5f46385764 MDEV-24731 Excessive mutex contention in DeadlockChecker::check_and_resolve()
The DeadlockChecker expects to be able to freeze the waits-for graph.
Hence, it is best executed somewhere where we are not holding any
additional mutexes.

lock_wait(): Defer the deadlock check to this function, instead
of executing it in lock_rec_enqueue_waiting(), lock_table_enqueue_waiting().

DeadlockChecker::trx_rollback(): Merge with the only caller,
check_and_resolve().

LockMutexGuard: RAII accessor for lock_sys.mutex.

lock_sys.deadlocks: Replaces lock_deadlock_found.

trx_t: Clean up some comments.
2021-02-04 16:38:07 +02:00
Thirunarayanan Balathandayuthapani
43ca6059ca MDEV-24720 AHI removal during rollback of bulk insert
InnoDB fails to remove the ahi entries during rollback
of bulk insert operation. InnoDB throws the error when
validates the ahi hash tables. InnoDB should remove
the ahi entries while freeing the segment only during
bulk index rollback operation.

Reviewed-by: Marko Mäkelä
2021-02-02 19:24:05 +05:30
Marko Mäkelä
1110beccd4 Merge 10.5 into 10.6 2021-02-02 15:15:53 +02:00
Marko Mäkelä
324e5f02a9 MDEV-24754 Crash in ha_partition_inplace_ctx::~ha_partition_inplace_ctx()
ha_innobase::commit_inplace_alter_table(): Fix a regression that was
introduced in 6d1f1b61b5 (MDEV-24564).
2021-02-01 18:45:35 +02:00
Sergei Golubchik
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00
Marko Mäkelä
a70a47f2f3 MDEV-24661: Remove the test innodb.innodb_wl6326_big
The purpose of the test was to ensure that the SX (update) mode of
index tree and buffer page latches are being used.

The test has become unstable, possibly due to changes related to
buf_pool.mutex and buf_pool.page_hash, or to the use of MDL in the
purge of transaction history.

In 10.6, the test depends on instrumentation that was refactored
or removed in MDEV-24142.

The use of different latching modes can better be indirectly observed
through high-concurrency benchmarks. For MDEV-14637, a performance test
was conducted where the finer-grained latching and
BTR_CUR_FINE_HISTORY_LENGTH were removed. It caused a 20% performance
regression for UPDATE and somewhat smaller for INSERT.

Any new problem with latching granularity should be easily caught by
performance testing, or by stress tests with Random Query Generator.
2021-01-29 18:03:20 +02:00
Marko Mäkelä
c411393a84 MDEV-24715 Assertion !node->table->skip_alter_undo in CREATE...REPLACE SELECT
In commit 3cef4f8f0f (MDEV-515)
we inadvertently broke CREATE TABLE...REPLACE SELECT statements
by wrongly disabling row-level undo logging.

select_create::prepare(): Only invoke extra(HA_EXTRA_BEGIN_ALTER_COPY)
if no special treatment of duplicates is needed.
2021-01-28 15:26:53 +02:00
Marko Mäkelä
c6308355e5 MDEV-24612 fixup: Skip the test for --embedded 2021-01-28 07:51:43 +02:00
Marko Mäkelä
5dd028f8ee MDEV-24700 Assertion "lock not found"==0 in lock_table_x_unlock()
After an ignored INSERT IGNORE statement into an empty table, we would
wrongly use the MDEV-515 table-level undo logging for a subsequent
REPLACE statement.

ha_innobase::reset_template(): Clear m_prebuilt->ins_node->bulk_insert
on every statement boundary.

ha_innobase::start_stmt(): Invoke end_bulk_insert().

ha_innobase::extra(): Avoid accessing m_prebuilt->trx. Do not call
thd_to_trx(). Invoke end_bulk_insert() and try to reset bulk_insert
when changing the REPLACE or IGNORE settings.

trx_mod_table_time_t::WAS_BULK: Use a distinct value from BULK.

trx_undo_report_row_operation(): Add debug assertions.

Note: Some calls to end_bulk_insert() may be redundant, but statement
boundaries are not always clear in the API (especially in the
presence of LOCK TABLES or stored procedures).
2021-01-27 16:54:38 +02:00
Marko Mäkelä
3f871b3394 MDEV-515 fixup: Cover dict_table_t::clear() during ADD INDEX 2021-01-25 19:48:09 +02:00
Marko Mäkelä
3cef4f8f0f MDEV-515 Reduce InnoDB undo logging for insert into empty table
We implement an idea that was suggested by Michael 'Monty' Widenius
in October 2017: When InnoDB is inserting into an empty table or partition,
we can write a single undo log record TRX_UNDO_EMPTY, which will cause
ROLLBACK to clear the table.

For this to work, the insert into an empty table or partition must be
covered by an exclusive table lock that will be held until the transaction
has been committed or rolled back, or the INSERT operation has been
rolled back (and the table is empty again), in lock_table_x_unlock().

Clustered index records that are covered by the TRX_UNDO_EMPTY record
will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot
be distinguished from what MDEV-12288 leaves behind after purging the
history of row-logged operations.

Concurrent non-locking reads must be adjusted: If the read view was
created before the INSERT into an empty table, then we must continue
to imagine that the table is empty, and not try to read any records.
If the read view was created after the INSERT was committed, then
all records must be visible normally. To implement this, we introduce
the field dict_table_t::bulk_trx_id.

This special handling only applies to the very first INSERT statement
of a transaction for the empty table or partition. If a subsequent
statement in the transaction is modifying the initially empty table again,
we must enable row-level undo logging, so that we will be able to
roll back to the start of the statement in case of an error (such as
duplicate key).

INSERT IGNORE will continue to use row-level logging and locking, because
implementing it would require the ability to roll back the latest row.
Since the undo log that we write only allows us to roll back the entire
statement, we cannot support INSERT IGNORE. We will introduce a
handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage
engines that INSERT IGNORE is being executed.

In many test cases, we add an extra record to the table, so that during
the 'interesting' part of the test, row-level locking and logging will
be used.

Replicas will continue to use row-level logging and locking until
MDEV-24622 has been addressed. Likewise, this optimization will be
disabled in Galera cluster until MDEV-24623 enables it.

dict_table_t::bulk_trx_id: The latest active or committed transaction
that initiated an insert into an empty table or partition.
Protected by exclusive table lock and a clustered index leaf page latch.

ins_node_t::bulk_insert: Whether bulk insert was initiated.

trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert).
Unlike earlier, this collection will cover also temporary tables.

trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(),
is_bulk_insert(), was_bulk_insert().

trx_undo_report_row_operation(): Before accessing any undo log pages,
invoke trx->mod_tables.emplace() in order to determine whether undo
logging was disabled, or whether this is the first INSERT and we are
supposed to write a TRX_UNDO_EMPTY record.

row_ins_clust_index_entry_low(): If we are inserting into an empty
clustered index leaf page, set the ins_node_t::bulk_insert flag for
the subsequent trx_undo_report_row_operation() call.

lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock():
Remove the redundant parameter 'flags' that can be checked in the caller.

btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write
DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation().

trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT),
ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that
the next statement will not be covered by table-level undo logging.

ReadView::changes_visible(trx_id_t) const: New accessor for the case
where the trx_id_t is not read from a potentially corrupted index page
but directly from the memory. In this case, we can skip a sanity check.

row_sel(), row_sel_try_search_shortcut(), row_search_mvcc():
row_sel_try_search_shortcut_for_mysql(),
row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id.

row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees().

lock_sec_rec_cons_read_sees(): Replaced with lower-level code.

btr_root_page_init(): Refactored from btr_create().

dict_index_t::clear(), dict_table_t::clear(): Empty an index or table,
for the ROLLBACK of an INSERT operation.

ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT
into an empty table.

This is joint work with Thirunarayanan Balathandayuthapani,
who created a working prototype.
Thanks to Matthias Leich for extensive testing.
2021-01-25 18:41:27 +02:00
Marko Mäkelä
e9fc61053d Merge 10.5 into 10.6 2021-01-25 15:12:24 +02:00
Marko Mäkelä
927a882341 Merge 10.4 into 10.5 2021-01-25 15:06:52 +02:00
Marko Mäkelä
e626f511f9 MDEV-24653 fixup: Make the test deterministic 2021-01-25 14:56:38 +02:00
Marko Mäkelä
5db3827689 Merge 10.3 into 10.4 2021-01-25 14:43:07 +02:00
Marko Mäkelä
75538f94ca MDEV-24653 fixup: Make the test deterministic 2021-01-25 14:40:22 +02:00
Marko Mäkelä
0c3d264207 instant_alter_debug: Cover everything with innodb_instant_alter_column 2021-01-25 13:56:10 +02:00
Marko Mäkelä
46234f03c8 Merge 10.5 into 10.6 2021-01-25 12:56:30 +02:00