Commit graph

5245 commits

Author SHA1 Message Date
Sergey Vojtovich
ce04790065 MDEV-14482 - Cache line contention on ut_rnd_ulint_counter()
InnoDB RNG maintains global state, causing otherwise unnecessary bus
traffic. Even worse this is cross-mutex traffic. That is different
mutexes suffer from contention.

Fixed delay of 4 was verified to give best throughput by OLTP update
index and read-write benchmarks on Intel Broadwell (2/20/40) and
ARM (1/46/46).
2018-01-26 10:25:33 +04:00
Marko Mäkelä
92d233a512 MDEV-15061 TRUNCATE must honor InnoDB table locks
Traditionally, DROP TABLE and TRUNCATE TABLE discarded any locks that
may have been held on the table. This feels like an ACID violation.
Probably most occurrences of it were prevented by meta-data locks (MDL)
which were introduced in MySQL 5.5.

dict_table_t::n_foreign_key_checks_running: Reduce the number of
non-debug checks.

lock_remove_all_on_table(), lock_remove_all_on_table_for_trx(): Remove.

ha_innobase::truncate(): Acquire an exclusive InnoDB table lock
before proceeding. DROP TABLE and DISCARD/IMPORT were already doing
this.

row_truncate_table_for_mysql(): Convert the already started transaction
into a dictionary operation, and do not invoke lock_remove_all_on_table().

row_mysql_table_id_reassign(): Do not call lock_remove_all_on_table().
This function is only used in ALTER TABLE...DISCARD/IMPORT TABLESPACE,
which is already holding an exclusive InnoDB table lock.

TODO: Make n_foreign_key_checks running a debug-only variable.
This would require two fixes:
(1) DROP TABLE: Exclusively lock the table beforehand, to prevent
the possibility of concurrently running foreign key checks (which
would acquire a table IS lock and then record S locks).
(2) RENAME TABLE: Find out if n_foreign_key_checks_running>0 actually
constitutes a potential problem.
2018-01-25 22:43:43 +02:00
Marko Mäkelä
9aa461b187 Minor cleanup
ReadView::ReadView(): Define inline, and remove the memset().

ReadView::~ReadView(): Use the default destructor.
2018-01-24 14:01:45 +02:00
Sergey Vojtovich
4575ae70da Plug a memory leak 2018-01-24 14:00:42 +02:00
Marko Mäkelä
9875d5c3e1 Merge bb-10.2-ext into 10.3 2018-01-24 14:00:33 +02:00
Howard Su
6fe953cb71 Fix build on OSX with 10.13 SDK 2018-01-24 11:28:00 +02:00
Marko Mäkelä
62740e02c8 Merge 10.2 into bb-10.2-ext 2018-01-24 11:15:11 +02:00
Marko Mäkelä
c269f1d6fe Allocate page_cleaner and page_cleaner.slot[] statically 2018-01-24 11:10:33 +02:00
Marko Mäkelä
ac3e7f788e MDEV-15016: multiple page cleaner threads use a lot of CPU
While the bug was reported as a regression of
MDEV-11025 Make number of page cleaner threads variable dynamic
in MariaDB Server 10.3, the code that MariaDB Server 10.2
inherited from MySQL 5.7.4 (WL#6642) looks prone to similar errors.

pc_flush_slot(): If there is no work to do, reset the is_requested
signal, to avoid potential busy-waiting in
buf_flush_page_cleaner_worker(). If the coordinator thread has shut
down, avoid resetting the is_requested event, to avoid a potential
hang at shutdown if there are multiple worker threads.
2018-01-24 11:10:33 +02:00
Alexander Barkov
ec6b8c546a Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext 2018-01-23 17:43:12 +04:00
Marko Mäkelä
29eeb527fd MDEV-12173 "[Warning] Trying to access missing tablespace"
ibuf_merge_or_delete_for_page(): Invoke fil_space_acquire_silent()
instead of fil_space_acquire() in order to avoid displaying
a useless message.

We know perfectly well that a tablespace can be dropped while a
change buffer merge is pending, because change buffer merges skip
any transactional locks.
2018-01-22 16:53:33 +02:00
Marko Mäkelä
89ae5d7f2f Allocate mutex_monitor, create_tracker statically 2018-01-22 16:30:38 +02:00
Marko Mäkelä
30f1d2f642 Remove useless method LatchCounter::sum_deregister() 2018-01-22 16:29:43 +02:00
Marko Mäkelä
d04e1d4bdc MDEV-15029 XA COMMIT and XA ROLLBACK operate on freed transaction object
innobase_commit_by_xid(), innobase_rollback_by_xid(): Decrement
the reference count before freeing the transaction object to the pool.
Failure to do so might corrupt the transaction bookkeeping
if trx_create_low() returns the same object to another thread
before we are done with it.

trx_sys_close(): Detach the recovered XA PREPARE transactions from
trx_sys->rw_trx_list before freeing them.
2018-01-22 16:25:37 +02:00
Sergey Vojtovich
8e1adff959 Simplified away ReadView::complete()
It was supposed to be called out of mutex, but nevertheless was called
under trx_sys.mutex for normal threads adding one extra condtion in
critical section.
2018-01-22 16:23:16 +04:00
Sergey Vojtovich
4dc30f3c17 MDEV-15019 - InnoDB: store ReadView on trx
This will allow us to reduce critical section protected by
trx_sys.mutex:
- no need to maintain global m_free list
- eliminate if (trx->read_view == NULL) condition.

On x86_64 sizeof(Readview) is 144 mostly due to padding, sizeof(trx_t)
with ReadView is 1200.

Also don't close ReadView for read-write transactions, just mark it
closed similarly to read-only.

Clean-up: removed n_prepared_recovered_trx and n_prepared_trx, which
accidentally re-appeared after some rebase.
2018-01-22 16:23:15 +04:00
Marko Mäkelä
c425dcd8f2 Merge 10.2 into bb-10.2-ext 2018-01-22 09:04:32 +02:00
Marko Mäkelä
4f8555f1f6 MDEV-14941 Timeouts on persistent statistics tables caused by MDEV-14511
MDEV-14511 tried to avoid some consistency problems related to InnoDB
persistent statistics. The persistent statistics are being written by
an InnoDB internal SQL interpreter that requires the InnoDB data dictionary
cache to be locked.

Before MDEV-14511, the statistics were written during DDL in separate
transactions, which could unnecessarily reduce performance (each commit
would require a redo log flush) and break atomicity, because the statistics
would be updated separately from the dictionary transaction.

However, because it is unacceptable to hold the InnoDB data dictionary
cache locked while suspending the execution for waiting for a
transactional lock (in the mysql.innodb_index_stats or
mysql.innodb_table_stats tables) to be released, any lock conflict
was immediately be reported as "lock wait timeout".

To fix MDEV-14941, an attempt to reduce these lock conflicts by acquiring
transactional locks on the user tables in both the statistics and DDL
operations was made, but it would still not entirely prevent lock conflicts
on the mysql.innodb_index_stats and mysql.innodb_table_stats tables.

Fixing the remaining problems would require a change that is too intrusive
for a GA release series, such as MariaDB 10.2.

Thefefore, we revert the change MDEV-14511. To silence the
MDEV-13201 assertion, we use the pre-existing flag trx_t::internal.
2018-01-22 08:58:47 +02:00
Monty
27a5d96bcb Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext
Conflicts:
	sql/sp_rcontext.cc
2018-01-21 20:32:48 +02:00
Monty
f67b8273c0 Fixed wrong arguments to printf in InnoDB 2018-01-21 20:22:00 +02:00
Sergey Vojtovich
ec32c05072 Get rid of trx->read_view pointer juggling
trx->read_view|= 1 was done in a silly attempt to fix race condition
where trx->read_view was closed without trx_sys.mutex lock by read-only
trasnactions.

This just made the problem less likely to happen. In fact there was race
condition in const version of trx_get_read_view(): pointer may change to
garbage any moment after MVCC::is_view_active(trx->read_view) check and
before this function returns.

This patch doesn't fix this race condition, but rather makes it's
consequences less destructive.
2018-01-20 16:10:38 +04:00
Sergey Vojtovich
95070bf939 MVCC simplifications
Simplified away MVCC::get_oldest_view()
Simplified away MVCC::get_view()
Removed unused MVCC::view_release()
2018-01-20 16:10:38 +04:00
Sergey Vojtovich
90bf55673e Misc trx_sys scalability fixes
trx_erase_lists(): trx->read_view is owned by current thread and thus
doesn't need trx_sys.mutex protection for reading it's value. Move
trx->read_view check out of mutex

trx_start_low(): moved assertion out of mutex.

Call ReadView::creator_trx_id() directly: allows to inline this one-line
method.
2018-01-20 16:10:37 +04:00
Sergey Vojtovich
64048bafe0 Removed purge_trx_id_age and purge_view_trx_id_age
These were unused status variables available in debug builds only.
Also removed trx_sys.rw_max_trx_id: not used anymore.
2018-01-20 16:10:37 +04:00
Sergey Vojtovich
db5bb785f9 Allocate trx_sys.mvcc at link time
trx_sys.mvcc was allocated dynamically for no good reason.
2018-01-20 16:10:36 +04:00
Marko Mäkelä
f8882cce93 Replace trx_sys_t* trx_sys with trx_sys_t trx_sys
There is only one transaction system object in InnoDB.
Allocate the storage for it at link time, not at runtime.

lock_rec_fetch_page(): Use the correct fetch mode BUF_GET.
Pages may never be deallocated from a tablespace while
record locks are pointing to them.
2018-01-20 16:10:36 +04:00
Sergey Vojtovich
7078203389 MDEV-14756 - Remove trx_sys_t::rw_trx_list
Use atomic operations when accessing trx_sys_t::max_trx_id. We can't yet
move trx_sys_t::get_new_trx_id() out of mutex because it must be updated
atomically along with trx_sys_t::rw_trx_ids.
2018-01-20 16:10:35 +04:00
Sergey Vojtovich
c6d2842d9a MDEV-14756 - Remove trx_sys_t::rw_trx_list
Remove rw_trx_list.
2018-01-20 16:10:35 +04:00
Sergey Vojtovich
a447980ff3 MDEV-14756 - Remove trx_sys_t::rw_trx_list
Let lock_print_info_all_transactions() iterate rw_trx_hash instead of
rw_trx_list.

When printing info of locks for transactions, InnoDB monitor doesn't
attempt to read relevant page from disk anymore. The code was prone
to race conditions.

Note that TrxListIterator didn't work as advertised: it iterated
rw_trx_list only.
2018-01-20 16:10:34 +04:00
Sergey Vojtovich
886af392d3 MDEV-14756 - Remove trx_sys_t::rw_trx_list
Let trx_rollback_recovered() iterate rw_trx_hash instead of rw_trx_list.
2018-01-20 16:09:26 +04:00
Sergey Vojtovich
02270b44d0 MDEV-14756 - Remove trx_sys_t::rw_trx_list
Let lock_validate_table_locks(), lock_rec_other_trx_holds_expl(),
lock_table_locks_lookup(), trx_recover_for_mysql(), trx_get_trx_by_xid(),
trx_roll_must_shutdown(), fetch_data_into_cache() iterate rw_trx_hash
instead of rw_trx_list.
2018-01-20 16:09:26 +04:00
Sergey Vojtovich
d8c0caad32 MDEV-14756 - Remove trx_sys_t::rw_trx_list
Removed trx_sys_validate_trx_list(): with rw_trx_hash elements are not
required to be ordered by transaction id. Transaction state is now guarded
by asserts in rw_trx_hash_t.
2018-01-20 16:09:26 +04:00
Sergey Vojtovich
900b07908b MDEV-14756 - Remove trx_sys_t::rw_trx_list
Removed trx_sys_t::n_prepared_recovered_trx: never used.

Removed trx_sys_t::n_prepared_trx: used only at shutdown, we can perfectly
get this value from rw_trx_hash.
2018-01-20 16:09:26 +04:00
Sergey Vojtovich
a0b385ea2b MDEV-14756 - Remove trx_sys_t::rw_trx_list
Determine minimum transaction id by iterating rw_trx_hash, not rw_trx_list.

It is more expensive than previous implementation since it does linear
search, especially if there're many concurrent transactions running. But in
such case mutex is much bigger evil. And since it doesn't require
trx_sys->mutex protection it scales better.

For low concurrency performance difference is neglible.
2018-01-20 16:09:26 +04:00
Sergey Vojtovich
868c77df3e MDEV-14756 - Remove trx_sys_t::rw_trx_list
Replaced UT_LIST_GET_LEN(trx_sys->rw_trx_list) with
trx_sys->rw_trx_hash.size().
Moved freeing of trx objects at shutdown to rw_trx_hash destructor.
Small clean-up in trx_rollback_recovered().
2018-01-20 16:09:26 +04:00
Sergey Vojtovich
d09f146934 MDEV-14756 - Remove trx_sys_t::rw_trx_list
Reduce divergence between trx_sys_t::rw_trx_hash and trx_sys_t::rw_trx_list
by not adding recovered COMMITTED transactions to trx_sys_t::rw_trx_list.

Such transactions are discarded immediately without creating trx object.

This also required to split rollback and cleanup phases of recovery. To
reflect these updates the following renames happened:
trx_rollback_or_clean_all_recovered() -> trx_rollback_all_recovered()
trx_rollback_or_clean_is_active -> trx_rollback_is_active
trx_rollback_or_clean_recovered() -> trx_rollback_recovered()
trx_cleanup_at_db_startup() -> trx_cleanup_recovered()

Also removed a hack from lock_trx_release_locks(). Instead let recovery
rollback thread to skip committed XA transactions.
2018-01-20 16:09:26 +04:00
Marko Mäkelä
6c09a6542e MDEV-14985 innodb_undo_log_truncate may be blocked if transactions were recovered at startup
The field trx_rseg_t::trx_ref_count that was added in WL#6965 in
MySQL 5.7.5 is being incremented twice if a recovered transaction
includes both undo log partitions insert_undo and update_undo.

This reference count is being used in trx_purge(), which invokes
trx_purge_initiate_truncate() to try to truncate an undo tablespace
file. Because of the double-increment, the trx_ref_count would never
reach 0.

It is possible that after the failed truncation attempt, the undo
tablespace would be disabled for logging any new transactions until
the server is restarted (hopefully after committing or rolling back
all transactions, so that no transactions would be recovered
on the next startup).

trx_resurrect_insert(), trx_resurrect_update(): Do not increment
trx_ref_count. Instead, let the caller do that.

trx_lists_init_at_db_start(): Increment rseg->trx_ref_count only
once for each recovered transaction. Adjust comments.
Finally, if innodb_force_recovery prevents the undo log scan,
do not bother iterating the empty lists.
2018-01-18 16:26:09 +02:00
Marko Mäkelä
4ef2e43080 Merge bb-10.2-ext into 10.3 2018-01-17 16:33:40 +02:00
Marko Mäkelä
c6cd64f3cb Merge 10.2 into bb-10.2-ext 2018-01-17 16:22:27 +02:00
Marko Mäkelä
656f66def2 Follow-up fix to MDEV-14585 Automatically remove #sql- tables in InnoDB dictionary during recovery
If InnoDB is killed while ALTER TABLE...ALGORITHM=COPY is in progress,
after recovery there could be undo log records some records that were
inserted into an intermediate copy of the table. Due to these undo log
records, InnoDB would resurrect locks at recovery, and the intermediate
table would be locked while we are trying to drop it. This would cause
a call to row_rename_table_for_mysql(), either from
row_mysql_drop_garbage_tables() or from the rollback of a RENAME
operation that was part of the ALTER TABLE.

row_rename_table_for_mysql(): Do not attempt to parse FOREIGN KEY
constraints when renaming from #sql-something to #sql-something-else,
because it does not make any sense.

row_drop_table_for_mysql(): When deferring DROP TABLE due to locks,
do not rename the table if its name already starts with the #sql-
prefix, which is what row_mysql_drop_garbage_tables() uses.
Previously, the too strict prefix #sql-ib was used, and some
tables were renamed unnecessarily.
2018-01-17 16:21:56 +02:00
Sergei Golubchik
8f102b584d Merge branch 'github/10.3' into bb-10.3-temporal 2018-01-17 00:45:02 +01:00
Sergei Golubchik
edb6375910 compilation warning on windows 2018-01-17 00:44:11 +01:00
Marko Mäkelä
f44017384a MDEV-14968 On upgrade, InnoDB reports "started; log sequence number 0"
srv_prepare_to_delete_redo_log_files(): Initialize srv_start_lsn.
2018-01-16 20:02:38 +02:00
Marko Mäkelä
d87531a6a0 Follow-up to MDEV-14952: Remove some more btr_get_search_latch()
Replace some !rw_lock_own() assertions with the stronger
!btr_search_own_any(). Remove some redundant btr_get_search_latch()
calls.

btr_search_update_hash_ref(): Remove a duplicated assertion.

btr_search_build_page_hash_index(): Remove a duplicated assertion.
rw_lock_s_lock() asserts that the latch is not being held.

btr_search_disable_ref_count(): Remove an assertion. The only caller
is acquiring all adaptive hash index latches.
2018-01-16 14:08:48 +02:00
Marko Mäkelä
2281fcf38a Follow-up fix to MDEV-14952 for Mariabackup
innodb_init_param(): Initialize btr_ahi_parts=1 for Mariabackup.

btr_search_enabled: Let the adaptive hash index be disabled
in Mariabackup. This would potentially only matter during --export,
and --export performs a table scan, not many index lookups.
2018-01-16 14:08:48 +02:00
Marko Mäkelä
be85c2dc88 Mariabackup --prepare: Do not access transactions or data dictionary
innobase_start_or_create_for_mysql(): Only start the data dictionary
and transaction subsystems in normal server startup and during
mariabackup --export.
2018-01-16 13:57:30 +02:00
Marko Mäkelä
33ecf8345d Follow-up fix to MDEV-14441: Fix a potential race condition
btr_cur_update_in_place(): Read block->index only once,
so that it cannot change to NULL after the first read.
When block->index != NULL, it must be equal to index.
2018-01-16 13:55:45 +02:00
Marko Mäkelä
822f4e6c10 Merge 10.2 into bb-10.2-ext 2018-01-16 07:51:02 +02:00
Marko Mäkelä
f5e158183c Follow-up fix to MDEV-14441: Correct a misplaced condition
btr_cur_update_in_place(): The call rw_lock_x_lock(ahi_latch) must
of course be inside the if (ahi_latch) condition. This is a mistake
that I made when backporting the fix-under-development from 10.3.
2018-01-16 07:50:15 +02:00
Marko Mäkelä
0664d633e4 MDEV-14952 Avoid repeated calls to btr_get_search_latch()
btr_cur_search_to_nth_level(), btr_search_guess_on_hash(),
btr_pcur_open_with_no_init_func(), row_sel_open_pcur():
Replace the parameter has_search_latch with the ahi_latch
(passed as NULL if the caller does not hold the latch).

btr_search_update_hash_node_on_insert(),
btr_search_update_hash_on_insert(),
btr_search_build_page_hash_index(): Add the parameter ahi_latch.

btr_search_x_lock(), btr_search_x_unlock(),
btr_search_s_lock(), btr_search_s_unlock(): Remove.
2018-01-15 19:51:09 +02:00