Commit graph

201 commits

Author SHA1 Message Date
Marko Mäkelä
3e6722d88d Cleanup: More trx_t member functions
trx_t::rollback(): Renamed from trx_rollback_to_savepoint().

trx_t::rollback_low(): Renamed from trx_rollback_to_savepoint_low().

fts_sql_commit(): Defined as an alias of trx_commit_for_mysql().
fts_sql_rollback(): Defined as an alias of trx_t::rollback().

fts_rename_aux_tables_to_hex_format(): Fix the error handling
that likely never worked because we failed to roll back the
first transaction.
2020-04-29 16:00:17 +03:00
Marko Mäkelä
496d0372ef Merge 10.4 into 10.5 2020-04-29 15:40:51 +03:00
Marko Mäkelä
cfbbf5424b MDEV-7962: Follow-up fix for 10.4
Replace wsrep_on() with trx_t::is_wsrep() where possible.

Also, rename some functions to member functions and
remove unused DBUG_EXECUTE_IF instrumentation:

trx_t::commit(): Renamed from trx_commit().

trx_t::commit_low(): Renamed from trx_commit_low().

trx_t::commit_in_memory(): Renamed from trx_commit_in_memory().
2020-04-29 11:50:03 +03:00
Marko Mäkelä
b63446984c Merge 10.3 into 10.4 2020-04-27 17:38:17 +03:00
Marko Mäkelä
2e12d471ea Merge 10.2 into 10.3 2020-04-27 14:24:41 +03:00
Marko Mäkelä
c06845d6f0 Merge 10.1 into 10.2 2020-04-27 13:28:13 +03:00
Marko Mäkelä
edbdfc2f99 MDEV-7962 wsrep_on() takes 0.14% in OLTP RO
The function wsrep_on() was being called rather frequently
in InnoDB and XtraDB. Let us cache it in trx_t and invoke
trx_t::is_wsrep() instead.

innobase_trx_init(): Cache trx->wsrep = wsrep_on(thd).

ha_innobase::write_row(): Replace many repeated calls to current_thd,
and test the cheapest condition first.
2020-04-27 11:18:11 +03:00
Marko Mäkelä
fc2f2fa853 MDEV-19747: Deprecate and ignore innodb_log_optimize_ddl
During native table rebuild or index creation, InnoDB used to skip
redo logging and write MLOG_INDEX_LOAD records to inform crash recovery
and Mariabackup of the gaps in redo log. This is fragile and prohibits
some optimizations, such as skipping the doublewrite buffer for
newly (re)initialized pages (MDEV-19738).

row_merge_write_redo(): Remove. We do not write MLOG_INDEX_LOAD
records any more. Instead, we write full redo log.

FlushObserver: Remove.

fseg_free_page_func(): Remove the parameter log. Redo logging
cannot be disabled.

fil_space_t::redo_skipped_count: Remove.

We cannot remove buf_block_t::skip_flush_check, because PageBulk
will temporarily generate invalid B-tree pages in the buffer pool.
2020-02-11 18:44:26 +02:00
Marko Mäkelä
68fe5f534c Merge 10.4 into 10.5 2020-01-07 14:10:15 +02:00
Marko Mäkelä
d60dcabd0f Merge 10.3 into 10.4 2020-01-07 13:23:41 +02:00
Marko Mäkelä
eda719793a Merge 10.2 into 10.3 2020-01-07 12:14:35 +02:00
Marko Mäkelä
82187a1221 MDEV-21429 TRUNCATE and OPTIMIZE are being refused due to "row size too large"
By default (innodb_strict_mode=ON), InnoDB attempts to guarantee
at DDL time that any INSERT to the table can succeed.
MDEV-19292 recently revised the "row size too large" check in InnoDB.
The check still is somewhat inaccurate;
that should be addressed in MDEV-20194.

Note: If a table contains multiple long string columns so that each column
is part of a column prefix index, then an UPDATE that attempts to modify
all those columns at once may fail, because the undo log record might
not fit in a single undo log page (of innodb_page_size). In the worst case,
the undo log record would grow by about 3KiB of for each updated column.

The DDL-time check (since the InnoDB Plugin for MySQL 5.1) is optional
in the sense that when the maximum B-tree record size or undo log
record size would be exceeded, the DML operation will fail and the
transaction will be properly rolled back.

create_table_info_t::row_size_is_acceptable(): Add the parameter
'bool strict' so that innodb_strict_mode=ON can be overridden during
TRUNCATE, OPTIMIZE and ALTER TABLE...FORCE (when the storage format
is not changing).

create_table_info_t::create_table(): Perform a sloppy check for
TRUNCATE TABLE (create_fk=false).

prepare_inplace_alter_table_dict(): Perform a sloppy check for
simple operations.

trx_is_strict(): Remove. The function became unused in
commit 98694ab0cb (MDEV-20949).
2020-01-07 11:02:12 +02:00
Marko Mäkelä
4081b7b27a Merge 10.4 into 10.5 2019-09-06 17:16:40 +03:00
Sergei Golubchik
244f0e6dd8 Merge branch '10.3' into 10.4 2019-09-06 11:53:10 +02:00
Marko Mäkelä
2c9e75ccfe MDEV-15326 after-merge fixes
trx_t::is_recovered: Revert most of the changes that were made by the
merge of MDEV-15326 from 10.2. The trx_sys.rw_trx_hash and the recovery
of transactions at startup is quite different in 10.3.

trx_free_at_shutdown(): Avoid excessive mutex protection. Reading fields
that can only be modified by the current thread (owning the transaction)
can be done outside mutex.

trx_t::commit_state(): Restore a tighter assertion.

trx_rollback_recovered(): Clarify why there is no potential race condition
with other transactions.

lock_trx_release_locks(): Merge with trx_t::release_locks(),
and avoid holding lock_sys.mutex unnecessarily long.

rw_trx_hash_t::find(): Remove redundant code, and avoid starving the
committer by checking trx_t::state before trx_t::reference().
2019-09-05 15:58:31 +03:00
Marko Mäkelä
537f8594a6 Merge 10.2 into 10.3 2019-09-04 17:52:04 +03:00
Marko Mäkelä
dae1b3b04c MDEV-15326: Backport trx_t::is_referenced()
Backport the applicable part of Sergey Vojtovich's commit
0ca2ea1a65 from MariaDB Server 10.3.

trx reference counter was updated under mutex and read without any
protection. This is both slow and unsafe. Use atomic operations for
reference counter accesses.
2019-09-04 09:42:38 +03:00
Marko Mäkelä
b07beff894 MDEV-15326: InnoDB: Failing assertion: !other_lock
MySQL 5.7.9 (and MariaDB 10.2.2) introduced a race condition
between InnoDB transaction commit and the conversion of implicit
locks into explicit ones.

The assertion failure can be triggered with a test that runs
3 concurrent single-statement transactions in a loop on a simple
table:

CREATE TABLE t (a INT PRIMARY KEY) ENGINE=InnoDB;
thread1: INSERT INTO t SET a=1;
thread2: DELETE FROM t;
thread3: SELECT * FROM t FOR UPDATE; -- or DELETE FROM t;

The failure scenarios are like the following:
(1) The INSERT statement is being committed, waiting for lock_sys->mutex.
(2) At the time of the failure, both the DELETE and SELECT transactions
are active but have not logged any changes yet.
(3) The transaction where the !other_lock assertion fails started
lock_rec_convert_impl_to_expl().
(4) After this point, the commit of the INSERT removed the transaction from
trx_sys->rw_trx_set, in trx_erase_lists().
(5) The other transaction consulted trx_sys->rw_trx_set and determined
that there is no implicit lock. Hence, it grabbed the lock.
(6) The !other_lock assertion fails in lock_rec_add_to_queue()
for the lock_rec_convert_impl_to_expl(), because the lock was 'stolen'.
This assertion failure looks genuine, because the INSERT transaction
is still active (trx->state=TRX_STATE_ACTIVE).

The problematic step (4) was introduced in
mysql/mysql-server@e27e0e0bb7
which fixed something related to MVCC (covered by the test
innodb.innodb-read-view). Basically, it reintroduced an error
that had been mentioned in an earlier commit
mysql/mysql-server@a17be6963f:
"The active transaction was removed from trx_sys->rw_trx_set prematurely."

Our fix goes along the following lines:

(a) Implicit locks will released by assigning
trx->state=TRX_STATE_COMMITTED_IN_MEMORY as the first step.
This transition will no longer be protected by lock_sys_t::mutex,
only by trx->mutex. This idea is by Sergey Vojtovich.
(b) We detach the transaction from trx_sys before starting to release
explicit locks.
(c) All callers of trx_rw_is_active() and trx_rw_is_active_low() must
recheck trx->state after acquiring trx->mutex.
(d) Before releasing any explicit locks, we will ensure that any activity
by other threads to convert implicit locks into explicit will have ceased,
by checking !trx_is_referenced(trx). There was a glitch
in this check when it was part of lock_trx_release_locks(); at the end
we would release trx->mutex and acquire lock_sys->mutex and trx->mutex,
and fail to recheck (trx_is_referenced() is protected by trx_t::mutex).
(e) Explicit locks can be released in batches (LOCK_RELEASE_INTERVAL=1000)
just like we did before.

trx_t::state: Document that the transition to COMMITTED is only
protected by trx_t::mutex, no longer by lock_sys_t::mutex.

trx_rw_is_active_low(), trx_rw_is_active(): Document that the transaction
state should be rechecked after acquiring trx_t::mutex.

trx_t::commit_state(): New function to change a transaction to committed
state, to release implicit locks.

trx_t::release_locks(): New function to release the explicit locks
after commit_state().

lock_trx_release_locks(): Move much of the logic to the caller
(which must invoke trx_t::commit_state() and trx_t::release_locks()
as needed), and assert that the transaction will have locks.

trx_get_trx_by_xid(): Make the parameter a pointer to const.

lock_rec_other_trx_holds_expl(): Recheck trx->state after acquiring
trx->mutex, and avoid a redundant lookup of the transaction.

lock_rec_queue_validate(): Recheck impl_trx->state while holding
impl_trx->mutex.

row_vers_impl_x_locked(), row_vers_impl_x_locked_low():
Document that the transaction state must be rechecked after
trx_mutex_enter().

trx_free_prepared(): Adjust for the changes to lock_trx_release_locks().
2019-09-04 09:42:38 +03:00
Marko Mäkelä
25af2a183b MDEV-15326/MDEV-16136 dead code removal
Revert part of fa2a74e08d.

trx_reference(): Remove, and merge the relevant part to the only caller
trx_rw_is_active(). If the statements trx = NULL; were ever executed,
the function would have dereferenced a NULL pointer and crashed in
trx_mutex_exit(trx). Hence, those statements must have been unreachable,
and they can be replaced with debug assertions.

trx_rw_is_active(): Avoid unnecessary acquisition and release of trx->mutex
when do_ref_count=false.

lock_trx_release_locks(): Do not reset trx->id=0. Had the statement been
necessary, we would have experienced crashes in trx_reference().
2019-08-27 16:38:57 +03:00
Marko Mäkelä
624dd71b94 Merge 10.4 into 10.5 2019-08-13 18:57:00 +03:00
Marko Mäkelä
e9c1701e11 Merge 10.3 into 10.4 2019-07-25 18:42:06 +03:00
Marko Mäkelä
fdef9f9b89 Merge 10.2 into 10.3 2019-07-25 15:31:11 +03:00
Marko Mäkelä
b6ac67389d Merge 10.1 into 10.2 2019-07-25 12:14:27 +03:00
Marko Mäkelä
10727b6953 Always initialize trx_t::start_time_micro
This affects the function has_higher_priority() for internal or
recovered transactions.
2019-07-24 21:59:26 +03:00
Marko Mäkelä
d09aec7a15 MDEV-19940 Clean up INFORMATION_SCHEMA.INNODB_ tables
Shorten some VARCHAR attributes to a more reasonable length.

INNODB_METRICS: Rename the column STATUS to ENABLED, and make it Boolean.

Replace with INT(1) many Boolean attributes that were declared as VARCHAR
containing 'NO','YES','disabled','enabled','Uninitialized','Initialized'.

Replace some VARCHAR attributes with ENUM.

Replace some BIGINT with INT when 32 bits are sufficient.

Remove INNODB_SYS_TABLESPACES.SPACE_TYPE. The type of a tablespace
can be derived from the tablespace ID. A fixed number is used for
the system tablespace and the temporary tablespace. All other tablespaces
are single-table or single-partition tablespaces.

i_s_locks_row_t::lock_type, lock_get_type_str(): Remove.
This is a redundant field. Table and record locks can be
distinguished by whether i_s_locks_row_t::lock_index is NULL.

fill_trx_row(): Do not unnecessarily copy the constant strings that
trx->op_info is pointing to.

i_s_locks_row_t::lock_mode: Replace string with integer.

lock_get_mode_str(), lock_get_trx_id(), lock_get_trx(): Remove.

field_store_ulint(): Remove.
2019-07-04 00:09:16 +03:00
Oleksandr Byelkin
c07325f932 Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
Marko Mäkelä
198ed24cac MDEV-19513: Rename dict_operation_lock to dict_sys.latch
dict_sys.lock(), dict_sys_lock(): Acquire both mutex and latch.

dict_sys.unlock(), dict_sys_unlock(): Release both mutex and latch.

dict_sys.assert_locked(): Assert that both mutex and latch are held.
2019-05-17 15:26:33 +03:00
Marko Mäkelä
be85d3e61b Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru
c0ac0b8860 Update FSF address 2019-05-11 19:25:02 +03:00
Marko Mäkelä
e6bdf77e4b Merge 10.3 into 10.4
In is_eits_usable(), we disable an assertion that fails due to
MDEV-19334.
2019-04-25 16:05:20 +03:00
Marko Mäkelä
acf6f92aa9 Merge 10.2 into 10.3 2019-04-25 09:05:52 +03:00
Marko Mäkelä
bc145193c1 Merge 10.1 into 10.2 2019-04-25 09:04:09 +03:00
Marko Mäkelä
bfb0726fc2 Merge 5.5 into 10.1 2019-04-24 12:03:11 +03:00
Thirunarayanan Balathandayuthapani
d5da8ae04d MDEV-15772 Potential list overrun during XA recovery
InnoDB could return the same list again and again if the buffer
passed to trx_recover_for_mysql() is smaller than the number of
transactions that InnoDB recovered in XA PREPARE state.

We introduce the transaction state TRX_PREPARED_RECOVERED, which
is like TRX_PREPARED, but will be set during trx_recover_for_mysql()
so that each transaction will only be returned once.

Because init_server_components() is invoking ha_recover() twice,
we must reset the state of the transactions back to TRX_PREPARED
after returning the complete list, so that repeated traversals
will see the complete list again, instead of seeing an empty list.
Without this tweak, the test main.tc_heuristic_recover would hang
in MariaDB 10.1.
2019-04-24 11:46:14 +03:00
Marko Mäkelä
5c3ff5cb93 Merge 10.3 into 10.4 2019-04-02 11:04:54 +03:00
Sergei Golubchik
4e1d3f83b7 Merge branch '10.2' into 10.3 2019-03-29 19:41:41 +01:00
Marko Mäkelä
d0116e10a5 Revert MDEV-18464 and MDEV-12009
This reverts commit 21b2fada7a
and commit 81d71ee6b2.

The MDEV-18464 change introduces a few data race issues. Contrary to
the documentation, the field trx_t::victim is not always being protected
by lock_sys_t::mutex and trx_t::mutex. Most importantly, it seems
that KILL QUERY could wrongly avoid acquiring both mutexes when
invoking lock_trx_handle_wait_low(), in case another thread had
already set trx->victim=true.

We also revert MDEV-12009, because it should depend on the MDEV-18464
fix being present.
2019-03-28 12:39:50 +02:00
Thirunarayanan Balathandayuthapani
0623cc7c16 MDEV-19051 Avoid unnecessary writing MLOG_INDEX_LOAD
1) Avoid writing of MLOG_INDEX_LOAD redo log record during inplace
alter table when the table is empty and also for spatial index.

2) Avoid creation of temporary merge file for spatial index during
index creation process.
2019-03-28 10:55:18 +02:00
Jan Lindström
21b2fada7a MDEV-18464: Port kill_one_trx fixes from 10.4 to 10.1
Pushed the decision for innodb transaction and system
locking down to lock0lock.cc level. With this,
we can avoid releasing these mutexes for executions
where these mutexes were acquired upfront.

This patch will also fix BF aborting of native threads, e.g.
threads which have declared wsrep_on=OFF. Earlier, we have
used, for innodb trx locks, was_chosen_as_deadlock_victim
flag, for marking inodb transactions, which are victims for
wsrep BF abort. With native threads (wsrep_on==OFF), re-using
was_chosen_as_deadlock_victim flag may lead to inteference
with real deadlock, and to deal with this, the patch has added new
flag for marking wsrep BF aborts only: victim=true

Similar way if replication decides to abort one of the threads
we mark victim by: victim=true

innobase_kill_query
	Remove lock sys and trx mutex handling.

wsrep_innobase_kill_one_trx
	Mark victim trx with victim=true

trx0trx.h
	Remove trx_abort_t type and abort type variable from
	trx struct. Add victim variable to trx.

wsrep_kill_victim
	Remove abort_type

lock_report_waiters_to_mysql
	Take also trx mutex and mark trx as a victim for
	replication abort.

lock_trx_handle_wait_low
	New low level function to check whether the transaction
	has already been rolled back because it was selected as
	a deadlock victim, or if it has to wait then cancel
	the wait lock.

lock_trx_handle_wait
	If transaction is not marked as victim take lock sys
	and trx mutex before calling lock_trx_handle_wait_low
	and release them after that.

row_search_for_mysql
	Remove lock sys and trx mutex taking and releasing.

trx_rollback_to_savepoint_for_mysql_low
trx_commit_in_memory
	Clean up victim variable.
2019-03-28 07:40:03 +02:00
Brave Galera Crew
36a2a185fe Galera4 2019-01-23 15:30:00 +04:00
Sergey Vojtovich
67dbfe6e9c MDEV-17441 - InnoDB transition to C++11 atomics
trx_t::n_ref transition to Atomic_counter.
2018-12-27 22:46:38 +04:00
Marko Mäkelä
b374246730 Merge 10.3 into 10.4 2018-11-30 11:05:46 +02:00
Marko Mäkelä
0abd2766b1 Merge 10.2 into 10.3
Also, related to MDEV-15522, MDEV-17304, MDEV-17835,
remove the Galera xtrabackup tests, because xtrabackup never worked
with MariaDB Server 10.3 due to InnoDB redo log format changes.
2018-11-30 09:38:56 +02:00
Marko Mäkelä
447e493179 Remove some unnecessary InnoDB #include 2018-11-29 12:53:44 +02:00
Marko Mäkelä
dde2ca4aa1 Merge 10.3 into 10.4 2018-11-19 20:22:33 +02:00
Marko Mäkelä
fd58bb71e2 Merge 10.2 into 10.3 2018-11-19 18:45:53 +02:00
Marko Mäkelä
ff88e4bb8a Remove many redundant #include from InnoDB 2018-11-19 11:42:14 +02:00
Marko Mäkelä
f545e3cfa9 MDEV-15562: Remove dict_table_t::rollback_instant(unsigned n)
On the rollback of changes to SYS_COLUMNS, MDEV-15562 will
break the assumption that the only instantaneous changes to columns
are the addition to the end of the column list.

The function dict_table_t::rollback_instant(unsigned n)
is inherently incompatible with instantly dropping or reordering
columns.

When a change to SYS_COLUMNS is rolled back, we must simply evict
the affected table definition, at the end of the rollback. We cannot
free the table object immediately, because the current transaction
that is being rolled back may be holding a lock on the table and
its metadata record.

dict_table_remove_from_cache_low(): Replaced
by dict_table_remove_from_cache().

dict_table_remove_from_cache(): Add a third parameter keep=false,
so that the table can be freed by the caller.

trx_lock_t::evicted_tables: List of tables on which trx_t::evict_table()
was invoked.

trx_t::evict_table(): Evict a table definition during rollback.

trx_commit_in_memory(): Empty the trx->lock.evicted_tables list
after the locks were released, by freeing the table objects.

row_undo_ins_remove_clust_rec(), row_undo_mod_clust_low():
Invoke trx_t::evict_table() on the affected table if a change to
SYS_COLUMNS is being rolled back.
2018-10-10 12:47:46 +03:00
Marko Mäkelä
1eb2d8f6e8 Merge 10.2 into 10.3 2018-08-16 08:54:58 +03:00