Commit graph

542 commits

Author SHA1 Message Date
Marko Mäkelä
ff88e4bb8a Remove many redundant #include from InnoDB 2018-11-19 11:42:14 +02:00
Marko Mäkelä
59b87e75d0 Fix a comment 2018-11-12 18:06:41 +02:00
Marko Mäkelä
abcd09c95a mtr_t::start(): Remove unused parameters
The parameters bool sync=true, bool read_only=false of mtr_t::start()
were added in
eca5b0fc17
(MySQL 5.7.3).

The parameter read_only was never used anywhere.
The parameter sync was only copied around, and would be returned
by the unused function mtr_t::is_async().

We do not need this dead code in MariaDB.
2018-11-01 10:48:56 +02:00
Marko Mäkelä
dc91ea5bb7 MDEV-12023 Assertion failure sym_node->table != NULL on startup
row_drop_table_for_mysql(): Avoid accessing non-existing dictionary tables.

dict_create_or_check_foreign_constraint_tables(): Add debug instrumentation
for creating and dropping a table before the creation of any non-core
dictionary tables.

trx_purge_add_update_undo_to_history(): Adjust a debug assertion, so that
it will not fail due to the test instrumentation.
2018-10-30 15:53:55 +02:00
Marko Mäkelä
6319c0b541 MDEV-13564: Replace innodb_unsafe_truncate with innodb_safe_truncate
Rename the 10.2-specific configuration option innodb_unsafe_truncate
to innodb_safe_truncate, and invert its value.

The default (for now) is innodb_safe_truncate=OFF, to avoid
disrupting users with an undo and redo log format change within
a Generally Available (GA) release series.
2018-10-11 15:10:13 +03:00
Marko Mäkelä
3448ceb02a MDEV-13564: Implement innodb_unsafe_truncate=ON for compatibility
While MariaDB Server 10.2 is not really guaranteed to be compatible
with Percona XtraBackup 2.4 (for example, the MySQL 5.7 undo log format
change that could be present in XtraBackup, but was reverted from
MariaDB in MDEV-12289), we do not want to disrupt users who have
deployed xtrabackup and MariaDB Server 10.2 in their environments.

With this change, MariaDB 10.2 will continue to use the backup-unsafe
TRUNCATE TABLE code, so that neither the undo log nor the redo log
formats will change in an incompatible way.

Undo tablespace truncation will keep using the redo log only. Recovery
or backup with old code will fail to shrink the undo tablespace files,
but the contents will be recovered just fine.

In the MariaDB Server 10.2 series only, we introduce the configuration
parameter innodb_unsafe_truncate and make it ON by default. To allow
MariaDB Backup (mariabackup) to work properly with TRUNCATE TABLE
operations, use loose_innodb_unsafe_truncate=OFF.

MariaDB Server 10.3.10 and later releases will always use the
backup-safe TRUNCATE TABLE, and this parameter will not be
added there.

recv_recovery_rollback_active(): Skip row_mysql_drop_garbage_tables()
unless innodb_unsafe_truncate=OFF. It is too unsafe to drop orphan
tables if RENAME operations are not transactional within InnoDB.

LOG_HEADER_FORMAT_10_3: Replaces LOG_HEADER_FORMAT_CURRENT.

log_init(), log_group_file_header_flush(),
srv_prepare_to_delete_redo_log_files(),
innobase_start_or_create_for_mysql(): Choose the redo log format
and subformat based on the value of innodb_unsafe_truncate.
2018-10-11 08:17:04 +03:00
Marko Mäkelä
75f8e86f57 MDEV-17158 TRUNCATE is not atomic after MDEV-13564
It turned out that ha_innobase::truncate() would prematurely
commit the transaction already before the completion of the
ha_innobase::create(). All of this must be atomic.

innodb.truncate_crash: Use the correct DEBUG_SYNC point, and
tolerate non-truncation of the table, because the redo log
for the TRUNCATE transaction commit might be flushed due to
some InnoDB background activity.

dict_build_tablespace_for_table(): Merge to the function
dict_build_table_def_step().

dict_build_table_def_step(): If a table is being created during
an already started data dictionary transaction (such as TRUNCATE),
persistently write the table_id to the undo log header before
creating any file. In this way, the recovery of TRUNCATE will be
able to delete the new file before rolling back the rename of
the original table.

dict_table_rename_in_cache(): Add the parameter replace_new_file,
used as part of rolling back a TRUNCATE operation.

fil_rename_tablespace_check(): Add the parameter replace_new.
If the parameter is set and a file identified by new_path exists,
remove a possible tablespace and also the file.

create_table_info_t::create_table_def(): Remove some debug assertions
that no longer hold. During TRUNCATE, the transaction will already
have been started (and performed a rename operation) before the
table is created. Also, remove a call to dict_build_tablespace_for_table().

create_table_info_t::create_table(): Add the parameter create_fk=true.
During TRUNCATE TABLE, do not add FOREIGN KEY constraints to the
InnoDB data dictionary, because they will also not be removed.

row_table_add_foreign_constraints(): If trx=NULL, do not modify
the InnoDB data dictionary, but only load the FOREIGN KEY constraints
from the data dictionary.

ha_innobase::create(): Lock the InnoDB data dictionary cache only
if no transaction was passed by the caller. Unlock it in any case.

innobase_rename_table(): Add the parameter commit = true.
If !commit, do not lock or unlock the data dictionary cache.

ha_innobase::truncate(): Lock the data dictionary before invoking
rename or create, and let ha_innobase::create() unlock it and
also commit or roll back the transaction.

trx_undo_mark_as_dict(): Renamed from trx_undo_mark_as_dict_operation()
and declared global instead of static.

row_undo_ins_parse_undo_rec(): If table_id is set, this must
be rolling back the rename operation in TRUNCATE TABLE, and
therefore replace_new_file=true.
2018-09-10 14:59:58 +03:00
Marko Mäkelä
cf2a4426a2 MDEV-14717 RENAME TABLE in InnoDB is not crash-safe
This is a backport of commit 0bc36758ba
and commit 9eb3fcc9fb.

InnoDB in MariaDB 10.2 appears to only write MLOG_FILE_RENAME2
redo log records during table-rebuilding ALGORITHM=INPLACE operations.
We must write the records for any .ibd file renames, so that the
operations are crash-safe.

If InnoDB is killed during a RENAME TABLE operation, it can happen that
the transaction for updating the data dictionary will be rolled back.
But, nothing will roll back the renaming of the .ibd file
(the MLOG_FILE_RENAME2 only guarantees roll-forward), or for that matter,
the renaming of the dict_table_t::name in the dict_sys cache. We introduce
the undo log record TRX_UNDO_RENAME_TABLE to fix this.

fil_space_for_table_exists_in_mem(): Remove the parameters
adjust_space, table_id and some code that was trying to work around
these deficiencies.

fil_name_write_rename(): Write a MLOG_FILE_RENAME2 record.

dict_table_rename_in_cache(): Invoke fil_name_write_rename().

trx_undo_rec_copy(): Set the first 2 bytes to the length of the
copied undo log record.

trx_undo_page_report_rename(), trx_undo_report_rename():
Write a TRX_UNDO_RENAME_TABLE record with the old table name.

row_rename_table_for_mysql(): Invoke trx_undo_report_rename()
before modifying any data dictionary tables.

row_undo_ins_parse_undo_rec(): Roll back TRX_UNDO_RENAME_TABLE
by invoking dict_table_rename_in_cache(), which will take care
of both renaming the table and the file.

ha_innobase::truncate(): Remove a work-around.
2018-09-07 22:10:02 +03:00
Marko Mäkelä
e67b1070bb MDEV-17049 Enable innodb_undo tests on buildbot
Remove the innodb_undo suite, and move and adapt the tests.
Remove unnecessary restarts, and add innodb_page_size_small.inc
for combinations.

innodb.undo_truncate is the merge of innodb_undo.truncate
and innodb_undo.truncate_multi_client.

Add the global status variable innodb_undo_truncations.
Without this, the test innodb.undo_truncate would occasionally
report that truncation did not happen. The test was only waiting
for the history list length to reach 0, but the undo tablespace
truncation would only take place some time after that.

Undo tablespace truncation will only occasionally occur with
innodb_page_size=32k, and typically never occur (with this amount
of undo log operations) with innodb_page_size=64k. We disable
these combinations.

innodb.undo_truncate_recover was formerly called
innodb_undo.truncate_recover.
2018-09-07 22:10:02 +03:00
Marko Mäkelä
055a3334ad MDEV-13564 Mariabackup does not work with TRUNCATE
Implement undo tablespace truncation via normal redo logging.

Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name,
CREATE, and DROP.

Note: Orphan #sql-ib*.ibd may be left behind if MariaDB Server 10.2
is killed before the DROP operation is committed. If MariaDB Server 10.2
is killed during TRUNCATE, it is also possible that the old table
was renamed to #sql-ib*.ibd but the data dictionary will refer to the
table using the original name.

In MariaDB Server 10.3, RENAME inside InnoDB is transactional,
and #sql-* tables will be dropped on startup. So, this new TRUNCATE
will be fully crash-safe in 10.3.

ha_mroonga::wrapper_truncate(): Pass table options to the underlying
storage engine, now that ha_innobase::truncate() will need them.

rpl_slave_state::truncate_state_table(): Before truncating
mysql.gtid_slave_pos, evict any cached table handles from
the table definition cache, so that there will be no stale
references to the old table after truncating.

== TRUNCATE TABLE ==

WL#6501 in MySQL 5.7 introduced separate log files for implementing
atomic and crash-safe TRUNCATE TABLE, instead of using the InnoDB
undo and redo log. Some convoluted logic was added to the InnoDB
crash recovery, and some extra synchronization (including a redo log
checkpoint) was introduced to make this work. This synchronization
has caused performance problems and race conditions, and the extra
log files cannot be copied or applied by external backup programs.

In order to support crash-upgrade from MariaDB 10.2, we will keep
the logic for parsing and applying the extra log files, but we will
no longer generate those files in TRUNCATE TABLE.

A prerequisite for crash-safe TRUNCATE is a crash-safe RENAME TABLE
(with full redo and undo logging and proper rollback). This will
be implemented in MDEV-14717.

ha_innobase::truncate(): Invoke RENAME, create(), delete_table().
Because RENAME cannot be fully rolled back before MariaDB 10.3
due to missing undo logging, add some explicit rename-back in
case the operation fails.

ha_innobase::delete(): Introduce a variant that takes sqlcom as
a parameter. In TRUNCATE TABLE, we do not want to touch any
FOREIGN KEY constraints.

ha_innobase::create(): Add the parameters file_per_table, trx.
In TRUNCATE, the new table must be created in the same transaction
that renames the old table.

create_table_info_t::create_table_info_t(): Add the parameters
file_per_table, trx.

row_drop_table_for_mysql(): Replace a bool parameter with sqlcom.

row_drop_table_after_create_fail(): New function, wrapping
row_drop_table_for_mysql().

dict_truncate_index_tree_in_mem(), fil_truncate_tablespace(),
fil_prepare_for_truncate(), fil_reinit_space_header_for_table(),
row_truncate_table_for_mysql(), TruncateLogger,
row_truncate_prepare(), row_truncate_rollback(),
row_truncate_complete(), row_truncate_fts(),
row_truncate_update_system_tables(),
row_truncate_foreign_key_checks(), row_truncate_sanity_checks():
Remove.

row_upd_check_references_constraints(): Remove a check for
TRUNCATE, now that the table is no longer truncated in place.

The new test innodb.truncate_foreign uses DEBUG_SYNC to cover some
race-condition like scenarios. The test innodb-innodb.truncate does
not use any synchronization.

We add a redo log subformat to indicate backup-friendly format.
MariaDB 10.4 will remove support for the old TRUNCATE logging,
so crash-upgrade from old 10.2 or 10.3 to 10.4 will involve
limitations.

== Undo tablespace truncation ==

MySQL 5.7 implements undo tablespace truncation. It is only
possible when innodb_undo_tablespaces is set to at least 2.
The logging is implemented similar to the WL#6501 TRUNCATE,
that is, using separate log files and a redo log checkpoint.

We can simply implement undo tablespace truncation within
a single mini-transaction that reinitializes the undo log
tablespace file. Unfortunately, due to the redo log format
of some operations, currently, the total redo log written by
undo tablespace truncation will be more than the combined size
of the truncated undo tablespace. It should be acceptable
to have a little more than 1 megabyte of log in a single
mini-transaction. This will be fixed in MDEV-17138 in
MariaDB Server 10.4.

recv_sys_t: Add truncated_undo_spaces[] to remember for which undo
tablespaces a MLOG_FILE_CREATE2 record was seen.

namespace undo: Remove some unnecessary declarations.

fil_space_t::is_being_truncated: Document that this flag now
only applies to undo tablespaces. Remove some references.

fil_space_t::is_stopping(): Do not refer to is_being_truncated.
This check is for tablespaces of tables. Potentially used
tablespaces are never truncated any more.

buf_dblwr_process(): Suppress the out-of-bounds warning
for undo tablespaces.

fil_truncate_log(): Write a MLOG_FILE_CREATE2 with a nonzero
page number (new size of the tablespace in pages) to inform
crash recovery that the undo tablespace size has been reduced.

fil_op_write_log(): Relax assertions, so that MLOG_FILE_CREATE2
can be written for undo tablespaces (without .ibd file suffix)
for a nonzero page number.

os_file_truncate(): Add the parameter allow_shrink=false
so that undo tablespaces can actually be shrunk using this function.

fil_name_parse(): For undo tablespace truncation,
buffer MLOG_FILE_CREATE2 in truncated_undo_spaces[].

recv_read_in_area(): Avoid reading pages for which no redo log
records remain buffered, after recv_addr_trim() removed them.

trx_rseg_header_create(): Add a FIXME comment that we could write
much less redo log.

trx_undo_truncate_tablespace(): Reinitialize the undo tablespace
in a single mini-transaction, which will be flushed to the redo log
before the file size is trimmed.

recv_addr_trim(): Discard any redo logs for pages that were
logged after the new end of a file, before the truncation LSN.
If the rec_list becomes empty, reduce n_addrs. After removing
any affected records, actually truncate the file.

recv_apply_hashed_log_recs(): Invoke recv_addr_trim() right before
applying any log records. The undo tablespace files must be open
at this point.

buf_flush_or_remove_pages(), buf_flush_dirty_pages(),
buf_LRU_flush_or_remove_pages(): Add a parameter for specifying
the number of the first page to flush or remove (default 0).

trx_purge_initiate_truncate(): Remove the log checkpoints, the
extra logging, and some unnecessary crash points. Merge the code
from trx_undo_truncate_tablespace(). First, flush all to-be-discarded
pages (beyond the new end of the file), then trim the space->size
to make the page allocation deterministic. At the only remaining
crash injection point, flush the redo log, so that the recovery
can be tested.
2018-09-07 22:10:02 +03:00
Marko Mäkelä
9258097fa3 Merge 10.1 into 10.2 2018-08-21 15:20:34 +03:00
Marko Mäkelä
dc7c080369 MDEV-17026 Assertion srv_undo_sources || ... failed on slow shutdown
trx_purge_add_update_undo_to_history(): Relax the too strict assertion
by removing the condition on srv_fast_shutdown (innodb_fast_shutdown).
Rollback is allowed during any form of shutdown.
2018-08-21 12:33:41 +03:00
Marko Mäkelä
fa2a74e08d MDEV-16136: Prevent wrong reuse in trx_reference()
If trx_free() and trx_create_low() were called while a call to
trx_reference() was pending, we could get a reference to a wrong
transaction object.

trx_reference(): Return NULL if the trx->id no longer matches.

lock_trx_release_locks(): Assign trx->id = 0, so that trx_reference()
will not return a reference to this object.

trx_cleanup_at_db_startup(): Assign trx->id = 0.

assert_trx_is_free(): Assert !trx->n_ref. Assert trx->id == 0,
now that it will be cleared as part of a transaction commit.
2018-08-16 05:54:11 +03:00
Marko Mäkelä
6583506570 Rename lock_pool_t to lock_list
Also, replace reverse iteration with forward iteration.

lock_table_has(): Remove a redundant condition.
2018-08-16 05:53:50 +03:00
Marko Mäkelä
3ce8a0fc49 MDEV-16136: Simplify trx_lock_t memory management
Allocate trx->lock.rec_pool and trx->lock.table_pool directly from trx_t.
Remove unnecessary use of std::vector.

In order to do this, move some definitions from lock0priv.h to
lock0types.h, so that ib_lock_t will not be an opaque type.
2018-08-16 05:53:38 +03:00
Marko Mäkelä
814ae57daf Merge 10.1 into 10.2 2018-08-03 13:02:56 +03:00
Marko Mäkelä
769f6d2db7 Fix -Wclass-memaccess in WSREP,InnoDB,XtraDB 2018-08-03 12:21:13 +03:00
Marko Mäkelä
0d3972c6be Merge 10.0 into 10.1 2018-08-03 12:03:10 +03:00
Marko Mäkelä
9dfef6e29b Fix -Wclass-memaccess warnings in InnoDB,XtraDB 2018-08-03 11:53:57 +03:00
Sergey Vojtovich
a1b2336199 Optimized away excessive condition
trx_set_rw_mode() is never called for read-only transactions, this is guarded
by callers.

Removing this condition from critical section immediately gives 5% scalability
improvement in OLTP index updates benchmark.
2018-08-03 08:32:17 +03:00
Oleksandr Byelkin
865e807125 Merge branch '10.0' into 10.1 2018-07-31 11:58:29 +02:00
Marko Mäkelä
d17e9a02c4 Fix InnoDB/XtraDB warnings by GCC 8.2.0 2018-07-30 14:05:24 +03:00
Marko Mäkelä
73af075366 Reduce the number of rw_lock_own() calls
Replace pairs of rw_lock_own() calls with
calls to rw_lock_own_flagged().

These calls are only present in debug builds.
2018-07-23 17:49:01 +03:00
Marko Mäkelä
fb5d579462 MDEV-13987 Hang in FLUSH TABLES...FOR EXPORT
trx_purge_stop(): Release purge_sys->latch before attempting to
wake up the purge threads, so that they can actually wake up.
This is a regression of commit a13a636c74
which attempted to fix MDEV-11802 by ensuring that srv_purge_wakeup()
will actually wait until all purge threads wake up.
Due to the purge_sys->latch, the threads might never wake up,
because some purge threads could end up waiting for purge_sys->latch
in trx_undo_prev_version_build() while holding dict_operation_lock
in shared mode. This in turn would block any DDL operations, the
InnoDB master thread, and InnoDB shutdown.
2018-05-14 18:19:20 +03:00
Marko Mäkelä
7b5543b21d MDEV-15030 Add ASAN instrumentation to trx_t Pool
Pool::mem_free(): Poison the freed memory. Assert that it was
fully initialized, because the reuse of trx_t objects will
assume that the objects were previously initialized.

Pool::~Pool(), Pool::get(): Unpoison the allocated memory,
and mark it initialized.

trx_free(): After invoking Pool::mem_free(), unpoison
trx_t::mutex and trx_t::undo_mutex, because MutexMonitor
will access these even for freed trx_t objects.
2018-04-24 20:33:27 +03:00
Thirunarayanan Balathandayuthapani
211842dd86 MDEV-15374 Server hangs and aborts with long semaphore wait or assertion `len < ((ulint) srv_page_size)' fails in trx_undo_rec_copy upon ROLLBACK on temporary table
Problem:
=======
InnoDB cleans all temporary undo logs during commit. During rollback
of secondary index entry, InnoDB tries to build the previous version
of clustered index. It leads to access of freed undo page during
previous transaction commit and it leads to undo log corruption.

Solution:
=========
During rollback, temporary undo logs should not try to build
the previous version of the record.
2018-04-23 11:22:58 +05:30
Thirunarayanan Balathandayuthapani
341edddc3d MDEV-15826 Purge attempts to free BLOB page after BEGIN;INSERT;UPDATE;ROLLBACK
- During rollback, redo segments priorities over no-redo rollback
segments and it leads to failure of redo rollback segment undo
logs truncation.
2018-04-18 12:39:39 +05:30
Vicențiu Ciorbaru
45e6d0aebf Merge branch '10.1' into 10.2 2018-04-10 17:43:18 +03:00
Marko Mäkelä
8eff803a1b Revert "MDEV-14705: Do not rollback on InnoDB shutdown"
This reverts commit 76ec37f522.

This behaviour change will be done separately in:
MDEV-15832 With innodb_fast_shutdown=3, skip the rollback
of connected transactions
2018-04-10 08:55:20 +03:00
Daniel Black
1479273cdb MDEV-14705: slow innodb startup/shutdown can exceed systemd timeout
Use systemd EXTEND_TIMEOUT_USEC to advise systemd of progress

Move towards progress measures rather than pure time based measures.

Progress reporting at numberious shutdown/startup locations incuding:
* For innodb_fast_shutdown=0 trx_roll_must_shutdown() for rolling back incomplete transactions.
* For merging the change buffer (in srv_shutdown(bool ibuf_merge))
* For purging history, srv_do_purge

Thanks Marko for feedback and suggestions.
2018-04-06 09:58:14 +03:00
Marko Mäkelä
76ec37f522 MDEV-14705: Do not rollback on InnoDB shutdown
row_undo_step(): If fast shutdown has been requested, abort the
rollback of any non-DDL transactions. Starting with MDEV-12323,
we aborted the rollback of recovered transactions. These
transactions would be rolled back on subsequent server startup.

trx_roll_report_progress(): Renamed from trx_roll_must_shutdown(),
now that the shutdown check has been moved to the only caller.
2018-04-06 09:58:14 +03:00
Marko Mäkelä
bd7ed1b923 MDEV-13935 INSERT stuck at state Unlocking tables
Revert the dead code for MySQL 5.7 multi-master replication (GCS),
also known as
WL#6835: InnoDB: GCS Replication: Deterministic Deadlock Handling
(High Prio Transactions in InnoDB).

Also, make innodb_lock_schedule_algorithm=vats skip SPATIAL INDEX,
because the code does not seem to be compatible with them.

Add FIXME comments to some SPATIAL INDEX locking code. It looks
like Galera write-set replication might not work with SPATIAL INDEX.
2018-03-16 15:50:04 +02:00
Marko Mäkelä
27ea2963fc Dead code removal: sess_t
The session object is not really needed for anything.
We can directly create and free the dummy purge_sys->query->trx.
2018-02-15 10:01:05 +02:00
Marko Mäkelä
d9955b22e9 Merge 10.1 into 10.2 2018-02-13 14:49:47 +02:00
Marko Mäkelä
2202afd541 Merge 10.0 into 10.1 2018-02-13 14:32:17 +02:00
Marko Mäkelä
c051eaba46 MDEV-14988 innodb_read_only tries to modify files if transactions were recovered in COMMITTED state
lock_trx_release_locks(): Relax a debug assertion to allow
recovered TRX_STATE_COMMITTED_IN_MEMORY transactions.

trx_commit_in_memory(): Add DEBUG_SYNC instrumentation.

trx_undo_insert_cleanup(): Skip persistent changes if innodb_read_only
is set. This should only happen when a recovered committed transaction
would be cleaned up at shutdown.
2018-02-13 14:29:32 +02:00
Marko Mäkelä
db25305780 MDEV-14407 Assertion failure during rollback
Rollback attempted to dereference DB_ROLL_PTR=0, which cannot possibly
be a valid undo log pointer. A safe canonical value would be
roll_ptr_t(1) << ROLL_PTR_INSERT_FLAG_POS
which is what was chosen in MDEV-12288.

This bug was reproduced in 10.3 only. Potentially, the problem could
have been introduced by MDEV-11415, which suppresses undo logging for
ALGORITHM=COPY operations. In those operations, we should actually
have written the safe value of DB_ROLL_PTR instead of writing 0.
However, the test in commit 5421e3aee7
demonstrates that access to the rebuilt table by earlier-started
transactions should actually have been refused with ER_TABLE_DEF_CHANGED.

btr_cur_ins_lock_and_undo(): When undo logging is disabled, use the
safe value of DB_ROLL_PTR.

btr_cur_optimistic_insert(): Validate the DB_TRX_ID,DB_ROLL_PTR before
inserting into a clustered index leaf page.

ins_node_t::sys_buf[]: Replaces row_id_buf and trx_id_buf and some
heap usage.

row_ins_alloc_sys_fields(): Initialize ins_node_t::sys_buf[].

trx_undo_page_report_modify(): Assert that the DB_ROLL_PTR is not 0.

trx_undo_get_undo_rec_low(): Assert that the roll_ptr is valid before
trying to dereference it.

dict_index_t::is_primary(): Check if the index is the primary key.
2018-02-08 13:55:57 +02:00
Sergei Golubchik
4771ae4b22 Merge branch 'github/10.1' into 10.2 2018-02-06 14:50:50 +01:00
Sergei Golubchik
d4df7bc9b1 Merge branch 'github/10.0' into 10.1 2018-02-02 10:09:44 +01:00
Marko Mäkelä
97a39ba212 Follow-up to reverting MDEV-6938
Do not call mtr_t::start() with trx_t*.
2018-02-01 18:53:33 +02:00
Marko Mäkelä
67d89e4d7d MDEV-15143 InnoDB: Rollback of trx with id 0 completed
When InnoDB has completed the rollback of a recovered transaction,
it used to display the transaction identifier.

This was broken in MySQL 5.7.2 in
2f5f3cd3ac
which was merged to MariaDB 10.2.2 in
commit 2e814d4702.

trx_rollback_active(): Cache the transaction ID before it will be
reset by transaction commit. Do not display the message if the
rollback was interrupted by shutdown (MDEV-13797, MDEV-12352).
2018-01-31 12:06:46 +02:00
Marko Mäkelä
0ba6aaf030 MDEV-11415 Remove excessive undo logging during ALTER TABLE…ALGORITHM=COPY
If a crash occurs during ALTER TABLE…ALGORITHM=COPY, InnoDB would spend
a lot of time rolling back writes to the intermediate copy of the table.
To reduce the amount of busy work done, a work-around was introduced in
commit fd069e2bb3 in MySQL 4.1.8 and 5.0.2,
to commit the transaction after every 10,000 inserted rows.

A proper fix would have been to disable the undo logging altogether and
to simply drop the intermediate copy of the table on subsequent server
startup. This is what happens in MariaDB 10.3 with MDEV-14717,MDEV-14585.
In MariaDB 10.2, the intermediate copy of the table would be left behind
with a name starting with the string #sql.

This is a backport of a bug fix from MySQL 8.0.0 to MariaDB,
contributed by jixianliang <271365745@qq.com>.

Unlike recent MySQL, MariaDB supports ALTER IGNORE. For that operation
InnoDB must for now keep the undo logging enabled, so that the latest
row can be rolled back in case of an error.

In Galera cluster, the LOAD DATA statement will retain the existing
behaviour and commit the transaction after every 10,000 rows if
the parameter wsrep_load_data_splitting=ON is set. The logic to do
so (the wsrep_load_data_split() function and the call
handler::extra(HA_EXTRA_FAKE_START_STMT)) are joint work
by Ji Xianliang and Marko Mäkelä.

The original fix:

Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
Date:   Wed Dec 2 16:09:15 2015 +0530

Bug#17479594 AVOID INTERMEDIATE COMMIT WHILE DOING ALTER TABLE ALGORITHM=COPY

Problem:

During ALTER TABLE, we commit and restart the transaction for every
10,000 rows, so that the rollback after recovery would not take so long.

Fix:

Suppress the undo logging during copy alter operation. If fts_index is
present then insert directly into fts auxiliary table rather
than doing at commit time.

ha_innobase::num_write_row: Remove the variable.

ha_innobase::write_row(): Remove the hack for committing every 10000 rows.

row_lock_table_for_mysql(): Remove the extra 2 parameters.

lock_get_src_table(), lock_is_table_exclusive(): Remove.

Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
Reviewed-by: Shaohua Wang <shaohua.wang@oracle.com>
Reviewed-by: Jon Olav Hauglid <jon.hauglid@oracle.com>
2018-01-30 20:24:23 +02:00
Marko Mäkelä
706ed8552d Revert "MDEV-6928: Add trx pointer to struct mtr_t"
This reverts commit 3486135bb5.

The commit comment ended in the words: "This is needed later."
Apparently the "later" never arrived.
2018-01-29 11:05:17 +02:00
Marko Mäkelä
d04e1d4bdc MDEV-15029 XA COMMIT and XA ROLLBACK operate on freed transaction object
innobase_commit_by_xid(), innobase_rollback_by_xid(): Decrement
the reference count before freeing the transaction object to the pool.
Failure to do so might corrupt the transaction bookkeeping
if trx_create_low() returns the same object to another thread
before we are done with it.

trx_sys_close(): Detach the recovered XA PREPARE transactions from
trx_sys->rw_trx_list before freeing them.
2018-01-22 16:25:37 +02:00
Marko Mäkelä
4f8555f1f6 MDEV-14941 Timeouts on persistent statistics tables caused by MDEV-14511
MDEV-14511 tried to avoid some consistency problems related to InnoDB
persistent statistics. The persistent statistics are being written by
an InnoDB internal SQL interpreter that requires the InnoDB data dictionary
cache to be locked.

Before MDEV-14511, the statistics were written during DDL in separate
transactions, which could unnecessarily reduce performance (each commit
would require a redo log flush) and break atomicity, because the statistics
would be updated separately from the dictionary transaction.

However, because it is unacceptable to hold the InnoDB data dictionary
cache locked while suspending the execution for waiting for a
transactional lock (in the mysql.innodb_index_stats or
mysql.innodb_table_stats tables) to be released, any lock conflict
was immediately be reported as "lock wait timeout".

To fix MDEV-14941, an attempt to reduce these lock conflicts by acquiring
transactional locks on the user tables in both the statistics and DDL
operations was made, but it would still not entirely prevent lock conflicts
on the mysql.innodb_index_stats and mysql.innodb_table_stats tables.

Fixing the remaining problems would require a change that is too intrusive
for a GA release series, such as MariaDB 10.2.

Thefefore, we revert the change MDEV-14511. To silence the
MDEV-13201 assertion, we use the pre-existing flag trx_t::internal.
2018-01-22 08:58:47 +02:00
Marko Mäkelä
6c09a6542e MDEV-14985 innodb_undo_log_truncate may be blocked if transactions were recovered at startup
The field trx_rseg_t::trx_ref_count that was added in WL#6965 in
MySQL 5.7.5 is being incremented twice if a recovered transaction
includes both undo log partitions insert_undo and update_undo.

This reference count is being used in trx_purge(), which invokes
trx_purge_initiate_truncate() to try to truncate an undo tablespace
file. Because of the double-increment, the trx_ref_count would never
reach 0.

It is possible that after the failed truncation attempt, the undo
tablespace would be disabled for logging any new transactions until
the server is restarted (hopefully after committing or rolling back
all transactions, so that no transactions would be recovered
on the next startup).

trx_resurrect_insert(), trx_resurrect_update(): Do not increment
trx_ref_count. Instead, let the caller do that.

trx_lists_init_at_db_start(): Increment rseg->trx_ref_count only
once for each recovered transaction. Adjust comments.
Finally, if innodb_force_recovery prevents the undo log scan,
do not bother iterating the empty lists.
2018-01-18 16:26:09 +02:00
Marko Mäkelä
e9842de20c Merge 10.1 into 10.2 2018-01-11 12:05:57 +02:00
Marko Mäkelä
c15b3d2d41 Merge 10.0 into 10.1 2018-01-11 10:44:05 +02:00
Marko Mäkelä
4c1479545d Merge 5.5 into 10.0 2018-01-11 10:16:52 +02:00
Marko Mäkelä
bdcd7f79e4 MDEV-14916 InnoDB reports warning for "Purge reached the head of the history list"
The warning was originally added in
commit c67663054a
(MySQL 4.1.12, 5.0.3) to trace claimed undo log corruption that
was analyzed in https://lists.mysql.com/mysql/176250
on November 9, 2004.

Originally, the limit was 20,000 undo log headers or transactions,
but in commit 9d6d1902e0
in MySQL 5.5.11 it was increased to 2,000,000.

The message can be triggered when the progress of purge is prevented
by a long-running transaction (or just an idle transaction whose
read view was started a long time ago), by running many transactions
that UPDATE or DELETE some records, then starting another transaction
with a read view, and finally by executing more than 2,000,000
transactions that UPDATE or DELETE records in InnoDB tables. Finally,
when the oldest long-running transaction is completed, purge would
run up to the next-oldest transaction, and there would still be more
than 2,000,000 transactions to purge.

Because the message can be triggered when the database is obviously
not corrupted, it should be removed. Heavy users of InnoDB should be
monitoring the "History list length" in SHOW ENGINE INNODB STATUS;
there is no need to spam the error log.
2018-01-11 09:55:10 +02:00