The algorithm change is based on a MySQL 8.0 fix for
BUG #26818787: ASSERTION: DATA0DATA.IC:430:TUPLE
by Krzysztof Kapuścik
ee606e62bb
If a record had been inserted in place of a delete-marked purgeable
record by modifying that record, and purge was accessing that record
before the off-page columns were written, row_build_index_entry()
would have returned NULL, causing a crash.
row_vers_non_virtual_fields_equal(): Check whether all non-virtual fields
of an index are equal. Replaces row_vers_non_vc_match(). A more complex
version of this function was called row_vers_non_vc_index_entry_match()
in the MySQL 8.0 fix.
row_vers_impl_x_locked_low(): This change is not directly related to
the reported problem, but apparently to the removal of the function
row_vers_non_vc_match(). This function checks if a secondary index
record was modified by a transaction that has not been committed yet.
For comparing the non-virtual columns, construct a secondary index
tuple from the table row.
row_vers_vc_matches_cluster(): Replace row_vers_non_vc_match() with
code that is equivalent to the row_vers_non_vc_index_entry_match()
in the MySQL 8.0 fix. Also, deduplicate some code by using goto.
The comment that I made in
commit 06299dddd4
is inaccurate. Replace the comment, and make the assertion
debug-only, because I cannot remember any reports of
it ever failing in these 10 years.
If crypt_block != NULL the entire object crypt_pfx should be
guaranteed to be initialized, including m_size, which will have been
initialized either in allocate_large(), either directly or via
allocate_trace().
When InnoDB has completed the rollback of a recovered transaction,
it used to display the transaction identifier.
This was broken in MySQL 5.7.2 in
2f5f3cd3ac
which was merged to MariaDB 10.2.2 in
commit 2e814d4702.
trx_rollback_active(): Cache the transaction ID before it will be
reset by transaction commit. Do not display the message if the
rollback was interrupted by shutdown (MDEV-13797, MDEV-12352).
If a crash occurs during ALTER TABLE…ALGORITHM=COPY, InnoDB would spend
a lot of time rolling back writes to the intermediate copy of the table.
To reduce the amount of busy work done, a work-around was introduced in
commit fd069e2bb3 in MySQL 4.1.8 and 5.0.2,
to commit the transaction after every 10,000 inserted rows.
A proper fix would have been to disable the undo logging altogether and
to simply drop the intermediate copy of the table on subsequent server
startup. This is what happens in MariaDB 10.3 with MDEV-14717,MDEV-14585.
In MariaDB 10.2, the intermediate copy of the table would be left behind
with a name starting with the string #sql.
This is a backport of a bug fix from MySQL 8.0.0 to MariaDB,
contributed by jixianliang <271365745@qq.com>.
Unlike recent MySQL, MariaDB supports ALTER IGNORE. For that operation
InnoDB must for now keep the undo logging enabled, so that the latest
row can be rolled back in case of an error.
In Galera cluster, the LOAD DATA statement will retain the existing
behaviour and commit the transaction after every 10,000 rows if
the parameter wsrep_load_data_splitting=ON is set. The logic to do
so (the wsrep_load_data_split() function and the call
handler::extra(HA_EXTRA_FAKE_START_STMT)) are joint work
by Ji Xianliang and Marko Mäkelä.
The original fix:
Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
Date: Wed Dec 2 16:09:15 2015 +0530
Bug#17479594 AVOID INTERMEDIATE COMMIT WHILE DOING ALTER TABLE ALGORITHM=COPY
Problem:
During ALTER TABLE, we commit and restart the transaction for every
10,000 rows, so that the rollback after recovery would not take so long.
Fix:
Suppress the undo logging during copy alter operation. If fts_index is
present then insert directly into fts auxiliary table rather
than doing at commit time.
ha_innobase::num_write_row: Remove the variable.
ha_innobase::write_row(): Remove the hack for committing every 10000 rows.
row_lock_table_for_mysql(): Remove the extra 2 parameters.
lock_get_src_table(), lock_is_table_exclusive(): Remove.
Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
Reviewed-by: Shaohua Wang <shaohua.wang@oracle.com>
Reviewed-by: Jon Olav Hauglid <jon.hauglid@oracle.com>
- Galera tests that was not updated with connection change
messages
- Disabled some TokuDB tests that always timed out.
These should be enabled again when we have an option to
specicy timeouts per tests.
While the bug was reported as a regression of
MDEV-11025 Make number of page cleaner threads variable dynamic
in MariaDB Server 10.3, the code that MariaDB Server 10.2
inherited from MySQL 5.7.4 (WL#6642) looks prone to similar errors.
pc_flush_slot(): If there is no work to do, reset the is_requested
signal, to avoid potential busy-waiting in
buf_flush_page_cleaner_worker(). If the coordinator thread has shut
down, avoid resetting the is_requested event, to avoid a potential
hang at shutdown if there are multiple worker threads.
ibuf_merge_or_delete_for_page(): Invoke fil_space_acquire_silent()
instead of fil_space_acquire() in order to avoid displaying
a useless message.
We know perfectly well that a tablespace can be dropped while a
change buffer merge is pending, because change buffer merges skip
any transactional locks.
innobase_commit_by_xid(), innobase_rollback_by_xid(): Decrement
the reference count before freeing the transaction object to the pool.
Failure to do so might corrupt the transaction bookkeeping
if trx_create_low() returns the same object to another thread
before we are done with it.
trx_sys_close(): Detach the recovered XA PREPARE transactions from
trx_sys->rw_trx_list before freeing them.
MDEV-14511 tried to avoid some consistency problems related to InnoDB
persistent statistics. The persistent statistics are being written by
an InnoDB internal SQL interpreter that requires the InnoDB data dictionary
cache to be locked.
Before MDEV-14511, the statistics were written during DDL in separate
transactions, which could unnecessarily reduce performance (each commit
would require a redo log flush) and break atomicity, because the statistics
would be updated separately from the dictionary transaction.
However, because it is unacceptable to hold the InnoDB data dictionary
cache locked while suspending the execution for waiting for a
transactional lock (in the mysql.innodb_index_stats or
mysql.innodb_table_stats tables) to be released, any lock conflict
was immediately be reported as "lock wait timeout".
To fix MDEV-14941, an attempt to reduce these lock conflicts by acquiring
transactional locks on the user tables in both the statistics and DDL
operations was made, but it would still not entirely prevent lock conflicts
on the mysql.innodb_index_stats and mysql.innodb_table_stats tables.
Fixing the remaining problems would require a change that is too intrusive
for a GA release series, such as MariaDB 10.2.
Thefefore, we revert the change MDEV-14511. To silence the
MDEV-13201 assertion, we use the pre-existing flag trx_t::internal.
The field trx_rseg_t::trx_ref_count that was added in WL#6965 in
MySQL 5.7.5 is being incremented twice if a recovered transaction
includes both undo log partitions insert_undo and update_undo.
This reference count is being used in trx_purge(), which invokes
trx_purge_initiate_truncate() to try to truncate an undo tablespace
file. Because of the double-increment, the trx_ref_count would never
reach 0.
It is possible that after the failed truncation attempt, the undo
tablespace would be disabled for logging any new transactions until
the server is restarted (hopefully after committing or rolling back
all transactions, so that no transactions would be recovered
on the next startup).
trx_resurrect_insert(), trx_resurrect_update(): Do not increment
trx_ref_count. Instead, let the caller do that.
trx_lists_init_at_db_start(): Increment rseg->trx_ref_count only
once for each recovered transaction. Adjust comments.
Finally, if innodb_force_recovery prevents the undo log scan,
do not bother iterating the empty lists.
The problem was that max_size was acciently set to 1 in some
cases.
Other things:
- Adjust max_rows if min_rows > max_rows.
- Removed not used variable varchar_length
- Adjusted max_pack_length (safety fix)
innobase_start_or_create_for_mysql(): Only start the data dictionary
and transaction subsystems in normal server startup and during
mariabackup --export.
btr_cur_update_in_place(): Read block->index only once,
so that it cannot change to NULL after the first read.
When block->index != NULL, it must be equal to index.
btr_cur_update_in_place(): The call rw_lock_x_lock(ahi_latch) must
of course be inside the if (ahi_latch) condition. This is a mistake
that I made when backporting the fix-under-development from 10.3.
This race condition is a regression caused by MDEV-12121.
btr_cur_update_in_place(): Determine block->index!=NULL only once
in order to determine whether an adaptive hash index bucket needs
to be exclusively locked and unlocked.
If we evaluated block->index multiple times, and the adaptive hash
index was disabled before we locked the adaptive hash index, then
we would never release the adaptive hash index bucket latch, which
would eventually lead to InnoDB hanging.
This is a regression that was introduced in MySQL 5.7.6 in
19855664de
fil_node_open_file(): Use proper 64-bit arithmetics for truncating
size_bytes to a multiple of a file extent size.
- Make Rdb_binlog_manager::unpack_value to not have a stack overrun
when it is reading invalid data (which it currently does as we in
MariaDB do not store binlog coordinates under BINLOG_INFO_INDEX_NUMBER,
see comments in MDEV-14892 for details).
- We may need to store these coordinates in the future, so instead of
removing the call of this function, let's make it work properly for
all possible inputs.
The warning was originally added in
commit c67663054a
(MySQL 4.1.12, 5.0.3) to trace claimed undo log corruption that
was analyzed in https://lists.mysql.com/mysql/176250
on November 9, 2004.
Originally, the limit was 20,000 undo log headers or transactions,
but in commit 9d6d1902e0
in MySQL 5.5.11 it was increased to 2,000,000.
The message can be triggered when the progress of purge is prevented
by a long-running transaction (or just an idle transaction whose
read view was started a long time ago), by running many transactions
that UPDATE or DELETE some records, then starting another transaction
with a read view, and finally by executing more than 2,000,000
transactions that UPDATE or DELETE records in InnoDB tables. Finally,
when the oldest long-running transaction is completed, purge would
run up to the next-oldest transaction, and there would still be more
than 2,000,000 transactions to purge.
Because the message can be triggered when the database is obviously
not corrupted, it should be removed. Heavy users of InnoDB should be
monitoring the "History list length" in SHOW ENGINE INNODB STATUS;
there is no need to spam the error log.
recv_log_recover_10_3(): Determine if a log from MariaDB 10.3 is clean.
recv_find_max_checkpoint(): Allow startup with a clean 10.3 redo log.
srv_prepare_to_delete_redo_log_files(): When starting up with a 10.3 log,
display a "Downgrading redo log" message instead of "Upgrading".
The XtraDB option innodb_track_changed_pages causes
the function log_group_read_log_seg() to be invoked
even when recv_sys==NULL, leading to the SIGSEGV.
This regression was caused by
MDEV-11027 InnoDB log recovery is too noisy
innodb/buf_LRU_get_free_block
Add debug instrumentation to produce error message about
no free pages. Print error message only once and do not
enable innodb monitor.
xtradb/buf_LRU_get_free_block
Add debug instrumentation to produce error message about
no free pages. Print error message only once and do not
enable innodb monitor. Remove code that does not seem to
be used.
innodb-lru-force-no-free-page.test
New test case to force produce desired error message.