The merge omitted some InnoDB and XtraDB conflict resolutions,
most notably, failing to merge the fix of MDEV-12173.
ibuf_merge_or_delete_for_page(), lock_rec_block_validate():
Invoke fil_space_acquire_silent() instead of fil_space_acquire().
This fixes MDEV-12173.
wsrep_debug, wsrep_trx_is_aborting(): Removed unused declarations.
_fil_io(): Remove. Instead, declare default parameters for the XtraDB
fil_io().
buf_read_page_low(): Declare default parameters, and clean up some
callers.
os_aio(): Correct the macro that is defined when !UNIV_PFS_IO.
innobase_start_or_create_for_mysql(): Only start the data dictionary
and transaction subsystems in normal server startup and during
mariabackup --export.
recv_log_recover_10_3(): Determine if a log from MariaDB 10.3 is clean.
recv_find_max_checkpoint(): Allow startup with a clean 10.3 redo log.
srv_prepare_to_delete_redo_log_files(): When starting up with a 10.3 log,
display a "Downgrading redo log" message instead of "Upgrading".
row_quiesce_table_start(), row_quiesce_table_complete():
Use the more appropriate predicate srv_undo_sources for skipping
purge control. (This change alone is insufficient; it is possible
that this predicate will change during the call to trx_purge_stop()
or trx_purge_run().)
trx_purge_stop(), trx_purge_run(): Tolerate PURGE_STATE_EXIT.
It is very well possible to initiate shutdown soon after the statement
FLUSH TABLES FOR EXPORT has been submitted to execution.
srv_purge_coordinator_thread(): Ensure that the wait for purge_sys->event
in trx_purge_stop() will terminate when the coordinator thread exits.
ha_print_info(): Remove.
srv_printf_innodb_monitor(): Do not acquire btr_search_latches[]
Add the equivalent functionality that was part of the non-debug
version of ha_print_info().
Starting with MySQL 5.7 (or MariaDB 10.2.2) InnoDB no longer contains
the "table monitor" or "tablespace monitor". The conditions on
srv_print_innodb_tablespace_monitor, srv_print_innodb_table_monitor
never hold. So, the code was dead.
Also, remove a bogus reference to dict_print(), which used to implement
the InnoDB table monitor.
srv_purge_wakeup(): If thd_destructor_proxy has initiated the first
step of shutdown, ensure that all purge threads terminate.
logs_empty_and_mark_files_at_shutdown(): Add a debug assertion.
(The purge threads should have been shut down already before this step.)
Also, MDEV-14317 When ALTER TABLE is aborted, do not write garbage pages to data files
As pointed out by Shaohua Wang, the merge of MDEV-13328 from
MariaDB 10.1 (based on MySQL 5.6) to 10.2 (based on 5.7)
was performed incorrectly.
Let us always pass a non-NULL FlushObserver* when writing
to data files is desired.
FlushObserver::is_partial_flush(): Check if this is a bulk-load
(partial flush of the tablespace).
FlushObserver::is_interrupted(): Check for interrupt status.
buf_LRU_flush_or_remove_pages(): Instead of trx_t*, take
FlushObserver* as a parameter.
buf_flush_or_remove_pages(): Remove the parameters flush, trx.
If observer!=NULL, write out the data pages. Use the new predicate
observer->is_partial() to distinguish a partial tablespace flush
(after bulk-loading) from a full tablespace flush (export).
Return a bool (whether all pages were removed from the flush_list).
buf_flush_dirty_pages(): Remove the parameter trx.
Replace all references in InnoDB and XtraDB error log messages
to bugs.mysql.com with references to https://jira.mariadb.org/.
The original merge
commit 4274d0bf57
was accidentally reverted by the subsequent merge
commit 3b35d745c3
Mariabackup 10.2.7 would delete the redo log files after a successful
--prepare operation. If the user is manually copying the prepared files
instead of using the --copy-back option, it could happen that some old
redo log file would be preserved in the restored location. These old
redo log files could cause corruption of the restored data files when
the server is started up.
We prevent this scenario by creating a "poisoned" redo log file
ib_logfile0 at the end of the --prepare step. The poisoning consists
of simply truncating the file to an empty file. InnoDB will refuse
to start up on an empty redo log file.
copy_back(): Delete all redo log files in the target if the source
file ib_logfile0 is empty. (Previously we did this if the source
file is missing.)
SRV_OPERATION_RESTORE_EXPORT: A new variant of SRV_OPERATION_RESTORE
when the --export option is specified. In this mode, we will keep
deleting all redo log files, instead of truncating the first one.
delete_log_files(): Add a parameter for the first file to delete,
to be passed as 0 or 1.
innobase_start_or_create_for_mysql(): In mariabackup --prepare,
tolerate an empty ib_logfile0 file. Otherwise, require the first
redo log file to be longer than 4 blocks (2048 bytes). Unless
--export was specified, truncate the first log file at the
end of --prepare.
The last parameter to this function is now,"bool is_sparse", like in 10.1
rather than the unused/useless "bool is_readonly", merged from MySQL 5.7
Like in 10.1, this function now supports sparse files, and efficient
platform specific mechanisms for file extension
os_file_set_size() is now consistenly used in all places where
innodb files are extended.
The only universal index in InnoDB was the change buffer.
It suffices to keep the DICT_IBUF flag (which, like DICT_UNIVERSAL,
is not written to any persistent data structure).
This should also fix the MariaDB 10.2.2 bug
MDEV-13826 CREATE FULLTEXT INDEX on encrypted table fails.
MDEV-12634 FIXME: Modify innodb-index-online, innodb-table-online
so that they will write and read merge sort files. InnoDB 5.7
introduced some optimizations to avoid using the files for small tables.
Many collation test results have been adjusted for MDEV-10191.
…porary file
Fixed by removing writing key version to start of every block that
was encrypted. Instead we will use single key version from log_sys
crypt info.
After this MDEV also blocks writen to row log are encrypted and blocks
read from row log aren decrypted if encryption is configured for the
table.
innodb_status_variables[], struct srv_stats_t
Added status variables for merge block and row log block
encryption and decryption amounts.
Removed ROW_MERGE_RESERVE_SIZE define.
row_merge_fts_doc_tokenize
Remove ROW_MERGE_RESERVE_SIZE
row_log_t
Add index, crypt_tail, crypt_head to be used in case of
encryption.
row_log_online_op, row_log_table_close_func
Before writing a block encrypt it if encryption is enabled
row_log_table_apply_ops, row_log_apply_ops
After reading a block decrypt it if encryption is enabled
row_log_allocate
Allocate temporary buffers crypt_head and crypt_tail
if needed.
row_log_free
Free temporary buffers crypt_head and crypt_tail if they
exist.
row_merge_encrypt_buf, row_merge_decrypt_buf
Removed.
row_merge_buf_create, row_merge_buf_write
Remove ROW_MERGE_RESERVE_SIZE
row_merge_build_indexes
Allocate temporary buffer used in decryption and encryption
if needed.
log_tmp_blocks_crypt, log_tmp_block_encrypt, log_temp_block_decrypt
New functions used in block encryption and decryption
log_tmp_is_encrypted
New function to check is encryption enabled.
Added test case innodb-rowlog to force creating a row log and
verify that operations are done using introduced status
variables.
This fixes several InnoDB bugs related to innodb_encrypt_log and
two Mariabackup --backup bugs.
log_crypt(): Properly derive the initialization vector from the
start LSN of each block. Add a debug assertion.
log_crypt_init(): Note that the function should only be used when
creating redo log files and that the information is persisted in
the checkpoint pages.
xtrabackup_copy_log(): Validate data_len.
xtrabackup_backup_func(): Always use the chosen checkpoint buffer.
log_group_write_buf(), log_write_up_to(): Only log_crypt() the redo
log payload, not the padding bytes.
innobase_start_or_create_for_mysql(): Do not invoke log_crypt_init()
or initiate a redo log checkpoint.
recv_find_max_checkpoint(): Return the contents of LOG_CHECKPOINT_NO
to xtrabackup_backup_func() in log_sys->next_checkpoint_no.
There is a race condition in InnoDB startup. A number of
fil_crypt_thread are created by fil_crypt_threads_init(). These threads
may call btr_scrub_complete_space() before btr_scrub_init() was called.
Those too early calls would be accessing an uninitialized scrub_stat_mutex.
innobase_start_or_create_for_mysql(): Invoke btr_scrub_init() before
fil_crypt_threads_init().
fil_crypt_complete_rotate_space(): Only invoke btr_scrub_complete_space()
if scrubbing is enabled. There is no need to update the statistics if
it is not enabled.
For running the Galera tests, the variable my_disable_leak_check
was set to true in order to avoid assertions due to memory leaks
at shutdown.
Some adjustments due to MDEV-13625 (merge InnoDB tests from MySQL 5.6)
were performed. The most notable behaviour changes from 10.0 and 10.1
are the following:
* innodb.innodb-table-online: adjustments for the DROP COLUMN
behaviour change (MDEV-11114, MDEV-13613)
* innodb.innodb-index-online-fk: the removal of a (1,NULL) record
from the result; originally removed in MySQL 5.7 in the
Oracle Bug #16244691 fix
377774689b
* innodb.create-index-debug: disabled due to MDEV-13680
(the MySQL Bug #77497 fix was not merged from 5.6 to 5.7.10)
* innodb.innodb-alter-autoinc: MariaDB 10.2 behaves like MySQL 5.6/5.7,
while MariaDB 10.0 and 10.1 assign different values when
auto_increment_increment or auto_increment_offset are used.
Also MySQL 5.6/5.7 exhibit different behaviour between
LGORITHM=INPLACE and ALGORITHM=COPY, so something needs to be tested
and fixed in both MariaDB 10.0 and 10.2.
* innodb.innodb-wl5980-alter: disabled because it would trigger an
InnoDB assertion failure (MDEV-13668 may need additional effort in 10.2)
Problem was incorrect definition of wsrep_recovery,
trx_sys_update_wsrep_checkpoint and
trx_sys_read_wsrep_checkpoint functions causing
innodb_plugin not to load as there was undefined symbols.
Fixes also MDEV-13488: InnoDB writes CRYPT_INFO even though
encryption is not enabled.
Problem was that we created encryption metadata (crypt_data) for
system tablespace even when no encryption was enabled and too early.
System tablespace can be encrypted only using key rotation.
Test innodb-key-rotation-disable, innodb_encryption, innodb_lotoftables
require adjustment because INFORMATION_SCHEMA INNODB_TABLESPACES_ENCRYPTION
contain row only if tablespace really has encryption metadata.
fil_crypt_set_thread_cnt: Send message to background encryption threads
if they exits when they are ready. This is required to find tablespaces
requiring key rotation if no other changes happen.
fil_crypt_find_space_to_rotate: Decrease the amount of time waiting
when nothing happens to better enable key rotation on startup.
fsp_header_init: Write encryption metadata to page 0 only if tablespace is
encrypted or encryption is disabled by table option.
i_s_dict_fill_tablespaces_encryption : Skip tablespaces that do not
contain encryption metadata. This is required to avoid too early
wait condition trigger in encrypted -> unencrypted state transfer.
open_or_create_data_files: Do not create encryption metadata
by default to system tablespace.
Problem is that page 0 and its possible enrryption information
is not read for undo tablespaces.
fil_crypt_get_latest_key_version(): Do not send event to
encryption threads if event does not yet exists. Seen
on regression testing.
fil_read_first_page: Add new parameter does page belong to
undo tablespace and if it does, we do not read FSP_HEADER.
srv_undo_tablespace_open : Read first page of the tablespace
to get crypt_data if it exists and pass it to fil_space_create.
Tested using innodb_encryption with combinations with
innodb-undo-tablespaces.
The parameter --innodb-sync-debug, which is disabled by default,
aims to find potential deadlocks in InnoDB.
When the parameter is enabled, lots of tests failed. Most of these
failures were due to bogus diagnostics. But, as part of this fix,
we are also fixing a bug in error handling code and removing dead
code, and fixing cases where an uninitialized mutex was being
locked and unlocked.
dict_create_foreign_constraints_low(): Remove an extraneous
mutex_exit() call that could cause corruption in an error handling
path. Also, do not unnecessarily acquire dict_foreign_err_mutex.
Its only purpose is to control concurrent access to
dict_foreign_err_file.
row_ins_foreign_trx_print(): Replace a redundant condition with a
debug assertion.
srv_dict_tmpfile, srv_dict_tmpfile_mutex: Remove. The
temporary file is never being written to or read from.
log_free_check(): Allow SYNC_FTS_CACHE (fts_cache_t::lock)
to be held.
ha_innobase::inplace_alter_table(), row_merge_insert_index_tuples():
Assert that no unexpected latches are being held.
sync_latch_meta_init(): Properly initialize dict_operation_lock_key
at SYNC_DICT_OPERATION. dict_sys->mutex is SYNC_DICT, and
the now-removed SRV_DICT_TMPFILE was wrongly registered at
SYNC_DICT_OPERATION.
buf_block_init(): Correctly register buf_block_t::debug_latch.
It was previously misleadingly reported as LATCH_ID_DICT_FOREIGN_ERR.
latch_level_t: Correct the relative latching order of
SYNC_IBUF_PESS_INSERT_MUTEX,SYNC_INDEX_TREE and
SYNC_FILE_FORMAT_TAG,SYNC_DICT_OPERATION to avoid bogus failures.
row_drop_table_for_mysql(): Avoid accessing btr_defragment_mutex
if the defragmentation thread has not been started. This is the
case during fts_drop_orphaned_tables() in recv_recovery_rollback_active().
fil_space_destroy_crypt_data(): Avoid acquiring fil_crypt_threads_mutex
when it is uninitialized. We may have created crypt_data before the
mutex was created, and the mutex creation would be skipped if
InnoDB startup failed or --innodb-read-only was specified.
The fix broke mariabackup --prepare --incremental.
The restore of an incremental backup starts up (parts of) InnoDB twice.
First, all data files are discovered for applying .delta files. Then,
after the .delta files have been applied, InnoDB will be restarted
more completely, so that the redo log records will be applied via the
buffer pool.
During the first startup, the buffer pool is not initialized, and thus
trx_rseg_get_n_undo_tablespaces() must not be invoked. The apply of
the .delta files will currently assume that the --innodb-undo-tablespaces
option correctly specifies the number of undo tablespace files, just
like --backup does.
The second InnoDB startup of --prepare for applying the redo log will
properly invoke trx_rseg_get_n_undo_tablespaces().
enum srv_operation_mode: Add SRV_OPERATION_RESTORE_DELTA for
distinguishing the apply of .delta files from SRV_OPERATION_RESTORE.
srv_undo_tablespaces_init(): In mariabackup --prepare --incremental,
in the initial SRV_OPERATION_RESTORE_DELTA phase, do not invoke
trx_rseg_get_n_undo_tablespaces() because the buffer pool or the
redo logs are not available. Instead, blindly rely on the parameter
--innodb-undo-tablespaces.
Problem was that dict_sys->size tries to maintain used memory
occupied by the data dictionary table and index objects.
However at least on table objects table->heap size can increase
between when table object is inserted to dict_sys and when
it is removed from dict_sys causing inconsistency on amount
of memory added to and removed from dict_sys->size variable.
Removed unnecessary dict_sys:size variable as it is really
used only for status output.
Introduced dict_sys_get_size function to calculate memory
occupied by the data dictionary table and index objects
that is then used on show engine innodb output.
dict_table_add_to_cache(),
dict_table_rename_in_cache(),
dict_table_remove_from_cache_low(),
dict_index_remove_from_cache_low(),
Remove size calculation.
srv_printf_innodb_monitor(): Use dict_sys_get_size function to
get dictionary memory allocated.
xtradb_internal_hash_tables_fill_table(): Use dict_sys_get_size
function to get dictionary memory allocated.
buf_flush_page_cleaner_coordinator: In the first loop, use an
appropriate termination condition, waiting for !recv_writer_thread_active.
logs_empty_and_mark_files_at_shutdown(): Signal recv_sys->flush_start
in case the recv_writer_thread was never started, or
buf_flush_page_cleaner_coordinator failed to notice its termination.
innobase_start_or_create_for_mysql(): Remove a redundant, unreachable
condition, and properly release resources when aborting startup due to
recv_sys->found_corrupt_log.
This is a regression caused by
commit bb60a832ed
srv_shutdown_all_bg_threads(): If os_thread_count indicates that
no threads are running, do not bother checking thread status.
This avoids a crash when InnoDB startup is aborted before
os_aio_init() has been invoked. (os_aio_all_slots_free() would
dereference AIO::s_reads even though it is NULL.)
InnoDB I/O and buffer pool interfaces and the redo log format
have been changed between MariaDB 10.1 and 10.2, and the backup
code has to be adjusted accordingly.
The code has been simplified, and many memory leaks have been fixed.
Instead of the file name xtrabackup_logfile, the file name ib_logfile0
is being used for the copy of the redo log. Unnecessary InnoDB startup and
shutdown and some unnecessary threads have been removed.
Some help was provided by Vladislav Vaintroub.
Parameters have been cleaned up and aligned with those of MariaDB 10.2.
The --dbug option has been added, so that in debug builds,
--dbug=d,ib_log can be specified to enable diagnostic messages
for processing redo log entries.
By default, innodb_doublewrite=OFF, so that --prepare works faster.
If more crash-safety for --prepare is needed, double buffering
can be enabled.
The parameter innodb_log_checksums=OFF can be used to ignore redo log
checksums in --backup.
Some messages have been cleaned up.
Unless --export is specified, Mariabackup will not deal with undo log.
The InnoDB mini-transaction redo log is not only about user-level
transactions; it is actually about mini-transactions. To avoid confusion,
call it the redo log, not transaction log.
We disable any undo log processing in --prepare.
Because MariaDB 10.2 supports indexed virtual columns, the
undo log processing would need to be able to evaluate virtual column
expressions. To reduce the amount of code dependencies, we will not
process any undo log in prepare.
This means that the --export option must be disabled for now.
This also means that the following options are redundant
and have been removed:
xtrabackup --apply-log-only
innobackupex --redo-only
In addition to disabling any undo log processing, we will disable any
further changes to data pages during --prepare, including the change
buffer merge. This means that restoring incremental backups should
reliably work even when change buffering is being used on the server.
Because of this, preparing a backup will not generate any further
redo log, and the redo log file can be safely deleted. (If the
--export option is enabled in the future, it must generate redo log
when processing undo logs and buffered changes.)
In --prepare, we cannot easily know if a partial backup was used,
especially when restoring a series of incremental backups. So, we
simply warn about any missing files, and ignore the redo log for them.
FIXME: Enable the --export option.
FIXME: Improve the handling of the MLOG_INDEX_LOAD record, and write
a test that initiates a backup while an ALGORITHM=INPLACE operation
is creating indexes or rebuilding a table. An error should be detected
when preparing the backup.
FIXME: In --incremental --prepare, xtrabackup_apply_delta() should
ensure that if FSP_SIZE is modified, the file size will be adjusted
accordingly.
In Mariabackup, we would want the backed-up redo log file size to be
a multiple of 512 bytes, or OS_FILE_LOG_BLOCK_SIZE. However, at startup,
InnoDB would be picky, requiring the file size to be a multiple of
innodb_page_size.
Furthermore, InnoDB would require the parameter to be a multiple of
one megabyte, while the minimum granularity is 512 bytes. Because
the data-file-oriented fil_io() API is being used for writing the
InnoDB redo log, writes will for now require innodb_log_file_size to
be a multiple of the maximum innodb_page_size (65536 bytes).
To complicate matters, InnoDB startup divided srv_log_file_size by
UNIV_PAGE_SIZE, so that initially, the unit was bytes, and later it
was innodb_page_size. We will simplify this and keep srv_log_file_size
in bytes at all times.
innobase_log_file_size: Remove. Remove some obsolete checks against
overflow on 32-bit systems. srv_log_file_size is always 64 bits, and
the maximum size 512GiB in multiples of innodb_page_size always fits
in ulint (which is 32 or 64 bits). 512GiB would be 8,388,608*64KiB or
134,217,728*4KiB.
log_init(): Remove the parameter file_size that was always passed as
srv_log_file_size.
log_set_capacity(): Add a parameter for passing the requested file size.
srv_log_file_size_requested: Declare static in srv0start.cc.
create_log_file(), create_log_files(),
innobase_start_or_create_for_mysql(): Invoke fil_node_create()
with srv_log_file_size expressed in multiples of innodb_page_size.
innobase_start_or_create_for_mysql(): Require the redo log file sizes
to be multiples of 512 bytes.
recv_sys_init(): Remove the parameter.
recv_sys_create(): Merge to recv_sys_init().
recv_sys_mem_free(): Merge to recv_sys_close().
log_mem_free(): Merge to log_shutdown().