Now that MDEV-14717 made RENAME TABLE crash-safe within InnoDB,
it should be safe to drop the #sql- tables within InnoDB during
crash recovery. These tables can be one of two things:
(1) #sql-ib related to deferred DROP TABLE (follow-up to MDEV-13407)
or to table-rebuilding ALTER TABLE...ALGORITHM=INPLACE
(since MDEV-14378, only related to the intermediate copy of a table),
(2) #sql- related to the intermediate copy of a table during
ALTER TABLE...ALGORITHM=COPY
We will not drop tables whose name starts with #sql2, because
the server can be killed during an ALGORITHM=COPY operation at
a point where the original table was renamed to #sql2 but the
finished intermediate copy was not yet renamed from #sql-
to the original table name.
InnoDB in MariaDB 10.2 appears to only write MLOG_FILE_RENAME2
redo log records during table-rebuilding ALGORITHM=INPLACE operations.
We must write the records for any .ibd file renames, so that the
operations are crash-safe.
If InnoDB is killed during a RENAME TABLE operation, it can happen that
the transaction for updating the data dictionary will be rolled back.
But, nothing will roll back the renaming of the .ibd file
(the MLOG_FILE_RENAME2 only guarantees roll-forward), or for that matter,
the renaming of the dict_table_t::name in the dict_sys cache. We introduce
the undo log record TRX_UNDO_RENAME_TABLE to fix this.
fil_space_for_table_exists_in_mem(): Remove the parameters
adjust_space, table_id and some code that was trying to work around
these deficiencies.
fil_name_write_rename(): Write a MLOG_FILE_RENAME2 record.
dict_table_rename_in_cache(): Invoke fil_name_write_rename().
trx_undo_rec_copy(): Set the first 2 bytes to the length of the
copied undo log record.
trx_undo_page_report_rename(), trx_undo_report_rename():
Write a TRX_UNDO_RENAME_TABLE record with the old table name.
row_rename_table_for_mysql(): Invoke trx_undo_report_rename()
before modifying any data dictionary tables.
row_undo_ins_parse_undo_rec(): Roll back TRX_UNDO_RENAME_TABLE
by invoking dict_table_rename_in_cache(), which will take care
of both renaming the table and the file.
The InnoDB background DROP TABLE queue is something that we should
really remove, but are unable to until we remove dict_operation_lock
so that DDL and DML operations can be combined in a single transaction.
Because the queue is not persistent, it is not crash-safe. We should
in some way ensure that the deferred-dropped tables will be dropped
after server restart.
The existence of two separate transactions complicates the error handling
of CREATE TABLE...SELECT. We should really not break locks in DROP TABLE.
Our solution to these problems is to rename the table to a temporary
name, and to drop such-named tables on InnoDB startup. Also, the
queue will use table IDs instead of names from now on.
check-testcase.test: Ignore #sql-ib*.ibd files, because tables may enter
the background DROP TABLE queue shortly before the test finishes.
innodb.drop_table_background: Test CREATE...SELECT and the creation of
tables whose file name starts with #sql-ib.
innodb.alter_crash: Adjust the recovery, now that the #sql-ib tables
will be dropped on InnoDB startup.
row_mysql_drop_garbage_tables(): New function, to drop all #sql-ib tables
on InnoDB startup.
row_drop_table_for_mysql_in_background(): Remove an unnecessary and
misplaced call to log_buffer_flush_to_disk(). (The call should have been
after the transaction commit. We do not care about flushing the redo log
here, because the table would be dropped again at server startup.)
Remove the entry from the list after the table no longer exists.
If server shutdown has been initiated, empty the list without actually
dropping any tables. They will be dropped again on startup.
row_drop_table_for_mysql(): Do not call lock_remove_all_on_table().
Instead, if locks exist, defer the DROP TABLE until they do not exist.
If the table name does not start with #sql-ib, rename it to that prefix
before adding it to the background DROP TABLE queue.
During the user-defined variable defined by the recursive CTE handling procedure
check_dependencies_in_with_clauses that checks dependencies between the tables
that are defined in the CTE and find recursive definitions wasn't called.
The InnoDB background DROP TABLE queue is something that we should
really remove, but are unable to until we remove dict_operation_lock
so that DDL and DML operations can be combined in a single transaction.
Because the queue is not persistent, it is not crash-safe. In stable
versions of MariaDB, we can only try harder to drop all enqueued
tables before server shutdown.
row_mysql_drop_t::table_id: Replaces table_name.
row_drop_tables_for_mysql_in_background():
Do not remove the entry from the list as long as the table exists.
In this way, the table should eventually be dropped.
Moved TOI replication to happen after ACL checking for commands:
SQLCOM_CREATE_EVENT
SQLCOM_ALTER_EVENT
SQLCOM_DROP_EVENT
SQLCOM_CREATE_VIEW
SQLCOM_CREATE_TRIGGER
SQLCOM_DROP_TRIGGER
SQLCOM_INSTALL_PLUGIN
SQLCOM_UNINSTALL_PLUGIN
wrep_sst_common: Setting "-c ''" for my_print_defaults just takes no values from config at all. $MY_PRINT_DEFAULTS is already set at the top of the script to have --defaults-file and --defaults-extra-file. If WSREP_SST_OPT_CONF if set to "--defaults-file=/etc/my.cnf --defaults-extra-file=/etc/my.extra.cnf", then "my_print_defaults -c "" --defaults-file=/etc/my.cnf" succeeds, but if WSREP_SST_OPT_CONF is empty - no default values are taken at all.
wsrep_sst_xtrabackup-v2: innobackupex does not support --defaults-extra-file, so ${WSREP_SST_OPT_CONF} cannot be used as an argument, it has been changed to ${WSREP_SST_OPT_DEFAULT}. Removed --defaults-file= from INNOMOVE line, because WSREP_SST_OPT_CONF already includes it (INNOBACKUP was fine, INNOMOVE - not).
one) and affected result files.
Specifically to rpl_semi_sync_after_sync*, the changed results reflect
a fact that thanks to fixes in the dump thread functionality
there's no longer zombie thread to kill neither such thread represent
a semisync client (so the counter drops).
and specifically the ack receiving functionality.
Semisync is turned to be static instead of plugin so its functions
are invoked at the same points as RUN_HOOKS.
The RUN_HOOKS and the observer interface remain to be removed by later
patch.
Todo:
React on killed status by repl_semisync_master.wait_after_sync(). Currently
Repl_semi_sync_master::commit_trx does not check the killed status.
There were few bugfixes found that are present in mysql and its unclear
whether/how they are covered. Those include:
Bug#15985893: GTID SKIPPED EVENTS ON MASTER CAUSE SEMI SYNC TIME-OUTS
Bug#17932935 CALLING IS_SEMI_SYNC_SLAVE() IN EACH FUNCTION CALL
HAS BAD PERFORMANCE
Bug#20574628: SEMI-SYNC REPLICATION PERFORMANCE DEGRADES WITH A HIGH NUMBER OF THREADS
RUN_HOOK() is only called if semisync is enabled
As the server can't disable the hooks if something is in progress, I added
a new variable, run_hooks_enabled, that is set the first time semi sync is
used. This means that RUN_HOOK will have no overhead, unless semi sync
master or slave has been enabled once.
Some of the changes was just to get rid of warnings for embedded server
Part of MDEV-13073 AliSQL Optimize performance of semisync
The idea it to use a dedicated lock detecting if there is new data in
the master's binary log instead of the overused LOCK_log.
Changes:
- Use dedicated COND variables for the relay and binary log signaling.
This was needed as we where the old 'update_cond' variable was used
with different mutex's, which could cause deadlocks.
- Relay log uses now COND_relay_log_updated and LOCK_log
- Binary log uses now COND_bin_log_updated and LOCK_binlog_end_pos
- Renamed signal_cnt to relay_signal_cnt (as we now have two signals)
- Added some missing error handling in MYSQL_BIN_LOG::new_file_impl()
- Reformatted some comments with old style
- Renamed m_key_LOCK_binlog_end_pos to key_LOCK_binlog_end_pos
- Changed 'signal_update()' to update_binlog_end_pos() which works for
both relay and binary log
Part of MDEV-13073 AliSQL Optimize performance of semisync
Did the following renames to match other similar variables
key_ss_mutex_LOCK_binlog_ > key_LOCK_bing
key_ss_cond_COND_binlog_send_ -> key_COND_binlog_send
COND_binlog_send_ -> COND_binlog_send
LOCK_binlog_ -> LOCK_binlog
debian/mariadb-server-10.2.install does not install semisync libs.
* The version of tcmalloc is written to the system variable
'version_malloc_library' if tcmalloc is used, similarly to
jemalloc
* Extracted method guess_malloc_library()
The 'data' field in the XA RECOVER resultset changed
to be charset_bin. It seems to me right and also
--binary-as-hex starts working. The XA RECOVER FORMAT='SQL' option
implemented. It returns the XID string that fits to be an argument for the
XA ... statements.