Commit graph

13 commits

Author SHA1 Message Date
Marko Mäkelä
44fd2c4b24 Merge 10.5 into 10.6 2022-09-20 16:53:20 +03:00
Thirunarayanan Balathandayuthapani
4feb9df105 MDEV-29282 atomic.rename_trigger fails occasionally
This reverts part of commit 212994f704
(MDEV-28974) and implements a better fix that works in that special case
while avoiding other failures.

fil_name_process(): Do not rename the tablespace in deferred_spaces;
it already contains the latest name for the space id.

deferred_spaces::create(): In mariadb-backup --prepare,
replace absolute data directory file path with short name
relative to the backup directory and store it as filename while
deferring tablespace file creation.
2022-08-23 08:47:59 +03:00
Thirunarayanan Balathandayuthapani
af552f2903 Disabling atomic.rename_trigger test case because of frequent
failures
2022-08-16 21:33:45 +05:30
Marko Mäkelä
71964c76fe MDEV-25910: Aim to make all InnoDB DDL durable
Before any committed DDL operation returns control to the caller,
we must ensure that it will have been durably committed before the
ddl_log state may be changed, no matter if
innodb_flush_log_at_trx_commit=0 is being used.

Operations that would involve deleting files were already safe,
because the durable write of the FILE_DELETE record that would precede
the file deletion would also have made the commit durable.

Furthermore, when we clean up ADD INDEX stubs that were left behind
by a previous ha_innobase::commit_inplace_alter_table(commit=false)
when MDL could not be acquired, we will use the same interface as
for dropping indexes. This safety measure should be dead code,
because ADD FULLTEXT INDEX is not supported online, and dropping indexes
only involves file deletion for FULLTEXT INDEX.
2021-06-16 09:03:02 +03:00
Monty
e5b6db0179 Speed up atomic test suite by improving wait_until_connected_again.inc
and remove usage of RESET MASTER in loops.

- Remove sleep of 0.1 second that was done even when not needed.
- Don't call include/wait_wsrep_ready.inc if NO_WSREP is defined.
- Added NO_WSREP=1 to all atomic tests.
- Use 'select 1' instead of 'show status' to check is server is up.
- Changed RESET MASTER to FLUSH BINARY LOGS to speed up atomic tests.
  To be able to do this, added a new parameter variable to
  show_events.inc to allow one to specify the name of the binary log
  in the output.
2021-05-24 21:04:40 +03:00
Monty
7762ee5dbe MDEV-25180 Atomic ALTER TABLE
MDEV-25604 Atomic DDL: Binlog event written upon recovery does not
           have default database

The purpose of this task is to ensure that ALTER TABLE is atomic even if
the MariaDB server would be killed at any point of the alter table.
This means that either the ALTER TABLE succeeds (including that triggers,
the status tables and the binary log are updated) or things should be
reverted to their original state.

If the server crashes before the new version is fully up to date and
commited, it will revert to the original table and remove all
temporary files and tables.
If the new version is commited, crash recovery will use the new version,
and update triggers, the status tables and the binary log.
The one execption is ALTER TABLE .. RENAME .. where no changes are done
to table definition. This one will work as RENAME and roll back unless
the whole statement completed, including updating the binary log (if
enabled).

Other changes:
- Added handlerton->check_version() function to allow the ddl recovery
  code to check, in case of inplace alter table, if the table in the
  storage engine is of the new or old version.
- Added handler->table_version() so that an engine can report the current
  version of the table. This should be changed each time the table
  definition changes.
- Added  ha_signal_ddl_recovery_done() and
  handlerton::signal_ddl_recovery_done() to inform all handlers when
  ddl recovery has been done. (Needed by InnoDB).
- Added handlerton call inplace_alter_table_committed, to signal engine
  that ddl_log has been closed for the alter table query.
- Added new handerton flag
  HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we
  should call hton->notify_tabledef_changed() during
  mysql_inplace_alter_table. This was required as MyRocks and InnoDB
  needed the call at different times.
- Added function server_uuid_value() to be able to generate a temporary
  xid when ddl recovery writes the query to the binary log. This is
  needed to be able to handle crashes during ddl log recovery.
- Moved freeing of the frm definition to end of mysql_alter_table() to
  remove duplicate code and have a common exit strategy.

-------
InnoDB part of atomic ALTER TABLE
(Implemented by Marko Mäkelä)
innodb_check_version(): Compare the saved dict_table_t::def_trx_id
to determine whether an ALTER TABLE operation was committed.

We must correctly recover dict_table_t::def_trx_id for this to work.
Before purge removes any trace of DB_TRX_ID from system tables, it
will make an effort to load the user table into the cache, so that
the dict_table_t::def_trx_id can be recovered.

ha_innobase::table_version(): return garbage, or the trx_id that would
be used for committing an ALTER TABLE operation.

In InnoDB, table names starting with #sql-ib will remain special:
they will be dropped on startup. This may be revisited later in
MDEV-18518 when we implement proper undo logging and rollback
for creating or dropping multiple tables in a transaction.

Table names starting with #sql will retain some special meaning:
dict_table_t::parse_name() will not consider such names for
MDL acquisition, and dict_table_rename_in_cache() will treat such
names specially when handling FOREIGN KEY constraints.

Simplify InnoDB DROP INDEX.
Prevent purge wakeup

To ensure that dict_table_t::def_trx_id will be recovered correctly
in case the server is killed before ddl_log_complete(), we will block
the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS
between ha_innobase::commit_inplace_alter_table(commit=true)
(purge_sys.stop_SYS()) and purge_sys.resume_SYS().
The completion callback purge_sys.resume_SYS() must be between
ddl_log_complete() and MDL release.

--------

MyRocks support for atomic ALTER TABLE
(Implemented by Sergui Petrunia)

Implement these SE API functions:
- ha_rocksdb::table_version()
- hton->check_version = rocksdb_check_versionMyRocks data dictionary
  now stores table version for each table.
  (Absence of table version record is interpreted as table_version=0,
  that is, which means no upgrade changes are needed)
- For inplace alter table of a partitioned table, call the underlying
  handlerton when checking if the table is ok. This assumes that the
  partition engine commits all changes at once.
2021-05-19 22:54:13 +02:00
Monty
ffe7f19fa6 MDEV-24746 Atomic CREATE TRIGGER
The purpose of this task is to ensure that CREATE TRIGGER is atomic

When a trigger is created, we first create a trigger_name.TRN file and then
create or update the table_name.TRG files.
This is done by creating .TRN~ and .TRG~ files and replacing (or creating)
the result files.

The new logic is

- Log CREATE TRIGGER to DDL log, with a marker if old trigger existsted
- If old .TRN or .TRG files exists, make backup copies of these
- Create the new .TRN and .TRG files as before
- Remove the backups

Crash recovery
- If query has been logged to binary log:
  - delete any left over backup files
- else
   - Delete any old .TRN~ or .TRG~ files
   - If there was orignally some triggers (old .TRG file existed)
      - If we crashed before creating all backup files
         - Delete existing backup files
      - else
         - Restore backup files
      - end
   - Delete .TRN and .TRG file (as there was no triggers before

One benefit of the new code is that CREATE OR REPLACE TRIGGER is now
totally atomic even if there existed an old trigger: Either the old
trigger will be replaced or the old one will be left untouched.

Other things:
- If sql_create_definition_file() would fail, there could be memory leaks
  in CREATE TRIGGER, DROP TRIGGER or CREATE OR REPLACE TRIGGER.  This
  is now fixed.
2021-05-19 22:54:13 +02:00
Monty
d494abd175 MDEV-24607 Atomic CREATE VIEW
The logic of the new code is:
- Log CREATE view to DDL log, with a marker if old view existed
- If old view exists (in case of CREATE or REPLACE view), make a copy
  of the old view as view_name.frm-
- Create the new view definition file
- Delete copy of view if it was created.

Crash recovery:
- Delete view_name.frm~ file (Temporary file for view definition)
- If query was logged to binary log
  - Delete copy of view if it exists
- else
   -rename the copy of the view over the .frm file (restoring the
    old definition)

One benefit of the new code is that CREATE OR REPLACE VIEW for an
existing view is no fully atomic: Either the view will be replaced or
the old one will be left unchanged.
2021-05-19 22:54:13 +02:00
Monty
6aa9a552c2 MDEV-24576 Atomic CREATE TABLE
There are a few different cases to consider

Logging of CREATE TABLE and CREATE TABLE ... LIKE
- If REPLACE is used and there was an existing table, DDL log the drop of
  the table.
- If discovery of table is to be done
    - DDL LOG create table
  else
    - DDL log create table (with engine type)
    - create the table
- If table was created
  - Log entry to binary log with xid
  - Mark DDL log completed

Crash recovery:
- If query was in binary log do nothing and exit
- If discoverted table
   - Delete the .frm file
-else
   - Drop created table and frm file
- If table was dropped, write a DROP TABLE statement in binary log

CREATE TABLE ... SELECT required a little more work as when one is using
statement logging the query is written to the binary log before commit is
done.
This was fixed by adding a DROP TABLE to the binary log during crash
recovery if the ddl log entry was not closed. In this case the binary log
will contain:
CREATE TABLE xxx ... SELECT ....
DROP TABLE xxx;

Other things:
- Added debug_crash_here() functionality to Aria to be able to test
  crash in create table between the creation of the .MAI and the .MAD files.
2021-05-19 22:54:13 +02:00
Monty
7a588c30b1 MDEV-24408 Crash-safe DROP DATABASE
Description of how DROP DATABASE works after this patch

- Collect list of tables
- DDL log tables as they are dropped
- DDL log drop database
- Delete db.opt
- Delete data directory
- Log either DROP TABLE or DROP DATABASE to binary log
- De active ddl log entry

This is in line of how things where before (minus ddl logging) except that
we delete db.opt file last to not loose it if DROP DATABASE fails.

On recovery we have to ensure that all dropped tables are logged in
binary log and that they are properly dropped (as with atomic drop
table).
No new tables be dropped as part of recovery.

Recovery of active drop database ddl log entry:

- If drop database was logged to ddl log but was not found in the binary
  log:
  - drop the db.opt file and database directory.
  - Log DROP DATABASE to binary log
- If drop database was not logged to ddl log
  - Update binary log with DROP TABLE of the dropped tables. If table list
    is longer than max_allowed_packet, then the query will be split into
    multiple DROP TABLE/VIEW queries.

Other things:
- Added DDL_LOG_STATE and 'current database' as arguments to
  mysql_rm_table_no_locks(). This was needed to be able to combine
  ddl logging of DROP DATABASE and DROP TABLE and make the generated
  DROP TABLE statements shorter.
- To make the DROP TABLE statement created by ddl log shorter, I changed
  the binlogged query to use current directory and omit the directory
  part for all tables in the current directory.
- Merged some DROP TABLE and DROP VIEW code in ddl logger.  This was done
  to be able get separate DROP VIEW and DROP TABLE statements in the binary
  log.
- Added a 'recovery_state' variable to remember the state of dropped
  tables and views.
- Moved out code that drops database objects (stored procedures) from
  mysql_rm_db_internal() to drop_database_objects() for better code reuse.
- Made mysql_rm_db_internal() global so that could be used by the ddl
  recovery code.
2021-05-19 22:54:13 +02:00
Monty
407e9b78cf MDEV-24395 Atomic DROP TRIGGER
The purpose of this task is to ensure that DROP TRIGGER is atomic.

Description of how atomic drop trigger works:

Logging of DROP TRIGGER
    Log the following information:
    db
    table name
    trigger name
    xid /* Used to check if query was already logged to binary log */
    initial length of the .TRG file
    query if there is space for it, if not log a zero length query.

Recovery operations:
- Delete if exists 'database/trigger_name.TRN~'
  - If this file existed, it means that we crashed before the trigger
    was deleted and there is nothing else to do.
- Get length of .TRG file
  - If file length is unchanged, trigger was not dropped. Nothing else to
    do.
- Log original query to binary, if it was stored in the ddl log. If it was
  not stored (long query string), log the following query to binary log:
  use `database` ; DROP TRIGGER IF EXISTS `trigger_name`
  /* generated by ddl log */;

Other things:
- Added trigger name and DDL_LOG_STATE to drop_trigger()
  Trigger name was added to make the interface more consistent and
  more general.
2021-05-19 22:54:13 +02:00
Monty
e3cfb7c803 MDEV-23844 Atomic DROP TABLE (single table)
Logging logic:
- Log tables, just before they are dropped, to the ddl log
- After the last table for the statement is dropped, log an xid for the
  whole ddl log event

In case of crash:
- Remove first any active DROP TABLE events from the ddl log that matches
  xids found in binary log (this mean the drop was successful and was
  propery logged).
- Loop over all active DROP TABLE events
  - Ensure that the table is completely dropped
- Write a DROP TABLE entry to the binary log with the dropped tables.

Other things:
- Added code to ha_drop_table() to be able to tell the difference if
  a get_new_handler() failed because of out-of-memory or because the
  handler refused/was not able to create a a handler. This was needed
  to get sequences to work as sequences needs a share object to be passed
  to get_new_handler()
- TC_LOG_BINLOG::recover() was changed to always collect Xid's from the
  binary log and always call ddl_log_close_binlogged_events(). This was
  needed to be able to collect DROP TABLE events with embedded Xid's
  (used by ddl log).
- Added a new variable "$grep_script" to binlog filter to be able to find
  only rows that matches a regexp.
- Had to adjust some test that changed because drop statements are a bit
  larger in the binary log than before (as we have to store the xid)

Other things:
- MDEV-25588 Atomic DDL: Binlog query event written upon recovery is corrupt
  fixed (in the original commit).
2021-05-19 22:54:12 +02:00
Monty
47010ccffa MDEV-23842 Atomic RENAME TABLE
- Major rewrite of ddl_log.cc and ddl_log.h
  - ddl_log.cc described in the beginning how the recovery works.
  - ddl_log.log has unique signature and is dynamic. It's easy to
    add more information to the header and other ddl blocks while still
    being able to execute old ddl entries.
  - IO_SIZE for ddl blocks is now dynamic. Can be changed without affecting
    recovery of old logs.
  - Code is more modular and is now usable outside of partition handling.
  - Renamed log file to dll_recovery.log and added option --log-ddl-recovery
    to allow one to specify the path & filename.
- Added ddl_log_entry_phase[], number of phases for each DDL action,
  which allowed me to greatly simply set_global_from_ddl_log_entry()
- Changed how strings are stored in log entries, which allows us to
  store much more information in a log entry.
- ddl log is now always created at start and deleted on normal shutdown.
  This simplices things notable.
- Added probes debug_crash_here() and debug_simulate_error() to simply
  crash testing and allow crash after a given number of times a probe
  is executed. See comments in debug_sync.cc and rename_table.test for
  how this can be used.
- Reverting failed table and view renames is done trough the ddl log.
  This ensures that the ddl log is tested also outside of recovery.
- Added helper function 'handler::needs_lower_case_filenames()'
- Extend binary log with Q_XID events. ddl log handling is using this
  to check if a ddl log entry was logged to the binary log (if yes,
  it will be deleted from the log during ddl_log_close_binlogged_events()
- If a DDL entry fails 3 time, disable it. This is to ensure that if
  we have a crash in ddl recovery code the server will not get stuck
  in a forever crash-restart-crash loop.

mysqltest.cc changes:
- --die will now replace $variables with their values
- $error will contain the error of the last failed statement

storage engine changes:
- maria_rename() was changed to be more robust against crashes during
  rename.
2021-05-19 22:54:12 +02:00