2020-10-02 15:59:06 +03:00
|
|
|
/*
|
|
|
|
Copyright (c) 2000, 2019, Oracle and/or its affiliates.
|
2022-10-27 22:18:51 +02:00
|
|
|
Copyright (c) 2010, 2021, MariaDB
|
2020-10-02 15:59:06 +03:00
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
|
|
it under the terms of the GNU General Public License as published by
|
|
|
|
the Free Software Foundation; version 2 of the License.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with this program; if not, write to the Free Software
|
|
|
|
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1335 USA
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* External interfaces to ddl log functions */
|
|
|
|
|
|
|
|
#ifndef DDL_LOG_INCLUDED
|
|
|
|
#define DDL_LOG_INCLUDED
|
|
|
|
|
|
|
|
enum ddl_log_entry_code
|
|
|
|
{
|
|
|
|
/*
|
2020-10-15 02:25:57 +03:00
|
|
|
DDL_LOG_UNKOWN
|
|
|
|
Here mainly to detect blocks that are all zero
|
|
|
|
|
2020-10-02 15:59:06 +03:00
|
|
|
DDL_LOG_EXECUTE_CODE:
|
|
|
|
This is a code that indicates that this is a log entry to
|
|
|
|
be executed, from this entry a linked list of log entries
|
|
|
|
can be found and executed.
|
|
|
|
DDL_LOG_ENTRY_CODE:
|
|
|
|
An entry to be executed in a linked list from an execute log
|
|
|
|
entry.
|
2021-06-17 16:06:46 +03:00
|
|
|
DDL_LOG_IGNORE_ENTRY_CODE:
|
2020-10-02 15:59:06 +03:00
|
|
|
An entry that is to be ignored
|
|
|
|
*/
|
2020-10-15 02:25:57 +03:00
|
|
|
DDL_LOG_UNKNOWN= 0,
|
|
|
|
DDL_LOG_EXECUTE_CODE= 1,
|
|
|
|
DDL_LOG_ENTRY_CODE= 2,
|
2021-06-17 16:06:46 +03:00
|
|
|
DDL_LOG_IGNORE_ENTRY_CODE= 3,
|
2020-10-15 02:25:57 +03:00
|
|
|
DDL_LOG_ENTRY_CODE_LAST= 4
|
2020-10-02 15:59:06 +03:00
|
|
|
};
|
|
|
|
|
2020-10-15 02:25:57 +03:00
|
|
|
|
|
|
|
/*
|
2020-12-04 18:23:40 +02:00
|
|
|
When adding things below, also add an entry to ddl_log_action_names and
|
|
|
|
ddl_log_entry_phases in ddl_log.cc
|
2020-10-15 02:25:57 +03:00
|
|
|
*/
|
|
|
|
|
2020-10-02 15:59:06 +03:00
|
|
|
enum ddl_log_action_code
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
The type of action that a DDL_LOG_ENTRY_CODE entry is to
|
|
|
|
perform.
|
|
|
|
*/
|
2020-10-15 02:25:57 +03:00
|
|
|
DDL_LOG_UNKNOWN_ACTION= 0,
|
|
|
|
|
|
|
|
/* Delete a .frm file or a table in the partition engine */
|
|
|
|
DDL_LOG_DELETE_ACTION= 1,
|
|
|
|
|
|
|
|
/* Rename a .frm fire a table in the partition engine */
|
|
|
|
DDL_LOG_RENAME_ACTION= 2,
|
|
|
|
|
|
|
|
/*
|
|
|
|
Rename an entity after removing the previous entry with the
|
|
|
|
new name, that is replace this entry.
|
|
|
|
*/
|
|
|
|
DDL_LOG_REPLACE_ACTION= 3,
|
|
|
|
|
|
|
|
/* Exchange two entities by renaming them a -> tmp, b -> a, tmp -> b */
|
|
|
|
DDL_LOG_EXCHANGE_ACTION= 4,
|
|
|
|
/*
|
|
|
|
log do_rename(): Rename of .frm file, table, stat_tables and triggers
|
|
|
|
*/
|
|
|
|
DDL_LOG_RENAME_TABLE_ACTION= 5,
|
|
|
|
DDL_LOG_RENAME_VIEW_ACTION= 6,
|
2020-12-20 17:44:11 +02:00
|
|
|
DDL_LOG_DROP_INIT_ACTION= 7,
|
2020-12-04 18:23:40 +02:00
|
|
|
DDL_LOG_DROP_TABLE_ACTION= 8,
|
2020-12-20 17:44:11 +02:00
|
|
|
DDL_LOG_DROP_VIEW_ACTION= 9,
|
|
|
|
DDL_LOG_DROP_TRIGGER_ACTION= 10,
|
|
|
|
DDL_LOG_DROP_DB_ACTION=11,
|
2021-01-17 16:06:43 +02:00
|
|
|
DDL_LOG_CREATE_TABLE_ACTION=12,
|
2021-01-17 16:34:01 +02:00
|
|
|
DDL_LOG_CREATE_VIEW_ACTION=13,
|
|
|
|
DDL_LOG_DELETE_TMP_FILE_ACTION=14,
|
2021-01-31 18:43:50 +02:00
|
|
|
DDL_LOG_CREATE_TRIGGER_ACTION=15,
|
MDEV-25180 Atomic ALTER TABLE
MDEV-25604 Atomic DDL: Binlog event written upon recovery does not
have default database
The purpose of this task is to ensure that ALTER TABLE is atomic even if
the MariaDB server would be killed at any point of the alter table.
This means that either the ALTER TABLE succeeds (including that triggers,
the status tables and the binary log are updated) or things should be
reverted to their original state.
If the server crashes before the new version is fully up to date and
commited, it will revert to the original table and remove all
temporary files and tables.
If the new version is commited, crash recovery will use the new version,
and update triggers, the status tables and the binary log.
The one execption is ALTER TABLE .. RENAME .. where no changes are done
to table definition. This one will work as RENAME and roll back unless
the whole statement completed, including updating the binary log (if
enabled).
Other changes:
- Added handlerton->check_version() function to allow the ddl recovery
code to check, in case of inplace alter table, if the table in the
storage engine is of the new or old version.
- Added handler->table_version() so that an engine can report the current
version of the table. This should be changed each time the table
definition changes.
- Added ha_signal_ddl_recovery_done() and
handlerton::signal_ddl_recovery_done() to inform all handlers when
ddl recovery has been done. (Needed by InnoDB).
- Added handlerton call inplace_alter_table_committed, to signal engine
that ddl_log has been closed for the alter table query.
- Added new handerton flag
HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we
should call hton->notify_tabledef_changed() during
mysql_inplace_alter_table. This was required as MyRocks and InnoDB
needed the call at different times.
- Added function server_uuid_value() to be able to generate a temporary
xid when ddl recovery writes the query to the binary log. This is
needed to be able to handle crashes during ddl log recovery.
- Moved freeing of the frm definition to end of mysql_alter_table() to
remove duplicate code and have a common exit strategy.
-------
InnoDB part of atomic ALTER TABLE
(Implemented by Marko Mäkelä)
innodb_check_version(): Compare the saved dict_table_t::def_trx_id
to determine whether an ALTER TABLE operation was committed.
We must correctly recover dict_table_t::def_trx_id for this to work.
Before purge removes any trace of DB_TRX_ID from system tables, it
will make an effort to load the user table into the cache, so that
the dict_table_t::def_trx_id can be recovered.
ha_innobase::table_version(): return garbage, or the trx_id that would
be used for committing an ALTER TABLE operation.
In InnoDB, table names starting with #sql-ib will remain special:
they will be dropped on startup. This may be revisited later in
MDEV-18518 when we implement proper undo logging and rollback
for creating or dropping multiple tables in a transaction.
Table names starting with #sql will retain some special meaning:
dict_table_t::parse_name() will not consider such names for
MDL acquisition, and dict_table_rename_in_cache() will treat such
names specially when handling FOREIGN KEY constraints.
Simplify InnoDB DROP INDEX.
Prevent purge wakeup
To ensure that dict_table_t::def_trx_id will be recovered correctly
in case the server is killed before ddl_log_complete(), we will block
the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS
between ha_innobase::commit_inplace_alter_table(commit=true)
(purge_sys.stop_SYS()) and purge_sys.resume_SYS().
The completion callback purge_sys.resume_SYS() must be between
ddl_log_complete() and MDL release.
--------
MyRocks support for atomic ALTER TABLE
(Implemented by Sergui Petrunia)
Implement these SE API functions:
- ha_rocksdb::table_version()
- hton->check_version = rocksdb_check_versionMyRocks data dictionary
now stores table version for each table.
(Absence of table version record is interpreted as table_version=0,
that is, which means no upgrade changes are needed)
- For inplace alter table of a partitioned table, call the underlying
handlerton when checking if the table is ok. This assumes that the
partition engine commits all changes at once.
2021-03-18 12:41:08 +02:00
|
|
|
DDL_LOG_ALTER_TABLE_ACTION=16,
|
|
|
|
DDL_LOG_STORE_QUERY_ACTION=17,
|
2020-12-04 18:23:40 +02:00
|
|
|
DDL_LOG_LAST_ACTION /* End marker */
|
2020-10-02 15:59:06 +03:00
|
|
|
};
|
|
|
|
|
2020-10-15 02:25:57 +03:00
|
|
|
|
|
|
|
/* Number of phases for each ddl_log_action_code */
|
|
|
|
extern const uchar ddl_log_entry_phases[DDL_LOG_LAST_ACTION];
|
|
|
|
|
|
|
|
|
2020-10-02 15:59:06 +03:00
|
|
|
enum enum_ddl_log_exchange_phase {
|
|
|
|
EXCH_PHASE_NAME_TO_TEMP= 0,
|
|
|
|
EXCH_PHASE_FROM_TO_NAME= 1,
|
2020-12-20 17:44:11 +02:00
|
|
|
EXCH_PHASE_TEMP_TO_FROM= 2,
|
|
|
|
EXCH_PHASE_END
|
2020-10-02 15:59:06 +03:00
|
|
|
};
|
|
|
|
|
2020-10-15 02:25:57 +03:00
|
|
|
enum enum_ddl_log_rename_table_phase {
|
|
|
|
DDL_RENAME_PHASE_TRIGGER= 0,
|
|
|
|
DDL_RENAME_PHASE_STAT,
|
|
|
|
DDL_RENAME_PHASE_TABLE,
|
2020-12-20 17:44:11 +02:00
|
|
|
DDL_RENAME_PHASE_END
|
2020-10-15 02:25:57 +03:00
|
|
|
};
|
|
|
|
|
2020-12-04 18:23:40 +02:00
|
|
|
enum enum_ddl_log_drop_table_phase {
|
|
|
|
DDL_DROP_PHASE_TABLE=0,
|
|
|
|
DDL_DROP_PHASE_TRIGGER,
|
|
|
|
DDL_DROP_PHASE_BINLOG,
|
|
|
|
DDL_DROP_PHASE_RESET, /* Reset found list of dropped tables */
|
|
|
|
DDL_DROP_PHASE_END
|
|
|
|
};
|
|
|
|
|
2020-12-20 17:44:11 +02:00
|
|
|
enum enum_ddl_log_drop_db_phase {
|
|
|
|
DDL_DROP_DB_PHASE_INIT=0,
|
|
|
|
DDL_DROP_DB_PHASE_LOG,
|
|
|
|
DDL_DROP_DB_PHASE_END
|
|
|
|
};
|
|
|
|
|
2021-01-17 16:06:43 +02:00
|
|
|
enum enum_ddl_log_create_table_phase {
|
|
|
|
DDL_CREATE_TABLE_PHASE_INIT=0,
|
|
|
|
DDL_CREATE_TABLE_PHASE_LOG,
|
|
|
|
DDL_CREATE_TABLE_PHASE_END
|
|
|
|
};
|
|
|
|
|
2021-01-17 16:34:01 +02:00
|
|
|
enum enum_ddl_log_create_view_phase {
|
|
|
|
DDL_CREATE_VIEW_PHASE_NO_OLD_VIEW,
|
|
|
|
DDL_CREATE_VIEW_PHASE_DELETE_VIEW_COPY,
|
|
|
|
DDL_CREATE_VIEW_PHASE_OLD_VIEW_COPIED,
|
|
|
|
DDL_CREATE_VIEW_PHASE_END
|
|
|
|
};
|
|
|
|
|
2021-01-31 18:43:50 +02:00
|
|
|
enum enum_ddl_log_create_trigger_phase {
|
|
|
|
DDL_CREATE_TRIGGER_PHASE_NO_OLD_TRIGGER,
|
|
|
|
DDL_CREATE_TRIGGER_PHASE_DELETE_COPY,
|
|
|
|
DDL_CREATE_TRIGGER_PHASE_OLD_COPIED,
|
|
|
|
DDL_CREATE_TRIGGER_PHASE_END
|
|
|
|
};
|
|
|
|
|
MDEV-25180 Atomic ALTER TABLE
MDEV-25604 Atomic DDL: Binlog event written upon recovery does not
have default database
The purpose of this task is to ensure that ALTER TABLE is atomic even if
the MariaDB server would be killed at any point of the alter table.
This means that either the ALTER TABLE succeeds (including that triggers,
the status tables and the binary log are updated) or things should be
reverted to their original state.
If the server crashes before the new version is fully up to date and
commited, it will revert to the original table and remove all
temporary files and tables.
If the new version is commited, crash recovery will use the new version,
and update triggers, the status tables and the binary log.
The one execption is ALTER TABLE .. RENAME .. where no changes are done
to table definition. This one will work as RENAME and roll back unless
the whole statement completed, including updating the binary log (if
enabled).
Other changes:
- Added handlerton->check_version() function to allow the ddl recovery
code to check, in case of inplace alter table, if the table in the
storage engine is of the new or old version.
- Added handler->table_version() so that an engine can report the current
version of the table. This should be changed each time the table
definition changes.
- Added ha_signal_ddl_recovery_done() and
handlerton::signal_ddl_recovery_done() to inform all handlers when
ddl recovery has been done. (Needed by InnoDB).
- Added handlerton call inplace_alter_table_committed, to signal engine
that ddl_log has been closed for the alter table query.
- Added new handerton flag
HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we
should call hton->notify_tabledef_changed() during
mysql_inplace_alter_table. This was required as MyRocks and InnoDB
needed the call at different times.
- Added function server_uuid_value() to be able to generate a temporary
xid when ddl recovery writes the query to the binary log. This is
needed to be able to handle crashes during ddl log recovery.
- Moved freeing of the frm definition to end of mysql_alter_table() to
remove duplicate code and have a common exit strategy.
-------
InnoDB part of atomic ALTER TABLE
(Implemented by Marko Mäkelä)
innodb_check_version(): Compare the saved dict_table_t::def_trx_id
to determine whether an ALTER TABLE operation was committed.
We must correctly recover dict_table_t::def_trx_id for this to work.
Before purge removes any trace of DB_TRX_ID from system tables, it
will make an effort to load the user table into the cache, so that
the dict_table_t::def_trx_id can be recovered.
ha_innobase::table_version(): return garbage, or the trx_id that would
be used for committing an ALTER TABLE operation.
In InnoDB, table names starting with #sql-ib will remain special:
they will be dropped on startup. This may be revisited later in
MDEV-18518 when we implement proper undo logging and rollback
for creating or dropping multiple tables in a transaction.
Table names starting with #sql will retain some special meaning:
dict_table_t::parse_name() will not consider such names for
MDL acquisition, and dict_table_rename_in_cache() will treat such
names specially when handling FOREIGN KEY constraints.
Simplify InnoDB DROP INDEX.
Prevent purge wakeup
To ensure that dict_table_t::def_trx_id will be recovered correctly
in case the server is killed before ddl_log_complete(), we will block
the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS
between ha_innobase::commit_inplace_alter_table(commit=true)
(purge_sys.stop_SYS()) and purge_sys.resume_SYS().
The completion callback purge_sys.resume_SYS() must be between
ddl_log_complete() and MDL release.
--------
MyRocks support for atomic ALTER TABLE
(Implemented by Sergui Petrunia)
Implement these SE API functions:
- ha_rocksdb::table_version()
- hton->check_version = rocksdb_check_versionMyRocks data dictionary
now stores table version for each table.
(Absence of table version record is interpreted as table_version=0,
that is, which means no upgrade changes are needed)
- For inplace alter table of a partitioned table, call the underlying
handlerton when checking if the table is ok. This assumes that the
partition engine commits all changes at once.
2021-03-18 12:41:08 +02:00
|
|
|
enum enum_ddl_log_alter_table_phase {
|
|
|
|
DDL_ALTER_TABLE_PHASE_INIT,
|
|
|
|
DDL_ALTER_TABLE_PHASE_RENAME_FAILED,
|
|
|
|
DDL_ALTER_TABLE_PHASE_INPLACE_COPIED,
|
|
|
|
DDL_ALTER_TABLE_PHASE_INPLACE,
|
|
|
|
DDL_ALTER_TABLE_PHASE_PREPARE_INPLACE,
|
|
|
|
DDL_ALTER_TABLE_PHASE_CREATED,
|
|
|
|
DDL_ALTER_TABLE_PHASE_COPIED,
|
|
|
|
DDL_ALTER_TABLE_PHASE_OLD_RENAMED,
|
|
|
|
DDL_ALTER_TABLE_PHASE_UPDATE_TRIGGERS,
|
|
|
|
DDL_ALTER_TABLE_PHASE_UPDATE_STATS,
|
|
|
|
DDL_ALTER_TABLE_PHASE_UPDATE_BINARY_LOG,
|
|
|
|
DDL_ALTER_TABLE_PHASE_END
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
Flags stored in DDL_LOG_ENTRY.flags
|
|
|
|
The flag values can be reused for different commands
|
|
|
|
*/
|
|
|
|
#define DDL_LOG_FLAG_ALTER_RENAME (1 << 0)
|
|
|
|
#define DDL_LOG_FLAG_ALTER_ENGINE_CHANGED (1 << 1)
|
|
|
|
#define DDL_LOG_FLAG_ONLY_FRM (1 << 2)
|
|
|
|
#define DDL_LOG_FLAG_UPDATE_STAT (1 << 3)
|
|
|
|
/*
|
|
|
|
Set when using ALTER TABLE on a partitioned table and the table
|
|
|
|
engine is not changed
|
|
|
|
*/
|
|
|
|
#define DDL_LOG_FLAG_ALTER_PARTITION (1 << 4)
|
2021-01-17 16:34:01 +02:00
|
|
|
|
2020-10-15 02:25:57 +03:00
|
|
|
/*
|
|
|
|
Setting ddl_log_entry.phase to this has the same effect as setting
|
|
|
|
the phase to the maximum phase (..PHASE_END) for an entry.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#define DDL_LOG_FINAL_PHASE ((uchar) 0xff)
|
2020-10-02 15:59:06 +03:00
|
|
|
|
|
|
|
typedef struct st_ddl_log_entry
|
|
|
|
{
|
2020-10-15 02:25:57 +03:00
|
|
|
LEX_CSTRING name;
|
|
|
|
LEX_CSTRING from_name;
|
|
|
|
LEX_CSTRING handler_name;
|
|
|
|
LEX_CSTRING db;
|
|
|
|
LEX_CSTRING from_db;
|
|
|
|
LEX_CSTRING from_handler_name;
|
MDEV-25180 Atomic ALTER TABLE
MDEV-25604 Atomic DDL: Binlog event written upon recovery does not
have default database
The purpose of this task is to ensure that ALTER TABLE is atomic even if
the MariaDB server would be killed at any point of the alter table.
This means that either the ALTER TABLE succeeds (including that triggers,
the status tables and the binary log are updated) or things should be
reverted to their original state.
If the server crashes before the new version is fully up to date and
commited, it will revert to the original table and remove all
temporary files and tables.
If the new version is commited, crash recovery will use the new version,
and update triggers, the status tables and the binary log.
The one execption is ALTER TABLE .. RENAME .. where no changes are done
to table definition. This one will work as RENAME and roll back unless
the whole statement completed, including updating the binary log (if
enabled).
Other changes:
- Added handlerton->check_version() function to allow the ddl recovery
code to check, in case of inplace alter table, if the table in the
storage engine is of the new or old version.
- Added handler->table_version() so that an engine can report the current
version of the table. This should be changed each time the table
definition changes.
- Added ha_signal_ddl_recovery_done() and
handlerton::signal_ddl_recovery_done() to inform all handlers when
ddl recovery has been done. (Needed by InnoDB).
- Added handlerton call inplace_alter_table_committed, to signal engine
that ddl_log has been closed for the alter table query.
- Added new handerton flag
HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we
should call hton->notify_tabledef_changed() during
mysql_inplace_alter_table. This was required as MyRocks and InnoDB
needed the call at different times.
- Added function server_uuid_value() to be able to generate a temporary
xid when ddl recovery writes the query to the binary log. This is
needed to be able to handle crashes during ddl log recovery.
- Moved freeing of the frm definition to end of mysql_alter_table() to
remove duplicate code and have a common exit strategy.
-------
InnoDB part of atomic ALTER TABLE
(Implemented by Marko Mäkelä)
innodb_check_version(): Compare the saved dict_table_t::def_trx_id
to determine whether an ALTER TABLE operation was committed.
We must correctly recover dict_table_t::def_trx_id for this to work.
Before purge removes any trace of DB_TRX_ID from system tables, it
will make an effort to load the user table into the cache, so that
the dict_table_t::def_trx_id can be recovered.
ha_innobase::table_version(): return garbage, or the trx_id that would
be used for committing an ALTER TABLE operation.
In InnoDB, table names starting with #sql-ib will remain special:
they will be dropped on startup. This may be revisited later in
MDEV-18518 when we implement proper undo logging and rollback
for creating or dropping multiple tables in a transaction.
Table names starting with #sql will retain some special meaning:
dict_table_t::parse_name() will not consider such names for
MDL acquisition, and dict_table_rename_in_cache() will treat such
names specially when handling FOREIGN KEY constraints.
Simplify InnoDB DROP INDEX.
Prevent purge wakeup
To ensure that dict_table_t::def_trx_id will be recovered correctly
in case the server is killed before ddl_log_complete(), we will block
the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS
between ha_innobase::commit_inplace_alter_table(commit=true)
(purge_sys.stop_SYS()) and purge_sys.resume_SYS().
The completion callback purge_sys.resume_SYS() must be between
ddl_log_complete() and MDL release.
--------
MyRocks support for atomic ALTER TABLE
(Implemented by Sergui Petrunia)
Implement these SE API functions:
- ha_rocksdb::table_version()
- hton->check_version = rocksdb_check_versionMyRocks data dictionary
now stores table version for each table.
(Absence of table version record is interpreted as table_version=0,
that is, which means no upgrade changes are needed)
- For inplace alter table of a partitioned table, call the underlying
handlerton when checking if the table is ok. This assumes that the
partition engine commits all changes at once.
2021-03-18 12:41:08 +02:00
|
|
|
LEX_CSTRING tmp_name; /* frm file or temporary file name */
|
|
|
|
LEX_CSTRING extra_name; /* Backup table name */
|
2020-10-15 02:25:57 +03:00
|
|
|
uchar uuid[MY_UUID_SIZE]; // UUID for new frm file
|
|
|
|
|
|
|
|
ulonglong xid; // Xid stored in the binary log
|
|
|
|
/*
|
|
|
|
unique_id can be used to store a unique number to check current state.
|
|
|
|
Currently it is used to store new size of frm file, link to another ddl log
|
|
|
|
entry or store an a uniq version for a storage engine in alter table.
|
|
|
|
For execute entries this is reused as an execute counter to ensure we
|
|
|
|
don't repeat an entry too many times if executing the entry fails.
|
|
|
|
*/
|
|
|
|
ulonglong unique_id;
|
2020-10-02 15:59:06 +03:00
|
|
|
uint next_entry;
|
2020-10-15 02:25:57 +03:00
|
|
|
uint entry_pos; // Set by write_dll_log_entry()
|
|
|
|
uint16 flags; // Flags unique for each command
|
|
|
|
enum ddl_log_entry_code entry_type; // Set automatically
|
2020-10-02 15:59:06 +03:00
|
|
|
enum ddl_log_action_code action_type;
|
|
|
|
/*
|
|
|
|
Most actions have only one phase. REPLACE does however have two
|
|
|
|
phases. The first phase removes the file with the new name if
|
|
|
|
there was one there before and the second phase renames the
|
|
|
|
old name to the new name.
|
|
|
|
*/
|
2020-10-15 02:25:57 +03:00
|
|
|
uchar phase; // set automatically
|
2020-10-02 15:59:06 +03:00
|
|
|
} DDL_LOG_ENTRY;
|
|
|
|
|
|
|
|
typedef struct st_ddl_log_memory_entry
|
|
|
|
{
|
|
|
|
uint entry_pos;
|
|
|
|
struct st_ddl_log_memory_entry *next_log_entry;
|
|
|
|
struct st_ddl_log_memory_entry *prev_log_entry;
|
|
|
|
struct st_ddl_log_memory_entry *next_active_log_entry;
|
|
|
|
} DDL_LOG_MEMORY_ENTRY;
|
|
|
|
|
|
|
|
|
2020-10-15 02:25:57 +03:00
|
|
|
/*
|
|
|
|
State of the ddl log during execution of a DDL.
|
|
|
|
|
|
|
|
A ddl log state has one execute entry (main entry pointing to the first
|
|
|
|
action entry) and many 'action entries' linked in a list in the order
|
|
|
|
they should be executed.
|
|
|
|
One recovery the log is parsed and all execute entries will be executed.
|
|
|
|
|
|
|
|
All entries are stored as separate blocks in the ddl recovery file.
|
|
|
|
*/
|
|
|
|
|
|
|
|
typedef struct st_ddl_log_state
|
|
|
|
{
|
|
|
|
/* List of ddl log entries */
|
|
|
|
DDL_LOG_MEMORY_ENTRY *list;
|
|
|
|
/* One execute entry per list */
|
|
|
|
DDL_LOG_MEMORY_ENTRY *execute_entry;
|
MDEV-25180 Atomic ALTER TABLE
MDEV-25604 Atomic DDL: Binlog event written upon recovery does not
have default database
The purpose of this task is to ensure that ALTER TABLE is atomic even if
the MariaDB server would be killed at any point of the alter table.
This means that either the ALTER TABLE succeeds (including that triggers,
the status tables and the binary log are updated) or things should be
reverted to their original state.
If the server crashes before the new version is fully up to date and
commited, it will revert to the original table and remove all
temporary files and tables.
If the new version is commited, crash recovery will use the new version,
and update triggers, the status tables and the binary log.
The one execption is ALTER TABLE .. RENAME .. where no changes are done
to table definition. This one will work as RENAME and roll back unless
the whole statement completed, including updating the binary log (if
enabled).
Other changes:
- Added handlerton->check_version() function to allow the ddl recovery
code to check, in case of inplace alter table, if the table in the
storage engine is of the new or old version.
- Added handler->table_version() so that an engine can report the current
version of the table. This should be changed each time the table
definition changes.
- Added ha_signal_ddl_recovery_done() and
handlerton::signal_ddl_recovery_done() to inform all handlers when
ddl recovery has been done. (Needed by InnoDB).
- Added handlerton call inplace_alter_table_committed, to signal engine
that ddl_log has been closed for the alter table query.
- Added new handerton flag
HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we
should call hton->notify_tabledef_changed() during
mysql_inplace_alter_table. This was required as MyRocks and InnoDB
needed the call at different times.
- Added function server_uuid_value() to be able to generate a temporary
xid when ddl recovery writes the query to the binary log. This is
needed to be able to handle crashes during ddl log recovery.
- Moved freeing of the frm definition to end of mysql_alter_table() to
remove duplicate code and have a common exit strategy.
-------
InnoDB part of atomic ALTER TABLE
(Implemented by Marko Mäkelä)
innodb_check_version(): Compare the saved dict_table_t::def_trx_id
to determine whether an ALTER TABLE operation was committed.
We must correctly recover dict_table_t::def_trx_id for this to work.
Before purge removes any trace of DB_TRX_ID from system tables, it
will make an effort to load the user table into the cache, so that
the dict_table_t::def_trx_id can be recovered.
ha_innobase::table_version(): return garbage, or the trx_id that would
be used for committing an ALTER TABLE operation.
In InnoDB, table names starting with #sql-ib will remain special:
they will be dropped on startup. This may be revisited later in
MDEV-18518 when we implement proper undo logging and rollback
for creating or dropping multiple tables in a transaction.
Table names starting with #sql will retain some special meaning:
dict_table_t::parse_name() will not consider such names for
MDL acquisition, and dict_table_rename_in_cache() will treat such
names specially when handling FOREIGN KEY constraints.
Simplify InnoDB DROP INDEX.
Prevent purge wakeup
To ensure that dict_table_t::def_trx_id will be recovered correctly
in case the server is killed before ddl_log_complete(), we will block
the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS
between ha_innobase::commit_inplace_alter_table(commit=true)
(purge_sys.stop_SYS()) and purge_sys.resume_SYS().
The completion callback purge_sys.resume_SYS() must be between
ddl_log_complete() and MDL release.
--------
MyRocks support for atomic ALTER TABLE
(Implemented by Sergui Petrunia)
Implement these SE API functions:
- ha_rocksdb::table_version()
- hton->check_version = rocksdb_check_versionMyRocks data dictionary
now stores table version for each table.
(Absence of table version record is interpreted as table_version=0,
that is, which means no upgrade changes are needed)
- For inplace alter table of a partitioned table, call the underlying
handlerton when checking if the table is ok. This assumes that the
partition engine commits all changes at once.
2021-03-18 12:41:08 +02:00
|
|
|
/*
|
|
|
|
Entry used for PHASE updates. Normally same as first in 'list', but in
|
|
|
|
case of a query log event, this points to the main event.
|
|
|
|
*/
|
|
|
|
DDL_LOG_MEMORY_ENTRY *main_entry;
|
|
|
|
uint16 flags; /* Cache for flags */
|
2021-01-17 16:06:43 +02:00
|
|
|
bool is_active() { return list != 0; }
|
2020-10-15 02:25:57 +03:00
|
|
|
} DDL_LOG_STATE;
|
|
|
|
|
|
|
|
|
|
|
|
/* These functions are for recovery */
|
|
|
|
bool ddl_log_initialize();
|
|
|
|
void ddl_log_release();
|
|
|
|
bool ddl_log_close_binlogged_events(HASH *xids);
|
|
|
|
int ddl_log_execute_recovery();
|
|
|
|
|
|
|
|
/* functions for updating the ddl log */
|
2020-10-12 11:21:05 +03:00
|
|
|
bool ddl_log_write_entry(DDL_LOG_ENTRY *ddl_log_entry,
|
2020-10-02 15:59:06 +03:00
|
|
|
DDL_LOG_MEMORY_ENTRY **active_entry);
|
2020-10-15 02:25:57 +03:00
|
|
|
|
2021-09-27 15:53:52 +03:00
|
|
|
bool ddl_log_write_execute_entry(uint first_entry, uint cond_entry,
|
|
|
|
DDL_LOG_MEMORY_ENTRY** active_entry);
|
2022-10-27 22:18:51 +02:00
|
|
|
inline
|
|
|
|
bool ddl_log_write_execute_entry(uint first_entry,
|
|
|
|
DDL_LOG_MEMORY_ENTRY **active_entry)
|
|
|
|
{
|
|
|
|
return ddl_log_write_execute_entry(first_entry, 0, active_entry);
|
|
|
|
}
|
2020-10-15 02:25:57 +03:00
|
|
|
bool ddl_log_disable_execute_entry(DDL_LOG_MEMORY_ENTRY **active_entry);
|
|
|
|
|
|
|
|
void ddl_log_complete(DDL_LOG_STATE *ddl_log_state);
|
2021-06-17 16:06:46 +03:00
|
|
|
bool ddl_log_revert(THD *thd, DDL_LOG_STATE *ddl_log_state);
|
2020-10-15 02:25:57 +03:00
|
|
|
|
|
|
|
bool ddl_log_update_phase(DDL_LOG_STATE *entry, uchar phase);
|
MDEV-25180 Atomic ALTER TABLE
MDEV-25604 Atomic DDL: Binlog event written upon recovery does not
have default database
The purpose of this task is to ensure that ALTER TABLE is atomic even if
the MariaDB server would be killed at any point of the alter table.
This means that either the ALTER TABLE succeeds (including that triggers,
the status tables and the binary log are updated) or things should be
reverted to their original state.
If the server crashes before the new version is fully up to date and
commited, it will revert to the original table and remove all
temporary files and tables.
If the new version is commited, crash recovery will use the new version,
and update triggers, the status tables and the binary log.
The one execption is ALTER TABLE .. RENAME .. where no changes are done
to table definition. This one will work as RENAME and roll back unless
the whole statement completed, including updating the binary log (if
enabled).
Other changes:
- Added handlerton->check_version() function to allow the ddl recovery
code to check, in case of inplace alter table, if the table in the
storage engine is of the new or old version.
- Added handler->table_version() so that an engine can report the current
version of the table. This should be changed each time the table
definition changes.
- Added ha_signal_ddl_recovery_done() and
handlerton::signal_ddl_recovery_done() to inform all handlers when
ddl recovery has been done. (Needed by InnoDB).
- Added handlerton call inplace_alter_table_committed, to signal engine
that ddl_log has been closed for the alter table query.
- Added new handerton flag
HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we
should call hton->notify_tabledef_changed() during
mysql_inplace_alter_table. This was required as MyRocks and InnoDB
needed the call at different times.
- Added function server_uuid_value() to be able to generate a temporary
xid when ddl recovery writes the query to the binary log. This is
needed to be able to handle crashes during ddl log recovery.
- Moved freeing of the frm definition to end of mysql_alter_table() to
remove duplicate code and have a common exit strategy.
-------
InnoDB part of atomic ALTER TABLE
(Implemented by Marko Mäkelä)
innodb_check_version(): Compare the saved dict_table_t::def_trx_id
to determine whether an ALTER TABLE operation was committed.
We must correctly recover dict_table_t::def_trx_id for this to work.
Before purge removes any trace of DB_TRX_ID from system tables, it
will make an effort to load the user table into the cache, so that
the dict_table_t::def_trx_id can be recovered.
ha_innobase::table_version(): return garbage, or the trx_id that would
be used for committing an ALTER TABLE operation.
In InnoDB, table names starting with #sql-ib will remain special:
they will be dropped on startup. This may be revisited later in
MDEV-18518 when we implement proper undo logging and rollback
for creating or dropping multiple tables in a transaction.
Table names starting with #sql will retain some special meaning:
dict_table_t::parse_name() will not consider such names for
MDL acquisition, and dict_table_rename_in_cache() will treat such
names specially when handling FOREIGN KEY constraints.
Simplify InnoDB DROP INDEX.
Prevent purge wakeup
To ensure that dict_table_t::def_trx_id will be recovered correctly
in case the server is killed before ddl_log_complete(), we will block
the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS
between ha_innobase::commit_inplace_alter_table(commit=true)
(purge_sys.stop_SYS()) and purge_sys.resume_SYS().
The completion callback purge_sys.resume_SYS() must be between
ddl_log_complete() and MDL release.
--------
MyRocks support for atomic ALTER TABLE
(Implemented by Sergui Petrunia)
Implement these SE API functions:
- ha_rocksdb::table_version()
- hton->check_version = rocksdb_check_versionMyRocks data dictionary
now stores table version for each table.
(Absence of table version record is interpreted as table_version=0,
that is, which means no upgrade changes are needed)
- For inplace alter table of a partitioned table, call the underlying
handlerton when checking if the table is ok. This assumes that the
partition engine commits all changes at once.
2021-03-18 12:41:08 +02:00
|
|
|
bool ddl_log_add_flag(DDL_LOG_STATE *entry, uint16 flag);
|
|
|
|
bool ddl_log_update_unique_id(DDL_LOG_STATE *state, ulonglong id);
|
2020-10-15 02:25:57 +03:00
|
|
|
bool ddl_log_update_xid(DDL_LOG_STATE *state, ulonglong xid);
|
|
|
|
bool ddl_log_disable_entry(DDL_LOG_STATE *state);
|
|
|
|
bool ddl_log_increment_phase(uint entry_pos);
|
2020-10-12 11:21:05 +03:00
|
|
|
void ddl_log_release_memory_entry(DDL_LOG_MEMORY_ENTRY *log_entry);
|
|
|
|
bool ddl_log_sync();
|
|
|
|
bool ddl_log_execute_entry(THD *thd, uint first_entry);
|
2020-10-02 15:59:06 +03:00
|
|
|
|
2021-09-09 11:58:45 +03:00
|
|
|
void ddl_log_add_entry(DDL_LOG_STATE *state, DDL_LOG_MEMORY_ENTRY *log_entry);
|
2020-10-15 02:25:57 +03:00
|
|
|
void ddl_log_release_entries(DDL_LOG_STATE *ddl_log_state);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_rename_table(DDL_LOG_STATE *ddl_state,
|
2020-10-15 02:25:57 +03:00
|
|
|
handlerton *hton,
|
|
|
|
const LEX_CSTRING *org_db,
|
|
|
|
const LEX_CSTRING *org_alias,
|
|
|
|
const LEX_CSTRING *new_db,
|
2022-10-27 22:18:51 +02:00
|
|
|
const LEX_CSTRING *new_alias);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_rename_view(DDL_LOG_STATE *ddl_state,
|
2020-10-15 02:25:57 +03:00
|
|
|
const LEX_CSTRING *org_db,
|
|
|
|
const LEX_CSTRING *org_alias,
|
|
|
|
const LEX_CSTRING *new_db,
|
|
|
|
const LEX_CSTRING *new_alias);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_drop_table_init(DDL_LOG_STATE *ddl_state,
|
2020-12-20 17:44:11 +02:00
|
|
|
const LEX_CSTRING *db,
|
2020-12-04 18:23:40 +02:00
|
|
|
const LEX_CSTRING *comment);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_drop_view_init(DDL_LOG_STATE *ddl_state,
|
2020-12-20 17:44:11 +02:00
|
|
|
const LEX_CSTRING *db);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_drop_table(DDL_LOG_STATE *ddl_state,
|
2020-12-04 18:23:40 +02:00
|
|
|
handlerton *hton,
|
|
|
|
const LEX_CSTRING *path,
|
|
|
|
const LEX_CSTRING *db,
|
2022-10-27 22:18:51 +02:00
|
|
|
const LEX_CSTRING *table);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_drop_view(DDL_LOG_STATE *ddl_state,
|
2020-12-04 18:23:40 +02:00
|
|
|
const LEX_CSTRING *path,
|
|
|
|
const LEX_CSTRING *db,
|
|
|
|
const LEX_CSTRING *table);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_drop_trigger(DDL_LOG_STATE *ddl_state,
|
2020-12-14 15:57:04 +02:00
|
|
|
const LEX_CSTRING *db,
|
|
|
|
const LEX_CSTRING *table,
|
|
|
|
const LEX_CSTRING *trigger_name,
|
|
|
|
const LEX_CSTRING *query);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_drop_view(DDL_LOG_STATE *ddl_state,
|
2020-12-20 17:44:11 +02:00
|
|
|
const LEX_CSTRING *path,
|
|
|
|
const LEX_CSTRING *db,
|
|
|
|
const LEX_CSTRING *table);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_drop_db(DDL_LOG_STATE *ddl_state,
|
2020-12-20 17:44:11 +02:00
|
|
|
const LEX_CSTRING *db, const LEX_CSTRING *path);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_create_table(DDL_LOG_STATE *ddl_state,
|
2021-01-17 16:06:43 +02:00
|
|
|
handlerton *hton,
|
|
|
|
const LEX_CSTRING *path,
|
|
|
|
const LEX_CSTRING *db,
|
|
|
|
const LEX_CSTRING *table,
|
|
|
|
bool only_frm);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_create_view(DDL_LOG_STATE *ddl_state,
|
2021-01-17 16:34:01 +02:00
|
|
|
const LEX_CSTRING *path,
|
|
|
|
enum_ddl_log_create_view_phase phase);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_delete_tmp_file(DDL_LOG_STATE *ddl_state,
|
2021-01-17 16:34:01 +02:00
|
|
|
const LEX_CSTRING *path,
|
|
|
|
DDL_LOG_STATE *depending_state);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_create_trigger(DDL_LOG_STATE *ddl_state,
|
2021-01-31 18:43:50 +02:00
|
|
|
const LEX_CSTRING *db, const LEX_CSTRING *table,
|
|
|
|
const LEX_CSTRING *trigger_name,
|
|
|
|
enum_ddl_log_create_trigger_phase phase);
|
2022-08-31 11:55:03 +03:00
|
|
|
bool ddl_log_alter_table(DDL_LOG_STATE *ddl_state,
|
MDEV-25180 Atomic ALTER TABLE
MDEV-25604 Atomic DDL: Binlog event written upon recovery does not
have default database
The purpose of this task is to ensure that ALTER TABLE is atomic even if
the MariaDB server would be killed at any point of the alter table.
This means that either the ALTER TABLE succeeds (including that triggers,
the status tables and the binary log are updated) or things should be
reverted to their original state.
If the server crashes before the new version is fully up to date and
commited, it will revert to the original table and remove all
temporary files and tables.
If the new version is commited, crash recovery will use the new version,
and update triggers, the status tables and the binary log.
The one execption is ALTER TABLE .. RENAME .. where no changes are done
to table definition. This one will work as RENAME and roll back unless
the whole statement completed, including updating the binary log (if
enabled).
Other changes:
- Added handlerton->check_version() function to allow the ddl recovery
code to check, in case of inplace alter table, if the table in the
storage engine is of the new or old version.
- Added handler->table_version() so that an engine can report the current
version of the table. This should be changed each time the table
definition changes.
- Added ha_signal_ddl_recovery_done() and
handlerton::signal_ddl_recovery_done() to inform all handlers when
ddl recovery has been done. (Needed by InnoDB).
- Added handlerton call inplace_alter_table_committed, to signal engine
that ddl_log has been closed for the alter table query.
- Added new handerton flag
HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we
should call hton->notify_tabledef_changed() during
mysql_inplace_alter_table. This was required as MyRocks and InnoDB
needed the call at different times.
- Added function server_uuid_value() to be able to generate a temporary
xid when ddl recovery writes the query to the binary log. This is
needed to be able to handle crashes during ddl log recovery.
- Moved freeing of the frm definition to end of mysql_alter_table() to
remove duplicate code and have a common exit strategy.
-------
InnoDB part of atomic ALTER TABLE
(Implemented by Marko Mäkelä)
innodb_check_version(): Compare the saved dict_table_t::def_trx_id
to determine whether an ALTER TABLE operation was committed.
We must correctly recover dict_table_t::def_trx_id for this to work.
Before purge removes any trace of DB_TRX_ID from system tables, it
will make an effort to load the user table into the cache, so that
the dict_table_t::def_trx_id can be recovered.
ha_innobase::table_version(): return garbage, or the trx_id that would
be used for committing an ALTER TABLE operation.
In InnoDB, table names starting with #sql-ib will remain special:
they will be dropped on startup. This may be revisited later in
MDEV-18518 when we implement proper undo logging and rollback
for creating or dropping multiple tables in a transaction.
Table names starting with #sql will retain some special meaning:
dict_table_t::parse_name() will not consider such names for
MDL acquisition, and dict_table_rename_in_cache() will treat such
names specially when handling FOREIGN KEY constraints.
Simplify InnoDB DROP INDEX.
Prevent purge wakeup
To ensure that dict_table_t::def_trx_id will be recovered correctly
in case the server is killed before ddl_log_complete(), we will block
the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS
between ha_innobase::commit_inplace_alter_table(commit=true)
(purge_sys.stop_SYS()) and purge_sys.resume_SYS().
The completion callback purge_sys.resume_SYS() must be between
ddl_log_complete() and MDL release.
--------
MyRocks support for atomic ALTER TABLE
(Implemented by Sergui Petrunia)
Implement these SE API functions:
- ha_rocksdb::table_version()
- hton->check_version = rocksdb_check_versionMyRocks data dictionary
now stores table version for each table.
(Absence of table version record is interpreted as table_version=0,
that is, which means no upgrade changes are needed)
- For inplace alter table of a partitioned table, call the underlying
handlerton when checking if the table is ok. This assumes that the
partition engine commits all changes at once.
2021-03-18 12:41:08 +02:00
|
|
|
handlerton *org_hton,
|
|
|
|
const LEX_CSTRING *db, const LEX_CSTRING *table,
|
|
|
|
handlerton *new_hton,
|
|
|
|
handlerton *partition_underlying_hton,
|
|
|
|
const LEX_CSTRING *new_db,
|
|
|
|
const LEX_CSTRING *new_table,
|
|
|
|
const LEX_CSTRING *frm_path,
|
|
|
|
const LEX_CSTRING *backup_table_name,
|
|
|
|
const LEX_CUSTRING *version,
|
|
|
|
ulonglong table_version,
|
|
|
|
bool is_renamed);
|
|
|
|
bool ddl_log_store_query(THD *thd, DDL_LOG_STATE *ddl_log_state,
|
|
|
|
const char *query, size_t length);
|
2021-09-27 15:53:52 +03:00
|
|
|
bool ddl_log_delete_frm(DDL_LOG_STATE *ddl_state, const char *to_path);
|
2020-10-02 15:59:06 +03:00
|
|
|
extern mysql_mutex_t LOCK_gdl;
|
|
|
|
#endif /* DDL_LOG_INCLUDED */
|