Store and maintain xdes pages always. And doesn't verify checksums for
freed pages.
innochecksum can work only with the first space file of multiple ones.
Tell about it and abort in case of not the first file.
fil_ibd_create(): Remove code that should have been removed in
commit 86dc7b4d4c already.
We no longer wrote an initialized page to the file, but we would
still allocate a page image in memory and write it.
xb_space_create_file(): Remove an unnecessary page write.
(This is a functional change for Mariabackup.)
Let us simply refuse an upgrade from earlier versions if the
upgrade procedure was not followed. This simplifies the purge,
commit, and rollback of transactions.
Before upgrading to MariaDB 10.3 or later, a clean shutdown
of the server (with innodb_fast_shutdown=1 or 0) is necessary,
to ensure that any incomplete transactions are rolled back.
The undo log format was changed in MDEV-12288. There is only
one persistent undo log for each transaction.
In commit 1c5ae99194 (MDEV-25666)
we had changed Mariabackup so that it would no longer skip files
whose names start with #sql. This turned out to be wrong.
Because operations on such named files are not protected by any
locks in the server, it is not safe to copy them.
Not copying the files may make the InnoDB data dictionary
inconsistent with the file system. So, we must do something
in InnoDB to adjust for that.
If InnoDB is being started up without the redo log (ib_logfile0)
or with a zero-length log file, we will assume that the server
was restored from a backup, and adjust things as follows:
dict_check_sys_tables(), fil_ibd_open(): Do not complain about
missing #sql files if they would be dropped a little later.
dict_stats_update_if_needed(): Never add #sql tables to
the recomputing queue. This avoids a potential race condition when
dropping the garbage tables.
drop_garbage_tables_after_restore(): Try to drop any garbage tables.
innodb_ddl_recovery_done(): Invoke drop_garbage_tables_after_restore()
if srv_start_after_restore (a new flag) was set and we are not in
read-only mode (innodb_read_only=ON or innodb_force_recovery>3).
The tests and dbug_mariabackup_event() instrumentation
were developed by Vladislav Vaintroub, who also reviewed this.
In commit 49e2c8f0a6 (MDEV-25743)
we made dict_sys_t::find() incompatible with the rest of the
table name hash table operations in case the table name contains
non-ASCII octets (using a compatibility mode that facilitates the
upgrade into the MySQL 5.0 filename-safe encoding) and the target
platform implements signed char.
ut_fold_string(): Remove; replace with my_crc32c(). This also makes
table name hash value calculations independent on whether char
is unsigned or signed.
This fixed the MySQL bug# 20338 about misuse of double underscore
prefix __WIN__, which was old MySQL's idea of identifying Windows
Replace it by _WIN32 standard symbol for targeting Windows OS
(both 32 and 64 bit)
Not that connect storage engine is not fixed in this patch (must be
fixed in "upstream" branch)
Many InnoDB data dictionary cache operations require that the
table name be copied so that it will be NUL terminated.
(For example, SYS_TABLES.NAME is not guaranteed to be NUL-terminated.)
dict_table_t::is_garbage_name(): Check if a name belongs to
the background drop table queue.
dict_check_if_system_table_exists(): Remove.
dict_sys_t::load_sys_tables(): Load the non-hard-coded system tables
SYS_FOREIGN, SYS_FOREIGN_COLS, SYS_VIRTUAL on startup.
dict_sys_t::create_or_check_sys_tables(): Replaces
dict_create_or_check_foreign_constraint_tables() and
dict_create_or_check_sys_virtual().
dict_sys_t::load_table(): Replaces dict_table_get_low()
and dict_load_table().
dict_sys_t::find_table(): Renamed from get_table().
dict_sys_t::sys_tables_exist(): Check whether all the non-hard-coded
tables SYS_FOREIGN, SYS_FOREIGN_COLS, SYS_VIRTUAL exist.
trx_t::has_stats_table_lock(): Moved to dict0stats.cc.
Some error messages will now report table names in the internal
databasename/tablename format, instead of `databasename`.`tablename`.
Changes:
- To detect automatic strlen() I removed the methods in String that
uses 'const char *' without a length:
- String::append(const char*)
- Binary_string(const char *str)
- String(const char *str, CHARSET_INFO *cs)
- append_for_single_quote(const char *)
All usage of append(const char*) is changed to either use
String::append(char), String::append(const char*, size_t length) or
String::append(LEX_CSTRING)
- Added STRING_WITH_LEN() around constant string arguments to
String::append()
- Added overflow argument to escape_string_for_mysql() and
escape_quotes_for_mysql() instead of returning (size_t) -1 on overflow.
This was needed as most usage of the above functions never tested the
result for -1 and would have given wrong results or crashes in case
of overflows.
- Added Item_func_or_sum::func_name_cstring(), which returns LEX_CSTRING.
Changed all Item_func::func_name()'s to func_name_cstring()'s.
The old Item_func_or_sum::func_name() is now an inline function that
returns func_name_cstring().str.
- Changed Item::mode_name() and Item::func_name_ext() to return
LEX_CSTRING.
- Changed for some functions the name argument from const char * to
to const LEX_CSTRING &:
- Item::Item_func_fix_attributes()
- Item::check_type_...()
- Type_std_attributes::agg_item_collations()
- Type_std_attributes::agg_item_set_converter()
- Type_std_attributes::agg_arg_charsets...()
- Type_handler_hybrid_field_type::aggregate_for_result()
- Type_handler_geometry::check_type_geom_or_binary()
- Type_handler::Item_func_or_sum_illegal_param()
- Predicant_to_list_comparator::add_value_skip_null()
- Predicant_to_list_comparator::add_value()
- cmp_item_row::prepare_comparators()
- cmp_item_row::aggregate_row_elements_for_comparison()
- Cursor_ref::print_func()
- Removes String_space() as it was only used in one cases and that
could be simplified to not use String_space(), thanks to the fixed
my_vsnprintf().
- Added some const LEX_CSTRING's for common strings:
- NULL_clex_str, DATA_clex_str, INDEX_clex_str.
- Changed primary_key_name to a LEX_CSTRING
- Renamed String::set_quick() to String::set_buffer_if_not_allocated() to
clarify what the function really does.
- Rename of protocol function:
bool store(const char *from, CHARSET_INFO *cs) to
bool store_string_or_null(const char *from, CHARSET_INFO *cs).
This was done to both clarify the difference between this 'store' function
and also to make it easier to find unoptimal usage of store() calls.
- Added Protocol::store(const LEX_CSTRING*, CHARSET_INFO*)
- Changed some 'const char*' arrays to instead be of type LEX_CSTRING.
- class Item_func_units now used LEX_CSTRING for name.
Other things:
- Fixed a bug in mysql.cc:construct_prompt() where a wrong escape character
in the prompt would cause some part of the prompt to be duplicated.
- Fixed a lot of instances where the length of the argument to
append is known or easily obtain but was not used.
- Removed some not needed 'virtual' definition for functions that was
inherited from the parent. I added override to these.
- Fixed Ordered_key::print() to preallocate needed buffer. Old code could
case memory overruns.
- Simplified some loops when adding char * to a String with delimiters.
This patch changes the main name of 3 byte character set from utf8 to
utf8mb3. New old_mode UTF8_IS_UTF8MB3 is added and set TRUE by default,
so that utf8 would mean utf8mb3. If not set, utf8 would mean utf8mb4.
The implementation of handlerton::drop_database in InnoDB is
unnecessarily complex. The minimal implementation should check
that no conflicting locks or references exist on the tables,
delete all table metadata in a single transaction, and finally
delete the tablespaces.
Note: DROP DATABASE will delete each individual table that the
SQL layer knows about, one table per transaction.
The handlerton::drop_database is basically a final cleanup step
for removing any garbage that could have been left behind
in InnoDB due to some bug, or not having atomic DDL in the past.
hash_node_t: Remove. Use the proper data type name in pointers.
dict_drop_index_tree(): Do not take the table as a parameter.
Instead, return the tablespace ID if the tablespace should be dropped
(we are dropping a clustered index tree).
fil_delete_tablespace(), fil_system_t::detach(): Return a single
detached file handle. Multi-file tablespaces cannot be deleted
via this interface.
ha_innobase::delete_table(): Remove a work-around for non-atomic DDL
and do not try to drop tables with similar-looking name.
innodb_drop_database(): Complete rewrite.
innobase_drop_database(), dict_get_first_table_name_in_db(),
row_drop_database_for_mysql(), drop_all_foreign_keys_in_db(): Remove.
row_purge_remove_clust_if_poss_low(), row_undo_ins_remove_clust_rec():
If the tablespace is to be deleted, try to evict the table definition
from the cache. Failing that, set dict_table_t::space to nullptr.
lock_release_on_rollback(): On the rollback of CREATE TABLE, release all
locks that the transaction had on the table, to avoid heap-use-after-free.
The functions fil_file_readdir_next_file(), os_file_opendir(),
os_file_closedir() became dead code in the server in MariaDB 10.4.0
with commit 09af00cbde (the removal of
the crash recovery logic for the TRUNCATE TABLE implementation that
was replaced in MDEV-13564).
os_file_opendir(), os_file_closedir(): Define as macros.
During data file creation, InnoDB holds dict_sys mutex, tries to
write page 0 of the file and flushes the file. This not only causing
unnecessary contention but also a deviation from the write-ahead
logging protocol.
The clean sequence of operations is that we first start a dictionary
transaction and write SYS_TABLES and SYS_INDEXES records that identify
the tablespace. Then, we durably write a FILE_CREATE record to the
write-ahead log and create the file.
Recovery should not unnecessarily insist that the first page of each
data file that is referred to by the redo log is valid. It must be
enough that page 0 of the tablespace can be initialized based on the
redo log contents.
We introduce a new data structure deferred_spaces that keeps track
of corrupted-looking files during recovery. The data structure holds
the last LSN of a FILE_ record referring to the data file, the
tablespace identifier, and the last known file name.
There are two scenarios can happen during recovery:
i) Sufficient memory: InnoDB can reconstruct the
tablespace after parsing all redo log records.
ii) Insufficient memory(multiple apply phase): InnoDB should
store the deferred tablespace redo logs even though
tablespace is not present. InnoDB should start constructing
the tablespace when it first encounters deferred tablespace
id.
Mariabackup copies the zero filled ibd file in backup_fix_ddl() as
the extension of .new file. Mariabackup test case does page flushing
when it deals with DDL operation during backup operation.
fil_ibd_create(): Remove the write of page0 and flushing of file
fil_ibd_load(): Return FIL_LOAD_DEFER if the tablespace has
zero filled page0
Datafile: Clean up the error handling, and do not report errors
if we are in the middle of recovery. The caller will check
Datafile::m_defer.
fil_node_t::deferred: Indicates whether the tablespace loading was
deferred during recovery
FIL_LOAD_DEFER: Returned by fil_ibd_load() to indicate that tablespace
file was cannot be loaded.
recv_sys_t::recover_deferred(): Invoke deferred_spaces.create() to
initialize fil_space_t based on buffered metadata and records to
initialize page 0. Ignore the flags in fil_name_t, because they are
intentionally invalid.
fil_name_process(): Update deferred_spaces.
recv_sys_t::parse(): Store the redo log if the tablespace id
is present in deferred spaces
recv_sys_t::recover_low(): Should recover the first page of
the tablespace even though the tablespace instance is not
present
recv_sys_t::apply(): Initialize the deferred tablespace
before applying the deferred tablespace records
recv_validate_tablespace(): Skip the validation for deferred_spaces.
recv_rename_files(): Moved and revised from recv_sys_t::apply().
For deferred-recovery tablespaces, do not attempt to rename the
file if a deferred-recovery tablespace is associated with the name.
recv_recovery_from_checkpoint_start(): Invoke recv_rename_files()
and initialize all deferred tablespaces before applying redo log.
fil_node_t::read_page0(): Skip page0 validation if the tablespace
is deferred
buf_page_create_deferred(): A variant of buf_page_create() when
the fil_space_t is not available yet
This is joint work with Thirunarayanan Balathandayuthapani,
who implemented an initial prototype.
Ever since MDEV-18518 made DDL operations mostly crash-safe inside InnoDB,
it became obvious that Mariabackup might not be entirely safe with regard to
concurrent DDL operations.
check_if_skip_table(): Do not skip files whose name starts with #sql.
We cannot know whether a DDL operation is in progress and the table
might in fact be needed later.
When CMAKE_CROSSCOMPILING_EMULATOR is defined, a cross-compile
can be made, however with native (emulated) execution possible.
This commit takes those points in the build system that
execute built targets natively and allow these to be executed
in a crosscompile if CMAKE_CROSSCOMPILING_EMULATOR is defined.
Closes#1805
SST scripts for Galera should use the new mariabackup interface
instead of the innobackupex interface, which is currently only
supported for compatibility reasons.
This commit converts the SST script for mariabackup to use the
new interface. It does not need separate tests, as any problems
will be seen as failures when running multiple tests for the
mariabackup-based SST.
This patch fixes an issue with launching mariabackup during SST
(when used with Galera), when during bootstrap mariabackup receives
the "--innodb" option, which is incorrectly interpreted as shortcut
for "--innodb-force-recovery". This patch does not require separate
test for mtr, as the problem is visible in general testing on
buildbot.
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.
fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.
There used to be a name_hash, but it had been removed already in
commit a75dbfd718 (MDEV-12266).
We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)
ut_basename_noext(): Remove (unused function).
read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".
Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.
Reviewed by: Vladislav Vaintroub