A proper InnoDB shutdown after aborted startup was introduced
in commit 81b7fe9d38.
Also related to this is MDEV-11985, making read-only shutdown more robust.
If startup was aborted, there may exist recovered transactions that were
not rolled back. Relax the assertions accordingly.
buf_page_is_checksum_valid_crc32()
buf_page_is_checksum_valid_innodb()
buf_page_is_checksum_valid_none():
Use ULINTPF instead of %lu and %u for ib_uint32_t
fil_space_verify_crypt_checksum():
Check that page is really empty if checksum and
LSN are zero.
fil_space_verify_crypt_checksum():
Correct the comment to be more agurate.
buf0buf.h:
Remove unnecessary is_corrupt variable from
buf_page_t structure.
recv_writer_thread(): Do not assign recv_writer_thread_active=true
in order to avoid a race condition with
recv_recovery_from_checkpoint_finish().
recv_init_crash_recovery(): Assign recv_writer_thread_active=true
before creating recv_writer_thread.
recv_writer_thread(): Do not assign recv_writer_thread_active=true
in order to avoid a race condition with
recv_recovery_from_checkpoint_finish().
recv_init_crash_recovery_spaces(): Assign recv_writer_thread_active=true
before creating recv_writer_thread.
InnoDB can wrongly ignore the end of data files when using
innodb_page_size=32k or innodb_page_size=64k. These page sizes
use an allocation extent size of 2 or 4 megabytes, not 1 megabyte.
This issue does not affect MariaDB Server 10.2, which is using
the correct WL#5757 code from MySQL 5.7.
That said, it does not make sense to ignore the tail of data files.
The next time the data file needs to be extended, it would be extended
to a multiple of the extent size, once the size exceeds one extent.
Encryption stores used key_version to
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION (offset 26)
field. Spatial indexes store RTREE Split Sequence Number
(FIL_RTREE_SPLIT_SEQ_NUM) in the same field. Both values
can't be stored in same field. Thus, current encryption
implementation does not support encrypting spatial indexes.
fil_space_encrypt(): Do not encrypt page if page type is
FIL_PAGE_RTREE (this is required for background
encryption innodb-encrypt-tables=ON).
create_table_info_t::check_table_options() Do not allow creating
table with ENCRYPTED=YES if table contains spatial index.
recv_log_format_0_recover(): Invoke log_decrypt_after_read() after
reading the old-format redo log buffer.
With this change, we will upgrade to an encrypted redo log that
is misleadingly carrying a MySQL 5.7.9 compatible format tag while
the log blocks (other than the header and the checkpoint blocks)
are in an incompatible, encrypted format.
That needs to be fixed by introducing a new redo log format tag that
indicates that the entire redo log is encrypted.
LOG_CHECKPOINT_ARRAY_END, LOG_CHECKPOINT_SIZE: Remove.
Change some error messages to refer to MariaDB 10.2.2 instead of
MySQL 5.7.9.
recv_find_max_checkpoint_0(): Do not abort when decrypting one of the
checkpoint pages fails.
after failed ADD UNIQUE INDEX
check_col_exists_in_indexes(): Add the parameter only_committed.
When considering committed indexes, evaluate index->is_committed().
Else, evaluate index->to_be_dropped.
rollback_inplace_alter_table(): Invoke check_col_exists_in_indexes()
with only_committed=true.
Galera disallow-writes feature was lost in InnoDB 5.7 merge
to 10.2. This patch restores this feature and fixes test
failure on test galera.galera_var_innodb_disallow_writes.
compatibility problems
Pages that are encrypted contain post encryption checksum on
different location that normal checksum fields. Therefore,
we should before decryption check this checksum to avoid
unencrypting corrupted pages. After decryption we can use
traditional checksum check to detect if page is corrupted
or unencryption was done using incorrect key.
Pages that are page compressed do not contain any checksum,
here we need to fist unencrypt, decompress and finally
use tradional checksum check to detect page corruption
or that we used incorrect key in unencryption.
buf0buf.cc: buf_page_is_corrupted() mofified so that
compressed pages are skipped.
buf0buf.h, buf_block_init(), buf_page_init_low():
removed unnecessary page_encrypted, page_compressed,
stored_checksum, valculated_checksum fields from
buf_page_t
buf_page_get_gen(): use new buf_page_check_corrupt() function
to detect corrupted pages.
buf_page_check_corrupt(): If page was not yet decrypted
check if post encryption checksum still matches.
If page is not anymore encrypted, use buf_page_is_corrupted()
traditional checksum method.
If page is detected as corrupted and it is not encrypted
we print corruption message to error log.
If page is still encrypted or it was encrypted and now
corrupted, we will print message that page is
encrypted to error log.
buf_page_io_complete(): use new buf_page_check_corrupt()
function to detect corrupted pages.
buf_page_decrypt_after_read(): Verify post encryption
checksum before tring to decrypt.
fil0crypt.cc: fil_encrypt_buf() verify post encryption
checksum and ind fil_space_decrypt() return true
if we really decrypted the page.
fil_space_verify_crypt_checksum(): rewrite to use
the method used when calculating post encryption
checksum. We also check if post encryption checksum
matches that traditional checksum check does not
match.
fil0fil.ic: Add missed page type encrypted and page
compressed to fil_get_page_type_name()
Note that this change does not yet fix innochecksum tool,
that will be done in separate MDEV.
Fix test failures caused by buf page corruption injection.
If InnoDB is started in innodb_read_only mode such that
recovered incomplete transactions exist at startup
(but the redo logs are clean), an assertion will fail at shutdown,
because there would exist some non-prepared transactions.
logs_empty_and_mark_files_at_shutdown(): Do not wait for incomplete
transactions to finish if innodb_read_only or innodb_force_recovery>=3.
Wait for purge to finish in only one place.
trx_sys_close(): Relax the assertion that would fail first.
trx_free_prepared(): Also free recovered TRX_STATE_ACTIVE transactions
if innodb_read_only or innodb_force_recovery>=3.
Also, revert my earlier fix to MySQL 5.7 because this fix is more generic:
Bug#20874411 INNODB SHUTDOWN HANGS IF INNODB_FORCE_RECOVERY>=3
SKIPPED ANY ROLLBACK
trx_undo_fake_prepared(): Remove.
trx_sys_any_active_transactions(): Revert the changes.
Datafile::validate_for_recovery(): Remove a redundant error message.
An error is already reported by Datafile::open_read_write() if the
file cannot be opened.
Also, do not assign SEARCH_ABORT, so that the full test will be executed
even if one step fails.
Remove the debug parameter innodb_force_recovery_crash that was
introduced into MySQL 5.6 by me in WL#6494 which allowed InnoDB
to resize the redo log on startup.
Let innodb.log_file_size actually start up the server, but ensure
that the InnoDB storage engine refuses to start up in each of the
scenarios.
If InnoDB is started in innodb_read_only mode such that
recovered incomplete transactions exist at startup
(but the redo logs are clean), an assertion will fail at shutdown,
because there would exist some non-prepared transactions.
logs_empty_and_mark_files_at_shutdown(): Do not wait for incomplete
transactions to finish if innodb_read_only or innodb_force_recovery>=3.
Wait for purge to finish in only one place.
trx_sys_close(): Relax the assertion that would fail first.
trx_free_prepared(): Also free recovered TRX_STATE_ACTIVE transactions
if innodb_read_only or innodb_force_recovery>=3.
srv_release_threads(): Actually wait for the threads to resume
from suspension. On CentOS 5 and possibly other platforms,
os_event_set() may be lost.
srv_resume_thread(): A counterpart of srv_suspend_thread().
Optionally wait for the event to be set, optionally with a timeout,
and then release the thread from suspension.
srv_free_slot(): Unconditionally suspend the thread. It is always
in resumed state when this function is entered.
srv_active_wake_master_thread_low(): Only call os_event_set().
srv_purge_coordinator_suspend(): Use srv_resume_thread() instead
of the complicated logic.
srv_release_threads(): Actually wait for the threads to resume
from suspension. On CentOS 5 and possibly other platforms,
os_event_set() may be lost.
srv_resume_thread(): A counterpart of srv_suspend_thread().
Optionally wait for the event to be set, optionally with a timeout,
and then release the thread from suspension.
srv_free_slot(): Unconditionally suspend the thread. It is always
in resumed state when this function is entered.
srv_active_wake_master_thread_low(): Only call os_event_set().
srv_purge_coordinator_suspend(): Use srv_resume_thread() instead
of the complicated logic.
Remove the dependency on unzip. Instead, generate the InnoDB files
with perl.
log_block_checksum_is_ok(): Correct the error message.
recv_scan_log_recs(): Remove the duplicated error message for
log block checksum mismatch.
innobase_start_or_create_for_mysql(): If the server is in read-only
mode or if innodb_force_recovery>=3, do not try to modify the system
tablespace. (If the doublewrite buffer or the non-core system tables
do not exist, do not try to create them.)
innodb_shutdown(): Relax a debug assertion. If the system tablespace
did not contain a doublewrite buffer and if we started up in
innodb_read_only mode or with innodb_force_recovery>=3, it will not
be created.
dict_create_or_check_sys_tablespace(): Set the flag
srv_sys_tablespaces_open when the tables exist.
This fixes memory leaks in tests that cause InnoDB startup to fail.
buf_pool_free_instance(): Also free buf_pool->flush_rbt, which would
normally be freed when crash recovery finishes.
fil_node_close_file(), fil_space_free_low(), fil_close_all_files():
Relax some debug assertions to tolerate !srv_was_started.
innodb_shutdown(): Renamed from innobase_shutdown_for_mysql().
Changed the return type to void. Do not assume that all subsystems
were started.
que_init(), que_close(): Remove (empty functions).
srv_init(), srv_general_init(): Remove as global functions.
srv_free(): Allow srv_sys=NULL.
srv_get_active_thread_type(): Only return SRV_PURGE if purge really
is running.
srv_shutdown_all_bg_threads(): Do not reset srv_start_state. It will
be needed by innodb_shutdown().
innobase_start_or_create_for_mysql(): Always call srv_boot() so that
innodb_shutdown() can assume that it was called. Make more subsystems
dependent on SRV_START_STATE_STAT.
srv_shutdown_bg_undo_sources(): Require SRV_START_STATE_STAT.
trx_sys_close(): Do not assume purge_sys!=NULL. Do not call
buf_dblwr_free(), because the doublewrite buffer can exist while
the transaction system does not.
logs_empty_and_mark_files_at_shutdown(): Do a faster shutdown if
!srv_was_started.
recv_sys_close(): Invoke dblwr.pages.clear() which would normally
be invoked by buf_dblwr_process().
recv_recovery_from_checkpoint_start(): Always release log_sys->mutex.
row_mysql_close(): Allow the subsystem not to exist.
Remove the debug parameter innodb_force_recovery_crash that was
introduced into MySQL 5.6 by me in WL#6494 which allowed InnoDB
to resize the redo log on startup.
Let innodb.log_file_size actually start up the server, but ensure
that the InnoDB storage engine refuses to start up in each of the
scenarios.
on shutdown it might happen that
1. the server starts killing THDs
2. it sets thd->killed in srv_purge_coordinator
3. srv_purge_coordinator notices that and tells srv_workers to exit
4. srv_worker will notice that and will start exiting,
... assert here ...
5. server sets thd->killed in worker threads
that is, it might happen that the assert is tested before
srv_worker's THD got the kill signal.
this fixes various random crashes (on this assertion) on shutdown
in tests
1. wait for thd_destructor_proxy thread to start after creating it.
this ensures that the thread is ready to receive a shutdown signal
whenever we want to send it.
2. join it at shutdown, this guarantees that no innodb THD will exist
after innobase_end().
this fixes crashes and memory leaks in main.mysqld_option_err
(were innodb was started and then immediately shut down).
InnoDB would refuse to start up if there is a mismatch on
the size of the system tablespace files. However, before this
check is conducted, the system tablespace may already have been
heavily modified.
InnoDB should perform the size check as early as possible.
recv_recovery_from_checkpoint_finish():
Move the recv_apply_hashed_log_recs() call to
innobase_start_or_create_for_mysql().
innobase_start_or_create_for_mysql(): Test the mutex functionality
before doing anything else. Use a compile_time_assert() for a
sizeof() constraint. Check the size of the system tablespace as
early as possible.
recv_scan_log_recs(): Remember if redo log apply is needed,
even if starting up in innodb_read_only mode.
recv_recovery_from_checkpoint_start_func(): Refuse
innodb_read_only startup if redo log apply is needed.
crashes server
This bug is the result of merging the Oracle MySQL follow-up fix
BUG#22963169 MYSQL CRASHES ON CREATE FULLTEXT INDEX
without merging the base bug fix:
Bug#79475 Insert a token of 84 4-bytes chars into fts index causes
server crash.
Unlike the above mentioned fixes in MySQL, our fix will not change
the storage format of fulltext indexes in InnoDB or XtraDB
when a character encoding with mbmaxlen=2 or mbmaxlen=3
and the length of a word is between 128 and 84*mbmaxlen bytes.
The Oracle fix would allocate 2 length bytes for these cases.
Compatibility with other MySQL and MariaDB releases is ensured by
persisting the used maximum length in the SYS_COLUMNS table in the
InnoDB data dictionary.
This fix also removes some unnecessary strcmp() calls when checking
for the legacy default collation my_charset_latin1
(my_charset_latin1.name=="latin1_swedish_ci").
fts_create_one_index_table(): Store the actual length in bytes.
This metadata will be written to the SYS_COLUMNS table.
fts_zip_initialize(): Initialize only the first byte of the buffer.
Actually the code should not even care about this first byte, because
the length is set as 0.
FTX_MAX_WORD_LEN: Define as HA_FT_MAXCHARLEN * 4 aka 336 bytes,
not as 254 bytes.
row_merge_create_fts_sort_index(): Set the actual maximum length of the
column in bytes, similar to fts_create_one_index_table().
row_merge_fts_doc_tokenize(): Remove the redundant parameter word_dtype.
Use the actual maximum length of the column. Calculate the extra_size
in the same way as row_merge_buf_encode() does.
InnoDB would refuse to start up if there is a mismatch on
the size of the system tablespace files. However, before this
check is conducted, the system tablespace may already have been
heavily modified.
InnoDB should perform the size check as early as possible.
recv_recovery_from_checkpoint_finish():
Move the recv_apply_hashed_log_recs() call to
innobase_start_or_create_for_mysql().
innobase_start_or_create_for_mysql(): Test the mutex functionality
before doing anything else. Use a compile_time_assert() for a
sizeof() constraint. Check the size of the system tablespace as
early as possible.
recv_scan_log_recs(): Remember if redo log apply is needed,
even if starting up in innodb_read_only mode.
recv_recovery_from_checkpoint_start_func(): Refuse
innodb_read_only startup if redo log apply is needed.
Both dict_foreign_find_index and dict_foreign_qualify_index
did not consider virtual columns as possible foreign key
columns and there was assertion to disable virtual columns.
Fixed by also looking referencing and referenced column
from virtual columns if needed.
In the merge of MDEV-11623 from 10.1 to 10.2, we added a call to
fsp_flags_try_adjust() which causes the data file to be opened
via the buffer pool while fil_ibd_open() is holding another open
handle to the file.