Always read full page 0 to determine does tablespace contain
encryption metadata. Tablespaces that are page compressed or
page compressed and encrypted do not compare checksum as
it does not exists. For encrypted tables use checksum
verification written for encrypted tables and normal tables
use normal method.
buf_page_is_checksum_valid_crc32
buf_page_is_checksum_valid_innodb
buf_page_is_checksum_valid_none
Modify Innochecksum logging to file to avoid compilation
warnings.
fil0crypt.cc fil0crypt.h
Modify to be able to use in innochecksum compilation and
move fil_space_verify_crypt_checksum to end of the file.
Add innochecksum logging to file.
univ.i
Add innochecksum strict_verify, log_file and cur_page_num
variables as extern.
page_zip_verify_checksum
Add innochecksum logging to file and remove unnecessary code.
innochecksum.cc
Lot of changes most notable able to read encryption
metadata from page 0 of the tablespace.
Added test case where we corrupt intentionally
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION (encryption key version)
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION+4 (post encryption checksum)
FIL_DATA+10 (data)
Always read full page 0 to determine does tablespace contain
encryption metadata. Tablespaces that are page compressed or
page compressed and encrypted do not compare checksum as
it does not exists. For encrypted tables use checksum
verification written for encrypted tables and normal tables
use normal method.
buf_page_is_checksum_valid_crc32
buf_page_is_checksum_valid_innodb
buf_page_is_checksum_valid_none
Add Innochecksum logging to file
buf_page_is_corrupted
Remove ib_logf and page_warn_strict_checksum
calls in innochecksum compilation. Add innochecksum
logging to file.
fil0crypt.cc fil0crypt.h
Modify to be able to use in innochecksum compilation and
move fil_space_verify_crypt_checksum to end of the file.
Add innochecksum logging to file.
univ.i
Add innochecksum strict_verify, log_file and cur_page_num
variables as extern.
page_zip_verify_checksum
Add innochecksum logging to file.
innochecksum.cc
Lot of changes most notable able to read encryption
metadata from page 0 of the tablespace.
Added test case where we corrupt intentionally
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION (encryption key version)
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION+4 (post encryption checksum)
FIL_DATA+10 (data)
Problem appears to be that the function fsp_flags_try_adjust()
is being unconditionally invoked on every .ibd file on startup.
Based on performance investigation also the top function
fsp_header_get_crypt_offset() needs to addressed.
Ported implementation of fsp_header_get_encryption_offset()
function from 10.2 to fsp_header_get_crypt_offset().
Introduced a new function fil_crypt_read_crypt_data()
to read page 0 if it is not yet read.
fil_crypt_find_space_to_rotate(): Now that page 0 for every .ibd
file is not read on startup we need to check has page 0 read
from space that we investigate for key rotation, if it is not read
we read it.
fil_space_crypt_get_status(): Now that page 0 for every .ibd
file is not read on startup here also we need to read page 0
if it is not yet read it. This is needed
as tests use IS query to wait until background encryption
or decryption has finished and this function is used to
produce results.
fil_crypt_thread(): Add is_stopping condition for tablespace
so that we do not rotate pages if usage of tablespace should
be stopped. This was needed for failure seen on regression
testing.
fil_space_create: Remove page_0_crypt_read and extra
unnecessary info output.
fil_open_single_table_tablespace(): We call fsp_flags_try_adjust
only when when no errors has happened and server was not started
on read only mode and tablespace validation was requested or
flags contain other table options except low order bits to
FSP_FLAGS_POS_PAGE_SSIZE position.
fil_space_t::page_0_crypt_read removed.
Added test case innodb-first-page-read to test startup when
encryption is on and when encryption is off to check that not
for all tables page 0 is read on startup.
Problem was that checksum check resulted false positives that page is
both not encrypted and encryted when checksum_algorithm was
strict_none.
Encrypton checksum will use only crc32 regardless of setting.
buf_zip_decompress: If compression fails report a error message
containing the space name if available (not available during import).
And note if space could be encrypted.
buf_page_get_gen: Do not assert if decompression fails,
instead unfix the page and return NULL to upper layer.
fil_crypt_calculate_checksum: Use only crc32 method.
fil_space_verify_crypt_checksum: Here we need to check
crc32, innodb and none method for old datafiles.
fil_space_release_for_io: Allow null space.
encryption.innodb-compressed-blob is now run with crc32 and none
combinations.
Note that with none and strict_none method there is not really
a way to detect page corruptions and page corruptions after
decrypting the page with incorrect key.
New test innodb-checksum-algorithm to test different checksum
algorithms with encrypted, row compressed and page compressed
tables.
In commit 360a4a0372
some debug assertions were introduced to the page flushing code
in XtraDB. Add these assertions to InnoDB as well, and adjust
the InnoDB shutdown so that these assertions will not fail.
logs_empty_and_mark_files_at_shutdown(): Advance
srv_shutdown_state from the first phase SRV_SHUTDOWN_CLEANUP
only after no page-dirtying activity is possible
(well, except by srv_master_do_shutdown_tasks(), which will be
fixed separately in MDEV-12052).
rotate_thread_t::should_shutdown(): Already exit the key rotation
threads at the first phase of shutdown (SRV_SHUTDOWN_CLEANUP).
page_cleaner_sleep_if_needed(): Do not sleep during shutdown.
This change is originally from XtraDB.
Use uint32_t for the encryption key_id.
When filling unsigned integer values into INFORMATION_SCHEMA tables,
use the method Field::store(longlong, bool unsigned)
instead of using Field::store(double).
Fix also some miscellanous type mismatch related to ulint (size_t).
When MySQL 5.7.9 was merged to MariaDB 10.2.2, an important
debug assertion was omitted from mlog_write_initial_log_record_low().
mlog_write_initial_log_record_low(): Put back the assertion
mtr_t::is_named_space().
fil_crypt_start_encrypting_space(), fil_crypt_rotate_page():
Call mtr_t::set_named_space() before modifying any pages.
fsp_flags_try_adjust(): Call mtr_t::set_named_space(). This additional
breakage was introduced in the merge of MDEV-11623 from 10.1. It was
not caught because of the missing debug assertion in
mlog_write_initial_log_record_low().
Remove some suppressions from the encryption.innodb-redo-badkey test.
This is a backport from 10.2 where it fixes the
cmake -DWITH_ASAN test failure that was mentioned
in commit f9cc391863
(merging MDEV-12253 from 10.1 to 10.2).
fil_parse_write_crypt_data(): If the tablespace is not found,
invoke fil_space_destroy_crypt_data(&crypt_data) to properly
free the created object.
This fixes the cmake -DWITH_ASAN test failure that was mentioned
in commit f9cc391863 (merging
MDEV-12253 from 10.1 to 10.2).
fil_parse_write_crypt_data(): If the tablespace is not found,
invoke fil_space_destroy_crypt_data(&crypt_data) to properly
free the created object.
With this, the test encryption.innodb-redo-badkey still reports
"Missing MLOG_FILE_NAME or MLOG_FILE_DELETE before MLOG_CHECKPOINT"
but does not fail. The misleading message should be corrected,
maybe as part of MDEV-12699.
This only merges MDEV-12253, adapting it to MDEV-12602 which is already
present in 10.2 but not yet in the 10.1 revision that is being merged.
TODO: Error handling in crash recovery needs to be improved.
If a page cannot be decrypted (or read), we should cleanly abort
the startup. If innodb_force_recovery is specified, we should
ignore the problematic page and apply redo log to other pages.
Currently, the test encryption.innodb-redo-badkey randomly fails
like this (the last messages are from cmake -DWITH_ASAN):
2017-05-05 10:19:40 140037071685504 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1635994
2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Missing MLOG_FILE_NAME or MLOG_FILE_DELETE before MLOG_CHECKPOINT for tablespace 1
2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[2201] with error Data structure corruption
2017-05-05 10:19:41 140037071685504 [Note] InnoDB: Starting shutdown...
i=================================================================
==5226==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x612000018588 in thread T0
#0 0x736750 in operator delete(void*) (/mariadb/server/build/sql/mysqld+0x736750)
#1 0x1e4833f in LatchCounter::~LatchCounter() /mariadb/server/storage/innobase/include/sync0types.h:599:4
#2 0x1e480b8 in LatchMeta<LatchCounter>::~LatchMeta() /mariadb/server/storage/innobase/include/sync0types.h:786:17
#3 0x1e35509 in sync_latch_meta_destroy() /mariadb/server/storage/innobase/sync/sync0debug.cc:1622:3
#4 0x1e35314 in sync_check_close() /mariadb/server/storage/innobase/sync/sync0debug.cc:1839:2
#5 0x1dfdc18 in innodb_shutdown() /mariadb/server/storage/innobase/srv/srv0start.cc:2888:2
#6 0x197e5e6 in innobase_init(void*) /mariadb/server/storage/innobase/handler/ha_innodb.cc:4475:3
This regression was caused by MDEV-12467 encryption.create_or_replace
hangs during DROP TABLE, where if table->is_stopping() (i.e. when
tablespace is dropped) background key rotation thread calls
fil_crypt_complete_rotate_space to release space and stop rotation.
However, that function does not decrease number of rotating
threads if table->is_stopping() is true.
innochecksum uses global variables. great, let's use them all the
way down, instead of passing them as arguments to innodb internals,
conditionally modifying function prototypes with #ifdefs
|| kv >= crypt_data->min_key_version,
encryption.innodb_encryption_tables failed in buildbot.
Now that key_version is not stored when page is read to
buf_page_t::key_version but always read from actual page
this assertion is not always valid.
This fixes a regression caused by MDEV-12428.
When we introduced a variant of fil_space_acquire() that could
increment space->n_pending_ops after space->stop_new_ops was set,
the logic of fil_check_pending_operations() was broken.
fil_space_t::n_pending_ios: A new field to track read or write
access from the buffer pool routines immediately before a block
write or after a block read in the file system.
fil_space_acquire_for_io(), fil_space_release_for_io(): Similar
to fil_space_acquire_silent() and fil_space_release(), but
modify fil_space_t::n_pending_ios instead of fil_space_t::n_pending_ops.
Adjust a number of places accordingly, and remove some redundant
tablespace lookups.
The following parts of this fix differ from the 10.2 version of this fix:
buf_page_get_corrupt(): Add a tablespace parameter.
In 10.2, we already had a two-phase process of freeing fil_space objects
(first, fil_space_detach(), then release fil_system->mutex, and finally
free the fil_space and fil_node objects).
fil_space_free_and_mutex_exit(): Renamed from fil_space_free().
Detach the tablespace from the fil_system cache, release the
fil_system->mutex, and then wait for space->n_pending_ios to reach 0,
to avoid accessing freed data in a concurrent thread.
During the wait, future calls to fil_space_acquire_for_io() will
not find this tablespace, and the count can only be decremented to 0,
at which point it is safe to free the objects.
fil_node_free_part1(), fil_node_free_part2(): Refactored from
fil_node_free().
This fixes a regression caused by MDEV-12428.
When we introduced a variant of fil_space_acquire() that could
increment space->n_pending_ops after space->stop_new_ops was set,
the logic of fil_check_pending_operations() was broken.
fil_space_t::n_pending_ios: A new field to track read or write
access from the buffer pool routines immediately before a block
write or after a block read in the file system.
fil_space_acquire_for_io(), fil_space_release_for_io(): Similar
to fil_space_acquire_silent() and fil_space_release(), but
modify fil_space_t::n_pending_ios instead of fil_space_t::n_pending_ops.
fil_space_free_low(): Wait for space->n_pending_ios to reach 0,
to avoid accessing freed data in a concurrent thread. Future
calls to fil_space_acquire_for_io() will not find this tablespace,
because it will already have been detached from fil_system.
Adjust a number of places accordingly, and remove some redundant
tablespace lookups.
FIXME: buf_page_check_corrupt() should take a tablespace from
fil_space_acquire_for_io() as a parameter. This will be done
in the 10.1 version of this patch and merged from there.
That depends on MDEV-12253, which has not been merged from 10.1 yet.
Problem was that bpage was referenced after it was already freed
from LRU. Fixed by adding a new variable encrypted that is
passed down to buf_page_check_corrupt() and used in
buf_page_get_gen() to stop processing page read.
This patch should also address following test failures and
bugs:
MDEV-12419: IMPORT should not look up tablespace in
PageConverter::validate(). This is now removed.
MDEV-10099: encryption.innodb_onlinealter_encryption fails
sporadically in buildbot
MDEV-11420: encryption.innodb_encryption-page-compression
failed in buildbot
MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8
Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing
and replaced these with dict_table_t::file_unreadable. Table
ibd file is missing if fil_get_space(space_id) returns NULL
and encrypted if not. Removed dict_table_t::is_corrupted field.
Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(),
buf_page_decrypt_after_read(), buf_page_encrypt_before_write(),
buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats().
Added test cases when enrypted page could be read while doing
redo log crash recovery. Also added test case for row compressed
blobs.
btr_cur_open_at_index_side_func(),
btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is
NULL.
buf_page_get_zip(): Issue error if page read fails.
buf_page_get_gen(): Use dberr_t for error detection and
do not reference bpage after we hare freed it.
buf_mark_space_corrupt(): remove bpage from LRU also when
it is encrypted.
buf_page_check_corrupt(): @return DB_SUCCESS if page has
been read and is not corrupted,
DB_PAGE_CORRUPTED if page based on checksum check is corrupted,
DB_DECRYPTION_FAILED if page post encryption checksum matches but
after decryption normal page checksum does not match. In read
case only DB_SUCCESS is possible.
buf_page_io_complete(): use dberr_t for error handling.
buf_flush_write_block_low(),
buf_read_ahead_random(),
buf_read_page_async(),
buf_read_ahead_linear(),
buf_read_ibuf_merge_pages(),
buf_read_recv_pages(),
fil_aio_wait():
Issue error if page read fails.
btr_pcur_move_to_next_page(): Do not reference page if it is
NULL.
Introduced dict_table_t::is_readable() and dict_index_t::is_readable()
that will return true if tablespace exists and pages read from
tablespace are not corrupted or page decryption failed.
Removed buf_page_t::key_version. After page decryption the
key version is not removed from page frame. For unencrypted
pages, old key_version is removed at buf_page_encrypt_before_write()
dict_stats_update_transient_for_index(),
dict_stats_update_transient()
Do not continue if table decryption failed or table
is corrupted.
dict0stats.cc: Introduced a dict_stats_report_error function
to avoid code duplication.
fil_parse_write_crypt_data():
Check that key read from redo log entry is found from
encryption plugin and if it is not, refuse to start.
PageConverter::validate(): Removed access to fil_space_t as
tablespace is not available during import.
Fixed error code on innodb.innodb test.
Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown
to innodb-bad-key-change2. Removed innodb-bad-key-change5 test.
Decreased unnecessary complexity on some long lasting tests.
Removed fil_inc_pending_ops(), fil_decr_pending_ops(),
fil_get_first_space(), fil_get_next_space(),
fil_get_first_space_safe(), fil_get_next_space_safe()
functions.
fil_space_verify_crypt_checksum(): Fixed bug found using ASAN
where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly
accessed from row compressed tables. Fixed out of page frame
bug for row compressed tables in
fil_space_verify_crypt_checksum() found using ASAN. Incorrect
function was called for compressed table.
Added new tests for discard, rename table and drop (we should allow them
even when page decryption fails). Alter table rename is not allowed.
Added test for restart with innodb-force-recovery=1 when page read on
redo-recovery cant be decrypted. Added test for corrupted table where
both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted.
Adjusted the test case innodb_bug14147491 so that it does not anymore
expect crash. Instead table is just mostly not usable.
fil0fil.h: fil_space_acquire_low is not visible function
and fil_space_acquire and fil_space_acquire_silent are
inline functions. FilSpace class uses fil_space_acquire_low
directly.
recv_apply_hashed_log_recs() does not return anything.
fil_crypt_thread(): Do invoke fil_crypt_complete_rotate_space()
when the tablespace is about to be dropped. Also, remove a redundant
check whether rotate_thread_t::space is NULL. It can only become
NULL when fil_crypt_find_space_to_rotate() returns false, and in
that case we would already have terminated the loop.
fil_crypt_find_page_to_rotate(): Remove a redundant check for
space->crypt_data == NULL. Once encryption metadata has been
created for a tablespace, it cannot be removed without dropping
the entire tablespace.
Alias the InnoDB ulint and lint data types to size_t and ssize_t,
which are the standard names for the machine-word-width data types.
Correspondingly, define ULINTPF as "%zu" and introduce ULINTPFx as "%zx".
In this way, better compiler warnings for type mismatch are possible.
Furthermore, use PRIu64 for that 64-bit format, and define
the feature macro __STDC_FORMAT_MACROS to enable it on Red Hat systems.
Fix some errors in error messages, and replace some error messages
with assertions.
Most notably, an IMPORT TABLESPACE error message in InnoDB was
displaying the number of columns instead of the mismatching flags.
fil_crypt_thread(): Do invoke fil_crypt_complete_rotate_space()
when the tablespace is about to be dropped. Also, remove a redundant
check whether rotate_thread_t::space is NULL. It can only become
NULL when fil_crypt_find_space_to_rotate() returns false, and in
that case we would already have terminated the loop.
fil_crypt_find_page_to_rotate(): Remove a redundant check for
space->crypt_data == NULL. Once encryption metadata has been
created for a tablespace, it cannot be removed without dropping
the entire tablespace.
Fixes compile error that highlights problem:
/source/storage/innobase/fil/fil0crypt.cc: In function 'void fil_crypt_rotate_page(const key_state_t*, rotate_thread_t*)':
/source/storage/innobase/fil/fil0crypt.cc:1770:15: error: ISO C++ forbids comparison between pointer and integer [-fpermissive]
if (space == TRX_SYS_SPACE && offset == TRX_SYS_PAGE_NO) {
Signed-off-by: Daniel Black <daniel.black@au.ibm.com>
MDEV-11581: Mariadb starts InnoDB encryption threads
when key has not changed or data scrubbing turned off
Background: Key rotation is based on background threads
(innodb-encryption-threads) periodically going through
all tablespaces on fil_system. For each tablespace
current used key version is compared to max key age
(innodb-encryption-rotate-key-age). This process
naturally takes CPU. Similarly, in same time need for
scrubbing is investigated. Currently, key rotation
is fully supported on Amazon AWS key management plugin
only but InnoDB does not have knowledge what key
management plugin is used.
This patch re-purposes innodb-encryption-rotate-key-age=0
to disable key rotation and background data scrubbing.
All new tables are added to special list for key rotation
and key rotation is based on sending a event to
background encryption threads instead of using periodic
checking (i.e. timeout).
fil0fil.cc: Added functions fil_space_acquire_low()
to acquire a tablespace when it could be dropped concurrently.
This function is used from fil_space_acquire() or
fil_space_acquire_silent() that will not print
any messages if we try to acquire space that does not exist.
fil_space_release() to release a acquired tablespace.
fil_space_next() to iterate tablespaces in fil_system
using fil_space_acquire() and fil_space_release().
Similarly, fil_space_keyrotation_next() to iterate new
list fil_system->rotation_list where new tables.
are added if key rotation is disabled.
Removed unnecessary functions fil_get_first_space_safe()
fil_get_next_space_safe()
fil_node_open_file(): After page 0 is read read also
crypt_info if it is not yet read.
btr_scrub_lock_dict_func()
buf_page_check_corrupt()
buf_page_encrypt_before_write()
buf_merge_or_delete_for_page()
lock_print_info_all_transactions()
row_fts_psort_info_init()
row_truncate_table_for_mysql()
row_drop_table_for_mysql()
Use fil_space_acquire()/release() to access fil_space_t.
buf_page_decrypt_after_read():
Use fil_space_get_crypt_data() because at this point
we might not yet have read page 0.
fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly
to functions needing it and store fil_space_t* to rotation state.
Use fil_space_acquire()/release() when iterating tablespaces
and removed unnecessary is_closing from fil_crypt_t. Use
fil_space_t::is_stopping() to detect when access to
tablespace should be stopped. Removed unnecessary
fil_space_get_crypt_data().
fil_space_create(): Inform key rotation that there could
be something to do if key rotation is disabled and new
table with encryption enabled is created.
Remove unnecessary functions fil_get_first_space_safe()
and fil_get_next_space_safe(). fil_space_acquire()
and fil_space_release() are used instead. Moved
fil_space_get_crypt_data() and fil_space_set_crypt_data()
to fil0crypt.cc.
fsp_header_init(): Acquire fil_space_t*, write crypt_data
and release space.
check_table_options()
Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_*
i_s.cc: Added ROTATING_OR_FLUSHING field to
information_schema.innodb_tablespace_encryption
to show current status of key rotation.
fil_parse_write_crypt_data(): Correct the comparison operator.
This was broken in commit 498f4a825b
which removed a signed/unsigned mismatch in these comparisons.
Define my_thread_id as an unsigned type, to avoid mismatch with
ulonglong. Change some parameters to this type.
Use size_t in a few more places.
Declare many flag constants as unsigned to avoid sign mismatch
when shifting bits or applying the unary ~ operator.
When applying the unary ~ operator to enum constants, explictly
cast the result to an unsigned type, because enum constants can
be treated as signed.
In InnoDB, change the source code line number parameters from
ulint to unsigned type. Also, make some InnoDB functions return
a narrower type (unsigned or uint32_t instead of ulint;
bool instead of ibool).
buf_page_is_checksum_valid_crc32()
buf_page_is_checksum_valid_innodb()
buf_page_is_checksum_valid_none():
Use ULINTPF instead of %lu and %u for ib_uint32_t
fil_space_verify_crypt_checksum():
Check that page is really empty if checksum and
LSN are zero.
fil_space_verify_crypt_checksum():
Correct the comment to be more agurate.
buf0buf.h:
Remove unnecessary is_corrupt variable from
buf_page_t structure.
Encryption stores used key_version to
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION (offset 26)
field. Spatial indexes store RTREE Split Sequence Number
(FIL_RTREE_SPLIT_SEQ_NUM) in the same field. Both values
can't be stored in same field. Thus, current encryption
implementation does not support encrypting spatial indexes.
fil_space_encrypt(): Do not encrypt page if page type is
FIL_PAGE_RTREE (this is required for background
encryption innodb-encrypt-tables=ON).
create_table_info_t::check_table_options() Do not allow creating
table with ENCRYPTED=YES if table contains spatial index.
compatibility problems
Pages that are encrypted contain post encryption checksum on
different location that normal checksum fields. Therefore,
we should before decryption check this checksum to avoid
unencrypting corrupted pages. After decryption we can use
traditional checksum check to detect if page is corrupted
or unencryption was done using incorrect key.
Pages that are page compressed do not contain any checksum,
here we need to fist unencrypt, decompress and finally
use tradional checksum check to detect page corruption
or that we used incorrect key in unencryption.
buf0buf.cc: buf_page_is_corrupted() mofified so that
compressed pages are skipped.
buf0buf.h, buf_block_init(), buf_page_init_low():
removed unnecessary page_encrypted, page_compressed,
stored_checksum, valculated_checksum fields from
buf_page_t
buf_page_get_gen(): use new buf_page_check_corrupt() function
to detect corrupted pages.
buf_page_check_corrupt(): If page was not yet decrypted
check if post encryption checksum still matches.
If page is not anymore encrypted, use buf_page_is_corrupted()
traditional checksum method.
If page is detected as corrupted and it is not encrypted
we print corruption message to error log.
If page is still encrypted or it was encrypted and now
corrupted, we will print message that page is
encrypted to error log.
buf_page_io_complete(): use new buf_page_check_corrupt()
function to detect corrupted pages.
buf_page_decrypt_after_read(): Verify post encryption
checksum before tring to decrypt.
fil0crypt.cc: fil_encrypt_buf() verify post encryption
checksum and ind fil_space_decrypt() return true
if we really decrypted the page.
fil_space_verify_crypt_checksum(): rewrite to use
the method used when calculating post encryption
checksum. We also check if post encryption checksum
matches that traditional checksum check does not
match.
fil0fil.ic: Add missed page type encrypted and page
compressed to fil_get_page_type_name()
Note that this change does not yet fix innochecksum tool,
that will be done in separate MDEV.
Fix test failures caused by buf page corruption injection.
InnoDB shutdown failed to properly take fil_crypt_thread() into account.
The encryption threads were signalled to shut down together with other
non-critical tasks. This could be much too early in case of slow shutdown,
which could need minutes to complete the purge. Furthermore, InnoDB
failed to wait for the fil_crypt_thread() to actually exit before
proceeding to the final steps of shutdown, causing the race conditions.
Furthermore, the log_scrub_thread() was shut down way too early.
Also it should remain until the SRV_SHUTDOWN_FLUSH_PHASE.
fil_crypt_threads_end(): Remove. This would cause the threads to
be terminated way too early.
srv_buf_dump_thread_active, srv_dict_stats_thread_active,
lock_sys->timeout_thread_active, log_scrub_thread_active,
srv_monitor_active, srv_error_monitor_active: Remove a race condition
between startup and shutdown, by setting these in the startup thread
that creates threads, not in each created thread. In this way, once the
flag is cleared, it will remain cleared during shutdown.
srv_n_fil_crypt_threads_started, fil_crypt_threads_event: Declare in
global rather than static scope.
log_scrub_event, srv_log_scrub_thread_active, log_scrub_thread():
Declare in static rather than global scope. Let these be created by
log_init() and freed by log_shutdown().
rotate_thread_t::should_shutdown(): Do not shut down before the
SRV_SHUTDOWN_FLUSH_PHASE.
srv_any_background_threads_are_active(): Remove. These checks now
exist in logs_empty_and_mark_files_at_shutdown().
logs_empty_and_mark_files_at_shutdown(): Shut down the threads in
the proper order. Keep fil_crypt_thread() and log_scrub_thread() alive
until SRV_SHUTDOWN_FLUSH_PHASE, and check that they actually terminate.
10.1 is merged into 10.2 now. Two issues are left to fix:
(1) encryption.innochecksum test
(2) read_page0 vs page_0_crypt_read
(1) innochecksum tool did not compile after merge because
buf_page_is_corrupted uses fil_crypt_t that has been changed.
extra/CMakeLists.txt: Added fil/fil0crypt.cc as dependency
as we need to use fil_crypt_verify_checksum for encrypted pages.
innochecksum.cc: If we think page is encrypted i.e.
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION != 0 we call
fil_crypt_verify_checksum() function to compare calculated
checksum to stored checksum calculated after encryption
(this is stored on different offset i.e.
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION + 4).
If checksum does not match we call normal buf_page_is_corrupted
to compare calculated checksum to stored checksum.
fil0crypt.cc: add #ifdef UNIV_INNOCHECKSUM to be able to compile
this file for innochecksum tool.
(2) read_page0 is not needed and thus removed.
after aborted InnoDB startup
This bug was repeatable by starting MariaDB 10.2 with an
invalid option, such as --innodb-flush-method=foo.
It is not repeatable in MariaDB 10.1 in the same way, but the
problem exists already there.
after aborted InnoDB startup
This bug was repeatable by starting MariaDB 10.2 with an
invalid option, such as --innodb-flush-method=foo.
It is not repeatable in MariaDB 10.1 in the same way, but the
problem exists already there.