With a big buffer pool that contains many data pages,
DISCARD TABLESPACE took a long time, because it would scan the
entire buffer pool to remove any pages that belong to the tablespace.
With a large buffer pool, this would take a lot of time, especially
when the table-to-discard is empty.
The minimum amount of work that DISCARD TABLESPACE must do is to
remove the pages of the to-be-discarded table from the
buf_pool->flush_list because any writes to the data file must be
prevented before the file is deleted.
If DISCARD TABLESPACE does not evict the pages from the buffer pool,
then IMPORT TABLESPACE must do it, because we must prevent pre-DISCARD,
not-yet-evicted pages from being mistaken for pages of the imported
tablespace.
It would not be a useful fix to simply move the buffer pool scan to
the IMPORT TABLESPACE step. What we can do is to actively evict those
pages that could be mistaken for imported pages. In this way, when
importing a small table into a big buffer pool, the import should
still run relatively fast.
Import is bypassing the buffer pool when reading pages for the
adjustment phase. In the adjustment phase, if a page exists in
the buffer pool, we could replace it with the page from the imported
file. Unfortunately I did not get this to work properly, so instead
we will simply evict any matching page from the buffer pool.
buf_page_get_gen(): Implement BUF_EVICT_IF_IN_POOL, a new mode
where the requested page will be evicted if it is found. There
must be no unwritten changes for the page.
buf_remove_t: Remove. Instead, use trx!=NULL to signify that a write
to file is desired, and use a separate parameter bool drop_ahi.
buf_LRU_flush_or_remove_pages(), fil_delete_tablespace():
Replace buf_remove_t.
buf_LRU_remove_pages(), buf_LRU_remove_all_pages(): Remove.
PageConverter::m_mtr: A dummy mini-transaction buffer
PageConverter::PageConverter(): Complete the member initialization list.
PageConverter::operator()(): Evict any 'shadow' pages from the
buffer pool so that pre-existing (garbage) pages cannot be mistaken
for pages that exist in the being-imported file.
row_discard_tablespace(): Remove a bogus comment that seems to
refer to IMPORT TABLESPACE, not DISCARD TABLESPACE.
Problem is that page 0 and its possible enrryption information
is not read for undo tablespaces.
fil_crypt_get_latest_key_version(): Do not send event to
encryption threads if event does not yet exists. Seen
on regression testing.
fil_read_first_page: Add new parameter does page belong to
undo tablespace and if it does, we do not read FSP_HEADER.
srv_undo_tablespace_open : Read first page of the tablespace
to get crypt_data if it exists and pass it to fil_space_create.
Tested using innodb_encryption with combinations with
innodb-undo-tablespaces.
dict_table_t::thd: Remove. This was only used by btr_root_block_get()
for reporting decryption failures, and it was only assigned by
ha_innobase::open(), and never cleared. This could mean that if a
connection is closed, the pointer would become stale, and the server
could crash while trying to report the error. It could also mean
that an error is being reported to the wrong client. It is better
to use current_thd in this case, even though it could mean that if
the code is invoked from an InnoDB background operation, there would
be no connection to which to send the error message.
Remove dict_table_t::crypt_data and dict_table_t::page_0_read.
These fields were never read.
fil_open_single_table_tablespace(): Remove the parameter "table".
Problem appears to be that the function fsp_flags_try_adjust()
is being unconditionally invoked on every .ibd file on startup.
Based on performance investigation also the top function
fsp_header_get_crypt_offset() needs to addressed.
Ported implementation of fsp_header_get_encryption_offset()
function from 10.2 to fsp_header_get_crypt_offset().
Introduced a new function fil_crypt_read_crypt_data()
to read page 0 if it is not yet read.
fil_crypt_find_space_to_rotate(): Now that page 0 for every .ibd
file is not read on startup we need to check has page 0 read
from space that we investigate for key rotation, if it is not read
we read it.
fil_space_crypt_get_status(): Now that page 0 for every .ibd
file is not read on startup here also we need to read page 0
if it is not yet read it. This is needed
as tests use IS query to wait until background encryption
or decryption has finished and this function is used to
produce results.
fil_crypt_thread(): Add is_stopping condition for tablespace
so that we do not rotate pages if usage of tablespace should
be stopped. This was needed for failure seen on regression
testing.
fil_space_create: Remove page_0_crypt_read and extra
unnecessary info output.
fil_open_single_table_tablespace(): We call fsp_flags_try_adjust
only when when no errors has happened and server was not started
on read only mode and tablespace validation was requested or
flags contain other table options except low order bits to
FSP_FLAGS_POS_PAGE_SSIZE position.
fil_space_t::page_0_crypt_read removed.
Added test case innodb-first-page-read to test startup when
encryption is on and when encryption is off to check that not
for all tables page 0 is read on startup.
Problem was that FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION field that for
encrypted pages even in system datafiles should contain key_version
except very first page (0:0) is after encryption overwritten with
flush lsn.
Ported WL#7990 Repurpose FIL_PAGE_FLUSH_LSN to 10.1
The field FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION is consulted during
InnoDB startup.
At startup, InnoDB reads the FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION
from the first page of each file in the InnoDB system tablespace.
If there are multiple files, the minimum and maximum LSN can differ.
These numbers are passed to InnoDB startup.
Having the number in other files than the first file of the InnoDB
system tablespace is not providing much additional value. It is
conflicting with other use of the field, such as on InnoDB R-tree
index pages and encryption key_version.
This worklog will stop writing FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION to
other files than the first file of the InnoDB system tablespace
(page number 0:0) when system tablespace is encrypted. If tablespace
is not encrypted we continue writing FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION
to all first pages of system tablespace to avoid unnecessary
warnings on downgrade.
open_or_create_data_files(): pass only one flushed_lsn parameter
xb_load_tablespaces(): pass only one flushed_lsn parameter.
buf_page_create(): Improve comment about where
FIL_PAGE_FIL_FLUSH_LSN_OR_KEY_VERSION is set.
fil_write_flushed_lsn(): A new function, merged from
fil_write_lsn_and_arch_no_to_file() and
fil_write_flushed_lsn_to_data_files().
Only write to the first page of the system tablespace (page 0:0)
if tablespace is encrypted, or write all first pages of system
tablespace and invoke fil_flush_file_spaces(FIL_TYPE_TABLESPACE)
afterwards.
fil_read_first_page(): read flush_lsn and crypt_data only from
first datafile.
fil_open_single_table_tablespace(): Remove output of LSN, because it
was only valid for the system tablespace and the undo tablespaces, not
user tablespaces.
fil_validate_single_table_tablespace(): Remove output of LSN.
checkpoint_now_set(): Use fil_write_flushed_lsn and output
a error if operation fails.
Remove lsn variable from fsp_open_info.
recv_recovery_from_checkpoint_start(): Remove unnecessary second
flush_lsn parameter.
log_empty_and_mark_files_at_shutdown(): Use fil_writte_flushed_lsn
and output error if it fails.
open_or_create_data_files(): Pass only one flushed_lsn variable.
Significantly reduce the amount of InnoDB, XtraDB and Mariabackup
code changes by defining pfs_os_file_t as something that is
transparently compatible with os_file_t.
This fixes a regression caused by MDEV-12428.
When we introduced a variant of fil_space_acquire() that could
increment space->n_pending_ops after space->stop_new_ops was set,
the logic of fil_check_pending_operations() was broken.
fil_space_t::n_pending_ios: A new field to track read or write
access from the buffer pool routines immediately before a block
write or after a block read in the file system.
fil_space_acquire_for_io(), fil_space_release_for_io(): Similar
to fil_space_acquire_silent() and fil_space_release(), but
modify fil_space_t::n_pending_ios instead of fil_space_t::n_pending_ops.
Adjust a number of places accordingly, and remove some redundant
tablespace lookups.
The following parts of this fix differ from the 10.2 version of this fix:
buf_page_get_corrupt(): Add a tablespace parameter.
In 10.2, we already had a two-phase process of freeing fil_space objects
(first, fil_space_detach(), then release fil_system->mutex, and finally
free the fil_space and fil_node objects).
fil_space_free_and_mutex_exit(): Renamed from fil_space_free().
Detach the tablespace from the fil_system cache, release the
fil_system->mutex, and then wait for space->n_pending_ios to reach 0,
to avoid accessing freed data in a concurrent thread.
During the wait, future calls to fil_space_acquire_for_io() will
not find this tablespace, and the count can only be decremented to 0,
at which point it is safe to free the objects.
fil_node_free_part1(), fil_node_free_part2(): Refactored from
fil_node_free().
These changes are comparable to Percona's modifications in innodb in the
Percona Xtrabackup repository.
- If functions are used in backup as well as in innodb, make them non-static.
- Define IS_XTRABACKUP() macro for special handling of innodb running
inside backup.
- Extend some functions for backup.
fil_space_for_table_exists_in_mem() gets additional parameter
'remove_from_data_dict_if_does_not_exist', for partial backups
fil_load_single_table_tablespaces() gets an optional parameter predicate
which tells whether to load tablespace based on database or table name,
also for partial backups.
srv_undo_tablespaces_init() gets an optional parameter 'backup_mode'
- Allow single redo log file (for backup "prepare")
- Do not read doublewrite buffer pages in backup, they are outdated
- Add function fil_remove_invalid_table_from_data_dict(), to remove non-existing
tables from data dictionary in case of partial backups.
- On Windows, fix file share modes when opening tablespaces,
to allow mariabackup to read tablespaces while server is online.
- Avoid access to THDVARs in backup, because innodb plugin is not loaded,
and THDVAR would crash in this case.
Problem was that bpage was referenced after it was already freed
from LRU. Fixed by adding a new variable encrypted that is
passed down to buf_page_check_corrupt() and used in
buf_page_get_gen() to stop processing page read.
This patch should also address following test failures and
bugs:
MDEV-12419: IMPORT should not look up tablespace in
PageConverter::validate(). This is now removed.
MDEV-10099: encryption.innodb_onlinealter_encryption fails
sporadically in buildbot
MDEV-11420: encryption.innodb_encryption-page-compression
failed in buildbot
MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8
Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing
and replaced these with dict_table_t::file_unreadable. Table
ibd file is missing if fil_get_space(space_id) returns NULL
and encrypted if not. Removed dict_table_t::is_corrupted field.
Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(),
buf_page_decrypt_after_read(), buf_page_encrypt_before_write(),
buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats().
Added test cases when enrypted page could be read while doing
redo log crash recovery. Also added test case for row compressed
blobs.
btr_cur_open_at_index_side_func(),
btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is
NULL.
buf_page_get_zip(): Issue error if page read fails.
buf_page_get_gen(): Use dberr_t for error detection and
do not reference bpage after we hare freed it.
buf_mark_space_corrupt(): remove bpage from LRU also when
it is encrypted.
buf_page_check_corrupt(): @return DB_SUCCESS if page has
been read and is not corrupted,
DB_PAGE_CORRUPTED if page based on checksum check is corrupted,
DB_DECRYPTION_FAILED if page post encryption checksum matches but
after decryption normal page checksum does not match. In read
case only DB_SUCCESS is possible.
buf_page_io_complete(): use dberr_t for error handling.
buf_flush_write_block_low(),
buf_read_ahead_random(),
buf_read_page_async(),
buf_read_ahead_linear(),
buf_read_ibuf_merge_pages(),
buf_read_recv_pages(),
fil_aio_wait():
Issue error if page read fails.
btr_pcur_move_to_next_page(): Do not reference page if it is
NULL.
Introduced dict_table_t::is_readable() and dict_index_t::is_readable()
that will return true if tablespace exists and pages read from
tablespace are not corrupted or page decryption failed.
Removed buf_page_t::key_version. After page decryption the
key version is not removed from page frame. For unencrypted
pages, old key_version is removed at buf_page_encrypt_before_write()
dict_stats_update_transient_for_index(),
dict_stats_update_transient()
Do not continue if table decryption failed or table
is corrupted.
dict0stats.cc: Introduced a dict_stats_report_error function
to avoid code duplication.
fil_parse_write_crypt_data():
Check that key read from redo log entry is found from
encryption plugin and if it is not, refuse to start.
PageConverter::validate(): Removed access to fil_space_t as
tablespace is not available during import.
Fixed error code on innodb.innodb test.
Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown
to innodb-bad-key-change2. Removed innodb-bad-key-change5 test.
Decreased unnecessary complexity on some long lasting tests.
Removed fil_inc_pending_ops(), fil_decr_pending_ops(),
fil_get_first_space(), fil_get_next_space(),
fil_get_first_space_safe(), fil_get_next_space_safe()
functions.
fil_space_verify_crypt_checksum(): Fixed bug found using ASAN
where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly
accessed from row compressed tables. Fixed out of page frame
bug for row compressed tables in
fil_space_verify_crypt_checksum() found using ASAN. Incorrect
function was called for compressed table.
Added new tests for discard, rename table and drop (we should allow them
even when page decryption fails). Alter table rename is not allowed.
Added test for restart with innodb-force-recovery=1 when page read on
redo-recovery cant be decrypted. Added test for corrupted table where
both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted.
Adjusted the test case innodb_bug14147491 so that it does not anymore
expect crash. Instead table is just mostly not usable.
fil0fil.h: fil_space_acquire_low is not visible function
and fil_space_acquire and fil_space_acquire_silent are
inline functions. FilSpace class uses fil_space_acquire_low
directly.
recv_apply_hashed_log_recs() does not return anything.
buf_flush_write_block_low(): Acquire the tablespace reference once,
and pass it to lower-level functions. This is only a start; further
calls may be removed later.
Also, some MDEV-11738/MDEV-11581 post-push fixes.
In MariaDB 10.1, there is no fil_space_t::is_being_truncated field,
and the predicates fil_space_t::stop_new_ops and fil_space_t::is_stopping()
are interchangeable. I requested the fil_space_t::is_stopping() to be added
in the review, but some added checks for fil_space_t::stop_new_ops were
not replaced with calls to fil_space_t::is_stopping().
buf_page_decrypt_after_read(): In this low-level I/O operation, we must
look up the tablespace if it exists, even though future I/O operations
have been blocked on it due to a pending DDL operation, such as DROP TABLE
or TRUNCATE TABLE or other table-rebuilding operations (ALTER, OPTIMIZE).
Pass a parameter to fil_space_acquire_low() telling that we are performing
a low-level I/O operation and the fil_space_t::is_stopping() status should
be ignored.
MDEV-11581: Mariadb starts InnoDB encryption threads
when key has not changed or data scrubbing turned off
Background: Key rotation is based on background threads
(innodb-encryption-threads) periodically going through
all tablespaces on fil_system. For each tablespace
current used key version is compared to max key age
(innodb-encryption-rotate-key-age). This process
naturally takes CPU. Similarly, in same time need for
scrubbing is investigated. Currently, key rotation
is fully supported on Amazon AWS key management plugin
only but InnoDB does not have knowledge what key
management plugin is used.
This patch re-purposes innodb-encryption-rotate-key-age=0
to disable key rotation and background data scrubbing.
All new tables are added to special list for key rotation
and key rotation is based on sending a event to
background encryption threads instead of using periodic
checking (i.e. timeout).
fil0fil.cc: Added functions fil_space_acquire_low()
to acquire a tablespace when it could be dropped concurrently.
This function is used from fil_space_acquire() or
fil_space_acquire_silent() that will not print
any messages if we try to acquire space that does not exist.
fil_space_release() to release a acquired tablespace.
fil_space_next() to iterate tablespaces in fil_system
using fil_space_acquire() and fil_space_release().
Similarly, fil_space_keyrotation_next() to iterate new
list fil_system->rotation_list where new tables.
are added if key rotation is disabled.
Removed unnecessary functions fil_get_first_space_safe()
fil_get_next_space_safe()
fil_node_open_file(): After page 0 is read read also
crypt_info if it is not yet read.
btr_scrub_lock_dict_func()
buf_page_check_corrupt()
buf_page_encrypt_before_write()
buf_merge_or_delete_for_page()
lock_print_info_all_transactions()
row_fts_psort_info_init()
row_truncate_table_for_mysql()
row_drop_table_for_mysql()
Use fil_space_acquire()/release() to access fil_space_t.
buf_page_decrypt_after_read():
Use fil_space_get_crypt_data() because at this point
we might not yet have read page 0.
fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly
to functions needing it and store fil_space_t* to rotation state.
Use fil_space_acquire()/release() when iterating tablespaces
and removed unnecessary is_closing from fil_crypt_t. Use
fil_space_t::is_stopping() to detect when access to
tablespace should be stopped. Removed unnecessary
fil_space_get_crypt_data().
fil_space_create(): Inform key rotation that there could
be something to do if key rotation is disabled and new
table with encryption enabled is created.
Remove unnecessary functions fil_get_first_space_safe()
and fil_get_next_space_safe(). fil_space_acquire()
and fil_space_release() are used instead. Moved
fil_space_get_crypt_data() and fil_space_set_crypt_data()
to fil0crypt.cc.
fsp_header_init(): Acquire fil_space_t*, write crypt_data
and release space.
check_table_options()
Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_*
i_s.cc: Added ROTATING_OR_FLUSHING field to
information_schema.innodb_tablespace_encryption
to show current status of key rotation.
This is a non-functional change.
On a related note, the calls fil_system_enter() and fil_system_exit()
are often used in an unsafe manner. The fix of MDEV-11738 should
introduce fil_space_acquire() and remove potential race conditions.
The function trx_purge_stop() was calling os_event_reset(purge_sys->event)
before calling rw_lock_x_lock(&purge_sys->latch). The os_event_set()
call in srv_purge_coordinator_suspend() is protected by that X-latch.
It would seem a good idea to consistently protect both os_event_set()
and os_event_reset() calls with a common mutex or rw-lock in those
cases where os_event_set() and os_event_reset() are used
like condition variables, tied to changes of shared state.
For each os_event_t, we try to document the mutex or rw-lock that is
being used. For some events, frequent calls to os_event_set() seem to
try to avoid hangs. Some events are never waited for infinitely, only
timed waits, and os_event_set() is used for early termination of these
waits.
os_aio_simulated_put_read_threads_to_sleep(): Define as a null macro
on other systems than Windows. TODO: remove this altogether and disable
innodb_use_native_aio on Windows.
os_aio_segment_wait_events[]: Initialize only if innodb_use_native_aio=0.
MariaDB 10.0/MySQL 5.6 using innodb-page-size!=16K
The storage format of FSP_SPACE_FLAGS was accidentally broken
already in MariaDB 10.1.0. This fix is bringing the format in
line with other MySQL and MariaDB release series.
Please refer to the comments that were added to fsp0fsp.h
for details.
This is an INCOMPATIBLE CHANGE that affects users of
page_compression and non-default innodb_page_size. Upgrading
to this release will correct the flags in the data files.
If you want to downgrade to earlier MariaDB 10.1.x, please refer
to the test innodb.101_compatibility how to reset the
FSP_SPACE_FLAGS in the files.
NOTE: MariaDB 10.1.0 to 10.1.20 can misinterpret
uncompressed data files with innodb_page_size=4k or 64k as
compressed innodb_page_size=16k files, and then probably fail
when trying to access the pages. See the comments in the
function fsp_flags_convert_from_101() for detailed analysis.
Move PAGE_COMPRESSION to FSP_SPACE_FLAGS bit position 16.
In this way, compressed innodb_page_size=16k tablespaces will not
be mistaken for uncompressed ones by MariaDB 10.1.0 to 10.1.20.
Derive PAGE_COMPRESSION_LEVEL, ATOMIC_WRITES and DATA_DIR from the
dict_table_t::flags when the table is available, in
fil_space_for_table_exists_in_mem() or fil_open_single_table_tablespace().
During crash recovery, fil_load_single_table_tablespace() will use
innodb_compression_level for the PAGE_COMPRESSION_LEVEL.
FSP_FLAGS_MEM_MASK: A bitmap of the memory-only fil_space_t::flags
that are not to be written to FSP_SPACE_FLAGS. Currently, these will
include PAGE_COMPRESSION_LEVEL, ATOMIC_WRITES and DATA_DIR.
Introduce the macro FSP_FLAGS_PAGE_SSIZE(). We only support
one innodb_page_size for the whole instance.
When creating a dummy tablespace for the redo log, use
fil_space_t::flags=0. The flags are never written to the redo log files.
Remove many FSP_FLAGS_SET_ macros.
dict_tf_verify_flags(): Remove. This is basically only duplicating
the logic of dict_tf_to_fsp_flags(), used in a debug assertion.
fil_space_t::mark: Remove. This flag was not used for anything.
fil_space_for_table_exists_in_mem(): Remove the unnecessary parameter
mark_space, and add a parameter for table flags. Check that
fil_space_t::flags match the table flags, and adjust the (memory-only)
flags based on the table flags.
fil_node_open_file(): Remove some redundant or unreachable conditions,
do not use stderr for output, and avoid unnecessary server aborts.
fil_user_tablespace_restore_page(): Convert the flags, so that the
correct page_size will be used when restoring a page from the
doublewrite buffer.
fil_space_get_page_compressed(), fsp_flags_is_page_compressed(): Remove.
It suffices to have fil_space_is_page_compressed().
FSP_FLAGS_WIDTH_DATA_DIR, FSP_FLAGS_WIDTH_PAGE_COMPRESSION_LEVEL,
FSP_FLAGS_WIDTH_ATOMIC_WRITES: Remove, because these flags do not
exist in the FSP_SPACE_FLAGS but only in memory.
fsp_flags_try_adjust(): New function, to adjust the FSP_SPACE_FLAGS
in page 0. Called by fil_open_single_table_tablespace(),
fil_space_for_table_exists_in_mem(), innobase_start_or_create_for_mysql()
except if --innodb-read-only is active.
fsp_flags_is_valid(ulint): Reimplement from the scratch, with
accurate comments. Do not display any details of detected
inconsistencies, because the output could be confusing when
dealing with MariaDB 10.1.x data files.
fsp_flags_convert_from_101(ulint): Convert flags from buggy
MariaDB 10.1.x format, or return ULINT_UNDEFINED if the flags
cannot be in MariaDB 10.1.x format.
fsp_flags_match(): Check the flags when probing files.
Implemented based on fsp_flags_is_valid()
and fsp_flags_convert_from_101().
dict_check_tablespaces_and_store_max_id(): Do not access the
page after committing the mini-transaction.
IMPORT TABLESPACE fixes:
AbstractCallback::init(): Convert the flags.
FetchIndexRootPages::operator(): Check that the tablespace flags match the
table flags. Do not attempt to convert tablespace flags to table flags,
because the conversion would necessarily be lossy.
PageConverter::update_header(): Write back the correct flags.
This takes care of the flags in IMPORT TABLESPACE.
fil_space_t::recv_size: New member: recovered tablespace size in pages;
0 if no size change was read from the redo log,
or if the size change was implemented.
fil_space_set_recv_size(): New function for setting space->recv_size.
innodb_data_file_size_debug: A debug parameter for setting the system
tablespace size in recovery even when the redo log does not contain
any size changes. It is hard to write a small test case that would
cause the system tablespace to be extended at the critical moment.
recv_parse_log_rec(): Note those tablespaces whose size is being changed
by the redo log, by invoking fil_space_set_recv_size().
innobase_init(): Correct an error message, and do not require a larger
innodb_buffer_pool_size when starting up with a smaller innodb_page_size.
innobase_start_or_create_for_mysql(): Allow startup with any initial
size of the ibdata1 file if the autoextend attribute is set. Require
the minimum size of fixed-size system tablespaces to be 640 pages,
not 10 megabytes. Implement innodb_data_file_size_debug.
open_or_create_data_files(): Round the system tablespace size down
to pages, not to full megabytes, (Our test truncates the system
tablespace to more than 800 pages with innodb_page_size=4k.
InnoDB should not imagine that it was truncated to 768 pages
and then overwrite good pages in the tablespace.)
fil_flush_low(): Refactored from fil_flush().
fil_space_extend_must_retry(): Refactored from
fil_extend_space_to_desired_size().
fil_mutex_enter_and_prepare_for_io(): Extend the tablespace if
fil_space_set_recv_size() was called.
The test case has been successfully run with all the
innodb_page_size values 4k, 8k, 16k, 32k, 64k.
The changes are deliberately kept minimal
- some functions are made global instead of static (they will be used in
xtrabackup later on)
- functions got additional parameter, deliberately unused for now :
fil_load_single_tablespaces
srv_undo_tablespaces_init
- Global variables added, also unused for now :
srv_archive_recovery
srv_archive_recovery_limit_lsn
srv_apply_log_only
srv_backup_mode
srv_close_files
- To make xtrabackup link with sql.lib on Windows, added some missing
source files to sql.lib
- Fixed os_thread_ret_t to be DWORD on Windows
Analysis: By design InnoDB was reading first page of every .ibd file
at startup to find out is tablespace encrypted or not. This is
because tablespace could have been encrypted always, not
encrypted newer or encrypted based on configuration and this
information can be find realible only from first page of .ibd file.
Fix: Do not read first page of every .ibd file at startup. Instead
whenever tablespace is first time accedded we will read the first
page to find necessary information about tablespace encryption
status.
TODO: Add support for SYS_TABLEOPTIONS where all table options
encryption information included will be stored.
Analysis: Problem was that in fil_read_first_page we do find that
table has encryption information and that encryption service
or used key_id is not available. But, then we just printed
fatal error message that causes above assertion.
Fix: When we open single table tablespace if it has encryption
information (crypt_data) store this crypt data to the table
structure. When we open a table and we find out that tablespace
is not available, check has table a encryption information
and from there is encryption service or used key_id is not available.
If it is, add additional warning for SQL-layer.
Analysis: Problem was that when a new tablespace is created a default
encryption info is also created and stored to the tablespace. Later a
new encryption information was created with correct key_id but that
does not affect on IV.
Fix: Push encryption mode and key_id to lower levels and create
correct encryption info when a new tablespace is created.
This fix does not contain test case because, currently incorrect
encryption key causes page corruption and a lot of error messages
to error log causing mtr to fail.
Analysis: Problem is that both encrypted tables and compressed tables use
FIL header offset FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION to store
required metadata. Furhermore, for only compressed tables currently
code skips compression.
Fixes:
- Only encrypted pages store key_version to FIL header offset FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION,
no need to fix
- Only compressed pages store compression algorithm to FIL header offset FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION,
no need to fix as they have different page type FIL_PAGE_PAGE_COMPRESSED
- Compressed and encrypted pages now use a new page type FIL_PAGE_PAGE_COMPRESSED_ENCRYPTED and
key_version is stored on FIL header offset FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION and compression
method is stored after FIL header similar way as compressed size, so that first
FIL_PAGE_COMPRESSED_SIZE is stored followed by FIL_PAGE_COMPRESSION_METHOD
- Fix buf_page_encrypt_before_write function to really compress pages if compression is enabled
- Fix buf_page_decrypt_after_read function to really decompress pages if compression is used
- Small style fixes
Analysis: Problem was that we did try to read from tablespace
that was being dropped.
Fixed by introducing a new function to find a tablespace only
if it is not being dropped currently and adding this check
before trying to read pages from tablespace.
Analysis: At fil_spage_get_space there is small change that space
is found from tablespace list but we have not yet created node
for it (and added it to space->chain) and as we hold fil_system
mutex here fil_node_create can't continue.
Fixed by allowing UT_LIST_GET_LEN(space->chain) == 0|| 1 and
introducint two new functions that access filespace list
and before returning space check that node is also created.
Step 2:
-- Introduce temporal memory array to buffer pool where to allocate
temporary memory for encryption/compression
-- Rename PAGE_ENCRYPTION -> ENCRYPTION
-- Rename PAGE_ENCRYPTION_KEY -> ENCRYPTION_KEY
-- Rename innodb_default_page_encryption_key -> innodb_default_encryption_key
-- Allow enable/disable encryption for tables by changing
ENCRYPTION to enum having values DEFAULT, ON, OFF
-- In create table store crypt_data if ENCRYPTION is ON or OFF
-- Do not crypt tablespaces having ENCRYPTION=OFF
-- Store encryption mode to crypt_data and redo-log
Step 1:
-- Remove page encryption from dictionary (per table
encryption will be handled by storing crypt_data to page 0)
-- Remove encryption/compression from os0file and all functions
before that (compression will be added to buf0buf.cc)
-- Use same CRYPT_SCHEME_1 for all encryption methods
-- Do some code cleanups to confort InnoDB coding style