after aborted InnoDB startup
This bug was repeatable by starting MariaDB 10.2 with an
invalid option, such as --innodb-flush-method=foo.
It is not repeatable in MariaDB 10.1 in the same way, but the
problem exists already there.
The InnoDB source code contains quite a few references to a closed-source
hot backup tool which was originally called InnoDB Hot Backup (ibbackup)
and later incorporated in MySQL Enterprise Backup.
The open source backup tool XtraBackup uses the full database for recovery.
So, the references to UNIV_HOTBACKUP are only cluttering the source code.
fil_space_t::recv_size: New member: recovered tablespace size in pages;
0 if no size change was read from the redo log,
or if the size change was implemented.
fil_space_set_recv_size(): New function for setting space->recv_size.
innodb_data_file_size_debug: A debug parameter for setting the system
tablespace size in recovery even when the redo log does not contain
any size changes. It is hard to write a small test case that would
cause the system tablespace to be extended at the critical moment.
recv_parse_log_rec(): Note those tablespaces whose size is being changed
by the redo log, by invoking fil_space_set_recv_size().
innobase_init(): Correct an error message, and do not require a larger
innodb_buffer_pool_size when starting up with a smaller innodb_page_size.
innobase_start_or_create_for_mysql(): Allow startup with any initial
size of the ibdata1 file if the autoextend attribute is set. Require
the minimum size of fixed-size system tablespaces to be 640 pages,
not 10 megabytes. Implement innodb_data_file_size_debug.
open_or_create_data_files(): Round the system tablespace size down
to pages, not to full megabytes, (Our test truncates the system
tablespace to more than 800 pages with innodb_page_size=4k.
InnoDB should not imagine that it was truncated to 768 pages
and then overwrite good pages in the tablespace.)
fil_flush_low(): Refactored from fil_flush().
fil_space_extend_must_retry(): Refactored from
fil_extend_space_to_desired_size().
fil_mutex_enter_and_prepare_for_io(): Extend the tablespace if
fil_space_set_recv_size() was called.
The test case has been successfully run with all the
innodb_page_size values 4k, 8k, 16k, 32k, 64k.
Problem was that for encryption we use temporary scratch area for
reading and writing tablespace pages. But if page was not really
decrypted the correct updated page was not moved to scratch area
that was then written. This can happen e.g. for page 0 as it is
newer encrypted even if encryption is enabled and as we write
the contents of old page 0 to tablespace it contained naturally
incorrect space_id that is then later noted and error message
was written. Updated page with correct space_id was lost.
If tablespace is encrypted we use additional
temporary scratch area where pages are read
for decrypting readptr == crypt_io_buffer != io_buffer.
Destination for decryption is a buffer pool block
block->frame == dst == io_buffer that is updated.
Pages that did not require decryption even when
tablespace is marked as encrypted are not copied
instead block->frame is set to src == readptr.
If tablespace was encrypted we copy updated page to
writeptr != io_buffer. This fixes above bug.
For encryption we again use temporary scratch area
writeptr != io_buffer == dst
that is then written to the tablespace
(1) For normal tables src == dst == writeptr
ut_ad(!encrypted && !page_compressed ?
src == dst && dst == writeptr + (i * size):1);
(2) For page compressed tables src == dst == writeptr
ut_ad(page_compressed && !encrypted ?
src == dst && dst == writeptr + (i * size):1);
(3) For encrypted tables src != dst != writeptr
ut_ad(encrypted ?
src != dst && dst != writeptr + (i * size):1);
Replace all exit() calls in InnoDB with abort() [possibly via ut_a()].
Calling exit() in a multi-threaded program is problematic also for
the reason that other threads could see corrupted data structures
while some data structures are being cleaned up by atexit() handlers
or similar.
In the long term, all these calls should be replaced with something
that returns an error all the way up the call stack.
fil_tablespace_iterate(): Call fil_space_destroy_crypt_data() to
invoke mutex_free() for the mutex_create() that was done in
fil_space_read_crypt_data(). Also, remember to free
iter.crypt_io_buffer.
The failure to call mutex_free() would cause sync_latch_meta_destroy()
to access freed memory on shutdown. This affected the IMPORT of
encrypted tablespaces.
fil_space_crypt_cleanup(): Call mutex_free() to pair with
fil_space_crypt_init().
fil_space_destroy_crypt_data(): Call mutex_free() to pair with
fil_space_create_crypt_data() and fil_space_read_crypt_data().
fil_crypt_threads_cleanup(): Call mutex_free() to pair with
fil_crypt_threads_init().
fil_space_free_low(): Invoke fil_space_destroy_crypt_data().
fil_close(): Invoke fil_space_crypt_cleanup(), just like
fil_init() invoked fil_space_crypt_init().
Datafile::shutdown(): Set m_crypt_info=NULL without dereferencing
the pointer. The object will be freed along with the fil_space_t
in fil_space_free_low().
Remove some unnecessary conditions (ut_free(NULL) is OK).
srv_shutdown_all_bg_threads(): Shut down the encryption threads
by calling fil_crypt_threads_end().
srv_shutdown_bg_undo_sources(): Do not prematurely call
fil_crypt_threads_end(). Many pages can still be written by
change buffer merge, rollback of incomplete transactions, and
purge, especially in slow shutdown (innodb_fast_shutdown=0).
innobase_shutdown_for_mysql(): Call fil_crypt_threads_cleanup()
also when innodb_read_only=1, because the threads will have been
created also in that case.
sync_check_close(): Re-enable the invocation of sync_latch_meta_destroy()
to free the mutex list.
Make some global fil_crypt_ variables static.
fil_close(): Call mutex_free(&fil_system->mutex) also in InnoDB, not
only in XtraDB. In InnoDB, sync_close() was called before fil_close().
innobase_shutdown_for_mysql(): Call fil_close() before sync_close(),
similar to XtraDB shutdown.
fil_space_crypt_cleanup(): Call mutex_free() to pair with
fil_space_crypt_init().
fil_crypt_threads_cleanup(): Call mutex_free() to pair with
fil_crypt_threads_init().
MySQL 5.7 supports only one shared temporary tablespace.
MariaDB 10.2 does not support any other shared InnoDB tablespaces than
the two predefined tablespaces: the persistent InnoDB system tablespace
(default file name ibdata1) and the temporary tablespace
(default file name ibtmp1).
InnoDB is unnecessarily allocating a tablespace ID for the predefined
temporary tablespace on every startup, and it is in several places
testing whether a tablespace ID matches this dynamically generated ID.
We should use a compile-time constant to reduce code size and to avoid
unnecessary updates to the DICT_HDR page at every startup.
Using a hard-coded tablespace ID will should make it easier to remove the
TEMPORARY flag from FSP_SPACE_FLAGS in MDEV-11202.
Reduce the number of calls to encryption_get_key_get_latest_version
when doing key rotation with two different methods:
(1) We need to fetch key information when tablespace not yet
have a encryption information, invalid keys are handled now
differently (see below). There was extra call to detect
if key_id is not found on key rotation.
(2) If key_id is not found from encryption plugin, do not
try fetching new key_version for it as it will fail anyway.
We store return value from encryption_get_key_get_latest_version
call and if it returns ENCRYPTION_KEY_VERSION_INVALID there
is no need to call it again.
WL#7682 in MySQL 5.7 introduced the possibility to create light-weight
temporary tables in InnoDB. These are called 'intrinsic temporary tables'
in InnoDB, and in MySQL 5.7, they can be created by the optimizer for
sorting or buffering data in query processing.
In MariaDB 10.2, the optimizer temporary tables cannot be created in
InnoDB, so we should remove the dead code and related data structures.
MySQL 5.7 introduced WL#7943: InnoDB: Implement Information_Schema.Files
to provide a long-term alternative for accessing tablespace metadata.
The INFORMATION_SCHEMA.INNODB_* views are considered internal interfaces
that are subject to change or removal between releases. So, users should
refer to I_S.FILES instead of I_S.INNODB_SYS_TABLESPACES to fetch metadata
about CREATE TABLESPACE.
Because MariaDB 10.2 does not support CREATE TABLESPACE or
CREATE TABLE…TABLESPACE for InnoDB, it does not make sense to support
I_S.FILES either. So, let MariaDB 10.2 omit the code that was added in
MySQL 5.7. After this change, I_S.FILES will report the empty result,
unless some other storage engine in MariaDB 10.2 implements the interface.
(The I_S.FILES interface was originally created for the NDB Cluster.)
Two problems:
(1) When pushing warning to sql-layer we need to check that thd != NULL
to avoid NULL-pointer reference.
(2) At tablespace key rotation if used key_id is not found from
encryption plugin tablespace should not be rotated.
Analysis: By design InnoDB was reading first page of every .ibd file
at startup to find out is tablespace encrypted or not. This is
because tablespace could have been encrypted always, not
encrypted newer or encrypted based on configuration and this
information can be find realible only from first page of .ibd file.
Fix: Do not read first page of every .ibd file at startup. Instead
whenever tablespace is first time accedded we will read the first
page to find necessary information about tablespace encryption
status.
TODO: Add support for SYS_TABLEOPTIONS where all table options
encryption information included will be stored.
Contains also:
MDEV-10549 mysqld: sql/handler.cc:2692: int handler::ha_index_first(uchar*): Assertion `table_share->tmp_table != NO_TMP_TABLE || m_lock_type != 2' failed. (branch bb-10.2-jan)
Unlike MySQL, InnoDB still uses THR_LOCK in MariaDB
MDEV-10548 Some of the debug sync waits do not work with InnoDB 5.7 (branch bb-10.2-jan)
enable tests that were fixed in MDEV-10549
MDEV-10548 Some of the debug sync waits do not work with InnoDB 5.7 (branch bb-10.2-jan)
fix main.innodb_mysql_sync - re-enable online alter for partitioned innodb tables
Contains also
MDEV-10547: Test multi_update_innodb fails with InnoDB 5.7
The failure happened because 5.7 has changed the signature of
the bool handler::primary_key_is_clustered() const
virtual function ("const" was added). InnoDB was using the old
signature which caused the function not to be used.
MDEV-10550: Parallel replication lock waits/deadlock handling does not work with InnoDB 5.7
Fixed mutexing problem on lock_trx_handle_wait. Note that
rpl_parallel and rpl_optimistic_parallel tests still
fail.
MDEV-10156 : Group commit tests fail on 10.2 InnoDB (branch bb-10.2-jan)
Reason: incorrect merge
MDEV-10550: Parallel replication can't sync with master in InnoDB 5.7 (branch bb-10.2-jan)
Reason: incorrect merge
MySQL 5.6 do not work with MariaDB 10.1
Analysis: Problem is that tablespace flags bit DATA_DIR
is on different position on MySQL 5.6 compared to
MariaDB 10.1.
Fix: If we detect that there is difference between dictionary
flags and tablespace flags we remove DATA_DIR flag and compare
again. Remote tablespace is tried to locate even in case
when DATA_DIR flag is not set.
Analysis: When pages in doublewrite buffer are analyzed compressed
pages do not have correct checksum.
Fix: Decompress page before checksum is compared. If decompression
fails we still check checksum and corrupted pages are found.
If decompression succeeds, page now contains the original
checksum.
Problem was that link file (.isl) is also opened using O_DIRECT
mode and if this fails the whole create table fails on internal
error.
Fixed by not using O_DIRECT on link files as they are used only
on create table and startup and do not contain real data.
O_DIRECT failures are successfully ignored for data files
if O_DIRECT is not supported by file system on used
data directory.
Make sure that on decrypt we do not try to reference
NULL pointer and if page contains undefined
FIL_PAGE_FILE_FLUSH_LSN field on when page is not
the first page or page is not in system tablespace,
clear it.
There was two problems. Firstly, if page in ibuf is encrypted but
decrypt failed we should not allow InnoDB to start because
this means that system tablespace is encrypted and not usable.
Secondly, if page decrypt is detected we should return false
from buf_page_decrypt_after_read.