Commit graph

284 commits

Author SHA1 Message Date
Marko Mäkelä
cd694d76ce Merge 10.0 into 10.1 2017-09-06 15:32:56 +03:00
Marko Mäkelä
38ca9be4de MDEV-13684 InnoDB race condition between fil_crypt_thread and btr_scrub_init
There is a race condition in InnoDB startup. A number of
fil_crypt_thread are created by fil_crypt_threads_init(). These threads
may call btr_scrub_complete_space() before btr_scrub_init() was called.
Those too early calls would be accessing an uninitialized scrub_stat_mutex.

innobase_start_or_create_for_mysql(): Invoke btr_scrub_init() before
fil_crypt_threads_init().

fil_crypt_complete_rotate_space(): Only invoke btr_scrub_complete_space()
if scrubbing is enabled. There is no need to update the statistics if
it is not enabled.
2017-08-31 11:08:43 +03:00
Jan Lindström
352d27ce36 MDEV-13557: Startup failure, unable to decrypt ibdata1
Fixes also MDEV-13488: InnoDB writes CRYPT_INFO even though
encryption is not enabled.

Problem was that we created encryption metadata (crypt_data) for
system tablespace even when no encryption was enabled and too early.
System tablespace can be encrypted only using key rotation.

Test innodb-key-rotation-disable, innodb_encryption, innodb_lotoftables
require adjustment because INFORMATION_SCHEMA INNODB_TABLESPACES_ENCRYPTION
contain row only if tablespace really has encryption metadata.

fil_crypt_set_thread_cnt: Send message to background encryption threads
if they exits when they are ready. This is required to find tablespaces
requiring key rotation if no other changes happen.

fil_crypt_find_space_to_rotate: Decrease the amount of time waiting
when nothing happens to better enable key rotation on startup.

fsp_header_init: Write encryption metadata to page 0 only if tablespace is
encrypted or encryption is disabled by table option.

i_s_dict_fill_tablespaces_encryption : Skip tablespaces that do not
contain encryption metadata. This is required to avoid too early
wait condition trigger in encrypted -> unencrypted state transfer.

open_or_create_data_files: Do not create encryption metadata
by default to system tablespace.
2017-08-29 14:23:34 +03:00
Jan Lindström
61096ff214 MDEV-13591: InnoDB: Database page corruption on disk or a failed file read and assertion failure
Problem is that page 0 and its possible enrryption information
is not read for undo tablespaces.

fil_crypt_get_latest_key_version(): Do not send event to
encryption threads if event does not yet exists. Seen
on regression testing.

fil_read_first_page: Add new parameter does page belong to
undo tablespace and if it does, we do not read FSP_HEADER.

srv_undo_tablespace_open : Read first page of the tablespace
to get crypt_data if it exists and pass it to fil_space_create.

Tested using innodb_encryption with combinations with
innodb-undo-tablespaces.
2017-08-28 09:49:30 +03:00
Marko Mäkelä
97f9d3c080 MDEV-13167 InnoDB key rotation is not skipping unused pages
In key rotation, we must initialize unallocated but previously
initialized pages, so that if encryption is enabled on a table,
all clear-text data for the page will eventually be overwritten.
But we should not rotate keys on pages that were never allocated
after the data file was created.

According to the latching order rules, after acquiring the
tablespace latch, no page latches of previously allocated user pages
may be acquired. So, key rotation should check the page allocation
status after acquiring the page latch, not before. But, the latching
order rules also prohibit accessing pages that were not allocated first,
and then acquiring the tablespace latch. Such behaviour would indeed
result in a deadlock when running the following tests:
encryption.innodb_encryption-page-compression
encryption.innodb-checksum-algorithm

Because the key rotation is accessing potentially unallocated pages, it
cannot reliably check if these pages were allocated. It can only check
the page header. If the page number is zero, we can assume that the
page is unallocated.

fil_crypt_rotate_page(): Detect uninitialized pages by FIL_PAGE_OFFSET.
Page 0 is never encrypted, and on other pages that are initialized,
FIL_PAGE_OFFSET must contain the page number.

fil_crypt_is_page_uninitialized(): Remove. It suffices to check the
page number field in fil_crypt_rotate_page().
2017-08-23 10:12:39 +03:00
Jan Lindström
8b019f87dd MDEV-11939: innochecksum mistakes a file for an encrypted one (page 0 invalid)
Always read full page 0 to determine does tablespace contain
encryption metadata. Tablespaces that are page compressed or
page compressed and encrypted do not compare checksum as
it does not exists. For encrypted tables use checksum
verification written for encrypted tables and normal tables
use normal method.

buf_page_is_checksum_valid_crc32
buf_page_is_checksum_valid_innodb
buf_page_is_checksum_valid_none
	Add Innochecksum logging to file

buf_page_is_corrupted
        Remove ib_logf and page_warn_strict_checksum
        calls in innochecksum compilation. Add innochecksum
        logging to file.

fil0crypt.cc fil0crypt.h
        Modify to be able to use in innochecksum compilation and
	move fil_space_verify_crypt_checksum to end of the file.
	Add innochecksum logging to file.

univ.i
        Add innochecksum strict_verify, log_file and cur_page_num
        variables as extern.

page_zip_verify_checksum
        Add innochecksum logging to file.

innochecksum.cc
        Lot of changes most notable able to read encryption
        metadata from page 0 of the tablespace.

Added test case where we corrupt intentionally
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION (encryption key version)
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION+4 (post encryption checksum)
FIL_DATA+10 (data)
2017-08-03 08:29:36 +03:00
Marko Mäkelä
e555540ab6 MDEV-13105 InnoDB fails to load a table with PAGE_COMPRESSION_LEVEL after upgrade from 10.1.20
When using innodb_page_size=16k, InnoDB tables
that were created in MariaDB 10.1.0 to 10.1.20 with
PAGE_COMPRESSED=1 and
PAGE_COMPRESSION_LEVEL=2 or PAGE_COMPRESSION_LEVEL=3
would fail to load.

fsp_flags_is_valid(): When using innodb_page_size=16k, use a
more strict check for .ibd files, with the assumption that
nobody would try to use different-page-size files.
2017-07-05 14:55:56 +03:00
Marko Mäkelä
58f87a41bd Remove some fields from dict_table_t
dict_table_t::thd: Remove. This was only used by btr_root_block_get()
for reporting decryption failures, and it was only assigned by
ha_innobase::open(), and never cleared. This could mean that if a
connection is closed, the pointer would become stale, and the server
could crash while trying to report the error. It could also mean
that an error is being reported to the wrong client. It is better
to use current_thd in this case, even though it could mean that if
the code is invoked from an InnoDB background operation, there would
be no connection to which to send the error message.

Remove dict_table_t::crypt_data and dict_table_t::page_0_read.
These fields were never read.

fil_open_single_table_tablespace(): Remove the parameter "table".
2017-06-15 12:41:02 +03:00
Marko Mäkelä
3005cebc96 Post-push fix for MDEV-12610 MariaDB start is slow
fil_crypt_read_crypt_data(): Remove an unnecessary
acquisition of fil_system->mutex. Remove a duplicated condition
from the callers.
2017-06-12 17:10:56 +03:00
Jan Lindström
58c56dd7f8 MDEV-12610: MariaDB start is slow
Problem appears to be that the function fsp_flags_try_adjust()
is being unconditionally invoked on every .ibd file on startup.
Based on performance investigation also the top function
fsp_header_get_crypt_offset() needs to addressed.

Ported implementation of fsp_header_get_encryption_offset()
function from 10.2 to fsp_header_get_crypt_offset().

Introduced a new function fil_crypt_read_crypt_data()
to read page 0 if it is not yet read.

fil_crypt_find_space_to_rotate(): Now that page 0 for every .ibd
file is not read on startup we need to check has page 0 read
from space that we investigate for key rotation, if it is not read
we read it.

fil_space_crypt_get_status(): Now that page 0 for every .ibd
file is not read on startup here also we need to read page 0
if it is not yet read it. This is needed
as tests use IS query to wait until background encryption
or decryption has finished and this function is used to
produce results.

fil_crypt_thread(): Add is_stopping condition for tablespace
so that we do not rotate pages if usage of tablespace should
be stopped. This was needed for failure seen on regression
testing.

fil_space_create: Remove page_0_crypt_read and extra
unnecessary info output.

fil_open_single_table_tablespace(): We call fsp_flags_try_adjust
only when when no errors has happened and server was not started
on read only mode and tablespace validation was requested or
flags contain other table options except low order bits to
FSP_FLAGS_POS_PAGE_SSIZE position.

fil_space_t::page_0_crypt_read removed.

Added test case innodb-first-page-read to test startup when
encryption is on and when encryption is off to check that not
for all tables page 0 is read on startup.
2017-06-09 13:15:39 +03:00
Marko Mäkelä
fbeb9489cd Cleanup of MDEV-12600: crash during install_db with innodb_page_size=32K and ibdata1=3M
The doublewrite buffer pages must fit in the first InnoDB system
tablespace data file. The checks that were added in the initial patch
(commit 112b21da37)
were at too high level and did not cover all cases.

innodb.log_data_file_size: Test all innodb_page_size combinations.

fsp_header_init(): Never return an error. Move the change buffer creation
to the only caller that needs to do it.

btr_create(): Clean up the logic. Remove the error log messages.

buf_dblwr_create(): Try to return an error on non-fatal failure.
Check that the first data file is big enough for creating the
doublewrite buffers.

buf_dblwr_process(): Check if the doublewrite buffer is available.
Display the message only if it is available.

recv_recovery_from_checkpoint_start_func(): Remove a redundant message
about FIL_PAGE_FILE_FLUSH_LSN mismatch when crash recovery has already
been initiated.

fil_report_invalid_page_access(): Simplify the message.

fseg_create_general(): Do not emit messages to the error log.

innobase_init(): Revert the changes.

trx_rseg_create(): Refactor (no functional change).
2017-06-08 11:55:47 +03:00
Jan Lindström
112b21da37 MDEV-12600: crash during install_db with innodb_page_size=32K and ibdata1=3M;
Problem was that all doublewrite buffer pages must fit to first
system datafile.

Ported commit 27a34df7882b1f8ed283f22bf83e8bfc523cbfde
Author: Shaohua Wang <shaohua.wang@oracle.com>
Date:   Wed Aug 12 15:55:19 2015 +0800

    BUG#21551464 - SEGFAULT WHILE INITIALIZING DATABASE WHEN
    INNODB_DATA_FILE SIZE IS SMALL

To 10.1 (with extended error printout).

btr_create(): If ibuf header page allocation fails report error and
return FIL_NULL. Similarly if root page allocation fails return a error.

dict_build_table_def_step: If fsp_header_init fails return
error code.

fsp_header_init: returns true if header initialization succeeds
and false if not.

fseg_create_general: report error if segment or page allocation fails.

innobase_init: If first datafile is smaller than 3M and could not
contain all doublewrite buffer pages report error and fail to
initialize InnoDB plugin.

row_truncate_table_for_mysql: report error if fsp header init
fails.

srv_init_abort: New function to report database initialization errors.

srv_undo_tablespaces_init, innobase_start_or_create_for_mysql: If
database initialization fails report error and abort.

trx_rseg_create: If segment header creation fails return.
2017-06-01 14:07:48 +03:00
Jan Lindström
6b6987154a MDEV-12114: install_db shows corruption for rest encryption and innodb_checksum_algorithm=strict_none
Problem was that checksum check resulted false positives that page is
both not encrypted and encryted when checksum_algorithm was
strict_none.

Encrypton checksum will use only crc32 regardless of setting.

buf_zip_decompress: If compression fails report a error message
containing the space name if available (not available during import).
And note if space could be encrypted.

buf_page_get_gen: Do not assert if decompression fails,
instead unfix the page and return NULL to upper layer.

fil_crypt_calculate_checksum: Use only crc32 method.

fil_space_verify_crypt_checksum: Here we need to check
crc32, innodb and none method for old datafiles.

fil_space_release_for_io: Allow null space.

encryption.innodb-compressed-blob is now run with crc32 and none
combinations.

Note that with none and strict_none method there is not really
a way to detect page corruptions and page corruptions after
decrypting the page with incorrect key.

New test innodb-checksum-algorithm to test different checksum
algorithms with encrypted, row compressed and page compressed
tables.
2017-06-01 14:07:48 +03:00
Jan Lindström
1af8bf39ca MDEV-12113: install_db shows corruption for rest encryption with innodb_data_file_path=ibdata1:3M;
Problem was that FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION field that for
encrypted pages even in system datafiles should contain key_version
except very first page (0:0) is after encryption overwritten with
flush lsn.

Ported WL#7990 Repurpose FIL_PAGE_FLUSH_LSN to 10.1
The field FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION is consulted during
InnoDB startup.

At startup, InnoDB reads the FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION
from the first page of each file in the InnoDB system tablespace.
If there are multiple files, the minimum and maximum LSN can differ.
These numbers are passed to InnoDB startup.

Having the number in other files than the first file of the InnoDB
system tablespace is not providing much additional value. It is
conflicting with other use of the field, such as on InnoDB R-tree
index pages and encryption key_version.

This worklog will stop writing FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION to
other files than the first file of the InnoDB system tablespace
(page number 0:0) when system tablespace is encrypted. If tablespace
is not encrypted we continue writing FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION
to all first pages of system tablespace to avoid unnecessary
warnings on downgrade.

open_or_create_data_files(): pass only one flushed_lsn parameter

xb_load_tablespaces(): pass only one flushed_lsn parameter.

buf_page_create(): Improve comment about where
FIL_PAGE_FIL_FLUSH_LSN_OR_KEY_VERSION is set.

fil_write_flushed_lsn(): A new function, merged from
fil_write_lsn_and_arch_no_to_file() and
fil_write_flushed_lsn_to_data_files().
Only write to the first page of the system tablespace (page 0:0)
if tablespace is encrypted, or write all first pages of system
tablespace and invoke fil_flush_file_spaces(FIL_TYPE_TABLESPACE)
afterwards.

fil_read_first_page(): read flush_lsn and crypt_data only from
first datafile.

fil_open_single_table_tablespace(): Remove output of LSN, because it
was only valid for the system tablespace and the undo tablespaces, not
user tablespaces.

fil_validate_single_table_tablespace(): Remove output of LSN.

checkpoint_now_set(): Use fil_write_flushed_lsn and output
a error if operation fails.

Remove lsn variable from fsp_open_info.

recv_recovery_from_checkpoint_start(): Remove unnecessary second
flush_lsn parameter.

log_empty_and_mark_files_at_shutdown(): Use fil_writte_flushed_lsn
and output error if it fails.

open_or_create_data_files(): Pass only one flushed_lsn variable.
2017-06-01 14:07:48 +03:00
Jan Lindström
90c52e5291 MDEV-12615: InnoDB page compression method snappy mostly does not compress pages
Snappy compression method require that output buffer
used for compression is bigger than input buffer.
Similarly lzo require additional work memory buffer.
Increase the allocated buffer accordingly.

buf_tmp_buffer_t: removed unnecessary lzo_mem, crypt_buf_free and
comp_buf_free.

buf_pool_reserve_tmp_slot: use alligned_alloc and if snappy
available allocate size based on snappy_max_compressed_length and
if lzo is available increase buffer by LZO1X_1_15_MEM_COMPRESS.

fil_compress_page: Remove unneeded lzo mem (we use same buffer)
and if output buffer is not yet allocated allocate based similarly
as above.

Decompression does not require additional work area.

    Modify test to use same test as other compression method tests.
2017-05-20 21:51:34 +03:00
Marko Mäkelä
a4d4a5fe82 After-merge fix for MDEV-11638
In commit 360a4a0372
some debug assertions were introduced to the page flushing code
in XtraDB. Add these assertions to InnoDB as well, and adjust
the InnoDB shutdown so that these assertions will not fail.

logs_empty_and_mark_files_at_shutdown(): Advance
srv_shutdown_state from the first phase SRV_SHUTDOWN_CLEANUP
only after no page-dirtying activity is possible
(well, except by srv_master_do_shutdown_tasks(), which will be
fixed separately in MDEV-12052).

rotate_thread_t::should_shutdown(): Already exit the key rotation
threads at the first phase of shutdown (SRV_SHUTDOWN_CLEANUP).

page_cleaner_sleep_if_needed(): Do not sleep during shutdown.
This change is originally from XtraDB.
2017-05-20 08:41:34 +03:00
Marko Mäkelä
65e1399e64 Merge 10.0 into 10.1
Significantly reduce the amount of InnoDB, XtraDB and Mariabackup
code changes by defining pfs_os_file_t as something that is
transparently compatible with os_file_t.
2017-05-20 08:41:20 +03:00
Marko Mäkelä
13a350ac29 Merge 10.0 into 10.1 2017-05-19 12:29:37 +03:00
Vicențiu Ciorbaru
45898c2092 Merge remote-tracking branch 'origin/10.0' into 10.0 2017-05-18 15:45:55 +03:00
Jan Lindström
f302a3cf9d MDEV-12593: InnoDB page compression should use lz4_compress_default if
available

lz4.cmake: Check if shared or static lz4 library has LZ4_compress_default
function and if it has define HAVE_LZ4_COMPRESS_DEFAULT.

fil_compress_page: If HAVE_LZ4_COMPRESS_DEFAULT is defined use
LZ4_compress_default function for compression if not use
LZ4_compress_limitedOutput function.

Introduced a innodb-page-compression.inc file for page compression
tests that will also search .ibd file to verify that pages
are compressed (i.e. used search string is not found). Modified
page compression tests to use this file.

Note that snappy method is not included because of MDEV-12615
InnoDB page compression method snappy mostly does not compress pages
that will be fixed on different commit.
2017-05-18 09:29:44 +03:00
Marko Mäkelä
e22d86a3eb fil_create_new_single_table_tablespace(): Correct a bogus nonnull attribute
The parameter path can be passed as NULL.
This error was reported by GCC 7.1.0 when compiling
CMAKE_BUILD_TYPE=Debug with -O3.
2017-05-17 13:49:51 +03:00
Vicențiu Ciorbaru
d8b45b0c00 Merge branch 'merge-xtradb-5.6' into 10.0 2017-05-17 12:11:12 +03:00
Marko Mäkelä
956d2540c4 Remove redundant UT_LIST_INIT() calls
The macro UT_LIST_INIT() zero-initializes the UT_LIST_NODE.
There is no need to call this macro on a buffer that has
already been zero-initialized by mem_zalloc() or mem_heap_zalloc()
or similar.

For some reason, the statement UT_LIST_INIT(srv_sys->tasks) in
srv_init() caused a SIGSEGV on server startup when compiling with
GCC 7.1.0 for AMD64 using -O3. The zero-initialization was attempted
by the instruction movaps %xmm0,0x50(%rax), while the proper offset
of srv_sys->tasks would seem to have been 0x48.
2017-05-17 10:33:49 +03:00
Marko Mäkelä
febe88198e Make some variables const in fil_iterate()
This is a non-functional change to make it slightly easier
to read the code. We seem to have some bugs in this
IMPORT TABLESPACE code; see MDEV-12396.
2017-05-17 08:54:16 +03:00
Vicențiu Ciorbaru
360a4a0372 5.6.36-82.0 2017-05-16 14:16:11 +03:00
Marko Mäkelä
d7cfe2c4f3 MDEV-12253 post-fix: Do not leak memory in crash recovery
This is a backport from 10.2 where it fixes the
cmake -DWITH_ASAN test failure that was mentioned
in commit f9cc391863
(merging MDEV-12253 from 10.1 to 10.2).

fil_parse_write_crypt_data(): If the tablespace is not found,
invoke fil_space_destroy_crypt_data(&crypt_data) to properly
free the created object.
2017-05-09 14:36:17 +03:00
Jan Lindström
acce1f37c2 MDEV-12624: encryption.innodb_encryption_tables fails in buildbot with timeout
This regression was caused by MDEV-12467 encryption.create_or_replace
hangs during DROP TABLE, where if table->is_stopping() (i.e. when
tablespace is dropped) background key rotation thread calls
fil_crypt_complete_rotate_space to release space and stop rotation.
However, that function does not decrease number of rotating
threads if table->is_stopping() is true.
2017-05-02 08:09:16 +03:00
Jan Lindström
935a1c676e MDEV-12623: InnoDB: Failing assertion: kv == 0
|| kv >= crypt_data->min_key_version,
encryption.innodb_encryption_tables failed in buildbot.

Now that key_version is not stored when page is read to
buf_page_t::key_version but always read from actual page
this assertion is not always valid.
2017-04-29 10:05:39 +03:00
Marko Mäkelä
b82c602db5 MDEV-12602 InnoDB: Failing assertion: space->n_pending_ops == 0
This fixes a regression caused by MDEV-12428.
When we introduced a variant of fil_space_acquire() that could
increment space->n_pending_ops after space->stop_new_ops was set,
the logic of fil_check_pending_operations() was broken.

fil_space_t::n_pending_ios: A new field to track read or write
access from the buffer pool routines immediately before a block
write or after a block read in the file system.

fil_space_acquire_for_io(), fil_space_release_for_io(): Similar
to fil_space_acquire_silent() and fil_space_release(), but
modify fil_space_t::n_pending_ios instead of fil_space_t::n_pending_ops.

Adjust a number of places accordingly, and remove some redundant
tablespace lookups.

The following parts of this fix differ from the 10.2 version of this fix:

buf_page_get_corrupt(): Add a tablespace parameter.

In 10.2, we already had a two-phase process of freeing fil_space objects
(first, fil_space_detach(), then release fil_system->mutex, and finally
free the fil_space and fil_node objects).

fil_space_free_and_mutex_exit(): Renamed from fil_space_free().
Detach the tablespace from the fil_system cache, release the
fil_system->mutex, and then wait for space->n_pending_ios to reach 0,
to avoid accessing freed data in a concurrent thread.
During the wait, future calls to fil_space_acquire_for_io() will
not find this tablespace, and the count can only be decremented to 0,
at which point it is safe to free the objects.

fil_node_free_part1(), fil_node_free_part2(): Refactored from
fil_node_free().
2017-04-28 14:12:52 +03:00
Vladislav Vaintroub
ecb25df21b Xtrabackup 2.3.8 2017-04-27 19:12:42 +02:00
Vladislav Vaintroub
9c4b7cad27 MDEV-9566 Prepare xtradb for xtrabackup
These changes are comparable to Percona's modifications in innodb in the
Percona Xtrabackup repository.

- If functions are used in backup as well as in innodb, make them non-static.

- Define IS_XTRABACKUP() macro for special handling of innodb running
  inside backup.

- Extend some functions for backup.
  fil_space_for_table_exists_in_mem() gets additional parameter
  'remove_from_data_dict_if_does_not_exist', for partial backups

  fil_load_single_table_tablespaces() gets an optional parameter predicate
  which tells whether to load tablespace based on database or table name,
  also for partial backups.

  srv_undo_tablespaces_init() gets an optional parameter 'backup_mode'

- Allow single redo log file (for backup "prepare")

- Do not read doublewrite buffer pages in backup, they are outdated

- Add function fil_remove_invalid_table_from_data_dict(), to remove non-existing
   tables from data dictionary in case of partial backups.

- On Windows, fix file share modes when opening tablespaces,
to allow mariabackup to read tablespaces while server is online.

- Avoid access to THDVARs in backup, because innodb plugin is not loaded,
and THDVAR would crash in this case.
2017-04-27 19:12:39 +02:00
Jan Lindström
765a43605a MDEV-12253: Buffer pool blocks are accessed after they have been freed
Problem was that bpage was referenced after it was already freed
from LRU. Fixed by adding a new variable encrypted that is
passed down to buf_page_check_corrupt() and used in
buf_page_get_gen() to stop processing page read.

This patch should also address following test failures and
bugs:

MDEV-12419: IMPORT should not look up tablespace in
PageConverter::validate(). This is now removed.

MDEV-10099: encryption.innodb_onlinealter_encryption fails
sporadically in buildbot

MDEV-11420: encryption.innodb_encryption-page-compression
failed in buildbot

MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8

Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing
and replaced these with dict_table_t::file_unreadable. Table
ibd file is missing if fil_get_space(space_id) returns NULL
and encrypted if not. Removed dict_table_t::is_corrupted field.

Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(),
buf_page_decrypt_after_read(), buf_page_encrypt_before_write(),
buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats().

Added test cases when enrypted page could be read while doing
redo log crash recovery. Also added test case for row compressed
blobs.

btr_cur_open_at_index_side_func(),
btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is
NULL.

buf_page_get_zip(): Issue error if page read fails.

buf_page_get_gen(): Use dberr_t for error detection and
do not reference bpage after we hare freed it.

buf_mark_space_corrupt(): remove bpage from LRU also when
it is encrypted.

buf_page_check_corrupt(): @return DB_SUCCESS if page has
been read and is not corrupted,
DB_PAGE_CORRUPTED if page based on checksum check is corrupted,
DB_DECRYPTION_FAILED if page post encryption checksum matches but
after decryption normal page checksum does not match. In read
case only DB_SUCCESS is possible.

buf_page_io_complete(): use dberr_t for error handling.

buf_flush_write_block_low(),
buf_read_ahead_random(),
buf_read_page_async(),
buf_read_ahead_linear(),
buf_read_ibuf_merge_pages(),
buf_read_recv_pages(),
fil_aio_wait():
        Issue error if page read fails.

btr_pcur_move_to_next_page(): Do not reference page if it is
NULL.

Introduced dict_table_t::is_readable() and dict_index_t::is_readable()
that will return true if tablespace exists and pages read from
tablespace are not corrupted or page decryption failed.
Removed buf_page_t::key_version. After page decryption the
key version is not removed from page frame. For unencrypted
pages, old key_version is removed at buf_page_encrypt_before_write()

dict_stats_update_transient_for_index(),
dict_stats_update_transient()
        Do not continue if table decryption failed or table
        is corrupted.

dict0stats.cc: Introduced a dict_stats_report_error function
to avoid code duplication.

fil_parse_write_crypt_data():
        Check that key read from redo log entry is found from
        encryption plugin and if it is not, refuse to start.

PageConverter::validate(): Removed access to fil_space_t as
tablespace is not available during import.

Fixed error code on innodb.innodb test.

Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown
to innodb-bad-key-change2.  Removed innodb-bad-key-change5 test.
Decreased unnecessary complexity on some long lasting tests.

Removed fil_inc_pending_ops(), fil_decr_pending_ops(),
fil_get_first_space(), fil_get_next_space(),
fil_get_first_space_safe(), fil_get_next_space_safe()
functions.

fil_space_verify_crypt_checksum(): Fixed bug found using ASAN
where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly
accessed from row compressed tables. Fixed out of page frame
bug for row compressed tables in
fil_space_verify_crypt_checksum() found using ASAN. Incorrect
function was called for compressed table.

Added new tests for discard, rename table and drop (we should allow them
even when page decryption fails). Alter table rename is not allowed.
Added test for restart with innodb-force-recovery=1 when page read on
redo-recovery cant be decrypted. Added test for corrupted table where
both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted.

Adjusted the test case innodb_bug14147491 so that it does not anymore
expect crash. Instead table is just mostly not usable.

fil0fil.h: fil_space_acquire_low is not visible function
and fil_space_acquire and fil_space_acquire_silent are
inline functions. FilSpace class uses fil_space_acquire_low
directly.

recv_apply_hashed_log_recs() does not return anything.
2017-04-26 15:19:16 +03:00
Marko Mäkelä
200ef51344 Fix a compilation error 2017-04-21 18:29:50 +03:00
Marko Mäkelä
996c7d5cb5 MDEV-12545 Reduce the amount of fil_space_t lookups
buf_flush_write_block_low(): Acquire the tablespace reference once,
and pass it to lower-level functions. This is only a start; further
calls may be removed later.
2017-04-21 17:47:23 +03:00
Marko Mäkelä
555e52f3bc MDEV-12467 encryption.create_or_replace hangs during DROP TABLE
fil_crypt_thread(): Do invoke fil_crypt_complete_rotate_space()
when the tablespace is about to be dropped. Also, remove a redundant
check whether rotate_thread_t::space is NULL. It can only become
NULL when fil_crypt_find_space_to_rotate() returns false, and in
that case we would already have terminated the loop.

fil_crypt_find_page_to_rotate(): Remove a redundant check for
space->crypt_data == NULL. Once encryption metadata has been
created for a tablespace, it cannot be removed without dropping
the entire tablespace.
2017-04-21 17:47:15 +03:00
Marko Mäkelä
aafaf05a47 Fix some InnoDB type mismatch introduced in 10.1
On 64-bit Windows, sizeof(ulint)!=sizeof(ulong).
2017-04-21 17:43:35 +03:00
Marko Mäkelä
3a6af51a8a Do not crash XtraDB when fil_space_acquire() fails
This reverts part of commit 50eb40a2a8
which backported the code from MariaDB 10.2. The XtraDB version of
the code included a ut_error statement (aborting the process) when
a tablespace is not found. Luckily this change was not part of a
release; MariaDB 10.1.22 had been released some days earlier.
2017-04-21 13:10:28 +03:00
Daniel Black
a7bb9e8fdb xtradb: fil_crypt_rotate_page, space_id should be compared to TRX_SYS_SPACE not space
like 9a218f4fb8 fil_crypt_rotate_page
 - space_id should be compared to TRX_SYS_SPACE not space

Signed-off-by: Daniel Black <daniel.black@au.ibm.com>
2017-04-05 16:29:08 +10:00
Marko Mäkelä
9505c96839 MDEV-12428 SIGSEGV in buf_page_decrypt_after_read() during DDL
Also, some MDEV-11738/MDEV-11581 post-push fixes.

In MariaDB 10.1, there is no fil_space_t::is_being_truncated field,
and the predicates fil_space_t::stop_new_ops and fil_space_t::is_stopping()
are interchangeable. I requested the fil_space_t::is_stopping() to be added
in the review, but some added checks for fil_space_t::stop_new_ops were
not replaced with calls to fil_space_t::is_stopping().

buf_page_decrypt_after_read(): In this low-level I/O operation, we must
look up the tablespace if it exists, even though future I/O operations
have been blocked on it due to a pending DDL operation, such as DROP TABLE
or TRUNCATE TABLE or other table-rebuilding operations (ALTER, OPTIMIZE).
Pass a parameter to fil_space_acquire_low() telling that we are performing
a low-level I/O operation and the fil_space_t::is_stopping() status should
be ignored.
2017-04-03 22:09:28 +03:00
Jan Lindström
b1ec35b903 Add assertions when key rotation list is used. 2017-03-16 17:30:13 +02:00
Jan Lindström
50eb40a2a8 MDEV-11738: Mariadb uses 100% of several of my 8 cpus doing nothing
MDEV-11581: Mariadb starts InnoDB encryption threads
when key has not changed or data scrubbing turned off

Background: Key rotation is based on background threads
(innodb-encryption-threads) periodically going through
all tablespaces on fil_system. For each tablespace
current used key version is compared to max key age
(innodb-encryption-rotate-key-age). This process
naturally takes CPU. Similarly, in same time need for
scrubbing is investigated. Currently, key rotation
is fully supported on Amazon AWS key management plugin
only but InnoDB does not have knowledge what key
management plugin is used.

This patch re-purposes innodb-encryption-rotate-key-age=0
to disable key rotation and background data scrubbing.
All new tables are added to special list for key rotation
and key rotation is based on sending a event to
background encryption threads instead of using periodic
checking (i.e. timeout).

fil0fil.cc: Added functions fil_space_acquire_low()
to acquire a tablespace when it could be dropped concurrently.
This function is used from fil_space_acquire() or
fil_space_acquire_silent() that will not print
any messages if we try to acquire space that does not exist.
fil_space_release() to release a acquired tablespace.
fil_space_next() to iterate tablespaces in fil_system
using fil_space_acquire() and fil_space_release().
Similarly, fil_space_keyrotation_next() to iterate new
list fil_system->rotation_list where new tables.
are added if key rotation is disabled.
Removed unnecessary functions fil_get_first_space_safe()
fil_get_next_space_safe()

fil_node_open_file(): After page 0 is read read also
crypt_info if it is not yet read.

btr_scrub_lock_dict_func()
buf_page_check_corrupt()
buf_page_encrypt_before_write()
buf_merge_or_delete_for_page()
lock_print_info_all_transactions()
row_fts_psort_info_init()
row_truncate_table_for_mysql()
row_drop_table_for_mysql()
    Use fil_space_acquire()/release() to access fil_space_t.

buf_page_decrypt_after_read():
    Use fil_space_get_crypt_data() because at this point
    we might not yet have read page 0.

fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly
to functions needing it and store fil_space_t* to rotation state.
Use fil_space_acquire()/release() when iterating tablespaces
and removed unnecessary is_closing from fil_crypt_t. Use
fil_space_t::is_stopping() to detect when access to
tablespace should be stopped. Removed unnecessary
fil_space_get_crypt_data().

fil_space_create(): Inform key rotation that there could
be something to do if key rotation is disabled and new
table with encryption enabled is created.
Remove unnecessary functions fil_get_first_space_safe()
and fil_get_next_space_safe(). fil_space_acquire()
and fil_space_release() are used instead. Moved
fil_space_get_crypt_data() and fil_space_set_crypt_data()
to fil0crypt.cc.

fsp_header_init(): Acquire fil_space_t*, write crypt_data
and release space.

check_table_options()
	Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_*

i_s.cc: Added ROTATING_OR_FLUSHING field to
information_schema.innodb_tablespace_encryption
to show current status of key rotation.
2017-03-14 16:23:10 +02:00
Marko Mäkelä
9dc10d5851 Merge 10.0 into 10.1 2017-03-13 19:17:34 +02:00
Marko Mäkelä
b28adb6a62 Fix an error introduced in the previous commit.
fil_parse_write_crypt_data(): Correct the comparison operator.
This was broken in commit 498f4a825b
which removed a signed/unsigned mismatch in these comparisons.
2017-03-09 15:09:44 +02:00
Marko Mäkelä
498f4a825b Fix InnoDB/XtraDB compilation warnings on 32-bit builds. 2017-03-09 08:54:07 +02:00
Marko Mäkelä
68d632bc5a Replace some functions with macros.
This is a non-functional change.

On a related note, the calls fil_system_enter() and fil_system_exit()
are often used in an unsafe manner. The fix of MDEV-11738 should
introduce fil_space_acquire() and remove potential race conditions.
2017-03-06 10:02:01 +02:00
Marko Mäkelä
adc91387e3 Merge 10.0 into 10.1 2017-03-03 13:27:12 +02:00
Marko Mäkelä
29c776cfd1 MDEV-11520: Retry posix_fallocate() after EINTR.
The function posix_fallocate() as well as the Linux system call
fallocate() can return EINTR when the operation was interrupted
by a signal. In that case, keep retrying the operation, except
if InnoDB shutdown has been initiated.
2017-03-03 12:03:33 +02:00
Marko Mäkelä
ec4cf111c0 MDEV-11520 after-merge fix for 10.1: Use sparse files.
If page_compression (introduced in MariaDB Server 10.1) is enabled,
the logical action is to not preallocate space to the data files,
but to only logically extend the files with zeroes.

fil_create_new_single_table_tablespace(): Create smaller files for
ROW_FORMAT=COMPRESSED tables, but adhere to the minimum file size of
4*innodb_page_size.

fil_space_extend_must_retry(), os_file_set_size(): On Windows,
use SetFileInformationByHandle() and FILE_END_OF_FILE_INFO,
which depends on bumping _WIN32_WINNT to 0x0600.
FIXME: The files are not yet set up as sparse, so
this will currently end up physically extending (preallocating)
the files, wasting storage for unused pages.

os_file_set_size(): Add the parameter "bool sparse=false" to declare
that the file is to be extended logically, instead of being preallocated.
The only caller with sparse=true is
fil_create_new_single_table_tablespace().
(The system tablespace cannot be created with page_compression.)

fil_space_extend_must_retry(), os_file_set_size(): Outside Windows,
use ftruncate() to extend files that are supposed to be sparse.
On systems where ftruncate() is limited to files less than 4GiB
(if there are any), fil_space_extend_must_retry() retains the
old logic of physically extending the file.
2017-02-22 22:29:56 +02:00
Marko Mäkelä
e1e920bf63 Merge 10.0 into 10.1 2017-02-22 15:53:05 +02:00
Marko Mäkelä
a0ce92ddc7 MDEV-11520 post-fix
fil_extend_space_to_desired_size(): Use a proper type cast when
computing start_offset for the posix_fallocate() call on 32-bit systems
(where sizeof(ulint) < sizeof(os_offset_t)). This could affect 32-bit
systems when extending files that are at least 4 MiB long.

This bug existed in MariaDB 10.0 before MDEV-11520. In MariaDB 10.1
it had been fixed in MDEV-11556.
2017-02-22 12:32:17 +02:00