Commit graph

217 commits

Author SHA1 Message Date
Marko Mäkelä
5a136d84f3 MDEV-19766: Disable page dump output in debug builds
The test innodb.leaf_page_corrupted_during_recovery
fails on buildbot with

Warning	1406	Data too long for column 'line' at row 10
line
 len 16384; hex ...

because of a page dumps that InnoDB is generating for a corrupted page

Since this test is using debug instrumentation, we will solve the
issue by disabling page dumps in debug builds altogether. Users of
debug builds will likely know how to extract page dumps in other means.

Page dump output could sometimes be useful when diagnosing problems
that users are facing. Hence we will keep the page dump output in
non-debug (release) builds.
2019-07-01 13:27:12 +03:00
Thirunarayanan Balathandayuthapani
723a4b1d78 MDEV-17228 Encrypted temporary tables are not encrypted
- Introduce a new variable called innodb_encrypt_temporary_tables which is
a boolean variable. It decides whether to encrypt the temporary tablespace.
- Encrypts the temporary tablespace based on full checksum format.
- Introduced a new counter to track encrypted and decrypted temporary
tablespace pages.
- Warnings issued if temporary table creation has conflict value with
innodb_encrypt_temporary_tables
- Added a new test case which reads and writes the pages from/to temporary
tablespace.
2019-06-28 19:07:59 +05:30
Thirunarayanan Balathandayuthapani
e9145aab44 MDEV-19435 buf_fix_count > 0 for corrupted page when it exits the LRU list
Problem:
=========
One of the purge thread access the corrupted page and tries to remove from
LRU list. In the mean time, other purge threads are waiting for same page
in buf_wait_for_read(). Assertion(buf_fix_count == 0) fails for the
purge thread which tries to remove the page from LRU list.

Solution:
========
- Set the page id as FIL_NULL to indicate the page is corrupted before
removing the block from LRU list. Acquire hash lock for the particular
page id and wait for the other threads to release buf_fix_count
for the block.

- Added the error check for btr_cur_open() in row_search_on_row_ref().
2019-06-13 16:13:51 +03:00
Thirunarayanan Balathandayuthapani
b4287ec386 MDEV-19541 InnoDB crashes when trying to recover a corrupted page
- Use corrupt page id instead of whole block after releasing it from
LRU list.
2019-06-05 16:36:51 +05:30
Thirunarayanan Balathandayuthapani
79b46ab2a6 MDEV-19541 InnoDB crashes when trying to recover a corrupted page
- Don't apply redo log for the corrupted page when innodb_force_recovery > 0.
- Allow the table to be dropped when index root page is
corrupted when innodb_force_recovery > 0.
2019-05-28 11:55:02 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru
c0ac0b8860 Update FSF address 2019-05-11 19:25:02 +03:00
Marko Mäkelä
d315b4ff39 Remove IBUF_COUNT_DEBUG
The compile-time option IBUF_COUNT_DEBUG has not been used for years.
It would only work with up to 3 created .ibd files, with no buffered
changes existing while InnoDB is started up.
2019-04-19 12:44:46 +03:00
Marko Mäkelä
169c00994b MDEV-12699 Improve crash recovery of corrupted data pages
InnoDB crash recovery used to read every data page for which
redo log exists. This is unnecessary for those pages that are
initialized by the redo log. If a newly created page is corrupted,
recovery could unnecessarily fail. It would suffice to reinitialize
the page based on the redo log records.

To add insult to injury, InnoDB crash recovery could hang if it
encountered a corrupted page. We will fix also that problem.
InnoDB would normally refuse to start up if it encounters a
corrupted page on recovery, but that can be overridden by
setting innodb_force_recovery=1.

Data pages are completely initialized by the records
MLOG_INIT_FILE_PAGE2 and MLOG_ZIP_PAGE_COMPRESS.
MariaDB 10.4 additionally recognizes MLOG_INIT_FREE_PAGE,
which notifies that a page has been freed and its contents
can be discarded (filled with zeroes).

The record MLOG_INDEX_LOAD notifies that redo logging has
been re-enabled after being disabled. We can avoid loading
the page if all buffered redo log records predate the
MLOG_INDEX_LOAD record.

For the internal tables of FULLTEXT INDEX, no MLOG_INDEX_LOAD
records were written before commit aa3f7a107c.
Hence, we will skip these optimizations for tables whose
name starts with FTS_.

This is joint work with Thirunarayanan Balathandayuthapani.

fil_space_t::enable_lsn, file_name_t::enable_lsn: The LSN of the
latest recovered MLOG_INDEX_LOAD record for a tablespace.

mlog_init: Page initialization operations discovered during
redo log scanning. FIXME: This really belongs in recv_sys->addr_hash,
and should be removed in MDEV-19176.

recv_addr_state: Add the new state RECV_WILL_NOT_READ to
indicate that according to mlog_init, the page will be
initialized based on redo log record contents.

recv_add_to_hash_table(): Set the RECV_WILL_NOT_READ state
if appropriate. For now, we do not treat MLOG_ZIP_PAGE_COMPRESS
as page initialization. This works around bugs in the crash
recovery of ROW_FORMAT=COMPRESSED tables.

recv_mark_log_index_load(): Process a MLOG_INDEX_LOAD record
by resetting the state to RECV_NOT_PROCESSED and by updating
the fil_name_t::enable_lsn.

recv_init_crash_recovery_spaces(): Copy fil_name_t::enable_lsn
to fil_space_t::enable_lsn.

recv_recover_page(): Add the parameter init_lsn, to ignore
any log records that precede the page initialization.
Add DBUG output about skipped operations.

buf_page_create(): Initialize FIL_PAGE_LSN, so that
recv_recover_page() will not wrongly skip applying
the page-initialization record due to the field containing
some newer LSN as a leftover from a different page.
Do not invoke ibuf_merge_or_delete_for_page() during
crash recovery.

recv_apply_hashed_log_recs(): Remove some unnecessary lookups.
Note if a corrupted page was found during recovery.
After invoking buf_page_create(), do invoke
ibuf_merge_or_delete_for_page() via mlog_init.ibuf_merge()
in the last recovery batch.

ibuf_merge_or_delete_for_page(): Relax a debug assertion.

innobase_start_or_create_for_mysql(): Abort startup if
a corrupted page was found during recovery. Corrupted pages
will not be flagged if innodb_force_recovery is set.
However, the recv_sys->found_corrupt_fs flag can be set
regardless of innodb_force_recovery if file names are found
to be incorrect (for example, multiple files with the same
tablespace ID).
2019-04-17 13:58:41 +03:00
Marko Mäkelä
f120a15b93 MDEV-19212 4GB Limit on large_pages - integer overflow
os_mem_alloc_large(): Invoke the macro ut_2pow_round() with the
correct argument type.

innobase_large_page_size, innobase_use_large_pages,
os_use_large_pages, os_large_page_size: Remove.
Simply refer to opt_large_page_size, my_use_large_pages.
2019-04-08 21:33:49 +03:00
Marko Mäkelä
1d30b7b1d2 MDEV-12699 preparation: Clean up recv_sys
The recv_sys data structures are accessed not only from the thread
that executes InnoDB plugin initialization, but also from the
InnoDB I/O threads, which can invoke recv_recover_page().

Assert that sufficient concurrency control is in place.
Some code was accessing recv_sys data structures without
holding recv_sys->mutex.

recv_recover_page(bpage): Refactor the call from buf_page_io_complete()
into a separate function that performs necessary steps. The
main thread was unnecessarily releasing and reacquiring recv_sys->mutex.

recv_recover_page(block,mtr,recv_addr): Pass more parameters from
the caller. Avoid redundant lookups and computations. Eliminate some
redundant variables.

recv_get_fil_addr_struct(): Assert that recv_sys->mutex is being held.
That was not always the case!

recv_scan_log_recs(): Acquire recv_sys->mutex for the whole duration
of the function. (While we are scanning and buffering redo log records,
no pages can be read in.)

recv_read_in_area(): Properly protect access with recv_sys->mutex.

recv_apply_hashed_log_recs(): Check recv_addr->state only once,
and continuously hold recv_sys->mutex. The mutex will be released
and reacquired inside recv_recover_page() and recv_read_in_area(),
allowing concurrent processing by buf_page_io_complete() in I/O threads.
2019-04-06 21:25:43 +03:00
Marko Mäkelä
1b95118c5f buf_page_get_gen(): Allow BUF_GET_IF_IN_POOL with a dummy page_size
The page_size argument to buf_page_get_gen() only matters when the
page is going to be loaded into the buffer pool. Allow callers to
pass a dummy parameter when using BUF_GET_IF_IN_POOL (which would
return NULL if the block is not in the buffer pool).
2019-04-06 21:25:43 +03:00
Marko Mäkelä
cad56fbaba MDEV-18733 MariaDB slow start after crash recovery
If InnoDB crash recovery was needed, the InnoDB function srv_start()
would invoke extra validation, reading something from every InnoDB
data file. This should be unnecessary now that MDEV-14717 made
RENAME operations crash-safe inside InnoDB (which can be
disabled in MariaDB 10.2 by setting innodb_safe_truncate=OFF).

dict_check_sys_tables(): Skip tables that would be dropped by
row_mysql_drop_garbage_tables(). Perform extra validation only
if innodb_safe_truncate=OFF, innodb_force_recovery=0 and
crash recovery was needed.

dict_load_table_one(): Validate the root page of the table.
In this way, we can deny access to corrupted or mismatching tables
not only after crash recovery, but also after a clean shutdown.
2019-04-03 19:56:03 +03:00
Marko Mäkelä
226ca250ed Merge 10.1 into 10.2 2019-03-26 14:17:19 +02:00
Marko Mäkelä
065ba53ccb MDEV-12711 mariabackup --backup is refused for multi-file system tablespace
Before MDEV-12113 (MariaDB Server 10.1.25), on shutdown InnoDB would write
the current LSN to the first page of each file of the system tablespace.
This is incompatible with MariaDB's InnoDB table encryption, because
encryption repurposed the field for an encryption key ID and checksum.

buf_page_is_corrupted(): For the InnoDB system tablespace, skip
FIL_PAGE_FILE_FLUSH_LSN when checking if a page is all zero,
because the first page of each file in the system tablespace can
contain nonzero bytes in the field.
2019-03-26 13:51:15 +02:00
Marko Mäkelä
15a20367fb buf_page_get_zip(): Deduplicate some code 2019-02-23 10:31:07 +02:00
Marko Mäkelä
2c8d9a4e59 MDEV-10813: Update buf_page_t::buf_fix_count outside mutex
Since MySQL 5.6.16 (and MariaDB Server 10.0.11), changes of
buf_page_t::buf_fix_count are atomic memory operations if
PAGE_ATOMIC_REF_COUNT is defined. Since MySQL 5.7
(and MariaDB Server 10.2.2), the field is always updated
by atomic memory operations.

In a few occurrences, updates of the counter were unnecessarily
surrounded by an acquisition and release of the block mutex
(buf_block_t::mutex or buf_pool_t::zip_mutex). Remove these
unnecessary mutex operations.
2019-02-22 22:56:22 +02:00
Marko Mäkelä
a249e57b68 Merge 10.1 into 10.2
Temporarily disable a test for
commit 2175bfce3e
because fixing it in 10.2 requires updating libmariadb.
2019-02-03 17:22:05 +02:00
Marko Mäkelä
955c7b3222 MDEV-16896 encryption.innodb-checksum-algorithm crashes
buf_page_is_corrupted(): Read the global variable srv_checksum_algorithm
only once in order to avoid a race condition when
SET GLOBAL innodb_checksum_algorithm=...;
is being executed concurrently with this function.
2019-02-03 17:09:20 +02:00
Thirunarayanan Balathandayuthapani
f669cecbe3 MDEV-18415 mariabackup.mdev-14447 test case fails with Table 'test.t' doesn't exist in engine
- Added retry logic if validation of first page fails with checksum
mismatch.
2019-02-01 08:53:50 +02:00
Ian Gilfillan
0c2fc9b3da Update InnoDB urls 2018-12-19 00:07:33 +04:00
Marko Mäkelä
560df47926 Merge 10.1 into 10.2 2018-12-18 16:28:19 +02:00
Thirunarayanan Balathandayuthapani
171271edf8 MDEV-18025 Mariabackup fails to detect corrupted page_compressed=1 tables
Problem:
=======
Mariabackup seems to fail to verify the pages of compressed tables.
The reason is that both fil_space_verify_crypt_checksum() and
buf_page_is_corrupted() will skip the validation for compressed pages.

Fix:
====
Mariabackup should call fil_page_decompress() for compressed and encrypted
compressed page. After that, call buf_page_is_corrupted() to
check the page corruption.
2018-12-18 18:07:17 +05:30
Marko Mäkelä
7d245083a4 Merge 10.1 into 10.2 2018-12-17 20:15:38 +02:00
Marko Mäkelä
8c43f96388 Follow-up to MDEV-12112: corruption in encrypted table may be overlooked
The initial fix only covered a part of Mariabackup.
This fix hardens InnoDB and XtraDB in a similar way, in order
to reduce the probability of mistaking a corrupted encrypted page
for a valid unencrypted one.

This is based on work by Thirunarayanan Balathandayuthapani.

fil_space_verify_crypt_checksum(): Assert that key_version!=0.
Let the callers guarantee that. Now that we have this assertion,
we also know that buf_page_is_zeroes() cannot hold.
Also, remove all diagnostic output and related parameters,
and let the relevant callers emit such messages.
Last but not least, validate the post-encryption checksum
according to the innodb_checksum_algorithm (only accepting
one checksum for the strict variants), and no longer
try to validate the page as if it was unencrypted.

buf_page_is_zeroes(): Move to the compilation unit of the only callers,
and declare static.

xb_fil_cur_read(), buf_page_check_corrupt(): Add a condition before
calling fil_space_verify_crypt_checksum(). This is a non-functional
change.

buf_dblwr_process(): Validate the page only as encrypted or unencrypted,
but not both.
2018-12-17 19:33:44 +02:00
Marko Mäkelä
1a780eefc9 MDEV-17958 Make bug-endian innodb_checksum_algorithm=crc32 optional
In MySQL 5.7, it was noticed that files are not portable between
big-endian and little-endian processor architectures
(such as SPARC and x86), because the original implementation of
innodb_checksum_algorithm=crc32 was not byte order agnostic.

A byte order agnostic implementation of innodb_checksum_algorithm=crc32
was only added to MySQL 5.7, not backported to 5.6. Consequently,
MariaDB Server versions 10.0 and 10.1 only contain the CRC-32C
implementation that works incorrectly on big-endian architectures,
and MariaDB Server 10.2.2 got the byte-order agnostic CRC-32C
implementation from MySQL 5.7.

MySQL 5.7 introduced a "legacy crc32" variant that is functionally
equivalent to the big-endian version of the original crc32 implementation.
Thanks to this variant, old data files can be transferred from big-endian
systems to newer versions.

Introducing new variants of checksum algorithms (without introducing
new names for them, or something on the pages themselves to identify
the algorithm) generally is a bad idea, because each checksum algorithm
is like a lottery ticket. The more algorithms you try, the more likely
it will be for the checksum to match on a corrupted page.

So, essentially MySQL 5.7 weakened innodb_checksum_algorithm=crc32,
and MariaDB 10.2.2 inherited this weakening.

We introduce a build option that together with MDEV-17957
makes innodb_checksum_algorithm=strict_crc32 strict again
by only allowing one variant of the checksum to match.

WITH_INNODB_BUG_ENDIAN_CRC32: A new cmake option for enabling the
bug-compatible "legacy crc32" checksum. This is only enabled on
big-endian systems by default, to facilitate an upgrade from
MariaDB 10.0 or 10.1. Checked by #ifdef INNODB_BUG_ENDIAN_CRC32.

ut_crc32_byte_by_byte: Remove (unused function).

legacy_big_endian_checksum: Remove. This variable seems to have
unnecessarily complicated the logic. When the weakening is enabled,
we must always fall back to the buggy checksum.

buf_page_check_crc32(): A helper function to compute one or
two CRC-32C variants.
2018-12-13 17:57:18 +02:00
Marko Mäkelä
2e5aea4bab Merge 10.1 into 10.2 2018-12-13 15:47:38 +02:00
Marko Mäkelä
621041b676 Merge 10.0 into 10.1
Also, apply the MDEV-17957 changes to encrypted page checksums,
and remove error message output from the checksum function,
because these messages would be useless noise when mariabackup
is retrying reads of corrupted-looking pages, and not that
useful during normal server operation either.

The error messages in fil_space_verify_crypt_checksum()
should be refactored separately.
2018-12-13 13:37:21 +02:00
Thirunarayanan Balathandayuthapani
5f5e73f1fe MDEV-17957 Make Innodb_checksum_algorithm stricter for strict_* values
Problem:

  Innodb_checksum_algorithm checks for all checksum algorithm to
validate the page checksum even though the algorithm is specified as
strict_crc32, strict_innodb, strict_none.

Fix:

   Remove the checks for all checksum algorithm to validate the page
checksum if the algo is specified as strict_* values.
2018-12-13 12:06:14 +02:00
Marko Mäkelä
447e493179 Remove some unnecessary InnoDB #include 2018-11-29 12:53:44 +02:00
Marko Mäkelä
ff88e4bb8a Remove many redundant #include from InnoDB 2018-11-19 11:42:14 +02:00
Eugene Kosov
14be814380 MDEV-17491 micro optimize page_id_t
page_id_t: remove m_fold member

various places: pass page_id_t by value instead of by reference
2018-10-25 18:46:27 +03:00
Sergei Golubchik
8bee7c16c8 compiler warnings (clang 4.0.1 on i386)
extra/mariabackup/fil_cur.cc:361:42: warning: format specifies type 'unsigned long' but the argument has type 'ib_int64_t' (aka 'long long') [-Wformat]
extra/mariabackup/fil_cur.cc:376:9: warning: format specifies type 'unsigned long' but the argument has type 'ib_int64_t' (aka 'long long') [-Wformat]
sql/handler.cc:6196:45: warning: format specifies type 'unsigned long' but the argument has type 'wsrep_trx_id_t' (aka 'unsigned long long') [-Wformat]
sql/log.cc:1681:16: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
sql/log.cc:1687:16: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
sql/wsrep_sst.cc:1388:86: warning: format specifies type 'long' but the argument has type 'wsrep_seqno_t' (aka 'long long') [-Wformat]
sql/wsrep_sst.cc:232:86: warning: format specifies type 'long' but the argument has type 'wsrep_seqno_t' (aka 'long long') [-Wformat]
storage/connect/filamdbf.cpp:450:47: warning: format specifies type 'short' but the argument has type 'int' [-Wformat]
storage/connect/filamdbf.cpp:970:47: warning: format specifies type 'short' but the argument has type 'int' [-Wformat]
storage/connect/inihandl.cpp:197:16: warning: address of array 'key->name' will always evaluate to 'true' [-Wpointer-bool-conversion]
storage/innobase/btr/btr0scrub.cc:151:17: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/buf/buf0buf.cc:5085:8: warning: nonnull parameter 'bpage' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
storage/innobase/fil/fil0crypt.cc:2454:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/handler/ha_innodb.cc:18685:7: warning: format specifies type 'unsigned long' but the argument has type 'wsrep_trx_id_t' (aka 'unsigned long long') [-Wformat]
storage/innobase/row/row0mysql.cc:3319:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/row/row0mysql.cc:3327:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/maria/ma_norec.c:35:10: warning: implicit conversion from 'int' to 'my_bool' (aka 'char') changes value from 131 to -125 [-Wconstant-conversion]
storage/maria/ma_norec.c:42:10: warning: implicit conversion from 'int' to 'my_bool' (aka 'char') changes value from 131 to -125 [-Wconstant-conversion]
storage/maria/ma_test2.c:1009:12: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
storage/maria/ma_test2.c:1010:12: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
storage/mroonga/ha_mroonga.cpp:9189:44: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand]
storage/mroonga/vendor/groonga/lib/expr.c:4987:22: warning: comparison of constant -1 with expression of type 'grn_operator' is always false [-Wtautological-constant-out-of-range-compare]
storage/xtradb/btr/btr0scrub.cc:151:17: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/buf/buf0buf.cc:5047:8: warning: nonnull parameter 'bpage' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
storage/xtradb/fil/fil0crypt.cc:2454:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/row/row0mysql.cc:3324:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/row/row0mysql.cc:3332:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
unittest/sql/mf_iocache-t.cc:120:35: warning: format specifies type 'unsigned long' but the argument has type 'int' [-Wformat]
unittest/sql/mf_iocache-t.cc:96:35: note: expanded from macro 'INFO_TAIL'
2018-09-04 09:19:48 +02:00
Marko Mäkelä
391e60b2db Fix -Wclass-memaccess warnings in InnoDB 2018-08-03 13:06:03 +03:00
Marko Mäkelä
814ae57daf Merge 10.1 into 10.2 2018-08-03 13:02:56 +03:00
Marko Mäkelä
0d3972c6be Merge 10.0 into 10.1 2018-08-03 12:03:10 +03:00
Marko Mäkelä
9dfef6e29b Fix -Wclass-memaccess warnings in InnoDB,XtraDB 2018-08-03 11:53:57 +03:00
Marko Mäkelä
73af075366 Reduce the number of rw_lock_own() calls
Replace pairs of rw_lock_own() calls with
calls to rw_lock_own_flagged().

These calls are only present in debug builds.
2018-07-23 17:49:01 +03:00
Marko Mäkelä
2ca904f0ca MDEV-13103 Deal with page_compressed page corruption
fil_page_decompress(): Replaces fil_decompress_page().
Allow the caller detect errors. Remove
duplicated code. Use the "safe" instead of "fast" variants of
decompression routines.

fil_page_compress(): Replaces fil_compress_page().
The length of the input buffer always was srv_page_size (innodb_page_size).
Remove printouts, and remove the fil_space_t* parameter.

buf_tmp_buffer_t::reserved: Make private; the accessors acquire()
and release() will use atomic memory access.

buf_pool_reserve_tmp_slot(): Make static. Remove the second parameter.
Do not acquire any mutex. Remove the allocation of the buffers.

buf_tmp_reserve_crypt_buf(), buf_tmp_reserve_compression_buf():
Refactored away from buf_pool_reserve_tmp_slot().

buf_page_decrypt_after_read(): Make static, and simplify the logic.
Use the encryption buffer also for decompressing.

buf_page_io_complete(), buf_dblwr_process(): Check more failures.

fil_space_encrypt(): Simplify the debug checks.

fil_space_t::printed_compression_failure: Remove.

fil_get_compression_alg_name(): Remove.

fil_iterate(): Allocate a buffer for compression and decompression
only once, instead of allocating and freeing it for every page
that uses compression, during IMPORT TABLESPACE. Also, validate the
page checksum before decryption, and reduce the scope of some variables.

fil_page_is_index_page(), fil_page_is_lzo_compressed(): Remove (unused).

AbstractCallback::operator()(): Remove the parameter 'offset'.
The check for it in FetchIndexRootPages::operator() was basically
redundant and dead code since the previous refactoring.
2018-06-14 14:23:01 +03:00
Marko Mäkelä
f5eb37129f MDEV-13103 Deal with page_compressed page corruption
fil_page_decompress(): Replaces fil_decompress_page().
Allow the caller detect errors. Remove
duplicated code. Use the "safe" instead of "fast" variants of
decompression routines.

fil_page_compress(): Replaces fil_compress_page().
The length of the input buffer always was srv_page_size (innodb_page_size).
Remove printouts, and remove the fil_space_t* parameter.

buf_tmp_buffer_t::reserved: Make private; the accessors acquire()
and release() will use atomic memory access.

buf_pool_reserve_tmp_slot(): Make static. Remove the second parameter.
Do not acquire any mutex. Remove the allocation of the buffers.

buf_tmp_reserve_crypt_buf(), buf_tmp_reserve_compression_buf():
Refactored away from buf_pool_reserve_tmp_slot().

buf_page_decrypt_after_read(): Make static, and simplify the logic.
Use the encryption buffer also for decompressing.

buf_page_io_complete(), buf_dblwr_process(): Check more failures.

fil_space_encrypt(): Simplify the debug checks.

fil_space_t::printed_compression_failure: Remove.

fil_get_compression_alg_name(): Remove.

fil_iterate(): Allocate a buffer for compression and decompression
only once, instead of allocating and freeing it for every page
that uses compression, during IMPORT TABLESPACE.

fil_node_get_space_id(), fil_page_is_index_page(),
fil_page_is_lzo_compressed(): Remove (unused code).
2018-06-14 13:46:07 +03:00
Marko Mäkelä
72005b7a1c MDEV-13103: Improve 'cannot be decrypted' error message
buf_page_check_corrupt(): Display the file name.
2018-06-13 16:02:40 +03:00
Marko Mäkelä
18934fb583 Merge 10.1 into 10.2 2018-05-29 16:52:12 +03:00
Marko Mäkelä
6aa50bad39 MDEV-16283 ALTER TABLE...DISCARD TABLESPACE still takes long on a large buffer pool
Also fixes MDEV-14727, MDEV-14491
InnoDB: Error: Waited for 5 secs for hash index ref_count (1) to drop to 0
by replacing the flawed wait logic in dict_index_remove_from_cache_low().

On DISCARD TABLESPACE, there is no need to drop the adaptive hash index.
We must drop it on IMPORT TABLESPACE, and eventually on DROP TABLE or
DROP INDEX. As long as the dict_index_t object remains in the cache
and the table remains inaccessible, the adaptive hash index entries
to orphaned pages would not do any harm. They would be dropped when
buffer pool pages are reused for something else.

btr_search_drop_page_hash_when_freed(), buf_LRU_drop_page_hash_batch():
Remove the parameter zip_size, and pass 0 to buf_page_get_gen().

buf_page_get_gen(): Ignore zip_size if mode==BUF_PEEK_IF_IN_POOL.

buf_LRU_drop_page_hash_for_tablespace(): Drop the adaptive hash index
even if the tablespace is inaccessible.

buf_LRU_drop_page_hash_for_tablespace(): New global function, to drop
the adaptive hash index.

buf_LRU_flush_or_remove_pages(), fil_delete_tablespace():
Remove the parameter drop_ahi.

dict_index_remove_from_cache_low(): Actively drop the adaptive hash index
if entries exist. This should prevent InnoDB hangs on DROP TABLE or
DROP INDEX.

row_import_for_mysql(): Drop any adaptive hash index entries for the table.

row_drop_table_for_mysql(): Drop any adaptive hash index for the table,
except if the table resides in the system tablespace. (DISCARD TABLESPACE
does not apply to the system tablespace, and we do no want to drop the
adaptive hash index for other tables than the one that is being dropped.)

row_truncate_table_for_mysql(): Drop any adaptive hash index entries for
the table, except if the table resides in the system tablespace.
2018-05-29 14:00:20 +03:00
Sachin Agarwal
0da98472b7 Bug #23593654 CRASH IN BUF_BLOCK_FROM_AHI WHEN LARGE PAGES AND AHI ARE ENABLED
Problem:

Fix for Bug #21348684 (#Rb9581) introduced a conditional debug execute
'buf_pool_resize_chunk_null', which causes new chunks memory for 2nd
buffer pool instance is freed.

Buffer pool resize function removes all old chunks entry from
'buf_chunk_map_reg' and add new chunks entry into it. But when
'buf_pool_resize_chunk_null' is set true, 2nd buffer pool
instance's chunk entries are not added into 'buf_chunk_map_reg'.
When purge thread tries to access that buffer chunk, it leads to
debug assertion.

Fix:

Added old chunk entries into 'buf_chunk_map_reg' for 2nd buffer pool
instance when 'buf_pool_resize_chunk_null' debug condition is set to true.

Reviewed by: Jimmy <Jimmy.Yang@oracle.com>
RB: 18664
2018-05-11 19:10:32 +03:00
Sergei Golubchik
9b1824dcd2 Merge branch '10.1' into 10.2 2018-05-10 13:01:42 +02:00
Marko Mäkelä
39d248fa55 MDEV-16092 Crash in encryption.create_or_replace
If the tablespace is dropped or truncated after the
space->is_stopping() check in fil_crypt_get_page_throttle_func(),
we would proceed to request the page, and eventually report a fatal
error.

buf_page_get_gen(): Do not retry reading if mode==BUF_GET_POSSIBLY_FREED.

lock_rec_block_validate(): Be prepared for a NULL return value when
invoking buf_page_get_gen() with mode=BUF_GET_POSSIBLY_FREED.
2018-05-04 22:44:33 +03:00
Thirunarayanan Balathandayuthapani
e27535093d - Follow-up fix to MDEV-15229 2018-03-26 15:48:27 +05:30
Marko Mäkelä
112df06996 MDEV-15529 IMPORT TABLESPACE unnecessarily uses the doublewrite buffer
fil_space_t::atomic_write_supported: Always set this flag for
TEMPORARY TABLESPACE and during IMPORT TABLESPACE. The page
writes during these operations are by definition not crash-safe
because they are not written to the redo log.

fil_space_t::use_doublewrite(): Determine if doublewrite should
be used.

buf_dblwr_update(): Add assertions, and let the caller check whether
doublewrite buffering is desired.

buf_flush_write_block_low(): Disable the doublewrite buffer for
the temporary tablespace and for IMPORT TABLESPACE.

fil_space_set_imported(), fil_node_open_file(), fil_space_create():
Initialize or revise the space->atomic_write_supported flag.

buf_page_io_complete(), buf_flush_write_complete(): Add the parameter
dblwr, to indicate whether doublewrite was used for writes.

buf_dblwr_sync_datafiles(): Remove an unnecessary flush of
persistent tablespaces when flushing temporary tablespaces.
(Move the call to buf_dblwr_flush_buffered_writes().)
2018-03-10 11:54:34 +02:00
Marko Mäkelä
1e4cb8403c buf_page_io_complete(): Minor code cleanup 2018-03-10 11:54:34 +02:00
Sergei Golubchik
49bcc82686 Merge branch '10.1' into 10.2 2018-02-11 13:47:16 +01:00