Commit graph

605 commits

Author SHA1 Message Date
Ian Gilfillan
0c2fc9b3da Update InnoDB urls 2018-12-19 00:07:33 +04:00
Marko Mäkelä
560df47926 Merge 10.1 into 10.2 2018-12-18 16:28:19 +02:00
Thirunarayanan Balathandayuthapani
171271edf8 MDEV-18025 Mariabackup fails to detect corrupted page_compressed=1 tables
Problem:
=======
Mariabackup seems to fail to verify the pages of compressed tables.
The reason is that both fil_space_verify_crypt_checksum() and
buf_page_is_corrupted() will skip the validation for compressed pages.

Fix:
====
Mariabackup should call fil_page_decompress() for compressed and encrypted
compressed page. After that, call buf_page_is_corrupted() to
check the page corruption.
2018-12-18 18:07:17 +05:30
Marko Mäkelä
7d245083a4 Merge 10.1 into 10.2 2018-12-17 20:15:38 +02:00
Marko Mäkelä
8c43f96388 Follow-up to MDEV-12112: corruption in encrypted table may be overlooked
The initial fix only covered a part of Mariabackup.
This fix hardens InnoDB and XtraDB in a similar way, in order
to reduce the probability of mistaking a corrupted encrypted page
for a valid unencrypted one.

This is based on work by Thirunarayanan Balathandayuthapani.

fil_space_verify_crypt_checksum(): Assert that key_version!=0.
Let the callers guarantee that. Now that we have this assertion,
we also know that buf_page_is_zeroes() cannot hold.
Also, remove all diagnostic output and related parameters,
and let the relevant callers emit such messages.
Last but not least, validate the post-encryption checksum
according to the innodb_checksum_algorithm (only accepting
one checksum for the strict variants), and no longer
try to validate the page as if it was unencrypted.

buf_page_is_zeroes(): Move to the compilation unit of the only callers,
and declare static.

xb_fil_cur_read(), buf_page_check_corrupt(): Add a condition before
calling fil_space_verify_crypt_checksum(). This is a non-functional
change.

buf_dblwr_process(): Validate the page only as encrypted or unencrypted,
but not both.
2018-12-17 19:33:44 +02:00
Marko Mäkelä
1a780eefc9 MDEV-17958 Make bug-endian innodb_checksum_algorithm=crc32 optional
In MySQL 5.7, it was noticed that files are not portable between
big-endian and little-endian processor architectures
(such as SPARC and x86), because the original implementation of
innodb_checksum_algorithm=crc32 was not byte order agnostic.

A byte order agnostic implementation of innodb_checksum_algorithm=crc32
was only added to MySQL 5.7, not backported to 5.6. Consequently,
MariaDB Server versions 10.0 and 10.1 only contain the CRC-32C
implementation that works incorrectly on big-endian architectures,
and MariaDB Server 10.2.2 got the byte-order agnostic CRC-32C
implementation from MySQL 5.7.

MySQL 5.7 introduced a "legacy crc32" variant that is functionally
equivalent to the big-endian version of the original crc32 implementation.
Thanks to this variant, old data files can be transferred from big-endian
systems to newer versions.

Introducing new variants of checksum algorithms (without introducing
new names for them, or something on the pages themselves to identify
the algorithm) generally is a bad idea, because each checksum algorithm
is like a lottery ticket. The more algorithms you try, the more likely
it will be for the checksum to match on a corrupted page.

So, essentially MySQL 5.7 weakened innodb_checksum_algorithm=crc32,
and MariaDB 10.2.2 inherited this weakening.

We introduce a build option that together with MDEV-17957
makes innodb_checksum_algorithm=strict_crc32 strict again
by only allowing one variant of the checksum to match.

WITH_INNODB_BUG_ENDIAN_CRC32: A new cmake option for enabling the
bug-compatible "legacy crc32" checksum. This is only enabled on
big-endian systems by default, to facilitate an upgrade from
MariaDB 10.0 or 10.1. Checked by #ifdef INNODB_BUG_ENDIAN_CRC32.

ut_crc32_byte_by_byte: Remove (unused function).

legacy_big_endian_checksum: Remove. This variable seems to have
unnecessarily complicated the logic. When the weakening is enabled,
we must always fall back to the buggy checksum.

buf_page_check_crc32(): A helper function to compute one or
two CRC-32C variants.
2018-12-13 17:57:18 +02:00
Marko Mäkelä
2e5aea4bab Merge 10.1 into 10.2 2018-12-13 15:47:38 +02:00
Marko Mäkelä
621041b676 Merge 10.0 into 10.1
Also, apply the MDEV-17957 changes to encrypted page checksums,
and remove error message output from the checksum function,
because these messages would be useless noise when mariabackup
is retrying reads of corrupted-looking pages, and not that
useful during normal server operation either.

The error messages in fil_space_verify_crypt_checksum()
should be refactored separately.
2018-12-13 13:37:21 +02:00
Thirunarayanan Balathandayuthapani
5f5e73f1fe MDEV-17957 Make Innodb_checksum_algorithm stricter for strict_* values
Problem:

  Innodb_checksum_algorithm checks for all checksum algorithm to
validate the page checksum even though the algorithm is specified as
strict_crc32, strict_innodb, strict_none.

Fix:

   Remove the checks for all checksum algorithm to validate the page
checksum if the algo is specified as strict_* values.
2018-12-13 12:06:14 +02:00
Marko Mäkelä
447e493179 Remove some unnecessary InnoDB #include 2018-11-29 12:53:44 +02:00
Marko Mäkelä
ff88e4bb8a Remove many redundant #include from InnoDB 2018-11-19 11:42:14 +02:00
Marko Mäkelä
b3009059d0 Minor cleanup 2018-10-29 12:05:39 +02:00
Eugene Kosov
14be814380 MDEV-17491 micro optimize page_id_t
page_id_t: remove m_fold member

various places: pass page_id_t by value instead of by reference
2018-10-25 18:46:27 +03:00
Marko Mäkelä
055a3334ad MDEV-13564 Mariabackup does not work with TRUNCATE
Implement undo tablespace truncation via normal redo logging.

Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name,
CREATE, and DROP.

Note: Orphan #sql-ib*.ibd may be left behind if MariaDB Server 10.2
is killed before the DROP operation is committed. If MariaDB Server 10.2
is killed during TRUNCATE, it is also possible that the old table
was renamed to #sql-ib*.ibd but the data dictionary will refer to the
table using the original name.

In MariaDB Server 10.3, RENAME inside InnoDB is transactional,
and #sql-* tables will be dropped on startup. So, this new TRUNCATE
will be fully crash-safe in 10.3.

ha_mroonga::wrapper_truncate(): Pass table options to the underlying
storage engine, now that ha_innobase::truncate() will need them.

rpl_slave_state::truncate_state_table(): Before truncating
mysql.gtid_slave_pos, evict any cached table handles from
the table definition cache, so that there will be no stale
references to the old table after truncating.

== TRUNCATE TABLE ==

WL#6501 in MySQL 5.7 introduced separate log files for implementing
atomic and crash-safe TRUNCATE TABLE, instead of using the InnoDB
undo and redo log. Some convoluted logic was added to the InnoDB
crash recovery, and some extra synchronization (including a redo log
checkpoint) was introduced to make this work. This synchronization
has caused performance problems and race conditions, and the extra
log files cannot be copied or applied by external backup programs.

In order to support crash-upgrade from MariaDB 10.2, we will keep
the logic for parsing and applying the extra log files, but we will
no longer generate those files in TRUNCATE TABLE.

A prerequisite for crash-safe TRUNCATE is a crash-safe RENAME TABLE
(with full redo and undo logging and proper rollback). This will
be implemented in MDEV-14717.

ha_innobase::truncate(): Invoke RENAME, create(), delete_table().
Because RENAME cannot be fully rolled back before MariaDB 10.3
due to missing undo logging, add some explicit rename-back in
case the operation fails.

ha_innobase::delete(): Introduce a variant that takes sqlcom as
a parameter. In TRUNCATE TABLE, we do not want to touch any
FOREIGN KEY constraints.

ha_innobase::create(): Add the parameters file_per_table, trx.
In TRUNCATE, the new table must be created in the same transaction
that renames the old table.

create_table_info_t::create_table_info_t(): Add the parameters
file_per_table, trx.

row_drop_table_for_mysql(): Replace a bool parameter with sqlcom.

row_drop_table_after_create_fail(): New function, wrapping
row_drop_table_for_mysql().

dict_truncate_index_tree_in_mem(), fil_truncate_tablespace(),
fil_prepare_for_truncate(), fil_reinit_space_header_for_table(),
row_truncate_table_for_mysql(), TruncateLogger,
row_truncate_prepare(), row_truncate_rollback(),
row_truncate_complete(), row_truncate_fts(),
row_truncate_update_system_tables(),
row_truncate_foreign_key_checks(), row_truncate_sanity_checks():
Remove.

row_upd_check_references_constraints(): Remove a check for
TRUNCATE, now that the table is no longer truncated in place.

The new test innodb.truncate_foreign uses DEBUG_SYNC to cover some
race-condition like scenarios. The test innodb-innodb.truncate does
not use any synchronization.

We add a redo log subformat to indicate backup-friendly format.
MariaDB 10.4 will remove support for the old TRUNCATE logging,
so crash-upgrade from old 10.2 or 10.3 to 10.4 will involve
limitations.

== Undo tablespace truncation ==

MySQL 5.7 implements undo tablespace truncation. It is only
possible when innodb_undo_tablespaces is set to at least 2.
The logging is implemented similar to the WL#6501 TRUNCATE,
that is, using separate log files and a redo log checkpoint.

We can simply implement undo tablespace truncation within
a single mini-transaction that reinitializes the undo log
tablespace file. Unfortunately, due to the redo log format
of some operations, currently, the total redo log written by
undo tablespace truncation will be more than the combined size
of the truncated undo tablespace. It should be acceptable
to have a little more than 1 megabyte of log in a single
mini-transaction. This will be fixed in MDEV-17138 in
MariaDB Server 10.4.

recv_sys_t: Add truncated_undo_spaces[] to remember for which undo
tablespaces a MLOG_FILE_CREATE2 record was seen.

namespace undo: Remove some unnecessary declarations.

fil_space_t::is_being_truncated: Document that this flag now
only applies to undo tablespaces. Remove some references.

fil_space_t::is_stopping(): Do not refer to is_being_truncated.
This check is for tablespaces of tables. Potentially used
tablespaces are never truncated any more.

buf_dblwr_process(): Suppress the out-of-bounds warning
for undo tablespaces.

fil_truncate_log(): Write a MLOG_FILE_CREATE2 with a nonzero
page number (new size of the tablespace in pages) to inform
crash recovery that the undo tablespace size has been reduced.

fil_op_write_log(): Relax assertions, so that MLOG_FILE_CREATE2
can be written for undo tablespaces (without .ibd file suffix)
for a nonzero page number.

os_file_truncate(): Add the parameter allow_shrink=false
so that undo tablespaces can actually be shrunk using this function.

fil_name_parse(): For undo tablespace truncation,
buffer MLOG_FILE_CREATE2 in truncated_undo_spaces[].

recv_read_in_area(): Avoid reading pages for which no redo log
records remain buffered, after recv_addr_trim() removed them.

trx_rseg_header_create(): Add a FIXME comment that we could write
much less redo log.

trx_undo_truncate_tablespace(): Reinitialize the undo tablespace
in a single mini-transaction, which will be flushed to the redo log
before the file size is trimmed.

recv_addr_trim(): Discard any redo logs for pages that were
logged after the new end of a file, before the truncation LSN.
If the rec_list becomes empty, reduce n_addrs. After removing
any affected records, actually truncate the file.

recv_apply_hashed_log_recs(): Invoke recv_addr_trim() right before
applying any log records. The undo tablespace files must be open
at this point.

buf_flush_or_remove_pages(), buf_flush_dirty_pages(),
buf_LRU_flush_or_remove_pages(): Add a parameter for specifying
the number of the first page to flush or remove (default 0).

trx_purge_initiate_truncate(): Remove the log checkpoints, the
extra logging, and some unnecessary crash points. Merge the code
from trx_undo_truncate_tablespace(). First, flush all to-be-discarded
pages (beyond the new end of the file), then trim the space->size
to make the page allocation deterministic. At the only remaining
crash injection point, flush the redo log, so that the recovery
can be tested.
2018-09-07 22:10:02 +03:00
Sergei Golubchik
8bee7c16c8 compiler warnings (clang 4.0.1 on i386)
extra/mariabackup/fil_cur.cc:361:42: warning: format specifies type 'unsigned long' but the argument has type 'ib_int64_t' (aka 'long long') [-Wformat]
extra/mariabackup/fil_cur.cc:376:9: warning: format specifies type 'unsigned long' but the argument has type 'ib_int64_t' (aka 'long long') [-Wformat]
sql/handler.cc:6196:45: warning: format specifies type 'unsigned long' but the argument has type 'wsrep_trx_id_t' (aka 'unsigned long long') [-Wformat]
sql/log.cc:1681:16: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
sql/log.cc:1687:16: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
sql/wsrep_sst.cc:1388:86: warning: format specifies type 'long' but the argument has type 'wsrep_seqno_t' (aka 'long long') [-Wformat]
sql/wsrep_sst.cc:232:86: warning: format specifies type 'long' but the argument has type 'wsrep_seqno_t' (aka 'long long') [-Wformat]
storage/connect/filamdbf.cpp:450:47: warning: format specifies type 'short' but the argument has type 'int' [-Wformat]
storage/connect/filamdbf.cpp:970:47: warning: format specifies type 'short' but the argument has type 'int' [-Wformat]
storage/connect/inihandl.cpp:197:16: warning: address of array 'key->name' will always evaluate to 'true' [-Wpointer-bool-conversion]
storage/innobase/btr/btr0scrub.cc:151:17: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/buf/buf0buf.cc:5085:8: warning: nonnull parameter 'bpage' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
storage/innobase/fil/fil0crypt.cc:2454:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/handler/ha_innodb.cc:18685:7: warning: format specifies type 'unsigned long' but the argument has type 'wsrep_trx_id_t' (aka 'unsigned long long') [-Wformat]
storage/innobase/row/row0mysql.cc:3319:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/row/row0mysql.cc:3327:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/maria/ma_norec.c:35:10: warning: implicit conversion from 'int' to 'my_bool' (aka 'char') changes value from 131 to -125 [-Wconstant-conversion]
storage/maria/ma_norec.c:42:10: warning: implicit conversion from 'int' to 'my_bool' (aka 'char') changes value from 131 to -125 [-Wconstant-conversion]
storage/maria/ma_test2.c:1009:12: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
storage/maria/ma_test2.c:1010:12: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
storage/mroonga/ha_mroonga.cpp:9189:44: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand]
storage/mroonga/vendor/groonga/lib/expr.c:4987:22: warning: comparison of constant -1 with expression of type 'grn_operator' is always false [-Wtautological-constant-out-of-range-compare]
storage/xtradb/btr/btr0scrub.cc:151:17: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/buf/buf0buf.cc:5047:8: warning: nonnull parameter 'bpage' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
storage/xtradb/fil/fil0crypt.cc:2454:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/row/row0mysql.cc:3324:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/row/row0mysql.cc:3332:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
unittest/sql/mf_iocache-t.cc:120:35: warning: format specifies type 'unsigned long' but the argument has type 'int' [-Wformat]
unittest/sql/mf_iocache-t.cc:96:35: note: expanded from macro 'INFO_TAIL'
2018-09-04 09:19:48 +02:00
Marko Mäkelä
9258097fa3 Merge 10.1 into 10.2 2018-08-21 15:20:34 +03:00
Marko Mäkelä
45dbd47026 MDEV-17003 service_manager_extend_timeout() being called too often
buf_dump(): Only generate the output when shutdown is in progress.

log_write_up_to(): Only generate the output before actually writing
to the redo log files.

srv_purge_should_exit(): Rate-limit the output, and instead of
displaying the work done, indicate the work that remains to be done
until the completion of the slow shutdown.
2018-08-21 12:33:40 +03:00
Marko Mäkelä
391e60b2db Fix -Wclass-memaccess warnings in InnoDB 2018-08-03 13:06:03 +03:00
Marko Mäkelä
814ae57daf Merge 10.1 into 10.2 2018-08-03 13:02:56 +03:00
Marko Mäkelä
0d3972c6be Merge 10.0 into 10.1 2018-08-03 12:03:10 +03:00
Marko Mäkelä
9dfef6e29b Fix -Wclass-memaccess warnings in InnoDB,XtraDB 2018-08-03 11:53:57 +03:00
Marko Mäkelä
ef3070e997 Merge 10.1 into 10.2 2018-08-02 08:19:57 +03:00
Oleksandr Byelkin
865e807125 Merge branch '10.0' into 10.1 2018-07-31 11:58:29 +02:00
Marko Mäkelä
d17e9a02c4 Fix InnoDB/XtraDB warnings by GCC 8.2.0 2018-07-30 14:05:24 +03:00
Marko Mäkelä
340963351c Use a more precise argument for memset() 2018-07-30 10:39:42 +03:00
Marko Mäkelä
0f90728bc0 MDEV-16809 Allow full redo logging for ALTER TABLE
Introduce the configuration option innodb_log_optimize_ddl
for controlling whether native index creation or table-rebuild
in InnoDB should keep optimizing the redo log
(and writing MLOG_INDEX_LOAD records to ensure that
concurrent backup would fail).

By default, we have innodb_log_optimize_ddl=ON, that is,
the default behaviour that was introduced in MariaDB 10.2.2
(with the merge of InnoDB from MySQL 5.7) will be unchanged.

BtrBulk::m_trx: Replaces m_trx_id. We must be able to check for
KILL QUERY even if !m_flush_observer (innodb_log_optimize_ddl=OFF).

page_cur_insert_rec_write_log(): Declare globally, so that this
can be called from PageBulk::insert().

row_merge_insert_index_tuples(): Remove the unused parameter trx_id.

row_merge_build_indexes(): Enable or disable redo logging based on
the innodb_log_optimize_ddl parameter.

PageBulk::init(), PageBulk::insert(), PageBulk::finish(): Write
redo log records if needed. For ROW_FORMAT=COMPRESSED, redo log
will be written in PageBulk::compress() unless we called
m_mtr.set_log_mode(MTR_LOG_NO_REDO).
2018-07-26 08:44:42 +03:00
Marko Mäkelä
73af075366 Reduce the number of rw_lock_own() calls
Replace pairs of rw_lock_own() calls with
calls to rw_lock_own_flagged().

These calls are only present in debug builds.
2018-07-23 17:49:01 +03:00
Marko Mäkelä
e9f1d8da57 Fix warnings about possibly uninitialized variables 2018-07-05 14:14:31 +03:00
Marko Mäkelä
1b4ac075bf Merge 10.1 into 10.2 2018-06-26 15:39:23 +03:00
Marko Mäkelä
c4eb4bcef6 MDEV-16515 InnoDB: Failing assertion: ++retries < 10000 in file
dict0dict.cc

buf_LRU_drop_page_hash_for_tablespace(): Return whether any adaptive
hash index entries existed. If yes, the caller should keep retrying to
drop the adaptive hash index.

row_import_for_mysql(), row_truncate_table_for_mysql(),
row_drop_table_for_mysql(): Ensure that the adaptive hash index was
entirely dropped for the table.
2018-06-26 11:34:51 +03:00
Marko Mäkelä
2ca904f0ca MDEV-13103 Deal with page_compressed page corruption
fil_page_decompress(): Replaces fil_decompress_page().
Allow the caller detect errors. Remove
duplicated code. Use the "safe" instead of "fast" variants of
decompression routines.

fil_page_compress(): Replaces fil_compress_page().
The length of the input buffer always was srv_page_size (innodb_page_size).
Remove printouts, and remove the fil_space_t* parameter.

buf_tmp_buffer_t::reserved: Make private; the accessors acquire()
and release() will use atomic memory access.

buf_pool_reserve_tmp_slot(): Make static. Remove the second parameter.
Do not acquire any mutex. Remove the allocation of the buffers.

buf_tmp_reserve_crypt_buf(), buf_tmp_reserve_compression_buf():
Refactored away from buf_pool_reserve_tmp_slot().

buf_page_decrypt_after_read(): Make static, and simplify the logic.
Use the encryption buffer also for decompressing.

buf_page_io_complete(), buf_dblwr_process(): Check more failures.

fil_space_encrypt(): Simplify the debug checks.

fil_space_t::printed_compression_failure: Remove.

fil_get_compression_alg_name(): Remove.

fil_iterate(): Allocate a buffer for compression and decompression
only once, instead of allocating and freeing it for every page
that uses compression, during IMPORT TABLESPACE. Also, validate the
page checksum before decryption, and reduce the scope of some variables.

fil_page_is_index_page(), fil_page_is_lzo_compressed(): Remove (unused).

AbstractCallback::operator()(): Remove the parameter 'offset'.
The check for it in FetchIndexRootPages::operator() was basically
redundant and dead code since the previous refactoring.
2018-06-14 14:23:01 +03:00
Marko Mäkelä
f5eb37129f MDEV-13103 Deal with page_compressed page corruption
fil_page_decompress(): Replaces fil_decompress_page().
Allow the caller detect errors. Remove
duplicated code. Use the "safe" instead of "fast" variants of
decompression routines.

fil_page_compress(): Replaces fil_compress_page().
The length of the input buffer always was srv_page_size (innodb_page_size).
Remove printouts, and remove the fil_space_t* parameter.

buf_tmp_buffer_t::reserved: Make private; the accessors acquire()
and release() will use atomic memory access.

buf_pool_reserve_tmp_slot(): Make static. Remove the second parameter.
Do not acquire any mutex. Remove the allocation of the buffers.

buf_tmp_reserve_crypt_buf(), buf_tmp_reserve_compression_buf():
Refactored away from buf_pool_reserve_tmp_slot().

buf_page_decrypt_after_read(): Make static, and simplify the logic.
Use the encryption buffer also for decompressing.

buf_page_io_complete(), buf_dblwr_process(): Check more failures.

fil_space_encrypt(): Simplify the debug checks.

fil_space_t::printed_compression_failure: Remove.

fil_get_compression_alg_name(): Remove.

fil_iterate(): Allocate a buffer for compression and decompression
only once, instead of allocating and freeing it for every page
that uses compression, during IMPORT TABLESPACE.

fil_node_get_space_id(), fil_page_is_index_page(),
fil_page_is_lzo_compressed(): Remove (unused code).
2018-06-14 13:46:07 +03:00
Marko Mäkelä
72005b7a1c MDEV-13103: Improve 'cannot be decrypted' error message
buf_page_check_corrupt(): Display the file name.
2018-06-13 16:02:40 +03:00
Marko Mäkelä
18934fb583 Merge 10.1 into 10.2 2018-05-29 16:52:12 +03:00
Marko Mäkelä
6aa50bad39 MDEV-16283 ALTER TABLE...DISCARD TABLESPACE still takes long on a large buffer pool
Also fixes MDEV-14727, MDEV-14491
InnoDB: Error: Waited for 5 secs for hash index ref_count (1) to drop to 0
by replacing the flawed wait logic in dict_index_remove_from_cache_low().

On DISCARD TABLESPACE, there is no need to drop the adaptive hash index.
We must drop it on IMPORT TABLESPACE, and eventually on DROP TABLE or
DROP INDEX. As long as the dict_index_t object remains in the cache
and the table remains inaccessible, the adaptive hash index entries
to orphaned pages would not do any harm. They would be dropped when
buffer pool pages are reused for something else.

btr_search_drop_page_hash_when_freed(), buf_LRU_drop_page_hash_batch():
Remove the parameter zip_size, and pass 0 to buf_page_get_gen().

buf_page_get_gen(): Ignore zip_size if mode==BUF_PEEK_IF_IN_POOL.

buf_LRU_drop_page_hash_for_tablespace(): Drop the adaptive hash index
even if the tablespace is inaccessible.

buf_LRU_drop_page_hash_for_tablespace(): New global function, to drop
the adaptive hash index.

buf_LRU_flush_or_remove_pages(), fil_delete_tablespace():
Remove the parameter drop_ahi.

dict_index_remove_from_cache_low(): Actively drop the adaptive hash index
if entries exist. This should prevent InnoDB hangs on DROP TABLE or
DROP INDEX.

row_import_for_mysql(): Drop any adaptive hash index entries for the table.

row_drop_table_for_mysql(): Drop any adaptive hash index for the table,
except if the table resides in the system tablespace. (DISCARD TABLESPACE
does not apply to the system tablespace, and we do no want to drop the
adaptive hash index for other tables than the one that is being dropped.)

row_truncate_table_for_mysql(): Drop any adaptive hash index entries for
the table, except if the table resides in the system tablespace.
2018-05-29 14:00:20 +03:00
Sachin Agarwal
0da98472b7 Bug #23593654 CRASH IN BUF_BLOCK_FROM_AHI WHEN LARGE PAGES AND AHI ARE ENABLED
Problem:

Fix for Bug #21348684 (#Rb9581) introduced a conditional debug execute
'buf_pool_resize_chunk_null', which causes new chunks memory for 2nd
buffer pool instance is freed.

Buffer pool resize function removes all old chunks entry from
'buf_chunk_map_reg' and add new chunks entry into it. But when
'buf_pool_resize_chunk_null' is set true, 2nd buffer pool
instance's chunk entries are not added into 'buf_chunk_map_reg'.
When purge thread tries to access that buffer chunk, it leads to
debug assertion.

Fix:

Added old chunk entries into 'buf_chunk_map_reg' for 2nd buffer pool
instance when 'buf_pool_resize_chunk_null' debug condition is set to true.

Reviewed by: Jimmy <Jimmy.Yang@oracle.com>
RB: 18664
2018-05-11 19:10:32 +03:00
Sergei Golubchik
9b1824dcd2 Merge branch '10.1' into 10.2 2018-05-10 13:01:42 +02:00
Marko Mäkelä
39d248fa55 MDEV-16092 Crash in encryption.create_or_replace
If the tablespace is dropped or truncated after the
space->is_stopping() check in fil_crypt_get_page_throttle_func(),
we would proceed to request the page, and eventually report a fatal
error.

buf_page_get_gen(): Do not retry reading if mode==BUF_GET_POSSIBLY_FREED.

lock_rec_block_validate(): Be prepared for a NULL return value when
invoking buf_page_get_gen() with mode=BUF_GET_POSSIBLY_FREED.
2018-05-04 22:44:33 +03:00
Vicențiu Ciorbaru
45e6d0aebf Merge branch '10.1' into 10.2 2018-04-10 17:43:18 +03:00
Marko Mäkelä
3498a656c9 MDEV-14705: Follow-up fixes
buf_flush_remove(): Disable the output for now, because we
certainly do not want this after every page flush on shutdown.
It must be rate-limited somehow. There already is a timeout
extension for waiting the page cleaner to exit in
logs_empty_and_mark_files_at_shutdown().

log_write_up_to(): Use correct format.

srv_purge_should_exit(): Move the timeout extension to the
appropriate place, from one of the callers.
2018-04-06 12:29:25 +03:00
Daniel Black
1479273cdb MDEV-14705: slow innodb startup/shutdown can exceed systemd timeout
Use systemd EXTEND_TIMEOUT_USEC to advise systemd of progress

Move towards progress measures rather than pure time based measures.

Progress reporting at numberious shutdown/startup locations incuding:
* For innodb_fast_shutdown=0 trx_roll_must_shutdown() for rolling back incomplete transactions.
* For merging the change buffer (in srv_shutdown(bool ibuf_merge))
* For purging history, srv_do_purge

Thanks Marko for feedback and suggestions.
2018-04-06 09:58:14 +03:00
Marko Mäkelä
6cccef21a6 MDEV-15720 ib_buffer_pool unnecessarily includes the temporary tablespace
The purpose of the InnoDB buffer pool dump is to allow InnoDB to be
restarted with the same persistent data pages in the buffer pool.

The InnoDB temporary tablespace that was introduced in MariaDB 10.2.2
is always reinitialized on restart. Therefore, it does not make sense
to attempt to dump or restore any pages of the temporary tablespace.
2018-03-29 13:22:16 +03:00
Thirunarayanan Balathandayuthapani
e27535093d - Follow-up fix to MDEV-15229 2018-03-26 15:48:27 +05:30
Vicențiu Ciorbaru
82aeb6b596 Merge branch '10.1' into 10.2 2018-03-21 10:36:49 +02:00
Vicențiu Ciorbaru
24b353162f Merge branch '10.0-galera' into 10.1 2018-03-19 15:21:01 +02:00
Daniel Black
8b54c31486 MDEV-8743: where O_CLOEXEC is available, use for innodb buf_dump
As this is the only moderately critical fopened for writing file,
create an alternate path to use open and fdopen for non-glibc platforms
that support O_CLOEXEC (BSDs).

Tested on Linux (by modifing the GLIBC defination) to take this
alternate path:

$ cd /proc/23874
$ more fdinfo/71
pos:    0
flags:  02100001
mnt_id: 24
$ ls -la fd/71
l-wx------. 1 dan dan 64 Mar 14 13:30 fd/71 -> /dev/shm/var_auto_i7rl/mysqld.1/data/ib_buffer_pool.incomplete
2018-03-15 12:07:43 +02:00
Thirunarayanan Balathandayuthapani
76ae6e725d MDEV-15384 buf_flush_LRU_list_batch() always reports n->flushed=0, n->evicted=0
- buf_flush_LRU_list_batch() initializes the count to zero and updates them
correctly.
2018-03-13 15:25:38 +05:30
Marko Mäkelä
71f9cc1221 MDEV-15554 InnoDB page_cleaner shutdown sometimes hangs
buf_flush_page_cleaner_coordinator(): Signal the worker threads
to exit while waiting for them to exit. Apparently, signals are
sometimes lost, causing shutdown to occasionally hang when
multiple page cleaners (and buffer pool instances) are used,
that is, when innodb_buffer_pool_size is at least 1 GiB.

buf_flush_page_cleaner_close(): Merge with the only caller.
2018-03-13 09:41:42 +02:00
Marko Mäkelä
112df06996 MDEV-15529 IMPORT TABLESPACE unnecessarily uses the doublewrite buffer
fil_space_t::atomic_write_supported: Always set this flag for
TEMPORARY TABLESPACE and during IMPORT TABLESPACE. The page
writes during these operations are by definition not crash-safe
because they are not written to the redo log.

fil_space_t::use_doublewrite(): Determine if doublewrite should
be used.

buf_dblwr_update(): Add assertions, and let the caller check whether
doublewrite buffering is desired.

buf_flush_write_block_low(): Disable the doublewrite buffer for
the temporary tablespace and for IMPORT TABLESPACE.

fil_space_set_imported(), fil_node_open_file(), fil_space_create():
Initialize or revise the space->atomic_write_supported flag.

buf_page_io_complete(), buf_flush_write_complete(): Add the parameter
dblwr, to indicate whether doublewrite was used for writes.

buf_dblwr_sync_datafiles(): Remove an unnecessary flush of
persistent tablespaces when flushing temporary tablespaces.
(Move the call to buf_dblwr_flush_buffered_writes().)
2018-03-10 11:54:34 +02:00
Marko Mäkelä
54765aaa4d MDEV-15524 Do not disable page checksums for temporary tables
buf_flush_init_for_writing(): Remove the parameter skip_checksum.
2018-03-10 11:54:34 +02:00