Commit graph

176 commits

Author SHA1 Message Date
Marko Mäkelä
e7029e864f Merge 10.3 into 10.4 2019-04-17 15:59:30 +03:00
Marko Mäkelä
250799f961 Merge 10.2 into 10.3 2019-04-17 15:26:17 +03:00
Marko Mäkelä
169c00994b MDEV-12699 Improve crash recovery of corrupted data pages
InnoDB crash recovery used to read every data page for which
redo log exists. This is unnecessary for those pages that are
initialized by the redo log. If a newly created page is corrupted,
recovery could unnecessarily fail. It would suffice to reinitialize
the page based on the redo log records.

To add insult to injury, InnoDB crash recovery could hang if it
encountered a corrupted page. We will fix also that problem.
InnoDB would normally refuse to start up if it encounters a
corrupted page on recovery, but that can be overridden by
setting innodb_force_recovery=1.

Data pages are completely initialized by the records
MLOG_INIT_FILE_PAGE2 and MLOG_ZIP_PAGE_COMPRESS.
MariaDB 10.4 additionally recognizes MLOG_INIT_FREE_PAGE,
which notifies that a page has been freed and its contents
can be discarded (filled with zeroes).

The record MLOG_INDEX_LOAD notifies that redo logging has
been re-enabled after being disabled. We can avoid loading
the page if all buffered redo log records predate the
MLOG_INDEX_LOAD record.

For the internal tables of FULLTEXT INDEX, no MLOG_INDEX_LOAD
records were written before commit aa3f7a107c.
Hence, we will skip these optimizations for tables whose
name starts with FTS_.

This is joint work with Thirunarayanan Balathandayuthapani.

fil_space_t::enable_lsn, file_name_t::enable_lsn: The LSN of the
latest recovered MLOG_INDEX_LOAD record for a tablespace.

mlog_init: Page initialization operations discovered during
redo log scanning. FIXME: This really belongs in recv_sys->addr_hash,
and should be removed in MDEV-19176.

recv_addr_state: Add the new state RECV_WILL_NOT_READ to
indicate that according to mlog_init, the page will be
initialized based on redo log record contents.

recv_add_to_hash_table(): Set the RECV_WILL_NOT_READ state
if appropriate. For now, we do not treat MLOG_ZIP_PAGE_COMPRESS
as page initialization. This works around bugs in the crash
recovery of ROW_FORMAT=COMPRESSED tables.

recv_mark_log_index_load(): Process a MLOG_INDEX_LOAD record
by resetting the state to RECV_NOT_PROCESSED and by updating
the fil_name_t::enable_lsn.

recv_init_crash_recovery_spaces(): Copy fil_name_t::enable_lsn
to fil_space_t::enable_lsn.

recv_recover_page(): Add the parameter init_lsn, to ignore
any log records that precede the page initialization.
Add DBUG output about skipped operations.

buf_page_create(): Initialize FIL_PAGE_LSN, so that
recv_recover_page() will not wrongly skip applying
the page-initialization record due to the field containing
some newer LSN as a leftover from a different page.
Do not invoke ibuf_merge_or_delete_for_page() during
crash recovery.

recv_apply_hashed_log_recs(): Remove some unnecessary lookups.
Note if a corrupted page was found during recovery.
After invoking buf_page_create(), do invoke
ibuf_merge_or_delete_for_page() via mlog_init.ibuf_merge()
in the last recovery batch.

ibuf_merge_or_delete_for_page(): Relax a debug assertion.

innobase_start_or_create_for_mysql(): Abort startup if
a corrupted page was found during recovery. Corrupted pages
will not be flagged if innodb_force_recovery is set.
However, the recv_sys->found_corrupt_fs flag can be set
regardless of innodb_force_recovery if file names are found
to be incorrect (for example, multiple files with the same
tablespace ID).
2019-04-17 13:58:41 +03:00
Marko Mäkelä
304ae942f7 MDEV-15528 preparation: Write MLOG_INIT_FREE_PAGE
When freeing a file page, write a MLOG_INIT_FREE_PAGE record.
This allows us to avoid page flush and instead punch holes later,
in the page flushing. To implement that, we may want to make
buf_page_t::file_page_was_freed available in non-debug builds.

Crash recovery can choose to ignore or apply the record.

In BtrBulk::finish() we must not write this record, because
redo logging is being disabled for the page.
2019-04-08 22:00:17 +03:00
Marko Mäkelä
edd1a53a55 Merge 10.3 into 10.4 2019-04-08 22:00:07 +03:00
Marko Mäkelä
ee7a4f4462 MDEV-12266: Pass fil_space_t* to fseg_free_page()
fseg_free_page_func(): Avoid an unnecessary tablespace ID lookup.
The callers should pass the tablespace that they already know.
2019-04-08 21:38:43 +03:00
Marko Mäkelä
6b6fa3cdb1 MDEV-18644: Support full_crc32 for page_compressed
This is a follow-up task to MDEV-12026, which introduced
innodb_checksum_algorithm=full_crc32 and a simpler page format.
MDEV-12026 did not enable full_crc32 for page_compressed tables,
which we will be doing now.

This is joint work with Thirunarayanan Balathandayuthapani.

For innodb_checksum_algorithm=full_crc32 we change the
page_compressed format as follows:

FIL_PAGE_TYPE: The most significant bit will be set to indicate
page_compressed format. The least significant bits will contain
the compressed page size, rounded up to a multiple of 256 bytes.

The checksum will be stored in the last 4 bytes of the page
(whether it is the full page or a page_compressed page whose
size is determined by FIL_PAGE_TYPE), covering all preceding
bytes of the page. If encryption is used, then the page will
be encrypted between compression and computing the checksum.
For page_compressed, FIL_PAGE_LSN will not be repeated at
the end of the page.

FSP_SPACE_FLAGS (already implemented as part of MDEV-12026):
We will store the innodb_compression_algorithm that may be used
to compress pages. Previously, the choice of algorithm was written
to each compressed data page separately, and one would be unable
to know in advance which compression algorithm(s) are used.

fil_space_t::full_crc32_page_compressed_len(): Determine if the
page_compressed algorithm of the tablespace needs to know the
exact length of the compressed data. If yes, we will reserve and
write an extra byte for this right before the checksum.

buf_page_is_compressed(): Determine if a page uses page_compressed
(in any innodb_checksum_algorithm).

fil_page_decompress(): Pass also fil_space_t::flags so that the
format can be determined.

buf_page_is_zeroes(): Check if a page is full of zero bytes.

buf_page_full_crc32_is_corrupted(): Renamed from
buf_encrypted_full_crc32_page_is_corrupted(). For full_crc32,
we always simply validate the checksum to the page contents,
while the physical page size is explicitly specified by an
unencrypted part of the page header.

buf_page_full_crc32_size(): Determine the size of a full_crc32 page.

buf_dblwr_check_page_lsn(): Make this a debug-only function, because
it involves potentially costly lookups of fil_space_t.

create_table_info_t::check_table_options(),
ha_innobase::check_if_supported_inplace_alter(): Do allow the creation
of SPATIAL INDEX with full_crc32 also when page_compressed is used.

commit_cache_norebuild(): Preserve the compression algorithm when
updating the page_compression_level.

dict_tf_to_fsp_flags(): Set the flags for page compression algorithm.
FIXME: Maybe there should be a table option page_compression_algorithm
and a session variable to back it?
2019-03-18 14:08:43 +02:00
Marko Mäkelä
ea25bdc135 Do not write MLOG_IBUF_BITMAP_INIT
Use ibuf_bitmap_page_init() only during recovery.

fsp_fill_free_list(): Initialize the FIL_PAGE_TYPE using MLOG_2BYTES.
The page contents will already have been zeroed out by
MLOG_INIT_FILE_PAGE2.

ibuf_bitmap_init_apply(): Replaces ibuf_parse_bitmap_init().
2019-02-08 19:10:33 +02:00
Marko Mäkelä
0a1c3477bf MDEV-18493 Remove page_size_t
MySQL 5.7 introduced the class page_size_t and increased the size of
buffer pool page descriptors by introducing this object to them.

Maybe the intention of this exercise was to prepare for a future
where the buffer pool could accommodate multiple page sizes.
But that future never arrived, not even in MySQL 8.0. It is much
easier to manage a pool of a single page size, and typically all
storage devices of an InnoDB instance benefit from using the same
page size.

Let us remove page_size_t from MariaDB Server. This will make it
easier to remove support for ROW_FORMAT=COMPRESSED (or make it a
compile-time option) in the future, just by removing various
occurrences of zip_size.
2019-02-07 12:21:35 +02:00
Sergey Vojtovich
c6a00544ff MDEV-17441 - InnoDB transition to C++11 atomics
ibuf_t::n_merged_ops and ibuf_t::n_discarded_ops transition to
Atomic_counter.
2018-12-29 14:09:31 +04:00
Sergey Vojtovich
dbd40edfe6 MDEV-17441 - InnoDB transition to C++11 atomics
ibuf_t::n_merges transition to Atomic_counter.
2018-12-27 22:46:38 +04:00
Marko Mäkelä
b5763ecd01 Merge 10.3 into 10.4 2018-12-18 11:33:53 +02:00
Marko Mäkelä
45531949ae Merge 10.2 into 10.3 2018-12-18 09:15:41 +02:00
Marko Mäkelä
7d245083a4 Merge 10.1 into 10.2 2018-12-17 20:15:38 +02:00
Marko Mäkelä
7dcbc33db5 Merge 10.3 into 10.4 2018-11-26 17:20:07 +02:00
Marko Mäkelä
06e5f28f9f MDEV-12266: Remove a level of pointer indirection
Replace table->space->id with table->space_id.
2018-11-22 17:10:26 +02:00
Marko Mäkelä
dde2ca4aa1 Merge 10.3 into 10.4 2018-11-19 20:22:33 +02:00
Marko Mäkelä
fd58bb71e2 Merge 10.2 into 10.3 2018-11-19 18:45:53 +02:00
Marko Mäkelä
ff88e4bb8a Remove many redundant #include from InnoDB 2018-11-19 11:42:14 +02:00
Marko Mäkelä
074c684099 Merge 10.3 into 10.4 2018-11-06 16:24:16 +02:00
Marko Mäkelä
df563e0c03 Merge 10.2 into 10.3
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.

main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
2018-11-06 09:40:39 +02:00
Eugene Kosov
14be814380 MDEV-17491 micro optimize page_id_t
page_id_t: remove m_fold member

various places: pass page_id_t by value instead of by reference
2018-10-25 18:46:27 +03:00
Marko Mäkelä
09af00cbde MDEV-13564: Remove old crash-upgrade logic in 10.4
Stop supporting the additional *trunc.log files that were
introduced via MySQL 5.7 to MariaDB Server 10.2 and 10.3.

DB_TABLESPACE_TRUNCATED: Remove.

purge_sys.truncate: A new structure to track undo tablespace
file truncation.

srv_start(): Remove the call to buf_pool_invalidate(). It is
no longer necessary, given that we no longer access things in
ways that violate the ARIES protocol. This call was originally
added for innodb_file_format, and it may later have been necessary
for the proper function of the MySQL 5.7 TRUNCATE recovery, which
we are now removing.

trx_purge_cleanse_purge_queue(): Take the undo tablespace as a
parameter.

trx_purge_truncate_history(): Rewrite everything mostly in a
single function, replacing references to undo::Truncate.

recv_apply_hashed_log_recs(): If any redo log is to be applied,
and if the log_sys.log.subformat indicates that separately
logged truncate may have been used, refuse to proceed except if
innodb_force_recovery is set. We will still refuse crash-upgrade
if TRUNCATE TABLE was logged. Undo tablespace truncation would
only be logged in undo*trunc.log files, which we are no longer
checking for.
2018-09-11 21:32:15 +03:00
Marko Mäkelä
5a1868b58d MDEV-13564 Mariabackup does not work with TRUNCATE
This is a merge from 10.2, but the 10.2 version of this will not
be pushed into 10.2 yet, because the 10.2 version would include
backports of MDEV-14717 and MDEV-14585, which would introduce
a crash recovery regression: Tables could be lost on
table-rebuilding DDL operations, such as ALTER TABLE,
OPTIMIZE TABLE or this new backup-friendly TRUNCATE TABLE.
The test innodb.truncate_crash occasionally loses the table due to
the following bug:

MDEV-17158 log_write_up_to() sometimes fails
2018-09-07 22:15:06 +03:00
Marko Mäkelä
055a3334ad MDEV-13564 Mariabackup does not work with TRUNCATE
Implement undo tablespace truncation via normal redo logging.

Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name,
CREATE, and DROP.

Note: Orphan #sql-ib*.ibd may be left behind if MariaDB Server 10.2
is killed before the DROP operation is committed. If MariaDB Server 10.2
is killed during TRUNCATE, it is also possible that the old table
was renamed to #sql-ib*.ibd but the data dictionary will refer to the
table using the original name.

In MariaDB Server 10.3, RENAME inside InnoDB is transactional,
and #sql-* tables will be dropped on startup. So, this new TRUNCATE
will be fully crash-safe in 10.3.

ha_mroonga::wrapper_truncate(): Pass table options to the underlying
storage engine, now that ha_innobase::truncate() will need them.

rpl_slave_state::truncate_state_table(): Before truncating
mysql.gtid_slave_pos, evict any cached table handles from
the table definition cache, so that there will be no stale
references to the old table after truncating.

== TRUNCATE TABLE ==

WL#6501 in MySQL 5.7 introduced separate log files for implementing
atomic and crash-safe TRUNCATE TABLE, instead of using the InnoDB
undo and redo log. Some convoluted logic was added to the InnoDB
crash recovery, and some extra synchronization (including a redo log
checkpoint) was introduced to make this work. This synchronization
has caused performance problems and race conditions, and the extra
log files cannot be copied or applied by external backup programs.

In order to support crash-upgrade from MariaDB 10.2, we will keep
the logic for parsing and applying the extra log files, but we will
no longer generate those files in TRUNCATE TABLE.

A prerequisite for crash-safe TRUNCATE is a crash-safe RENAME TABLE
(with full redo and undo logging and proper rollback). This will
be implemented in MDEV-14717.

ha_innobase::truncate(): Invoke RENAME, create(), delete_table().
Because RENAME cannot be fully rolled back before MariaDB 10.3
due to missing undo logging, add some explicit rename-back in
case the operation fails.

ha_innobase::delete(): Introduce a variant that takes sqlcom as
a parameter. In TRUNCATE TABLE, we do not want to touch any
FOREIGN KEY constraints.

ha_innobase::create(): Add the parameters file_per_table, trx.
In TRUNCATE, the new table must be created in the same transaction
that renames the old table.

create_table_info_t::create_table_info_t(): Add the parameters
file_per_table, trx.

row_drop_table_for_mysql(): Replace a bool parameter with sqlcom.

row_drop_table_after_create_fail(): New function, wrapping
row_drop_table_for_mysql().

dict_truncate_index_tree_in_mem(), fil_truncate_tablespace(),
fil_prepare_for_truncate(), fil_reinit_space_header_for_table(),
row_truncate_table_for_mysql(), TruncateLogger,
row_truncate_prepare(), row_truncate_rollback(),
row_truncate_complete(), row_truncate_fts(),
row_truncate_update_system_tables(),
row_truncate_foreign_key_checks(), row_truncate_sanity_checks():
Remove.

row_upd_check_references_constraints(): Remove a check for
TRUNCATE, now that the table is no longer truncated in place.

The new test innodb.truncate_foreign uses DEBUG_SYNC to cover some
race-condition like scenarios. The test innodb-innodb.truncate does
not use any synchronization.

We add a redo log subformat to indicate backup-friendly format.
MariaDB 10.4 will remove support for the old TRUNCATE logging,
so crash-upgrade from old 10.2 or 10.3 to 10.4 will involve
limitations.

== Undo tablespace truncation ==

MySQL 5.7 implements undo tablespace truncation. It is only
possible when innodb_undo_tablespaces is set to at least 2.
The logging is implemented similar to the WL#6501 TRUNCATE,
that is, using separate log files and a redo log checkpoint.

We can simply implement undo tablespace truncation within
a single mini-transaction that reinitializes the undo log
tablespace file. Unfortunately, due to the redo log format
of some operations, currently, the total redo log written by
undo tablespace truncation will be more than the combined size
of the truncated undo tablespace. It should be acceptable
to have a little more than 1 megabyte of log in a single
mini-transaction. This will be fixed in MDEV-17138 in
MariaDB Server 10.4.

recv_sys_t: Add truncated_undo_spaces[] to remember for which undo
tablespaces a MLOG_FILE_CREATE2 record was seen.

namespace undo: Remove some unnecessary declarations.

fil_space_t::is_being_truncated: Document that this flag now
only applies to undo tablespaces. Remove some references.

fil_space_t::is_stopping(): Do not refer to is_being_truncated.
This check is for tablespaces of tables. Potentially used
tablespaces are never truncated any more.

buf_dblwr_process(): Suppress the out-of-bounds warning
for undo tablespaces.

fil_truncate_log(): Write a MLOG_FILE_CREATE2 with a nonzero
page number (new size of the tablespace in pages) to inform
crash recovery that the undo tablespace size has been reduced.

fil_op_write_log(): Relax assertions, so that MLOG_FILE_CREATE2
can be written for undo tablespaces (without .ibd file suffix)
for a nonzero page number.

os_file_truncate(): Add the parameter allow_shrink=false
so that undo tablespaces can actually be shrunk using this function.

fil_name_parse(): For undo tablespace truncation,
buffer MLOG_FILE_CREATE2 in truncated_undo_spaces[].

recv_read_in_area(): Avoid reading pages for which no redo log
records remain buffered, after recv_addr_trim() removed them.

trx_rseg_header_create(): Add a FIXME comment that we could write
much less redo log.

trx_undo_truncate_tablespace(): Reinitialize the undo tablespace
in a single mini-transaction, which will be flushed to the redo log
before the file size is trimmed.

recv_addr_trim(): Discard any redo logs for pages that were
logged after the new end of a file, before the truncation LSN.
If the rec_list becomes empty, reduce n_addrs. After removing
any affected records, actually truncate the file.

recv_apply_hashed_log_recs(): Invoke recv_addr_trim() right before
applying any log records. The undo tablespace files must be open
at this point.

buf_flush_or_remove_pages(), buf_flush_dirty_pages(),
buf_LRU_flush_or_remove_pages(): Add a parameter for specifying
the number of the first page to flush or remove (default 0).

trx_purge_initiate_truncate(): Remove the log checkpoints, the
extra logging, and some unnecessary crash points. Merge the code
from trx_undo_truncate_tablespace(). First, flush all to-be-discarded
pages (beyond the new end of the file), then trim the space->size
to make the page allocation deterministic. At the only remaining
crash injection point, flush the redo log, so that the recovery
can be tested.
2018-09-07 22:10:02 +03:00
Marko Mäkelä
0121d5a790 Merge 10.2 into 10.3 2018-06-18 15:43:59 +03:00
Marko Mäkelä
2ca904f0ca MDEV-13103 Deal with page_compressed page corruption
fil_page_decompress(): Replaces fil_decompress_page().
Allow the caller detect errors. Remove
duplicated code. Use the "safe" instead of "fast" variants of
decompression routines.

fil_page_compress(): Replaces fil_compress_page().
The length of the input buffer always was srv_page_size (innodb_page_size).
Remove printouts, and remove the fil_space_t* parameter.

buf_tmp_buffer_t::reserved: Make private; the accessors acquire()
and release() will use atomic memory access.

buf_pool_reserve_tmp_slot(): Make static. Remove the second parameter.
Do not acquire any mutex. Remove the allocation of the buffers.

buf_tmp_reserve_crypt_buf(), buf_tmp_reserve_compression_buf():
Refactored away from buf_pool_reserve_tmp_slot().

buf_page_decrypt_after_read(): Make static, and simplify the logic.
Use the encryption buffer also for decompressing.

buf_page_io_complete(), buf_dblwr_process(): Check more failures.

fil_space_encrypt(): Simplify the debug checks.

fil_space_t::printed_compression_failure: Remove.

fil_get_compression_alg_name(): Remove.

fil_iterate(): Allocate a buffer for compression and decompression
only once, instead of allocating and freeing it for every page
that uses compression, during IMPORT TABLESPACE. Also, validate the
page checksum before decryption, and reduce the scope of some variables.

fil_page_is_index_page(), fil_page_is_lzo_compressed(): Remove (unused).

AbstractCallback::operator()(): Remove the parameter 'offset'.
The check for it in FetchIndexRootPages::operator() was basically
redundant and dead code since the previous refactoring.
2018-06-14 14:23:01 +03:00
Marko Mäkelä
f5eb37129f MDEV-13103 Deal with page_compressed page corruption
fil_page_decompress(): Replaces fil_decompress_page().
Allow the caller detect errors. Remove
duplicated code. Use the "safe" instead of "fast" variants of
decompression routines.

fil_page_compress(): Replaces fil_compress_page().
The length of the input buffer always was srv_page_size (innodb_page_size).
Remove printouts, and remove the fil_space_t* parameter.

buf_tmp_buffer_t::reserved: Make private; the accessors acquire()
and release() will use atomic memory access.

buf_pool_reserve_tmp_slot(): Make static. Remove the second parameter.
Do not acquire any mutex. Remove the allocation of the buffers.

buf_tmp_reserve_crypt_buf(), buf_tmp_reserve_compression_buf():
Refactored away from buf_pool_reserve_tmp_slot().

buf_page_decrypt_after_read(): Make static, and simplify the logic.
Use the encryption buffer also for decompressing.

buf_page_io_complete(), buf_dblwr_process(): Check more failures.

fil_space_encrypt(): Simplify the debug checks.

fil_space_t::printed_compression_failure: Remove.

fil_get_compression_alg_name(): Remove.

fil_iterate(): Allocate a buffer for compression and decompression
only once, instead of allocating and freeing it for every page
that uses compression, during IMPORT TABLESPACE.

fil_node_get_space_id(), fil_page_is_index_page(),
fil_page_is_lzo_compressed(): Remove (unused code).
2018-06-14 13:46:07 +03:00
Marko Mäkelä
b52bb6eb82 MDEV-16469 SET GLOBAL innodb_change_buffering has no effect
When type of the settable global variable innodb_change_buffering was
changed from string to ENUM in MDEV-12218, the "shadow" variable ibuf_use
stopped being updated as a result of SET GLOBAL innodb_change_buffering.
Only on InnoDB startup, the parameter innodb_change_buffering would
take effect.

ibuf_use: Remove, and use the global variable innodb_change_buffering.
2018-06-12 12:49:42 +03:00
Marko Mäkelä
ba43914ec4 Replace dict_table_is_temporary(table) with table->is_temporary() 2018-05-12 22:12:12 +03:00
Marko Mäkelä
e9f2609747 Merge 10.2 into 10.3 2018-05-09 16:52:45 +03:00
Marko Mäkelä
0ea625b325 MDEV-15916 Change buffer crash during TRUNCATE or DROP TABLE
ibuf_restore_pos(): Do not issue any messages if the tablespace
is being dropped or truncated. In MariaDB 10.2, TRUNCATE TABLE
would not change the tablespace ID, and therefore the tablespace
would be found, even though TRUNCATE is in progress.

Furthermore, do not commit suicide if restoring the change buffer
cursor fails. The worst that could happen is that a secondary index
becomes corrupted due to incomplete change buffer merge.
2018-05-09 12:10:02 +03:00
Marko Mäkelä
2b27ac8282 Fix many -Wunused-parameter
Remove unused InnoDB function parameters and functions.

i_s_sys_virtual_fill_table(): Do not allocate heap memory.

mtr_is_block_fix(): Replace with mtr_memo_contains().

mtr_is_page_fix(): Replace with mtr_memo_contains_page().
2018-05-01 16:52:19 +03:00
Marko Mäkelä
9801715cb0 Use compile_time_assert() in InnoDB
Replace most use of #error. Some checks were impossible to
evaluate in the preprocessor due to the use of named
integer constants or enumerations.
2018-04-30 18:22:52 +03:00
Marko Mäkelä
715e4f4320 MDEV-12218 Clean up InnoDB parameter validation
Bind more InnoDB parameters directly to MYSQL_SYSVAR and
remove "shadow variables".

innodb_change_buffering: Declare as ENUM, not STRING.

innodb_flush_method: Declare as ENUM, not STRING.

innodb_log_buffer_size: Bind directly to srv_log_buffer_size,
without rounding it to a multiple of innodb_page_size.

LOG_BUFFER_SIZE: Remove.

SysTablespace::normalize_size(): Renamed from normalize().

innodb_init_params(): A new function to initialize and validate
InnoDB startup parameters.

innodb_init(): Renamed from innobase_init(). Invoke innodb_init_params()
before actually trying to start up InnoDB.

srv_start(bool): Renamed from innobase_start_or_create_for_mysql().
Added the input parameter create_new_db.

SRV_ALL_O_DIRECT_FSYNC: Define only for _WIN32.

xb_normalize_init_values(): Merge to innodb_init_param().
2018-04-29 09:41:42 +03:00
Marko Mäkelä
9ed2b2b2b8 Do not divide or multiply by srv_page_size
Instead, shift by srv_page_size_shift.
2018-04-28 20:52:22 +03:00
Marko Mäkelä
a90100d756 Replace univ_page_size and UNIV_PAGE_SIZE
Try to use one variable (srv_page_size) for innodb_page_size.

Also, replace UNIV_PAGE_SIZE_SHIFT with srv_page_size_shift.
2018-04-28 20:45:45 +03:00
Marko Mäkelä
ba19764209 Fix most -Wsign-conversion in InnoDB
Change innodb_buffer_pool_size, innodb_fill_factor to unsigned.
2018-04-28 20:45:45 +03:00
Marko Mäkelä
de942c9f61 MDEV-15983 Reduce fil_system.mutex contention further
fil_space_t::n_pending_ops, n_pending_ios: Use a combination of
fil_system.mutex and atomic memory access for protection.

fil_space_t::release(): Replaces fil_space_release().
Does not acquire fil_system.mutex.

fil_space_t::release_for_io(): Replaces fil_space_release_for_io().
Does not acquire fil_system.mutex.
2018-04-23 13:15:54 +03:00
Marko Mäkelä
4cad42392a MDEV-12266: Change dict_table_t::space to fil_space_t*
InnoDB always keeps all tablespaces in the fil_system cache.
The fil_system.LRU is only for closing file handles; the
fil_space_t and fil_node_t for all data files will remain
in main memory. Between startup to shutdown, they can only be
created and removed by DDL statements. Therefore, we can
let dict_table_t::space point directly to the fil_space_t.

dict_table_t::space_id: A numeric tablespace ID for the corner cases
where we do not have a tablespace. The most prominent examples are
ALTER TABLE...DISCARD TABLESPACE or a missing or corrupted file.

There are a few functional differences; most notably:
(1) DROP TABLE will delete matching .ibd and .cfg files,
even if they were not attached to the data dictionary.
(2) Some error messages will report file names instead of numeric IDs.

There still are many functions that use numeric tablespace IDs instead
of fil_space_t*, and many functions could be converted to fil_space_t
member functions. Also, Tablespace and Datafile should be merged with
fil_space_t and fil_node_t. page_id_t and buf_page_get_gen() could use
fil_space_t& instead of a numeric ID, and after moving to a single
buffer pool (MDEV-15058), buf_pool_t::page_hash could be moved to
fil_space_t::page_hash.

FilSpace: Remove. Only few calls to fil_space_acquire() will remain,
and gradually they should be removed.

mtr_t::set_named_space_id(ulint): Renamed from set_named_space(),
to prevent accidental calls to this slower function. Very few
callers remain.

fseg_create(), fsp_reserve_free_extents(): Take fil_space_t*
as a parameter instead of a space_id.

fil_space_t::rename(): Wrapper for fil_rename_tablespace_check(),
fil_name_write_rename(), fil_rename_tablespace(). Mariabackup
passes the parameter log=false; InnoDB passes log=true.

dict_mem_table_create(): Take fil_space_t* instead of space_id
as parameter.

dict_process_sys_tables_rec_and_mtr_commit(): Replace the parameter
'status' with 'bool cached'.

dict_get_and_save_data_dir_path(): Avoid copying the fil_node_t::name.

fil_ibd_open(): Return the tablespace.

fil_space_t::set_imported(): Replaces fil_space_set_imported().

truncate_t: Change many member function parameters to fil_space_t*,
and remove page_size parameters.

row_truncate_prepare(): Merge to its only caller.

row_drop_table_from_cache(): Assert that the table is persistent.

dict_create_sys_indexes_tuple(): Write SYS_INDEXES.SPACE=FIL_NULL
if the tablespace has been discarded.

row_import_update_discarded_flag(): Remove a constant parameter.
2018-03-29 22:02:05 +03:00
Marko Mäkelä
428e02895b MDEV-12266: Remove dict_index_t::table_name
Replace index->table_name with index->table->name.
2018-03-29 20:47:37 +03:00
Marko Mäkelä
604fea1ad6 MDEV-12266: Remove dict_index_t::space
We can rely on the dict_table_t::space. All indexes of a table object
are always in the same tablespace. (For fulltext indexes, the data is
located in auxiliary tables, and these will continue to have their own
table objects, separate from the main table.)
2018-03-29 20:47:37 +03:00
Marko Mäkelä
2ac8b1a907 MDEV-12266: Add fil_system.sys_space, temp_space
Add fil_system_t::sys_space, fil_system_t::temp_space.
These will replace lookups for TRX_SYS_SPACE or SRV_TMP_SPACE_ID.

mtr_t::m_undo_space, mtr_t::m_sys_space: Remove.

mtr_t::set_sys_modified(): Remove.

fil_space_get_type(), fil_space_get_n_reserved_extents(): Remove.

fsp_header_get_tablespace_size(), fsp_header_inc_size():
Merge to the only caller, innobase_start_or_create_for_mysql().
2018-03-29 19:18:11 +03:00
Eugene Kosov
af2d260b37 merge btr_page_get_level_low() and btr_page_get_level() 2018-02-16 21:44:51 +03:00
Marko Mäkelä
32170f8c6d Add page_has_prev(), page_has_next(), page_has_siblings()
Until now, InnoDB inefficiently compared the aligned fields
FIL_PAGE_PREV, FIL_PAGE_NEXT to the byte-order-agnostic value FIL_NULL.
2018-02-08 22:34:21 +02:00
Marko Mäkelä
f69a3b2e92 After-merge fix for commit d4df7bc9b1
The merge omitted some InnoDB and XtraDB conflict resolutions,
most notably, failing to merge the fix of MDEV-12173.

ibuf_merge_or_delete_for_page(), lock_rec_block_validate():
Invoke fil_space_acquire_silent() instead of fil_space_acquire().
This fixes MDEV-12173.

wsrep_debug, wsrep_trx_is_aborting(): Removed unused declarations.

_fil_io(): Remove. Instead, declare default parameters for the XtraDB
fil_io().

buf_read_page_low(): Declare default parameters, and clean up some
callers.

os_aio(): Correct the macro that is defined when !UNIV_PFS_IO.
2018-02-02 19:57:59 +02:00
Sergei Golubchik
d4df7bc9b1 Merge branch 'github/10.0' into 10.1 2018-02-02 10:09:44 +01:00
Marko Mäkelä
9875d5c3e1 Merge bb-10.2-ext into 10.3 2018-01-24 14:00:33 +02:00
Marko Mäkelä
431607237d MDEV-12173 "Error: trying to do an operation on a dropped tablespace"
InnoDB is issuing a 'noise' message that is not a sign of abnormal
operation. The only issuers of it are the debug function
lock_rec_block_validate() and the change buffer merge.
While the error should ideally never occur in transactional locking,
we happen to know that DISCARD TABLESPACE and TRUNCATE TABLE and
possibly DROP TABLE are breaking InnoDB table locks.

When it comes to the change buffer merge, the message simply is useless
noise. We know perfectly well that a tablespace can be dropped while a
change buffer merge is pending. And the code is prepared to handle that,
which is demonstrated by the fact that whenever the message was issued,
InnoDB did not crash.

fil_inc_pending_ops(): Remove the parameter print_err.
2018-01-22 16:58:13 +02:00
Marko Mäkelä
29eeb527fd MDEV-12173 "[Warning] Trying to access missing tablespace"
ibuf_merge_or_delete_for_page(): Invoke fil_space_acquire_silent()
instead of fil_space_acquire() in order to avoid displaying
a useless message.

We know perfectly well that a tablespace can be dropped while a
change buffer merge is pending, because change buffer merges skip
any transactional locks.
2018-01-22 16:53:33 +02:00