Remove srv_win_file_flush_method
- Rename srv_unix_file_flush_method to srv_file_flush_method, and
rename constants to remove UNIX from them, i.e SRV_UNIX_FSYNC=>SRV_FSYNC
- Add SRV_ALL_O_DIRECT_FSYNC corresponding to current Windows default
(no buffering for either log or data, flush on both log and data)
- change os_file_open on Windows to behave identically to Unix wrt
O_DIRECT and O_DSYNC settings. map O_DIRECT to FILE_FLAG_NO_BUFFERING and
O_DSYNC to FILE_FLAG_WRITE_THROUGH
- remove various #ifdef _WIN32
Provide more useful progress reporting of crash recovery.
recv_sys_t::progress_time: The time of the last report.
recv_sys_t::report(ib_time_t): Determine whether progress should
be reported.
recv_scan_print_counter: Remove.
log_group_read_log_seg(): After after each I/O request, invoke
recv_sys_t::report() and report progress if needed.
recv_apply_hashed_log_recs(): Change the return type back to void
(DB_SUCCESS was always returned), and rename the parameter to last_batch.
At the start of each batch, if there are pages to be recovered,
issue a message.
Because the default build configuration of the server will remain
at -DWITH_INNODB_AHI=ON, we want to test the instrumentation.
We make and revert the test adjustments in separate commits on purpose,
so that this commit can be easily reverted later if the default
build configuration is changed to -DWITH_INNODB_AHI=OFF.
The InnoDB adaptive hash index is sometimes degrading the performance of
InnoDB, and it is sometimes disabled to get more consistent performance.
We should have a compile-time option to disable the adaptive hash index.
Let us introduce two options:
OPTION(WITH_INNODB_AHI "Include innodb_adaptive_hash_index" ON)
OPTION(WITH_INNODB_ROOT_GUESS "Cache index root block descriptors" ON)
where WITH_INNODB_AHI always implies WITH_INNODB_ROOT_GUESS.
As part of this change, the misleadingly named function
trx_search_latch_release_if_reserved(trx) will be replaced with the macro
trx_assert_no_search_latch(trx) that will be empty unless
BTR_CUR_HASH_ADAPT is defined (cmake -DWITH_INNODB_AHI=ON).
We will also remove the unused column
INFORMATION_SCHEMA.INNODB_TRX.TRX_ADAPTIVE_HASH_TIMEOUT.
In MariaDB Server 10.1, it used to reflect the value of
trx_t::search_latch_timeout which could be adjusted during
row_search_for_mysql(). In 10.2, there is no such field.
Other than the removal of the unused column TRX_ADAPTIVE_HASH_TIMEOUT,
this is an almost non-functional change to the server when using the
default build options.
Some tests are adjusted so that they will work with both
-DWITH_INNODB_AHI=ON and -DWITH_INNODB_AHI=OFF. The test
innodb.innodb_monitor has been renamed to innodb.monitor
in order to track MySQL 5.7, and the duplicate tests
sys_vars.innodb_monitor_* are removed.
This fixes MySQL Bug#80788 in MariaDB 10.2.5.
When I made the InnoDB crash recovery more robust by implementing
WL#7142, I also introduced an extra redo log scan pass that can be
shortened.
This fix will slightly extend the InnoDB redo log format that I
introduced in MySQL 5.7.9 by writing the start LSN of the MLOG_CHECKPOINT
mini-transaction to the end of the log checkpoint page, so that recovery
can jump straight to it without scanning all the preceding redo log.
LOG_CHECKPOINT_END_LSN: At the end of the checkpoint page, the start LSN
of the MLOG_CHECKPOINT mini-transaction. Previously, these bytes were
written as 0.
log_write_checkpoint_info(), log_group_checkpoint(): Add the parameter
end_lsn for writing LOG_CHECKPOINT_END_LSN.
log_checkpoint(): Remember the LSN at which the MLOG_CHECKPOINT
mini-transaction is starting (or at which the redo log ends on
shutdown).
recv_init_crash_recovery(): Remove.
recv_group_scan_log_recs(): Add the parameter checkpoint_lsn.
recv_recovery_from_checkpoint_start(): Read LOG_CHECKPOINT_END_LSN
and if it is set, start the first scan from it instead of the
checkpoint LSN. Improve some messages and remove bogus assertions.
recv_parse_log_recs(): Do not skip DBUG_PRINT("ib_log") for some
file-level redo log records.
recv_parse_or_apply_log_rec_body(): If we have not parsed all redo
log between the checkpoint and the corresponding MLOG_CHECKPOINT
record, defer the check for MLOG_FILE_DELETE or MLOG_FILE_NAME records
to recv_init_crash_recovery_spaces().
recv_init_crash_recovery_spaces(): Refuse recovery if
MLOG_FILE_NAME or MLOG_FILE_DELETE records are missing.
MDEV-7618 introduced configuration parameter innodb_instrument_semaphores
in MariaDB Server 10.1. The parameter seems to only affect the rw-lock
X-latch acquisition. Extra fields are added to rw_lock_t to remember one
X-latch holder or waiter. These fields are not being consulted or reported
anywhere. This is basically only adding code bloat.
If the intention is to debug hangs or deadlocks, we have better tools for
that in the debug server, and for the non-debug server, core dumps can
reveal a lot. For example, the mini-transaction memo records the
currently held buffer block or index rw-locks, to be released at
mtr_t::commit().
The configuration parameter innodb_instrument_semaphores will be
deprecated in 10.2.5 and removed in 10.3.0.
rw_lock_t: Remove the members lock_name, file_name, line, thread_id
which did not affect any output.
to tables in the system tablespace
This is a regression caused by MDEV-11585, which accidentally
changed Tablespace::is_undo_tablespace() in an incorrect way,
causing the InnoDB system tablespace to be reported as a dedicated
undo tablespace, for which the change buffer is not applicable.
Tablespace::is_undo_tablespace(): Remove. There were only 2
calls from the function buf_page_io_complete(). Replace those
calls as appropriate.
Also, merge changes to tablespace import/export tests from
MySQL 5.7, and clean up the tests a little further, allowing
them to be run with any innodb_page_size.
Remove duplicated error injection instrumentation for the
import/export tests. In MySQL 5.7, the error injection label
buf_page_is_corrupt_failure was renamed to
buf_page_import_corrupt_failure.
fil_space_extend_must_retry(): Correct a debug assertion
(tablespaces can be extended during IMPORT), and remove a
TODO comment about compressed temporary tables that was
already addressed in MDEV-11816.
dict_build_tablespace_for_table(): Correct a comment that
no longer holds after MDEV-11816, and assert that
ROW_FORMAT=COMPRESSED can only be used in .ibd files.
Note: At least one test is unstable, failing with the following:
./mtr --mysqld=--innodb-purge-threads=9 --big-test --no-reorder \
galera.galera_parallel_autoinc_largetrx galera.galera_var_slave_threads
The result difference is dependent on innodb_purge_threads.
buf_dump(): Correct the printf format passed to buf_dump_status()
to match the argument types.
Revert the changes to storage/xtradb. XtraDB is not being compiled
for 10.2. The unused copy that we have in the 10.2 branch is only
getting merges from 10.1.
Disable the test sys_vars.innodb_buffer_pool_dump_pct_function
because it is unstable on buildbot.
Revert the MDEV-4396 tweak to innodb.innodb_bug14676111.
We must fix the root cause instead.
Allow gcol.innodb_virtual_purge to run on a non-debug build
(If wait_innodb_all_purged.inc is used in a non-debug test,
it will have no effect.)
Add the test innodb.index_merge_threshold from MySQL 5.7.
The test innodb.log_file_size_checkpoint was originally added to
MySQL 5.7 by me in a bug fix, to fix the interaction of WL#6494
(redo log resizing, introduced in MySQL 5.6) and WL#7142
(data file discovery based on MLOG_FILE_NAME records,
introduced in MySQL 5.7):
commit 70f9ef4e1220827132b50275ca7272f2bcca1864
Author: Marko Mäkelä <marko.makela@oracle.com>
Date: Wed May 21 13:31:29 2014 +0300
Bug#18755095 REDO LOG SIZE CHANGE AFTER CRASH RESULTS IN CHECKPOINT AGE
ERROR MESSAGE
This is a regression from fixing
Bug#18730524 REPEATED KILL+RESTART FAILS DUE TO MISSING MLOG_FILE_NAME
RECORD
innobase_start_or_create_for_mysql(): Invoke fil_names_clear() before
creating the "checkpoint" when changing redo log files.
Approved by Jimmy Yang on IM.
The relevant part of the test is that fil_names_clear() is invoked to
emit an MLOG_CHECKPOINT record before the redo log files are deleted.
In case the server is killed before ib_logfile0 has been deleted,
the old (not-yet-resized) redo log will be treated as valid. We do not
need to create a large number of tables for that.
Write only one encryption key to the checkpoint page.
Use 4 bytes of nonce. Encrypt more of each redo log block,
only skipping the 4-byte field LOG_BLOCK_HDR_NO which the
initialization vector is derived from.
Issue notes, not warning messages for rewriting the redo log files.
recv_recovery_from_checkpoint_finish(): Do not generate any redo log,
because we must avoid that before rewriting the redo log files, or
otherwise a crash during a redo log rewrite (removing or adding
encryption) may end up making the database unrecoverable.
Instead, do these tasks in innobase_start_or_create_for_mysql().
Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some
unreachable code and duplicated error messages for log corruption.
LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo
log format.
log_group_t::is_encrypted(), log_t::is_encrypted(): Determine
if the redo log is in encrypted format.
recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED.
srv_prepare_to_delete_redo_log_files(): Display NOTE messages about
adding or removing encryption. Do not issue warnings for redo log
resizing any more.
innobase_start_or_create_for_mysql(): Rebuild the redo logs also when
the encryption changes.
innodb_log_checksums_func_update(): Always use the CRC-32C checksum
if innodb_encrypt_log. If needed, issue a warning
that innodb_encrypt_log implies innodb_log_checksums.
log_group_write_buf(): Compute the checksum on the encrypted
block contents, so that transmission errors or incomplete blocks can be
detected without decrypting.
Rewrite most of the redo log encryption code. Only remember one
encryption key at a time (but remember up to 5 when upgrading from the
MariaDB 10.1 format.)
The InnoDB redo log consists of a list of files that logically form
a bigger file, as if the individual files were concatenated together.
The first file will always be written on redo log checkpoint, because
the two checkpoint pages are at the start of the single logical
redo log file.
There is no technical reason why InnoDB requires at least 2 files
to exist. Let us reduce the minimum number to 1. In that way,
restoring from backups will become easier, since InnoDB can directly
deal with a single backed-up redo log file.
the test restarts the server, giving it 60 seconds to shutdown
and then killing it mercilessly.
make sure the server closes all MyISAM tables before shutdown,
as we cannot reliably expect it to make the deadline.
JOIN_CACHE's were initialized in check_join_cache_usage()
from make_join_readinfo(). After that make_join_readinfo() was looking
whether it's possible to use keyread. Later, after make_join_readinfo(),
optimizer decided whether to use filesort. And even later, at the
execution time, from join_read_first(), keyread was actually enabled.
The problem is, that if a query uses a vcol, base columns that it
depends on are automatically added to the read_set - because they're
needed to calculate the vcol. But if we're doing keyread, vcol is taken
from the index, not calculated, and base columns do not need to be
in the read set (even should not be - as they aren't getting values).
The bug was that JOIN_CACHE used read_set with base columns,
they were not read because of keyread, so it was caching garbage.
So read_set is only known after the keyread was decided. And after the
filesort was decided, as filesort doesn't use keyread. But
check_join_cache_usage() needs to be done in make_join_readinfo(),
as the code below depends on these checks,
Fix: keep JOIN_CACHE checks where they were, but move initialization
down to the very end of JOIN::optimize_inner. If keyread was enabled,
update the read_set to include only columns that are part of the index.
Copy the keyread logic from join_read_first() to happen at optimize time.
move TABLE::key_read into handler. Because in index merge and DS-MRR
there can be many handlers per table, and some of them use
key read while others don't. "keyread" is really per handler,
not per TABLE property.
Oracle introduced a Memcached plugin interface to the InnoDB
storage engine in MySQL 5.6. That interface is essentially a
fork of Memcached development snapshot 1.6.0-beta1 of an old
development branch 'engine-pu'.
To my knowledge, there have not been any updates to the Memcached code
between MySQL 5.6 and 5.7; only bug fixes and extensions related to
the Oracle modifications.
The Memcached plugin is not part of the MariaDB Server. Therefore it
does not make sense to include the InnoDB interfaces for the Memcached
plugin, or to have any related configuration parameters:
innodb_api_bk_commit_interval
innodb_api_disable_rowlock
innodb_api_enable_binlog
innodb_api_enable_mdl
innodb_api_trx_level
Removing this code in one commit makes it possible to easily restore
it, in case it turns out to be needed later.
The fix from MDEV-10866 was insufficient.
Attempt 2 at fixing this.
binlog.binlog_row_ctype_ucs 'row' w18 [ fail ]
Test ended at 2017-02-13 10:36:57
CURRENT_TEST: binlog.binlog_row_ctype_ucs
--- /mariadb/mysql-test/suite/binlog/r/binlog_row_ctype_ucs.result 2017-02-06 09:29:43.116183650 +1100
+++ /mariadb/mysql-test/suite/binlog/r/binlog_row_ctype_ucs.reject 2017-02-13 10:36:56.984056229 +1100
@@ -71,21 +71,21 @@
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at #
-#700101 6:46:40 server id # end_log_pos # CRC32 XXX Start: binlog v 4, server v #.##.## created 700101 6:46:40
+#170213 10:36:56 server id # end_log_pos # CRC32 XXX Start: binlog v 4, server v #.##.## created 170213 10:36:56
# at #
-#700101 6:46:40 server id # end_log_pos # CRC32 XXX Gtid list [#-#-#]
+#170213 10:36:56 server id # end_log_pos # CRC32 XXX Gtid list [#-#-#]
# at #
-#700101 6:46:40 server id # end_log_pos # CRC32 XXX Binlog checkpoint master-bin.000002
+#170213 10:36:56 server id # end_log_pos # CRC32 XXX Binlog checkpoint master-bin.000002
# at #
-#700101 6:46:40 server id # end_log_pos # CRC32 XXX Binlog checkpoint master-bin.000003
+#170213 10:36:56 server id # end_log_pos # CRC32 XXX Binlog checkpoint master-bin.000003
# at #
-#700101 6:46:40 server id # end_log_pos # CRC32 XXX GTID #-#-# ddl
+#170213 10:36:56 server id # end_log_pos # CRC32 XXX GTID #-#-# ddl
/*!100101 SET @@session.skip_parallel_replication=0*//*!*/;
/*!100001 SET @@session.gtid_domain_id=#*//*!*/;
/*!100001 SET @@session.server_id=#*//*!*/;
/*!100001 SET @@session.gtid_seq_no=#*//*!*/;
# at #
-#700101 6:46:40 server id # end_log_pos # CRC32 XXX Query thread_id=# exec_time=# error_code=0
+#170213 10:36:56 server id # end_log_pos # CRC32 XXX Query thread_id=# exec_time=# error_code=0
use `test`/*!*/;
SET TIMESTAMP=XXX/*!*/;
SET @@session.pseudo_thread_id=#/*!*/;
Signed-off-by: Daniel Black <daniel.black@au.ibm.com>
A proper InnoDB shutdown after aborted startup was introduced
in commit 81b7fe9d38.
Also related to this is MDEV-11985, making read-only shutdown more robust.
If startup was aborted, there may exist recovered transactions that were
not rolled back. Relax the assertions accordingly.
Before killing the server, ensure that the redo log for the
incomplete transaction is flushed, so that the AUTO_INCREMENT
sequence will always be updated. Usually the INSERT
transaction would not have persisted the sequence before the
server was killed, but sometimes it could happen, causing
result mismatch.
Note: This test used to be called innodb_fts.innodb_fts_misc_debug.
Do not effectively set DEBUG_DBUG='d' by setting DEBUG_DBUG='-d,...'.
Instead, restore the saved value of DEBUG_DBUG.
Also, split the test innodb_fts.innodb_fts_misc_debug into
innodb_fts.crash_recovery and innodb_fts.misc_debug, and enable
these tests for --valgrind, the latter test for --embedded,
and the former tests for the non-debug server.
Encryption stores used key_version to
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION (offset 26)
field. Spatial indexes store RTREE Split Sequence Number
(FIL_RTREE_SPLIT_SEQ_NUM) in the same field. Both values
can't be stored in same field. Thus, current encryption
implementation does not support encrypting spatial indexes.
fil_space_encrypt(): Do not encrypt page if page type is
FIL_PAGE_RTREE (this is required for background
encryption innodb-encrypt-tables=ON).
create_table_info_t::check_table_options() Do not allow creating
table with ENCRYPTED=YES if table contains spatial index.
recv_log_format_0_recover(): Invoke log_decrypt_after_read() after
reading the old-format redo log buffer.
With this change, we will upgrade to an encrypted redo log that
is misleadingly carrying a MySQL 5.7.9 compatible format tag while
the log blocks (other than the header and the checkpoint blocks)
are in an incompatible, encrypted format.
That needs to be fixed by introducing a new redo log format tag that
indicates that the entire redo log is encrypted.
LOG_CHECKPOINT_ARRAY_END, LOG_CHECKPOINT_SIZE: Remove.
Change some error messages to refer to MariaDB 10.2.2 instead of
MySQL 5.7.9.
recv_find_max_checkpoint_0(): Do not abort when decrypting one of the
checkpoint pages fails.
If InnoDB is started in innodb_read_only mode such that
recovered incomplete transactions exist at startup
(but the redo logs are clean), an assertion will fail at shutdown,
because there would exist some non-prepared transactions.
logs_empty_and_mark_files_at_shutdown(): Do not wait for incomplete
transactions to finish if innodb_read_only or innodb_force_recovery>=3.
Wait for purge to finish in only one place.
trx_sys_close(): Relax the assertion that would fail first.
trx_free_prepared(): Also free recovered TRX_STATE_ACTIVE transactions
if innodb_read_only or innodb_force_recovery>=3.
Also, revert my earlier fix to MySQL 5.7 because this fix is more generic:
Bug#20874411 INNODB SHUTDOWN HANGS IF INNODB_FORCE_RECOVERY>=3
SKIPPED ANY ROLLBACK
trx_undo_fake_prepared(): Remove.
trx_sys_any_active_transactions(): Revert the changes.
Rewrite the test so that the main server is restarted, instead of
--exec $MYSQLD_CMD. In this way, the test can be run with Valgrind
and with any --mysqld=--innodb-page-size.
Also remove the workaround --skip-innodb-use-native-aio. It should
not be needed when we are inheriting the server parameters from
the test environment.
Datafile::validate_for_recovery(): Remove a redundant error message.
An error is already reported by Datafile::open_read_write() if the
file cannot be opened.
Also, do not assign SEARCH_ABORT, so that the full test will be executed
even if one step fails.