Oracle introduced a Memcached plugin interface to the InnoDB
storage engine in MySQL 5.6. That interface is essentially a
fork of Memcached development snapshot 1.6.0-beta1 of an old
development branch 'engine-pu'.
To my knowledge, there have not been any updates to the Memcached code
between MySQL 5.6 and 5.7; only bug fixes and extensions related to
the Oracle modifications.
The Memcached plugin is not part of the MariaDB Server. Therefore it
does not make sense to include the InnoDB interfaces for the Memcached
plugin, or to have any related configuration parameters:
innodb_api_bk_commit_interval
innodb_api_disable_rowlock
innodb_api_enable_binlog
innodb_api_enable_mdl
innodb_api_trx_level
Removing this code in one commit makes it possible to easily restore
it, in case it turns out to be needed later.
log_crypt_101_read_checkpoint(): Read the encryption information
from a MariaDB 10.1 checkpoint page.
log_crypt_101_read_block(): Attempt to decrypt a MariaDB 10.1
redo log page.
recv_log_format_0_recover(): Only attempt decryption on checksum
mismatch. NOTE: With the MariaDB 10.1 innodb_encrypt_log format,
we can actually determine from the cleartext portion of the redo log
whether the redo log is empty. We do not really have to decrypt the
redo log here, if we did not want to determine if the checksum is valid.
innodb_shutdown(), trx_sys_close(): Startup may be aborted between
purge_sys and trx_sys creation. Therefore, purge_sys must be freed
independently of trx_sys.
innobase_start_or_create_for_mysql(): Remember to free purge_queue if
it was not yet attached to purge_sys.
This change add WITH_ROCKSDB_{LZ4,BZIP2,ZSTD,snappy} CMake variables
that can be set to ON/OFF/AUTO.
If variable has default value AUTO, rocksdb links with corresponding
compression library. OFF disables compiling/linking with specific compression
library, ON forces compiling with it (cmake would throw error if library
is not available)
Support for ZLIB is added unconditionally, as it is always there.
A proper InnoDB shutdown after aborted startup was introduced
in commit 81b7fe9d38.
Also related to this is MDEV-11985, making read-only shutdown more robust.
If startup was aborted, there may exist recovered transactions that were
not rolled back. Relax the assertions accordingly.
buf_page_is_checksum_valid_crc32()
buf_page_is_checksum_valid_innodb()
buf_page_is_checksum_valid_none():
Use ULINTPF instead of %lu and %u for ib_uint32_t
fil_space_verify_crypt_checksum():
Check that page is really empty if checksum and
LSN are zero.
fil_space_verify_crypt_checksum():
Correct the comment to be more agurate.
buf0buf.h:
Remove unnecessary is_corrupt variable from
buf_page_t structure.
recv_writer_thread(): Do not assign recv_writer_thread_active=true
in order to avoid a race condition with
recv_recovery_from_checkpoint_finish().
recv_init_crash_recovery(): Assign recv_writer_thread_active=true
before creating recv_writer_thread.
recv_writer_thread(): Do not assign recv_writer_thread_active=true
in order to avoid a race condition with
recv_recovery_from_checkpoint_finish().
recv_init_crash_recovery_spaces(): Assign recv_writer_thread_active=true
before creating recv_writer_thread.
InnoDB can wrongly ignore the end of data files when using
innodb_page_size=32k or innodb_page_size=64k. These page sizes
use an allocation extent size of 2 or 4 megabytes, not 1 megabyte.
This issue does not affect MariaDB Server 10.2, which is using
the correct WL#5757 code from MySQL 5.7.
That said, it does not make sense to ignore the tail of data files.
The next time the data file needs to be extended, it would be extended
to a multiple of the extent size, once the size exceeds one extent.
Encryption stores used key_version to
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION (offset 26)
field. Spatial indexes store RTREE Split Sequence Number
(FIL_RTREE_SPLIT_SEQ_NUM) in the same field. Both values
can't be stored in same field. Thus, current encryption
implementation does not support encrypting spatial indexes.
fil_space_encrypt(): Do not encrypt page if page type is
FIL_PAGE_RTREE (this is required for background
encryption innodb-encrypt-tables=ON).
create_table_info_t::check_table_options() Do not allow creating
table with ENCRYPTED=YES if table contains spatial index.
recv_log_format_0_recover(): Invoke log_decrypt_after_read() after
reading the old-format redo log buffer.
With this change, we will upgrade to an encrypted redo log that
is misleadingly carrying a MySQL 5.7.9 compatible format tag while
the log blocks (other than the header and the checkpoint blocks)
are in an incompatible, encrypted format.
That needs to be fixed by introducing a new redo log format tag that
indicates that the entire redo log is encrypted.
LOG_CHECKPOINT_ARRAY_END, LOG_CHECKPOINT_SIZE: Remove.
Change some error messages to refer to MariaDB 10.2.2 instead of
MySQL 5.7.9.
recv_find_max_checkpoint_0(): Do not abort when decrypting one of the
checkpoint pages fails.
after failed ADD UNIQUE INDEX
check_col_exists_in_indexes(): Add the parameter only_committed.
When considering committed indexes, evaluate index->is_committed().
Else, evaluate index->to_be_dropped.
rollback_inplace_alter_table(): Invoke check_col_exists_in_indexes()
with only_committed=true.
Galera disallow-writes feature was lost in InnoDB 5.7 merge
to 10.2. This patch restores this feature and fixes test
failure on test galera.galera_var_innodb_disallow_writes.
commit d1bb19b8f751875472211312c8e810143a7ba4b6
Author: Manuel Ung <mung@fb.com>
Date: Fri Feb 3 11:50:34 2017 -0800
Add cardinality stats to information schema
Summary: This adds cardinality stats to the INFORMATION_SCHEMA.ROCKSDB_INDEX_FILE_MAP table. This is the only missing user collected properties from SST files that we don't expose, which is useful for debugging cardinality bugs.
Reviewed By: hermanlee
Differential Revision: D4509156
fbshipit-source-id: 2d3918a
compatibility problems
Pages that are encrypted contain post encryption checksum on
different location that normal checksum fields. Therefore,
we should before decryption check this checksum to avoid
unencrypting corrupted pages. After decryption we can use
traditional checksum check to detect if page is corrupted
or unencryption was done using incorrect key.
Pages that are page compressed do not contain any checksum,
here we need to fist unencrypt, decompress and finally
use tradional checksum check to detect page corruption
or that we used incorrect key in unencryption.
buf0buf.cc: buf_page_is_corrupted() mofified so that
compressed pages are skipped.
buf0buf.h, buf_block_init(), buf_page_init_low():
removed unnecessary page_encrypted, page_compressed,
stored_checksum, valculated_checksum fields from
buf_page_t
buf_page_get_gen(): use new buf_page_check_corrupt() function
to detect corrupted pages.
buf_page_check_corrupt(): If page was not yet decrypted
check if post encryption checksum still matches.
If page is not anymore encrypted, use buf_page_is_corrupted()
traditional checksum method.
If page is detected as corrupted and it is not encrypted
we print corruption message to error log.
If page is still encrypted or it was encrypted and now
corrupted, we will print message that page is
encrypted to error log.
buf_page_io_complete(): use new buf_page_check_corrupt()
function to detect corrupted pages.
buf_page_decrypt_after_read(): Verify post encryption
checksum before tring to decrypt.
fil0crypt.cc: fil_encrypt_buf() verify post encryption
checksum and ind fil_space_decrypt() return true
if we really decrypted the page.
fil_space_verify_crypt_checksum(): rewrite to use
the method used when calculating post encryption
checksum. We also check if post encryption checksum
matches that traditional checksum check does not
match.
fil0fil.ic: Add missed page type encrypted and page
compressed to fil_get_page_type_name()
Note that this change does not yet fix innochecksum tool,
that will be done in separate MDEV.
Fix test failures caused by buf page corruption injection.
If InnoDB is started in innodb_read_only mode such that
recovered incomplete transactions exist at startup
(but the redo logs are clean), an assertion will fail at shutdown,
because there would exist some non-prepared transactions.
logs_empty_and_mark_files_at_shutdown(): Do not wait for incomplete
transactions to finish if innodb_read_only or innodb_force_recovery>=3.
Wait for purge to finish in only one place.
trx_sys_close(): Relax the assertion that would fail first.
trx_free_prepared(): Also free recovered TRX_STATE_ACTIVE transactions
if innodb_read_only or innodb_force_recovery>=3.
Also, revert my earlier fix to MySQL 5.7 because this fix is more generic:
Bug#20874411 INNODB SHUTDOWN HANGS IF INNODB_FORCE_RECOVERY>=3
SKIPPED ANY ROLLBACK
trx_undo_fake_prepared(): Remove.
trx_sys_any_active_transactions(): Revert the changes.
Datafile::validate_for_recovery(): Remove a redundant error message.
An error is already reported by Datafile::open_read_write() if the
file cannot be opened.
Also, do not assign SEARCH_ABORT, so that the full test will be executed
even if one step fails.
Remove the debug parameter innodb_force_recovery_crash that was
introduced into MySQL 5.6 by me in WL#6494 which allowed InnoDB
to resize the redo log on startup.
Let innodb.log_file_size actually start up the server, but ensure
that the InnoDB storage engine refuses to start up in each of the
scenarios.
If InnoDB is started in innodb_read_only mode such that
recovered incomplete transactions exist at startup
(but the redo logs are clean), an assertion will fail at shutdown,
because there would exist some non-prepared transactions.
logs_empty_and_mark_files_at_shutdown(): Do not wait for incomplete
transactions to finish if innodb_read_only or innodb_force_recovery>=3.
Wait for purge to finish in only one place.
trx_sys_close(): Relax the assertion that would fail first.
trx_free_prepared(): Also free recovered TRX_STATE_ACTIVE transactions
if innodb_read_only or innodb_force_recovery>=3.
srv_release_threads(): Actually wait for the threads to resume
from suspension. On CentOS 5 and possibly other platforms,
os_event_set() may be lost.
srv_resume_thread(): A counterpart of srv_suspend_thread().
Optionally wait for the event to be set, optionally with a timeout,
and then release the thread from suspension.
srv_free_slot(): Unconditionally suspend the thread. It is always
in resumed state when this function is entered.
srv_active_wake_master_thread_low(): Only call os_event_set().
srv_purge_coordinator_suspend(): Use srv_resume_thread() instead
of the complicated logic.
srv_release_threads(): Actually wait for the threads to resume
from suspension. On CentOS 5 and possibly other platforms,
os_event_set() may be lost.
srv_resume_thread(): A counterpart of srv_suspend_thread().
Optionally wait for the event to be set, optionally with a timeout,
and then release the thread from suspension.
srv_free_slot(): Unconditionally suspend the thread. It is always
in resumed state when this function is entered.
srv_active_wake_master_thread_low(): Only call os_event_set().
srv_purge_coordinator_suspend(): Use srv_resume_thread() instead
of the complicated logic.
Remove the dependency on unzip. Instead, generate the InnoDB files
with perl.
log_block_checksum_is_ok(): Correct the error message.
recv_scan_log_recs(): Remove the duplicated error message for
log block checksum mismatch.
innobase_start_or_create_for_mysql(): If the server is in read-only
mode or if innodb_force_recovery>=3, do not try to modify the system
tablespace. (If the doublewrite buffer or the non-core system tables
do not exist, do not try to create them.)
innodb_shutdown(): Relax a debug assertion. If the system tablespace
did not contain a doublewrite buffer and if we started up in
innodb_read_only mode or with innodb_force_recovery>=3, it will not
be created.
dict_create_or_check_sys_tablespace(): Set the flag
srv_sys_tablespaces_open when the tables exist.
This fixes memory leaks in tests that cause InnoDB startup to fail.
buf_pool_free_instance(): Also free buf_pool->flush_rbt, which would
normally be freed when crash recovery finishes.
fil_node_close_file(), fil_space_free_low(), fil_close_all_files():
Relax some debug assertions to tolerate !srv_was_started.
innodb_shutdown(): Renamed from innobase_shutdown_for_mysql().
Changed the return type to void. Do not assume that all subsystems
were started.
que_init(), que_close(): Remove (empty functions).
srv_init(), srv_general_init(): Remove as global functions.
srv_free(): Allow srv_sys=NULL.
srv_get_active_thread_type(): Only return SRV_PURGE if purge really
is running.
srv_shutdown_all_bg_threads(): Do not reset srv_start_state. It will
be needed by innodb_shutdown().
innobase_start_or_create_for_mysql(): Always call srv_boot() so that
innodb_shutdown() can assume that it was called. Make more subsystems
dependent on SRV_START_STATE_STAT.
srv_shutdown_bg_undo_sources(): Require SRV_START_STATE_STAT.
trx_sys_close(): Do not assume purge_sys!=NULL. Do not call
buf_dblwr_free(), because the doublewrite buffer can exist while
the transaction system does not.
logs_empty_and_mark_files_at_shutdown(): Do a faster shutdown if
!srv_was_started.
recv_sys_close(): Invoke dblwr.pages.clear() which would normally
be invoked by buf_dblwr_process().
recv_recovery_from_checkpoint_start(): Always release log_sys->mutex.
row_mysql_close(): Allow the subsystem not to exist.