This should also fix the MariaDB 10.2.2 bug
MDEV-13826 CREATE FULLTEXT INDEX on encrypted table fails.
MDEV-12634 FIXME: Modify innodb-index-online, innodb-table-online
so that they will write and read merge sort files. InnoDB 5.7
introduced some optimizations to avoid using the files for small tables.
Many collation test results have been adjusted for MDEV-10191.
The messages are getting lost because they are written with fprintf()
but without fflush(), so abort() would lose buffered text.
Applied fix from 10.2, which makes in_logf() use
sql_print_information(), which in turn does proper flush after each message
…porary file
Fixed by removing writing key version to start of every block that
was encrypted. Instead we will use single key version from log_sys
crypt info.
After this MDEV also blocks writen to row log are encrypted and blocks
read from row log aren decrypted if encryption is configured for the
table.
innodb_status_variables[], struct srv_stats_t
Added status variables for merge block and row log block
encryption and decryption amounts.
Removed ROW_MERGE_RESERVE_SIZE define.
row_merge_fts_doc_tokenize
Remove ROW_MERGE_RESERVE_SIZE
row_log_t
Add index, crypt_tail, crypt_head to be used in case of
encryption.
row_log_online_op, row_log_table_close_func
Before writing a block encrypt it if encryption is enabled
row_log_table_apply_ops, row_log_apply_ops
After reading a block decrypt it if encryption is enabled
row_log_allocate
Allocate temporary buffers crypt_head and crypt_tail
if needed.
row_log_free
Free temporary buffers crypt_head and crypt_tail if they
exist.
row_merge_encrypt_buf, row_merge_decrypt_buf
Removed.
row_merge_buf_create, row_merge_buf_write
Remove ROW_MERGE_RESERVE_SIZE
row_merge_build_indexes
Allocate temporary buffer used in decryption and encryption
if needed.
log_tmp_blocks_crypt, log_tmp_block_encrypt, log_temp_block_decrypt
New functions used in block encryption and decryption
log_tmp_is_encrypted
New function to check is encryption enabled.
Added test case innodb-rowlog to force creating a row log and
verify that operations are done using introduced status
variables.
For running the Galera tests, the variable my_disable_leak_check
was set to true in order to avoid assertions due to memory leaks
at shutdown.
Some adjustments due to MDEV-13625 (merge InnoDB tests from MySQL 5.6)
were performed. The most notable behaviour changes from 10.0 and 10.1
are the following:
* innodb.innodb-table-online: adjustments for the DROP COLUMN
behaviour change (MDEV-11114, MDEV-13613)
* innodb.innodb-index-online-fk: the removal of a (1,NULL) record
from the result; originally removed in MySQL 5.7 in the
Oracle Bug #16244691 fix
377774689b
* innodb.create-index-debug: disabled due to MDEV-13680
(the MySQL Bug #77497 fix was not merged from 5.6 to 5.7.10)
* innodb.innodb-alter-autoinc: MariaDB 10.2 behaves like MySQL 5.6/5.7,
while MariaDB 10.0 and 10.1 assign different values when
auto_increment_increment or auto_increment_offset are used.
Also MySQL 5.6/5.7 exhibit different behaviour between
LGORITHM=INPLACE and ALGORITHM=COPY, so something needs to be tested
and fixed in both MariaDB 10.0 and 10.2.
* innodb.innodb-wl5980-alter: disabled because it would trigger an
InnoDB assertion failure (MDEV-13668 may need additional effort in 10.2)
Fixes also MDEV-13488: InnoDB writes CRYPT_INFO even though
encryption is not enabled.
Problem was that we created encryption metadata (crypt_data) for
system tablespace even when no encryption was enabled and too early.
System tablespace can be encrypted only using key rotation.
Test innodb-key-rotation-disable, innodb_encryption, innodb_lotoftables
require adjustment because INFORMATION_SCHEMA INNODB_TABLESPACES_ENCRYPTION
contain row only if tablespace really has encryption metadata.
fil_crypt_set_thread_cnt: Send message to background encryption threads
if they exits when they are ready. This is required to find tablespaces
requiring key rotation if no other changes happen.
fil_crypt_find_space_to_rotate: Decrease the amount of time waiting
when nothing happens to better enable key rotation on startup.
fsp_header_init: Write encryption metadata to page 0 only if tablespace is
encrypted or encryption is disabled by table option.
i_s_dict_fill_tablespaces_encryption : Skip tablespaces that do not
contain encryption metadata. This is required to avoid too early
wait condition trigger in encrypted -> unencrypted state transfer.
open_or_create_data_files: Do not create encryption metadata
by default to system tablespace.
Assertions failed due to incorrect handling of the --tc-heuristic-recover
option when InnoDB is in read-only mode either due to innodb_read_only=1
or innodb_force_recovery>3. InnoDB failed to refuse a XA COMMIT or
XA ROLLBACK operation, and there were errors in the error handling in
the upper layer.
This was fixed by making InnoDB XA operations respect the
high_level_read_only flag. The InnoDB part of the fix and
parts of the test main.tc_heuristic_recover were provided
by Marko Mäkelä.
LOCK_log mutex lock/unlock had to be added to fix MDEV-13438.
The measure is confirmed by mysql sources as well.
For testing of the conflicting option combination, mysql-test-run is
made to export a new $MYSQLD_LAST_CMD. It holds the very last value
generated by mtr.mysqld_start(). Even though the options have been
also always stored in $mysqld->{'started_opts'} there were no access
to them beyond the automatic server restart by mtr through the expect
file interface.
Effectively therefore $MYSQLD_LAST_CMD represents a more general
interface to $mysqld->{'started_opts'} which can be used in wider
scopes including server launch with incompatible options.
Notice another existing method to restart the server with incompatible
options relying on $MYSQLD_CMD is is aware of $mysqld->{'started_opts'}
(the actual options that the server is launched by mtr). In order to use
this method they would have to be provided manually.
NOTE: When merging to 10.2, the file search_pattern_in_file++.inc
should be replaced with the pre-existing search_pattern_in_file.inc.
Problem was that dict_sys->size tries to maintain used memory
occupied by the data dictionary table and index objects.
However at least on table objects table->heap size can increase
between when table object is inserted to dict_sys and when
it is removed from dict_sys causing inconsistency on amount
of memory added to and removed from dict_sys->size variable.
Removed unnecessary dict_sys:size variable as it is really
used only for status output.
Introduced dict_sys_get_size function to calculate memory
occupied by the data dictionary table and index objects
that is then used on show engine innodb output.
dict_table_add_to_cache(),
dict_table_rename_in_cache(),
dict_table_remove_from_cache_low(),
dict_index_remove_from_cache_low(),
Remove size calculation.
srv_printf_innodb_monitor(): Use dict_sys_get_size function to
get dictionary memory allocated.
xtradb_internal_hash_tables_fill_table(): Use dict_sys_get_size
function to get dictionary memory allocated.
In all InnoDB row formats, the pointers or lengths stored in the record
header can be at most 14 bits, that is, count up to 16383.
In ROW_FORMAT=REDUNDANT, this limits the maximum possible record length
to 16383 bytes. In other ROW_FORMAT, it could merely limit the maximum
length of variable-length fields.
When MySQL 5.7 introduced innodb_page_size=32k and 64k, the maximum
record length was limited to 16383 bytes (I hope 16383, not 16384,
to be able to distinguish from a record whose length is 0 bytes).
This change is present in MariaDB Server 10.2.
btr_cur_optimistic_update(): Restrict maximum record size to 16K-1
for REDUNDANT and 64K page size.
dict_index_too_big_for_tree(): The maximum allowed record size
is half a B-tree page or 16K(-1 for REDUNDANT) for 64K page size.
convert_error_code_to_mysql(): Fix error message to print
correct limits.
my_error_innodb(): Fix error message to print correct limits.
page_zip_rec_needs_ext() : record size was already restricted to 16K.
Restrict REDUNDANT to 16K-1.
rem0rec.h: Introduce REDUNDANT_REC_MAX_DATA_SIZE (16K-1)
and COMPRESSED_REC_MAX_DATA_SIZE (16K).
The option innodb_log_compressed_pages was contributed by
Facebook to MySQL 5.6. It was disabled in the 5.6.10 GA release
due to problems that were fixed in 5.6.11, which is when the
option was enabled.
The option was set to innodb_log_compressed_pages=ON by default
(disabling the feature), because safety was considered more
important than speed. The option innodb_log_compressed_pages=OFF
can *CORRUPT* ROW_FORMAT=COMPRESSED tables on crash recovery
if the zlib deflate function is behaving differently (producing
a different amount of compressed data) from how it behaved
when the redo log records were written (prior to the crash recovery).
In MDEV-6935, the default value was changed to
innodb_log_compressed_pages=OFF. This is inherently unsafe, because
there are very many different environments where MariaDB can be
running, using different zlib versions. While zlib can decompress
data just fine, there are no guarantees that different versions will
always compress the same data to the exactly same size. To avoid
problems related to zlib upgrades or version mismatch, we must
use a safe default setting.
This will reduce the write performance for users of
ROW_FORMAT=COMPRESSED tables. If you configure
innodb_log_compressed_pages=ON, please make sure that you will
always cleanly shut down InnoDB before upgrading the server
or zlib.
The sole purpose of handlerton::release_temporary_latches and its wrapper
function was to release the InnoDB adaptive hash index latch
(btr_search_latch).
When the btr_search_latch was split into an array of latches
in MySQL 5.7.8 as part of the Oracle Bug#20985298 fix, the "caching"
of the latch across storage engine API calls was removed. As part of that,
the function trx_search_latch_release_if_reserved() was changed to an
assertion and the function trx_reserve_search_latch_if_not_reserved()
was removed, and handlerton::release_temporary_latches() practically
became a no-op.
Note: MDEV-12121 replaced the function
trx_search_latch_release_if_reserved()
with the more appropriately named macro trx_assert_no_search_latch().
innodb.table_flags: Adjust the test case. Due to the MDEV-12873 fix
in 10.2, the corrupted flags for table test.td would be converted,
and a tablespace flag mismatch will occur when trying to open the file.
dict_table_t::thd: Remove. This was only used by btr_root_block_get()
for reporting decryption failures, and it was only assigned by
ha_innobase::open(), and never cleared. This could mean that if a
connection is closed, the pointer would become stale, and the server
could crash while trying to report the error. It could also mean
that an error is being reported to the wrong client. It is better
to use current_thd in this case, even though it could mean that if
the code is invoked from an InnoDB background operation, there would
be no connection to which to send the error message.
Remove dict_table_t::crypt_data and dict_table_t::page_0_read.
These fields were never read.
fil_open_single_table_tablespace(): Remove the parameter "table".
When the server is started in innodb_read_only mode, there cannot be
any writes to persistent InnoDB/XtraDB files. Just like the creation
of buf_flush_page_cleaner_thread is skipped in this case, also
the creation of the XtraDB-specific buf_flush_lru_manager_thread
should be skipped.
When a slow shutdown is performed soon after spawning some work for
background threads that can create or commit transactions, it is possible
that new transactions are started or committed after the purge has finished.
This is violating the specification of innodb_fast_shutdown=0, namely that
the purge must be completed. (None of the history of the recent transactions
would be purged.)
Also, it is possible that the purge threads would exit in slow shutdown
while there exist active transactions, such as recovered incomplete
transactions that are being rolled back. Thus, the slow shutdown could
fail to purge some undo log that becomes purgeable after the transaction
commit or rollback.
srv_undo_sources: A flag that indicates if undo log can be generated
or the persistent, whether by background threads or by user SQL.
Even when this flag is clear, active transactions that already exist
in the system may be committed or rolled back.
innodb_shutdown(): Renamed from innobase_shutdown_for_mysql().
Do not return an error code; the operation never fails.
Clear the srv_undo_sources flag, and also ensure that the background
DROP TABLE queue is empty.
srv_purge_should_exit(): Do not allow the purge to exit if
srv_undo_sources are active or the background DROP TABLE queue is not
empty, or in slow shutdown, if any active transactions exist
(and are being rolled back).
srv_purge_coordinator_thread(): Remove some previous workarounds
for this bug.
innobase_start_or_create_for_mysql(): Set buf_page_cleaner_is_active
and srv_dict_stats_thread_active directly. Set srv_undo_sources before
starting the purge subsystem, to prevent immediate shutdown of the purge.
Create dict_stats_thread and fts_optimize_thread immediately
after setting srv_undo_sources, so that shutdown can use this flag to
determine if these subsystems were started.
dict_stats_shutdown(): Shut down dict_stats_thread. Backported from 10.2.
srv_shutdown_table_bg_threads(): Remove (unused).
The doublewrite buffer pages must fit in the first InnoDB system
tablespace data file. The checks that were added in the initial patch
(commit 112b21da37)
were at too high level and did not cover all cases.
innodb.log_data_file_size: Test all innodb_page_size combinations.
fsp_header_init(): Never return an error. Move the change buffer creation
to the only caller that needs to do it.
btr_create(): Clean up the logic. Remove the error log messages.
buf_dblwr_create(): Try to return an error on non-fatal failure.
Check that the first data file is big enough for creating the
doublewrite buffers.
buf_dblwr_process(): Check if the doublewrite buffer is available.
Display the message only if it is available.
recv_recovery_from_checkpoint_start_func(): Remove a redundant message
about FIL_PAGE_FILE_FLUSH_LSN mismatch when crash recovery has already
been initiated.
fil_report_invalid_page_access(): Simplify the message.
fseg_create_general(): Do not emit messages to the error log.
innobase_init(): Revert the changes.
trx_rseg_create(): Refactor (no functional change).
Problem was that all doublewrite buffer pages must fit to first
system datafile.
Ported commit 27a34df7882b1f8ed283f22bf83e8bfc523cbfde
Author: Shaohua Wang <shaohua.wang@oracle.com>
Date: Wed Aug 12 15:55:19 2015 +0800
BUG#21551464 - SEGFAULT WHILE INITIALIZING DATABASE WHEN
INNODB_DATA_FILE SIZE IS SMALL
To 10.1 (with extended error printout).
btr_create(): If ibuf header page allocation fails report error and
return FIL_NULL. Similarly if root page allocation fails return a error.
dict_build_table_def_step: If fsp_header_init fails return
error code.
fsp_header_init: returns true if header initialization succeeds
and false if not.
fseg_create_general: report error if segment or page allocation fails.
innobase_init: If first datafile is smaller than 3M and could not
contain all doublewrite buffer pages report error and fail to
initialize InnoDB plugin.
row_truncate_table_for_mysql: report error if fsp header init
fails.
srv_init_abort: New function to report database initialization errors.
srv_undo_tablespaces_init, innobase_start_or_create_for_mysql: If
database initialization fails report error and abort.
trx_rseg_create: If segment header creation fails return.
Problem was that FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION field that for
encrypted pages even in system datafiles should contain key_version
except very first page (0:0) is after encryption overwritten with
flush lsn.
Ported WL#7990 Repurpose FIL_PAGE_FLUSH_LSN to 10.1
The field FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION is consulted during
InnoDB startup.
At startup, InnoDB reads the FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION
from the first page of each file in the InnoDB system tablespace.
If there are multiple files, the minimum and maximum LSN can differ.
These numbers are passed to InnoDB startup.
Having the number in other files than the first file of the InnoDB
system tablespace is not providing much additional value. It is
conflicting with other use of the field, such as on InnoDB R-tree
index pages and encryption key_version.
This worklog will stop writing FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION to
other files than the first file of the InnoDB system tablespace
(page number 0:0) when system tablespace is encrypted. If tablespace
is not encrypted we continue writing FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION
to all first pages of system tablespace to avoid unnecessary
warnings on downgrade.
open_or_create_data_files(): pass only one flushed_lsn parameter
xb_load_tablespaces(): pass only one flushed_lsn parameter.
buf_page_create(): Improve comment about where
FIL_PAGE_FIL_FLUSH_LSN_OR_KEY_VERSION is set.
fil_write_flushed_lsn(): A new function, merged from
fil_write_lsn_and_arch_no_to_file() and
fil_write_flushed_lsn_to_data_files().
Only write to the first page of the system tablespace (page 0:0)
if tablespace is encrypted, or write all first pages of system
tablespace and invoke fil_flush_file_spaces(FIL_TYPE_TABLESPACE)
afterwards.
fil_read_first_page(): read flush_lsn and crypt_data only from
first datafile.
fil_open_single_table_tablespace(): Remove output of LSN, because it
was only valid for the system tablespace and the undo tablespaces, not
user tablespaces.
fil_validate_single_table_tablespace(): Remove output of LSN.
checkpoint_now_set(): Use fil_write_flushed_lsn and output
a error if operation fails.
Remove lsn variable from fsp_open_info.
recv_recovery_from_checkpoint_start(): Remove unnecessary second
flush_lsn parameter.
log_empty_and_mark_files_at_shutdown(): Use fil_writte_flushed_lsn
and output error if it fails.
open_or_create_data_files(): Pass only one flushed_lsn variable.
Significantly reduce the amount of InnoDB, XtraDB and Mariabackup
code changes by defining pfs_os_file_t as something that is
transparently compatible with os_file_t.
This fixes warnings that were emitted when running InnoDB test
suites on a debug server that was compiled with GCC 7.1.0 using
the flags -O3 -fsanitize=undefined.
thd_requested_durability(): XtraDB can call this with trx->mysql_thd=NULL.
Remove the function in InnoDB, because it is not used there.
calc_row_difference(): Do not call memcmp(o_ptr, NULL, 0).
innobase_index_name_is_reserved(): This can be called with
key_info=NULL, num_of_keys=0.
innobase_dropping_foreign(), innobase_check_foreigns_low(),
innobase_check_foreigns(): This can be called with
drop_fk=NULL, n_drop_fk=0.
rec_convert_dtuple_to_rec_comp(): Do not invoke memcpy(end, NULL, 0).
The parameter thr of the function btr_cur_optimistic_insert()
is not declared as nonnull, but GCC 7.1.0 with -O3 is wrongly
optimizing away the first part of the condition
UNIV_UNLIKELY(thr && thr_get_trx(thr)->fake_changes)
when the function is being called by row_merge_insert_index_tuples()
with thr==NULL.
The fake_changes is an XtraDB addition. This GCC bug only appears
to have an impact on XtraDB, not InnoDB.
We work around the problem by not attempting to dereference thr
when both BTR_NO_LOCKING_FLAG and BTR_NO_UNDO_LOG_FLAG are set
in the flags. Probably BTR_NO_LOCKING_FLAG alone should suffice.
btr_cur_optimistic_insert(), btr_cur_pessimistic_insert(),
btr_cur_pessimistic_update(): Correct comments that disagree with
usage and with nonnull attributes. No other parameter than thr can
actually be NULL.
row_ins_duplicate_error_in_clust(): Remove an unused parameter.
innobase_is_fake_change(): Unused function; remove.
ibuf_insert_low(), row_log_table_apply(), row_log_apply(),
row_undo_mod_clust_low():
Because we will be passing BTR_NO_LOCKING_FLAG | BTR_NO_UNDO_LOG_FLAG
in the flags, the trx->fake_changes flag will be treated as false,
which is the right thing to do at these low-level operations
(change buffer merge, ALTER TABLE…LOCK=NONE, or ROLLBACK).
This might be fixing actual XtraDB bugs.
Other callers that pass these two flags are also passing thr=NULL,
implying fake_changes=false. (Some callers in ROLLBACK are passing
BTR_NO_LOCKING_FLAG and a nonnull thr. In these callers, fake_changes
better be false, to avoid corruption.)
Do not silence uncertain cases, or fix any bugs.
The only functional change should be that ha_federated::extra()
is not calling DBUG_PRINT to report an unhandled case for
HA_EXTRA_PREPARE_FOR_DROP.
Do not silence uncertain cases, or fix any bugs.
The only functional change should be that ha_federated::extra()
is not calling DBUG_PRINT to report an unhandled case for
HA_EXTRA_PREPARE_FOR_DROP.