fails with ERROR_INVALID_FUNCTION
This DeviceIoControl seems to happen on different boxes from time to time,
and there is not much user can do about it.
Instead of error, log a single INFO message, so it does not disturb users
much.
row_ins_check_foreign_constraint(): On timeout,
return DB_LOCK_WAIT_TIMEOUT instead of DB_LOCK_WAIT,
so that the lock wait will be properly terminated.
Also, replace some redundant assignments.
It looks like this bug was introduced in MySQL 5.7.8 by:
commit a97f6b91227c7e0fc3151cfe5421891e79c12d19
Author: Annamalai Gurusami <annamalai.gurusami@oracle.com>
Date: Tue Jun 9 16:02:31 2015 +0530
Bug #20953265 INNODB: FAILING ASSERTION: RESULT != FTS_INVALID
MDEV-13498 is a performance regression that was introduced in MariaDB 10.2.2
by commit fec844aca8
which introduced some Galera-specific conditions that were being
evaluated even if the write-set replication was not enabled.
MDEV-13246 Stale rows despite ON DELETE CASCADE constraint
is a correctness regression that was introduced by the same commit.
Especially the subcondition
!(parent && que_node_get_type(parent) == QUE_NODE_UPDATE)
which is equivalent to
!parent || que_node_get_type(parent) != QUE_NODE_UPDATE
makes little sense. If parent==NULL, the evaluation would proceed to the
std::find() expression, which would dereference parent. Because no SIGSEGV
was observed related to this, we can conclude that parent!=NULL always
holds. But then, the condition would be equivalent to
que_node_get_type(parent) != QUE_NODE_UPDATE
which would not make sense either, because the std::find() expression
is actually assuming the opposite when casting parent to upd_node_t*.
It looks like this condition never worked properly, or that
it was never properly tested, or both.
wsrep_must_process_fk(): Helper function to check if FOREIGN KEY
constraints need to be processed. Only evaluate the costly std::find()
expression when write-set replication is enabled.
Also, rely on operator<<(std::ostream&, const id_name_t&) and
operator<<(std::ostream&, const table_name_t&) for pretty-printing
index and table names.
row_upd_sec_index_entry(): Add !wsrep_thd_is_BF() to the condition.
This is applying part of "Galera MW-369 FK fixes"
f37b79c6da
that is described by the following part of the commit comment:
additionally: skipping wsrep_row_upd_check_foreign_constraint if thd has
BF, essentially is applier or replaying
This FK check would be needed only for populating parent row FK keys
in write set, so no use for appliers
trx_set_rw_mode(): Check the flag high_level_read_only instead
of testing srv_force_recovery (innodb_force_recovery) directly.
There is no need to prevent the creation of read-write transactions
if innodb_force_recovery=3 is used. Yes, in that mode any recovered
incomplete transactions will not be rolled back, but these transactions
will continue to hold locks on the records that they have modified.
If the new read-write transactions hit conflicts with already existing
(possibly recovered) transactions, the lock wait timeout mechanism
will work just fine.
buf_page_io_complete(): Do not test bpage for NULL, because
it is declared (and always passed) as nonnull.
buf_flush_batch(): Remove the constant local variable count=0.
fil_ibd_load(): Use magic comment to suppress -Wimplicit-fallthrough.
ut_stage_alter_t::inc(ulint): Disable references to an unused parameter.
lock_queue_validate(), sync_array_find_thread(), rbt_check_ordering():
Define only in debug builds.
Only a relevant subset of the InnoDB changes was merged.
In particular, two follow-up bug fixes for the bugs that
were introduced in 5.7.18 but not MariaDB 10.2.7 were omitted.
Because MariaDB 10.2.7 omitted the risky change
Bug#23481444 OPTIMISER CALL ROW_SEARCH_MVCC() AND READ THE INDEX
APPLIED BY UNCOMMITTED ROWS
we do not need the follow-up fixes that were introduced in
MySQL 5.6.37 and MySQL 5.7.19:
Bug#25175249 ASSERTION: (TEMPL->IS_VIRTUAL && !FIELD) || ...
Bug#25793677 INNODB: FAILING ASSERTION: CLUST_TEMPL_FOR_SEC || LEN
Analysis:
=========
During alter table rebuild, InnoDB fails to apply concurrent insert log.
If the insert log record is present across the blocks then apply phase
trying to access the next block without fetching it.
Fix:
====
During virtual column parsing, check whether the record is present
across the blocks before accessing the virtual column information.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 16243
Analysis:
========
During alter table rebuild, InnoDB fails to apply concurrent delete log.
Parsing and validation of merge record happens while applying the
log operation on a table. Validation goes wrong for the virtual column.
Validation assumes that virtual column information can't be the end
of the merge record end.
Fix:
====
Virtual column information in the merge record can be end of the merge
record. Virtual column information is written at the end for
row_log_table_delete().
Reviewed-by: Satya Bodapati<satya.bodapati@oracle.com>
RB: 16155
MariaDB 10.2 never contained the Oracle change
Bug#23481444 OPTIMISER CALL ROW_SEARCH_MVCC() AND READ THE
INDEX APPLIED BY UNCOMMITTED ROWS
because it was considered risky for a GA release and incomplete.
Remove the references that were added when merging MySQL 5.6.36
to MariaDB 10.0.31, 10.1.24, and 10.2.7.
Analysis:
========
(1) During TRUNCATE of file_per_table tablespace, dict_operation_lock is
released before eviction of dirty pages of a tablespace from the buffer
pool. After eviction, we try to re-acquire
dict_operation_lock (higher level latch) but we already hold lower
level latch (index->lock). This causes latch order violation
(2) Deadlock issue is present if child table is being truncated and it
holds index lock. At the same time, cascade dml happens and it took
dict_operation_lock and waiting for index lock.
Fix:
====
1) Release the indexes lock before releasing the dict operation lock.
2) Ignore the cascading dml operation on the parent table, for the
cascading foreign key, if the child table is truncated or if it is
in the process of being truncated.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
Reviewed-by: Kevin Lewis <kevin.lewis@oracle.com>
RB: 16122
Issue
====
The issue is that the info message that InnoDB prints when a table
is created with a reference which doesn't exist fills up the log as
it's printed for every insert when foreign_key_checks is disabled.
Fix
===
The fix is to display the message only if foreign_key_checks is
enabled.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
Problem:
=======
Offsets allocates memory from row_heap even for deleted row
traversal during table rebuild.
Solution:
=========
Empty the row_heap even for deleted record. So that
offsets don't allocate memory everytime.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 15694
Cherry-pick the commit from MySQL 5.7.19, and adapt the test case:
commit 45c933ac19c73a3e9c756a87ee1ba18ba1ac564c
Author: Aakanksha Verma <aakanksha.verma@oracle.com>
Date: Tue Mar 21 10:31:43 2017 +0530
Bug #25189192 ERRORS WHEN RESTARTING MYSQL AFTER RENAME TABLE.
PROBLEM
While renaming table innodb doesn't update the InnoDB Dictionary table
INNODB_SYS_DATAFILES incase there is change in database while doing
rename table. Hence on a restart the server log shows error that it
couldnt find table with old path before rename which has actually been
renamed. So the errors would only vanish if we update the system
tablespace
FIX
Update the innodb dictionary table with new path in the case there is
not a change in the table but the database holding the table as well.
Reviewed-by: Jimmy Yang<Jimmy.Yang@oracle.com>
RB: 15751
Revert the following change, because Memcached is not present
in MariaDB Server. We had better avoid adding dead code.
commit d9bc5e03d788b958ce8c76e157239953db60adb2
Author: Aakanksha Verma <aakanksha.verma@oracle.com>
Date: Thu May 18 14:31:01 2017 +0530
Bug #24605783 MYSQL GOT SIGNAL 6 ASSERTION FAILURE
The thd_destructor_proxy detects that no transactions are active and
starts srv_shutdown_bg_undo_sources(), but fails to take into account
that new transactions can still start, especially be slave but also
by other threads. In addition there is no mutex when checking for
active transaction so this is not safe.
We relax the failing InnoDB debug assertion by allowing the execution
of user transactions after the purge thread has been shut down.
FIXME: If innodb_fast_shutdown=0, we should somehow guarantee that no
new transactions can start after thd_destructor_proxy observed that
trx_sys_any_active_transactions() did not hold.
row_update_for_mysql(): Remove the wrapper function and
rename the function from row_update_for_mysql_using_upd_graph().
Remove the unused parameter mysql_rec.
If the latest InnoDB redo log checkpoint was stored in the
first checkpoint slot and not the second one, InnoDB would
incorrectly set log_sys->log.lsn to the previous checkpoint.
It is possible that this logic error did not exist before
commit 86927cc712, which
removed traces of multiple InnoDB redo logs, to prepare for
MDEV-12548 (Mariabackup for MariaDB 10.2). In the worst case,
this error could mean that InnoDB unnecessarily fails to
recover from redo log when the last-but-one checkpoint was
overwritten, but the last checkpoint is intact.
recv_find_max_checkpoint(), recv_find_max_checkpoint_0():
Do not overwrite the fields of log_sys->log with the information
of an older checkpoint.
recv_find_max_checkpoint(): Do not return DB_SUCCESS on an error.
recv_recovery_from_checkpoint_start(): Return early if the log is
in a version-tagged format but not in the latest format. (In this case,
the log must be logically empty, and there is nothing to apply.)
Always read full page 0 to determine does tablespace contain
encryption metadata. Tablespaces that are page compressed or
page compressed and encrypted do not compare checksum as
it does not exists. For encrypted tables use checksum
verification written for encrypted tables and normal tables
use normal method.
buf_page_is_checksum_valid_crc32
buf_page_is_checksum_valid_innodb
buf_page_is_checksum_valid_none
Modify Innochecksum logging to file to avoid compilation
warnings.
fil0crypt.cc fil0crypt.h
Modify to be able to use in innochecksum compilation and
move fil_space_verify_crypt_checksum to end of the file.
Add innochecksum logging to file.
univ.i
Add innochecksum strict_verify, log_file and cur_page_num
variables as extern.
page_zip_verify_checksum
Add innochecksum logging to file and remove unnecessary code.
innochecksum.cc
Lot of changes most notable able to read encryption
metadata from page 0 of the tablespace.
Added test case where we corrupt intentionally
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION (encryption key version)
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION+4 (post encryption checksum)
FIL_DATA+10 (data)
Problem was introduced with the InnoDB 5.7 merge, the code related to
avoiding extra fsync at the end of commit when binlog is enabled. The
MariaDB method for this was removed, but the replacement MySQL method
based on thd_get_durability_property() is not functional in MariaDB.
This commit reverts the offending parts of the merge and adds a test
case, to fix the problem for InnoDB. But other storage engines are
likely to have a similar problem.
The debug flag recv_no_log_write prohibits writes of redo log records for
modifying page data. The debug assertion was failing when fil_names_clear()
was writing the informative MLOG_FILE_NAME and MLOG_CHECKPOINT records
which do not modify any data.
log_reserve_and_open(), log_write_low(): Remove the debug assertion.
log_pad_current_log_block(), mtr_write_log(),
mtr_t::Command::prepare_write(): Add the debug assertion.
During InnoDB startup, change buffer merge operations are prohibited
before recv_apply_hashed_log_recs(true), which performs the last phase
of redo log apply. Before this call, ibuf_init_at_db_start() would be
invoked, and it could trigger the debug assertion.
ibuf_init_at_db_start(): Do not declare the mini-transaction as
"inside change buffer", because nothing is being written in the
mini-transaction. The purpose of this function is only to initialize
the memory data structures from the persistent data structures.
This is basically port of WL6045:Improve Innochecksum with some
code refactoring on innochecksum.
Added page0size.h include from 10.2 to make 10.1 vrs 10.2 innochecksum
as identical as possible.
Added page 0 checksum checking and if that fails whole test fails.
Always read full page 0 to determine does tablespace contain
encryption metadata. Tablespaces that are page compressed or
page compressed and encrypted do not compare checksum as
it does not exists. For encrypted tables use checksum
verification written for encrypted tables and normal tables
use normal method.
buf_page_is_checksum_valid_crc32
buf_page_is_checksum_valid_innodb
buf_page_is_checksum_valid_none
Add Innochecksum logging to file
buf_page_is_corrupted
Remove ib_logf and page_warn_strict_checksum
calls in innochecksum compilation. Add innochecksum
logging to file.
fil0crypt.cc fil0crypt.h
Modify to be able to use in innochecksum compilation and
move fil_space_verify_crypt_checksum to end of the file.
Add innochecksum logging to file.
univ.i
Add innochecksum strict_verify, log_file and cur_page_num
variables as extern.
page_zip_verify_checksum
Add innochecksum logging to file.
innochecksum.cc
Lot of changes most notable able to read encryption
metadata from page 0 of the tablespace.
Added test case where we corrupt intentionally
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION (encryption key version)
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION+4 (post encryption checksum)
FIL_DATA+10 (data)
Following merge from 5.6.36, this merge also rejects changes that
collided with the rejection of 6ca4f693c1ce472e2b1bf7392607c2d1124b4293.
We initially rejected 6ca4f693c1ce472e2b1bf7392607c2d1124b4293 because
it was introducing a new storage engine API method.
Problem was that dict_sys->size tries to maintain used memory
occupied by the data dictionary table and index objects.
However at least on table objects table->heap size can increase
between when table object is inserted to dict_sys and when
it is removed from dict_sys causing inconsistency on amount
of memory added to and removed from dict_sys->size variable.
Removed unnecessary dict_sys:size variable as it is really
used only for status output.
Introduced dict_sys_get_size function to calculate memory
occupied by the data dictionary table and index objects
that is then used on show engine innodb output.
dict_table_add_to_cache(),
dict_table_rename_in_cache(),
dict_table_remove_from_cache_low(),
dict_index_remove_from_cache_low(),
Remove size calculation.
srv_printf_innodb_monitor(): Use dict_sys_get_size function to
get dictionary memory allocated.
xtradb_internal_hash_tables_fill_table(): Use dict_sys_get_size
function to get dictionary memory allocated.
Crashes with innodb_page_size=64K. Does not crash at <= 32K.
Problem was that when blob record that was earlier < 16k is
enlarged at update wo that length > 16K it should be stored
externally. However, that was not enforced when page-size = 64K
(note that 16K+1 < 64K/2 i.e. half of the btree leaf page).
btr_cur_optimistic_update: limit max record size to 16K
or in REDUNDANT row format to 16K-1.
In all InnoDB row formats, the pointers or lengths stored in the record
header can be at most 14 bits, that is, count up to 16383.
In ROW_FORMAT=REDUNDANT, this limits the maximum possible record length
to 16383 bytes. In other ROW_FORMAT, it could merely limit the maximum
length of variable-length fields.
When MySQL 5.7 introduced innodb_page_size=32k and 64k, the maximum
record length was limited to 16383 bytes (I hope 16383, not 16384,
to be able to distinguish from a record whose length is 0 bytes).
This change is present in MariaDB Server 10.2.
btr_cur_optimistic_update(): Restrict maximum record size to 16K-1
for REDUNDANT and 64K page size.
dict_index_too_big_for_tree(): The maximum allowed record size
is half a B-tree page or 16K(-1 for REDUNDANT) for 64K page size.
convert_error_code_to_mysql(): Fix error message to print
correct limits.
my_error_innodb(): Fix error message to print correct limits.
page_zip_rec_needs_ext() : record size was already restricted to 16K.
Restrict REDUNDANT to 16K-1.
rem0rec.h: Introduce REDUNDANT_REC_MAX_DATA_SIZE (16K-1)
and COMPRESSED_REC_MAX_DATA_SIZE (16K).
This is a bogus debug assertion failure that should be possible
starting with MariaDB 10.2.2 (which merged WL#7142 via MySQL 5.7.9).
While generating page-change redo log records is strictly out of the
question during tat certain parts of crash recovery, the
fil_names_clear() is only emitting informational MLOG_FILE_NAME
and MLOG_CHECKPOINT records to guarantee that if the server is killed
during or soon after the crash recovery, subsequent crash recovery
will be possible.
The metadata buffer that fil_names_clear() is flushing to the redo log
is being filled by recv_init_crash_recovery_spaces(), right before
starting to apply redo log, by invoking fil_names_dirty() on every
discovered tablespace for which there are changes to apply.
When it comes to Mariabackup (xtrabackup --prepare), it is strictly out
of the question to generate any redo log whatsoever, because that could
break the restore of incremental backups by causing LSN deviation.
So, the fil_names_dirty() call must be skipped when restoring backups.
recv_recovery_from_checkpoint_start(): Do not invoke fil_names_clear()
when restoring a backup.
mtr_t::commit_checkpoint(): Remove the failing assertion. The only
caller is fil_names_clear(), and it must be called by
recv_recovery_from_checkpoint_start() for normal server startup to be
crash-safe. The debug assertion in mtr_t::commit() will still
catch rogue redo log writes.
The option innodb_log_compressed_pages was contributed by
Facebook to MySQL 5.6. It was disabled in the 5.6.10 GA release
due to problems that were fixed in 5.6.11, which is when the
option was enabled.
The option was set to innodb_log_compressed_pages=ON by default
(disabling the feature), because safety was considered more
important than speed. The option innodb_log_compressed_pages=OFF
can *CORRUPT* ROW_FORMAT=COMPRESSED tables on crash recovery
if the zlib deflate function is behaving differently (producing
a different amount of compressed data) from how it behaved
when the redo log records were written (prior to the crash recovery).
In MDEV-6935, the default value was changed to
innodb_log_compressed_pages=OFF. This is inherently unsafe, because
there are very many different environments where MariaDB can be
running, using different zlib versions. While zlib can decompress
data just fine, there are no guarantees that different versions will
always compress the same data to the exactly same size. To avoid
problems related to zlib upgrades or version mismatch, we must
use a safe default setting.
This will reduce the write performance for users of
ROW_FORMAT=COMPRESSED tables. If you configure
innodb_log_compressed_pages=ON, please make sure that you will
always cleanly shut down InnoDB before upgrading the server
or zlib.
Since MariaDB 10.2.2, temporary table metadata is not written
to the InnoDB data dictionary tables. Therefore,
the DICT_TF2_TEMPORARY flag cannot be set in SYS_TABLES,
except if there exist orphan temporary tables that were created
before MariaDB 10.2.2.
trx_resurrect_table_locks(): Do not skip temporary tables.
If a resurrect transaction modified a temporary table that was
created before MariaDB 10.2.2, that table would be treated
internally as a persistent table. It is safer to resurrect
locks than to skip the table, because the table would be modified
on transaction rollback.
buf_flush_page_cleaner_coordinator: In the first loop, use an
appropriate termination condition, waiting for !recv_writer_thread_active.
logs_empty_and_mark_files_at_shutdown(): Signal recv_sys->flush_start
in case the recv_writer_thread was never started, or
buf_flush_page_cleaner_coordinator failed to notice its termination.
innobase_start_or_create_for_mysql(): Remove a redundant, unreachable
condition, and properly release resources when aborting startup due to
recv_sys->found_corrupt_log.
when opening 10.1- table that has virtual columns:
1. don't error out if it has vcols over autoinc columns.
just issue a warning.
2. set vcol type properly
3. in innodb: use table->s->stored_fields instead of table->s->fields,
because that's what was stored in innodb data dictionary