The predicate page_is_root(), which was added in MariaDB Server 10.2.2,
is based on a wrong assumption.
Under some circumstances, InnoDB can transform B-trees into a degenerate
state where a non-leaf page has no sibling pages. Because of this,
we cannot assume that a page that has no siblings is the root page.
This bug will be tracked as MDEV-19022.
Because of the bug that may affect many InnoDB data files, we must remove
and replace the wrong predicate. Using the wrong predicate can cause
corruption. A leaf page is not allowed to be empty except if it is the
root page, and the entire table is empty.
row_ins_foreign_fill_virtual(): Construct update->old_vrow
with ROW_COPY_DATA instead of ROW_COPY_POINTERS. With the latter,
the object would be pointing to a buffer pool page frame. That page
frame can become stale and invalid as soon as
row_ins_foreign_check_on_constraint() invokes mtr_t::commit().
Most of the time, the pointer target is not going to be overwritten
by anything, and everything appears to work correctly.
Buffer pool page replacement is highly unlikely, and any pessimistic
operation that would overwrite the old location of the record is only
slightly more likely. It is not known whether there is an actual bug.
This came up while diagnosing MDEV-18879 in MariaDB 10.3.
row_ins_foreign_check_on_constraint(): When constructing
cascade->historical_row for tables WITH SYSTEM VERSIONING,
use the appropriate mode ROW_COPY_DATA, because the pointers
will be stale after mtr_commit() is invoked.
row_drop_tables_for_mysql_in_background(): Copy the table name
before closing the table handle, to avoid heap-use-after-free if
another thread succeeds in dropping the table before
row_drop_table_for_mysql_in_background() completes the table name lookup.
dict_mem_create_temporary_tablename(): With innodb_safe_truncate=ON
(the default), generate a simple, unique, collision-free table name
using only the id, no pseudorandom component. This is safe, because
on startup, we will drop any #sql tables that might exist in InnoDB.
This is a backport from 10.3. It should have been backported already
as part of backporting MDEV-14717,MDEV-14585 which were prerequisites
for the MDEV-13564 backup-friendly TRUNCATE TABLE.
This seems to reduce the chance of table creation failures in
ha_innobase::truncate().
ha_innobase::truncate(): Do not invoke close(), but instead
mimic it, so that we can restore to the original table handle
in case opening the truncated copy of the table failed.
In 10.3, all records will be processed by purge due to MDEV-12288.
But, the insert undo records do not contain a transaction identifier.
row_purge_parse_undo_rec(): Use node->trx_id=TRX_ID_MAX for the
insert undo records. We cannot skip table lookups for these records
after DISCARD TABLESPACE other than by 'detaching' the table from
the undo logs by updating SYS_TABLES.ID on both DISCARD TABLESPACE
and IMPORT TABLESPACE.
Also, remove a redundant condition that was introduced
in the merge commit 814205f306.
row_merge_create_fts_sort_index(): Initialize dict_col_t in
an unambiguous way. GCC 6 and later appear to be able to optimize
away the memset() that is part of mem_heap_zalloc() in the
placement new call. Let us avoid using placement new in order
to ensure that the objects will actually be initialized.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71388https://gcc.gnu.org/ml/gcc/2016-02/msg00207.html
While the latter reference hints that the optimization is only
applicable to non-POD types (and dict_col_t does not define
any member functions before 10.2), it is most consistent to
use the same initialization across all versions.
purge_node_t::in_progress: Replaces purge_node_t::done.
Only present in debug builds.
purge_node_t::start(): Moved from the start of row_purge_step().
purge_node_t::end(): Replaces row_purge_end().
trx_purge_attach_undo_recs(): Omit a check from non-debug builds.
If a table has been dropped, rebuilt, or its tablespace has been
discarded or the table is corrupted, it does not make sense to
look up that table again while purging old undo log records.
purge_node_t::purge_node_t(): Replaces row_purge_node_create().
que_common_t::que_common_t(): Constructor.
row_import_update_index_root(): Remove the constant parameter
dict_locked=true, and update the table->def_trx_id in the cache.
purge_node_t::unavailable_table_id: The latest unavailable table ID,
to avoid future lookups.
purge_node_t::def_trx_id: The latest modification of the table
identified by unavailable_table_id, or TRX_ID_MAX.
purge_node_t::is_skipped(): Determine if a table should be skipped.
purge_node_t::skip(): Note that a table should be skipped.
row_merge_create_index_graph(): Relay the internal state
from dict_create_index_step(). Our caller should free the index
only if it was not copied, added to the cache, and freed.
row_merge_create_index(): Free the index template if it was
not added to the cache. This is a safer variant of the logic
that was introduced in 65070beffd in 10.2.
prepare_inplace_alter_table_dict(): Add additional fault injection
to exercise a code path where we have already added an index
to the cache.
row_mysql_handle_errors(): Correct the wrong error handling for
the code DB_FOREIGN_EXCEED_MAX_CASCADE that was introduced in
c0923d396a
commit 35f5429eda
Author: Jimmy Yang <jimmy.yang@oracle.com>
Date: Wed Oct 6 06:55:34 2010 -0700
Manual port Bug #Bug #54582 "stack overflow when opening many tables
linked with foreign keys at once" from mysql-5.1-security to
mysql-5.5-security again.
rb://391 approved by Heikki
No known test case exists for repeating the bug before MariaDB 10.2.
The scenario should be that DB_FOREIGN_EXCEED_MAX_CASCADE is returned,
then InnoDB wrongly skips the rollback to the start of the current
row operation, and finally the SQL layer commits the transaction.
Normally the SQL layer would roll back either the entire transaction or
to the start of the statement. In the faulty scenario, InnoDB would
leave the transaction in an inconsistent state, and the SQL layer could
commit the transaction.
Fix the warnings issued by GCC 8 -Wstringop-truncation
and -Wstringop-overflow in InnoDB and XtraDB.
This work is motivated by Jan Lindström. The patch mainly differs
from his original one as follows:
(1) We remove explicit initialization of stack-allocated string buffers.
The minimum amount of initialization that is needed is a terminating
NUL character.
(2) GCC issues a warning for invoking strncpy(dest, src, sizeof dest)
because if strlen(src) >= sizeof dest, there would be no terminating
NUL byte in dest. We avoid this problem by invoking strncpy() with
a limit that is 1 less than the buffer size, and by always writing
NUL to the last byte of the buffer.
(3) We replace strncpy() with memcpy() or strcpy() in those cases
when the result is functionally equivalent.
Note: fts_fetch_index_words() never deals with len==UNIV_SQL_NULL.
This was enforced by an assertion that limits the maximum length
to FTS_MAX_WORD_LEN. Also, the encoding that InnoDB uses for
the compressed fulltext index is not byte-order agnostic, that is,
InnoDB data files that use FULLTEXT INDEX are not portable between
big-endian and little-endian systems.
row_merge_create_fts_sort_index(): Initialize dict_col_t.
This fixes an access to uninitialized dict_col_t::ind when a debug
assertion in MariaDB 10.4 invokes is_dropped() in
rec_get_converted_size_comp_prefix_low(). Older MariaDB versions
seem to be unaffected by the uninitialized values, but it should
not hurt to initialize everything.
When importing a tablespace, we must initialize dummy DEFAULT NULL
values for any instantly added columns in order to avoid a debug
assertion failure when PageConverter::update_records() invokes
rec_get_offsets(). Finally, when the operation completes, we must
evict and reload the table definition, so that the correct
default values for instantly added columns will be loaded.
ha_innobase::discard_or_import_tablespace(): On successful
IMPORT TABLESPACE, evict and reload the table definition,
so that btr_cur_instant_init() will load the correct metadata.
PageConverter::update_index_page(): Fill in dummy DEFAULT NULL values
for instantly added columns. These will be replaced upon the
completion of the operation by evicting and reloading the metadata.
row_discard_tablespace(): Invoke dict_table_t::remove_instant().
After DISCARD TABLESPACE, the table is no longer in "instant ALTER"
format, because there is no data file attached.
An uninitialized buffer is passed to row_sel_store_mysql_rec() but
InnoDB may not initialize everything. Looks like it's ok in most cases
but not always.
The partially initialized buffer was later passed to
ha_innobase::write_row() which reads random NULL bit values for
virtual columns and random stuff happens.
No test case for MariaDB 10.2 was found.
The test case for MariaDB 10.3 involves partitioning,
system versioning and the TRASH_ALLOC fill pattern 0xA5.
Test case depends very much on the number and layout of columns.
Think about 0xA5 byte for a NULL bit mask.
row_sel_store_mysql_rec(): always initialize virtual columns NULL bit
Closes#1144
wsrep_certification_rules: Define as a weak global symbol.
While there are separate _embedded.a for statically
linked storage engine plugins, there is only one ha_innodb.so
which is supposed to work with both values of WITH_WSREP.
The merge from 10.0-galera introduced a reference to a global
variable that is only defined when the server is built WITH_WSREP.
We must define that symbol as weak global, so that when
a dynamically linked InnoDB or XtraDB is used with the embedded
server (which never includes write-set replication patches),
the variable will be read as 0, instead of causing a failure to
load the InnoDB or XtraDB plugin.
When innobase_allocate_row_for_vcol() returns true (for failure),
it may already have invoked mem_heap_create(). However, some callers
would fail to invoke mem_heap_free().
PROBLEM
-------
1. We are inserting a base column entry which causes an invalid value
by the function provided to generate virtual column,but we go ahead
and insert this due to ignore keyword.
2. We then delete this record, making this record delete marked in innodb.
If we try to insert another record with the same pk as the deleted
record and if the rec is not purged ,then we try to undelete mark this
record and try to build a update vector with previous and updated value
and while calculating the value of virtual column we get error from
server that we cannot calculate this from base column.
Innodb assumes that innobase_get_computed_value() Should always return
a valid value for the base column present in the row. The failure of
this call was not handled ,so we were crashing.
FIX
This assertion would fail when a secondary index record for an
instantly added column was accessed.
It is unclear to me why this code path is executed so rarely.
I was unable to cover it even when using FORCE INDEX.
row_sel_sec_rec_is_for_clust_rec(): Remove the assertion, and use
the proper function rec_get_nth_cfield().
row_sel_store_mysql_field_func(): Simply use rec_get_nth_cfield()
instead of duplicating its logic.
row_drop_table_for_mysql(): Fix a regression introduced in MDEV-16515.
Similar to the follow-up fixes MDEV-16647 and MDEV-17470, we must make
the internal tables of FULLTEXT INDEX immune to kills, to avoid noise
and resource leakage on DROP TABLE or ALTER TABLE. (Orphan internal tables
would be dropped at the next InnoDB startup only.)
Orphan #sql* tables may remain after ALTER TABLE
was interrupted by timeout or KILL or client disconnect.
This is a regression caused by MDEV-16515.
Similar to temporary tables (MDEV-16647), we had better ignore the
KILL when dropping the original table in the final part of ALTER TABLE.
Closes#1020
This fixes a regression that was introduced in MySQL 5.6.6
in an error handling code path, in the following change:
commit 024f363d6b5f09b20d1bba411af55be95c7398d3
Author: kevin.lewis@oracle.com <>
Date: Fri Jun 15 09:01:42 2012 -0500
Bug #14169459 INNODB; DROP TABLE DOES NOT DELETE THE IBD FILE
FOR A TEMPORARY TABLE.
The initial fix only covered a part of Mariabackup.
This fix hardens InnoDB and XtraDB in a similar way, in order
to reduce the probability of mistaking a corrupted encrypted page
for a valid unencrypted one.
This is based on work by Thirunarayanan Balathandayuthapani.
fil_space_verify_crypt_checksum(): Assert that key_version!=0.
Let the callers guarantee that. Now that we have this assertion,
we also know that buf_page_is_zeroes() cannot hold.
Also, remove all diagnostic output and related parameters,
and let the relevant callers emit such messages.
Last but not least, validate the post-encryption checksum
according to the innodb_checksum_algorithm (only accepting
one checksum for the strict variants), and no longer
try to validate the page as if it was unencrypted.
buf_page_is_zeroes(): Move to the compilation unit of the only callers,
and declare static.
xb_fil_cur_read(), buf_page_check_corrupt(): Add a condition before
calling fil_space_verify_crypt_checksum(). This is a non-functional
change.
buf_dblwr_process(): Validate the page only as encrypted or unencrypted,
but not both.
A static analysis tool suggested that in the function
row_merge_read_clustered_index(), ut_free(nonnull) could
be invoked twice for nonnull!=NULL. While a manual review
of the code disproved this, it should not hurt to clean up
the code so that the static analysis tool will not complain.
index_tuple_info_t::insert(), row_mtuple_cmp(): Remove the
parameter mtr_committed, which duplicated !mtr->is_active().
row_merge_read_clustered_index(): Initialize row_heap = NULL.
Remove a duplicated call mem_heap_empty(row_heap) that was
inadvertently added in commit cb1e76e4de.
Replace a "goto func_exit" with "break", to get consistent error
handling for both failures to create or write a temporary file.
end_of_index: Assign row_heap=NULL and nonnull=NULL to prevent
double freeing.
func_exit: Check for row_heap!=NULL before invoking mem_heap_free().
Closes#959
There was a race condition in the error handling of ALTER TABLE when
the table contains FULLTEXT INDEX.
During the error handling of an erroneous ALTER TABLE statement,
when InnoDB would drop the internally created tables for FULLTEXT INDEX,
it could happen that one of the hidden tables was being concurrently
accessed by a background thread. Because of this, InnoDB would defer
the drop operation to the background.
However, related to MDEV-13564 backup-safe TRUNCATE TABLE and its
prerequisite MDEV-14585, we had to make the background drop table queue
crash-safe by renaming the table to a temporary name before enqueueing it.
This renaming was introduced in a follow-up of the MDEV-13407 fix.
As part of this rename operation, we were unnecessarily parsing the
current SQL statement, because the same rename operation could also be
executed as part of ALTER TABLE via ha_innobase::rename_table().
If an ALTER TABLE statement was being refused due to incorrectly formed
FOREIGN KEY constraint, then it could happen that the renaming of the hidden
internal tables for FULLTEXT INDEX could also fail, triggering a host of
error log messages, and causing a subsequent table-rebuilding ALTER TABLE
operation to fail due to the tablespace already existing.
innobase_rename_table(), row_rename_table_for_mysql(): Add the parameter
use_fk for suppressing the parsing of FOREIGN KEY constraints. It
will only be passed as use_fk=true by ha_innobase::rename_table(),
which can be invoked as part of ALTER TABLE...ALGORITHM=COPY.
Also, related to MDEV-15522, MDEV-17304, MDEV-17835,
remove the Galera xtrabackup tests, because xtrabackup never worked
with MariaDB Server 10.3 due to InnoDB redo log format changes.
dict_create_add_foreigns_to_dictionary(): Do not commit the transaction.
The operation can still fail in dict_load_foreigns(), and we want
to be able to roll back the transaction.
create_table_info_t::create_table(): Never reset m_drop_before_rollback,
and never commit the transaction. We use a single point of rollback
in ha_innobase::create(). Merge the logic from
row_table_add_foreign_constraints().
The error handling in the MDEV-13564 TRUNCATE TABLE was broken
when an error occurred during table creation.
row_create_index_for_mysql(): Do not drop the table on error.
fts_create_one_common_table(), fts_create_one_index_table():
Do drop the table on error.
create_index(), create_table_info_t::create_table():
Let the caller handle the index creation errors.
ha_innobase::create(): If create_table_info_t::create_table()
fails, drop the incomplete table, roll back the transaction,
and finally return an error to the caller.
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.
main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
The relevant InnoDB/XtraDB fixes up to 5.6.42 had already
been applied to MariaDB in commit 30c3d6db32.
Revert some changes that appeared in
the merge commit 87d852f102.
thd_rpl_stmt_based(): A new predicate to check if statement-based
replication is active. (This can also hold when replication is not
in use, but binlog is.)
que_thr_stop(), row_ins_duplicate_error_in_clust(),
row_ins_sec_index_entry_low(), row_ins(): On a duplicate key error,
only lock all index records when statement-based replication is in use.
row_drop_table_for_mysql(): Avoid accessing non-existing dictionary tables.
dict_create_or_check_foreign_constraint_tables(): Add debug instrumentation
for creating and dropping a table before the creation of any non-core
dictionary tables.
trx_purge_add_update_undo_to_history(): Adjust a debug assertion, so that
it will not fail due to the test instrumentation.
row_drop_table_for_mysql(): Avoid accessing non-existing dictionary tables.
dict_create_or_check_foreign_constraint_tables(): Add debug instrumentation
for creating and dropping a table before the creation of any non-core
dictionary tables.
trx_purge_add_update_undo_to_history(): Adjust a debug assertion, so that
it will not fail due to the test instrumentation.
row_build_index_entry_low(): ext does not contain virtual columns.
row_upd_store_v_row(): Copy virtual column values
This is based on the following fix in MySQL 5.7.24:
commit 4ec2158bec73f1582501c4b3e3de250fed9edc9a
Author: Sachin Agarwal <sachin.z.agarwal@oracle.com>
Date: Fri Aug 24 14:44:13 2018 +0530
Bug #27968952 INNODB CRASH/CORRUPTION WITH TEXT PREFIX INDEXES
Problem:
There are two problems:
1. If there is one secondary index on extenally
stored column and another seconday index on virtual column (whose
base column is not externally stored). then while updating seconday
index on vitrual column, virtual column data is replaced by
externally stoared column.
2. In row update operation, node->row contains
shallow copy of virtual data fields. While building an update vector
containing all the fields to be modified, compute virtual column.
which may causes change in virtual data fields in node->row.
In both the above cases, while updating seconday index on virtual
column, couldn't find the row and hit an explicite assert inside
ROW_NOT_FOUND.
Fix:
1. Added check if column is virtual then its ext flag should be ZERO
and virtual column data will not be replaced by offset column data.
2. Deep copy of virtual data fields for node->row.
RB: #20382
Reviewed by : Jimmy.Yang@oracle.com
In RENAME TABLE, when an error occurs while renaming FOREIGN KEY
constraint, that error would be overwritten when renaming the
InnoDB internal tables related to FULLTEXT INDEX.
row_rename_table_for_mysql(): Do not attempt to rename the internal
tables if an error already occurred.
This problem was originally reported as Oracle Bug#27545888.
row_ins_check_foreign_constraint(): Do not overwrite hard errors
with the soft error DB_LOCK_WAIT. This prevents an infinite
wait loop when DB_INTERRUPTED was returned. For DB_LOCK_WAIT,
row_insert_for_mysql() would keep invoking row_ins_step() and the
transaction would remain active until the server shutdown is initiated.
On the hidden metadata record, if instant ALTER TABLE was executed
multiple times on the same table, purge could fail to reset the
DB_TRX_ID,DB_ROLL_PTR on the updated metadata record. This is
only a cosmetic failure that was caught (and separately fixed)
in 10.4 during the MDEV-15562 development. The problem was that
occasionally, innodb.instant_alter_crash would fail with a
result difference due to the DB_TRX_ID,DB_ROLL_PTR not having
been reset on the metadata record.
This bug should have no noticeable impact, because the metadata
record is invisible to the SQL layer, and never subjected to
MVCC or locking.
I was unable to repeat the problem on 10.3.
row_purge_parse_undo_rec(): Set node->ref for the metadata record.
This reverts commit 2d4075e1d9
where the debug assertion was added. There seems to be a potential
problem in the purge of indexes that depend on virtual columns.
Ultimately, we should change the InnoDB undo log format so that
all actual secondary index keys are stored there, also for
virtual or spatial indexes. In that way, purge and rollback would
be more straightforward.