In a Ubuntu Xenial build environment, the compiler identified as
g++-5.real (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
seems to be emitting incorrect code for the compilation unit
trx0rec.cc, triggering a bogus-looking AddressSanitizer report
of an invalid read of something in the function trx_undo_rec_get_pars().
This is potentially affecting any larger tests where the InnoDB
purge subsystem is being exercised.
When the optimization level of trx0rec.cc is limited to -O1, no
bogus failure is being reported. With -O2 or -O3, a lot of things
seemed to be inlined in the function, and the disassembly of the
generated code did not make sense to me.
Before MDEV-12113 (MariaDB Server 10.1.25), on shutdown InnoDB would write
the current LSN to the first page of each file of the system tablespace.
This is incompatible with MariaDB's InnoDB table encryption, because
encryption repurposed the field for an encryption key ID and checksum.
buf_page_is_corrupted(): For the InnoDB system tablespace, skip
FIL_PAGE_FILE_FLUSH_LSN when checking if a page is all zero,
because the first page of each file in the system tablespace can
contain nonzero bytes in the field.
The test case for reproducing MDEV-14126 demonstrates that InnoDB can
end up with an index tree where a non-leaf page has only one child page.
The test case innodb.innodb_bug14676111 demonstrates that such pages
are sometimes unavoidable, because InnoDB does not implement any sort
of B-tree rotation.
But, there is no reason to allow a root page with only one child page.
btr_cur_node_ptr_delete(): Replaces btr_node_ptr_delete().
btr_page_get_father(): Declare globally.
btr_discard_only_page_on_level(): Declare with ATTRIBUTE_COLD.
It turns out that this function is not covered by the
innodb.innodb_bug14676111 test case after all.
btr_discard_page(): If the root page ends up having only one child
page, shrink the tree by invoking btr_lift_page_up().
The predicate page_is_root(), which was added in MariaDB Server 10.2.2,
is based on a wrong assumption.
Under some circumstances, InnoDB can transform B-trees into a degenerate
state where a non-leaf page has no sibling pages. Because of this,
we cannot assume that a page that has no siblings is the root page.
This bug will be tracked as MDEV-19022.
Because of the bug that may affect many InnoDB data files, we must remove
and replace the wrong predicate. Using the wrong predicate can cause
corruption. A leaf page is not allowed to be empty except if it is the
root page, and the entire table is empty.
MariaDB before MDEV-5800 in version 10.2.2 did not support
indexed virtual columns. Non-persistent virtual columns were
hidden from storage engines. Only starting with MDEV-5800, InnoDB
would create internal metadata on virtual columns.
Similar to what was done in MDEV-18084 and MDEV-18960, we adjust two more
code paths for the old tables.
ha_innobase::build_template(): Do not invoke
dict_index_contains_col_or_prefix() for virtual columns if InnoDB
does not store the metadata.
innobase_build_col_map(): Relax an assertion about the number of columns.
ha_innobase::omits_virtual_cols(): Renamed from omits_virtual_cols().
Fixed by replacing sprinf by snprintf in ShowValue to avoid
buffer overflow. It nows always use a buffer and returns int.
modified: storage/connect/tabdos.cpp
modified: storage/connect/tabfmt.cpp
modified: storage/connect/value.cpp
modified: storage/connect/value.h
trx_rseg_header_create(): Return the block descriptor, and
remove the redundant trx_rsegf_get_new() call. Apparently
the idea of that call was some kind of encapsulation or
abstraction, to discourage the direct use of the constant TRX_RSEG.
This also removes the trx_purge_initiate_truncate() local
variable rseg_header, which was only used in debug builds.
row_ins_foreign_fill_virtual(): Construct update->old_vrow
with ROW_COPY_DATA instead of ROW_COPY_POINTERS. With the latter,
the object would be pointing to a buffer pool page frame. That page
frame can become stale and invalid as soon as
row_ins_foreign_check_on_constraint() invokes mtr_t::commit().
Most of the time, the pointer target is not going to be overwritten
by anything, and everything appears to work correctly.
Buffer pool page replacement is highly unlikely, and any pessimistic
operation that would overwrite the old location of the record is only
slightly more likely. It is not known whether there is an actual bug.
This came up while diagnosing MDEV-18879 in MariaDB 10.3.
row_ins_foreign_check_on_constraint(): When constructing
cascade->historical_row for tables WITH SYSTEM VERSIONING,
use the appropriate mode ROW_COPY_DATA, because the pointers
will be stale after mtr_commit() is invoked.
MariaDB Server 10.0 and 10.1 support non-indexed virtual columns,
which are hidden from the storage engine. Starting with MDEV-5800
in MariaDB 10.2.2, the virtual columns are visible to storage engines.
calc_row_difference(): Follow up the MDEV-17199 fix, which forgot
to increment num_v when skipping virtual columns in tables that
were created before MariaDB 10.2.2. This caused a corruption of
the update vector when an updated persistent column is preceded
by virtual columns.
The macro innobase_is_v_fld() turns out to be equivalent with
the opposite of Field::stored_in_db(). Remove the macro and
invoke the member function directly.
innodb_base_col_setup_for_stored(): Simplify a condition to only
check Field::vcol_info.
innobase_create_index_def(): Replace some redundant code with
DBUG_ASSERT().
MariaDB before MDEV-5800 in version 10.2.2 did not support
indexed virtual columns. Non-persistent virtual columns were
hidden from storage engines. Only starting with MDEV-5800, InnoDB
would create internal metadata on virtual columns.
On TRUNCATE TABLE, an old .frm file from before MDEV-5800 may be
used as the table schema. When the table is being re-created by
InnoDB, the old schema must be used. That is, we may hide
the existence of virtual columns from InnoDB.
create_table_check_doc_id_col(): Remove the assertion that failed.
This function can actually correctly deal with virtual columns
that could have been created before MariaDB 10.2.2 introduced MDEV-5800.
create_table_info_t::create_table_def(): Do not create metadata for
virtual columns if the table definition was created before MariaDB 10.2.2.
This bug was introduced by MDEV-12288, which made InnoDB use
a single undo log for persistent transactions, instead of
maintaining separate insert_undo and update_undo logs.
trx_undo_reuse_cached(): Initialize the TRX_UNDO_PAGE_TYPE
after reusing a cached undo log page for undo log.
Failure to do so can cause trx_undo_mem_create_at_db_start()
to misclassify new undo log records as TRX_UNDO_INSERT.
This in turn would trigger an assertion failure in
trx_roll_pop_top_rec_of_trx() due to undo==insert.
os_file_fsync_posix(): If fsync() returns a fatal error,
do include errno in the error message.
In the future, we might handle fsync() or write or allocation failures
on InnoDB data files a little more gracefully: flag the affected index
or table as corrupted, and deny any subsequent writes to the table.
If a write to the undo log or redo log fails, an alternative to
killing the server could be to deny any writes to InnoDB tables
until the server has been restarted.
In MDEV-10814, a missing argument caused a later optional argument
(bool true) to be treated as a size. The unmap of this memory occurs
during shutdown and resizing innodb buffer pool. As a result the
memory is lost but still allocated until shutdown is completed.
This patch contains a fix for the MDEV-17262/17243 issues and
new mtr test.
These issues (MDEV-17262/17243) have two reasons:
1) After an intermediate commit, a transaction loses its status
of "transaction that registered in the MySQL for 2pc coordinator"
(in the InnoDB) due to the fact that since version 10.2 the
write_row() function (which located in the ha_innodb.cc) does
not call trx_register_for_2pc(m_prebuilt->trx) during the processing
of split transactions. It is necessary to restore this call inside
the write_row() when an intermediate commit was made (for a split
transaction).
Similarly, we need to set the flag of the started transaction
(m_prebuilt->sql_stat_start) after intermediate commit.
The table->file->extra(HA_EXTRA_FAKE_START_STMT) called from the
wsrep_load_data_split() function (which located in sql_load.cc)
will also do this, but it will be too late. As a result, the call
to the wsrep_append_keys() function from the InnoDB engine may be
lost or function may be called with invalid transaction identifier.
2) If a transaction with the LOAD DATA statement is divided into
logical mini-transactions (of the 10K rows) and binlog is rotated,
then in rare cases due to the wsrep handler re-registration at the
boundary of the split, the last portion of data may be lost. Since
splitting of the LOAD DATA into mini-transactions is technical,
I believe that we should not allow these mini-transactions to fall
into separate binlogs. Therefore, it is necessary to prohibit the
rotation of binlog in the middle of processing LOAD DATA statement.
https://jira.mariadb.org/browse/MDEV-17262 and
https://jira.mariadb.org/browse/MDEV-17243
MDEV-15418 changed InnoDB to use the READ UNCOMMITTED isolation level
when the transaction recovery is disabled by setting
innodb_force_recovery to 5 or 6. Alas, CHECK TABLE would still
internally use REPEATABLE READ. If the server was started without
purge running into completion (and resetting all DB_TRX_ID thanks
to MDEV-12288), this would cause errors about any DB_TRX_ID that
had not been reset to 0.
In 1dc78d35a0beb9620bae1f4841cc07389b425707 the arguments
to a deallocate_large(dontdump=true) was passed a wrong value.
To avoid accidential calling large memory function that have
DODUMP/DONTDUMP options and missing arguments, the functions
have been given distinct names.