row_drop_tables_for_mysql_in_background(): Copy the table name
before closing the table handle, to avoid heap-use-after-free if
another thread succeeds in dropping the table before
row_drop_table_for_mysql_in_background() completes the table name lookup.
dict_mem_create_temporary_tablename(): With innodb_safe_truncate=ON
(the default), generate a simple, unique, collision-free table name
using only the id, no pseudorandom component. This is safe, because
on startup, we will drop any #sql tables that might exist in InnoDB.
This is a backport from 10.3. It should have been backported already
as part of backporting MDEV-14717,MDEV-14585 which were prerequisites
for the MDEV-13564 backup-friendly TRUNCATE TABLE.
This seems to reduce the chance of table creation failures in
ha_innobase::truncate().
ha_innobase::truncate(): Do not invoke close(), but instead
mimic it, so that we can restore to the original table handle
in case opening the truncated copy of the table failed.
Problem was that we skipped background persistent statistics calculation
on applier nodes if thread is marked as high priority (a.k.a BF).
However, on applier nodes all DDL which is replicate will be executed
as high priority i.e BF.
Fixed by allowing background persistent statistics calculation on
applier nodes even when thread is marked as BF. This could lead
BF lock waits but for queries on that node needs that statistics.
recv_parse_log_recs(): Do not compare type if ptr==end_ptr
(we have reached the end of the redo log parsing buffer),
because it will not have been correctly initialized in that case.
GCC 6 and later can optimize away the memset() that is part of
mem_heap_zalloc() in a placement new call. So, instead of relying
on that kind of initialization, explicitly initialize the necessary
fields in the constructors.
que_common_t::que_common_t(): Initialize more fields in the
default constructor.
purge_vcol_info_t::purge_vcol_info_t(): Initialize all fields in
the default constructor.
purge_node_t::purge_node_t(): Initialize all necessary fields.
Reference:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71388https://gcc.gnu.org/ml/gcc/2016-02/msg00207.html
row_merge_create_fts_sort_index(): Initialize dict_col_t in
an unambiguous way. GCC 6 and later appear to be able to optimize
away the memset() that is part of mem_heap_zalloc() in the
placement new call. Let us avoid using placement new in order
to ensure that the objects will actually be initialized.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71388https://gcc.gnu.org/ml/gcc/2016-02/msg00207.html
While the latter reference hints that the optimization is only
applicable to non-POD types (and dict_col_t does not define
any member functions before 10.2), it is most consistent to
use the same initialization across all versions.
purge_node_t::in_progress: Replaces purge_node_t::done.
Only present in debug builds.
purge_node_t::start(): Moved from the start of row_purge_step().
purge_node_t::end(): Replaces row_purge_end().
trx_purge_attach_undo_recs(): Omit a check from non-debug builds.
If a table has been dropped, rebuilt, or its tablespace has been
discarded or the table is corrupted, it does not make sense to
look up that table again while purging old undo log records.
purge_node_t::purge_node_t(): Replaces row_purge_node_create().
que_common_t::que_common_t(): Constructor.
row_import_update_index_root(): Remove the constant parameter
dict_locked=true, and update the table->def_trx_id in the cache.
purge_node_t::unavailable_table_id: The latest unavailable table ID,
to avoid future lookups.
purge_node_t::def_trx_id: The latest modification of the table
identified by unavailable_table_id, or TRX_ID_MAX.
purge_node_t::is_skipped(): Determine if a table should be skipped.
purge_node_t::skip(): Note that a table should be skipped.
row_merge_create_index_graph(): Relay the internal state
from dict_create_index_step(). Our caller should free the index
only if it was not copied, added to the cache, and freed.
row_merge_create_index(): Free the index template if it was
not added to the cache. This is a safer variant of the logic
that was introduced in 65070beffd in 10.2.
prepare_inplace_alter_table_dict(): Add additional fault injection
to exercise a code path where we have already added an index
to the cache.
row_mysql_handle_errors(): Correct the wrong error handling for
the code DB_FOREIGN_EXCEED_MAX_CASCADE that was introduced in
c0923d396a
commit 35f5429eda
Author: Jimmy Yang <jimmy.yang@oracle.com>
Date: Wed Oct 6 06:55:34 2010 -0700
Manual port Bug #Bug #54582 "stack overflow when opening many tables
linked with foreign keys at once" from mysql-5.1-security to
mysql-5.5-security again.
rb://391 approved by Heikki
No known test case exists for repeating the bug before MariaDB 10.2.
The scenario should be that DB_FOREIGN_EXCEED_MAX_CASCADE is returned,
then InnoDB wrongly skips the rollback to the start of the current
row operation, and finally the SQL layer commits the transaction.
Normally the SQL layer would roll back either the entire transaction or
to the start of the statement. In the faulty scenario, InnoDB would
leave the transaction in an inconsistent state, and the SQL layer could
commit the transaction.
I know no test case for this bug in 10.1. So a test case will be
committed separately in 10.2
fts_reset_get_doc(): properly initialize fts_get_doc_t::cache
fts_fetch_index_words(): Restore the initialization len=0.
The test innodb_fts.create in 10.2 would end up in an infinite loop
if this assignment is removed, because a following iteration of the
while() loop would assign zip->zp->avail_in=len with the original value
instead of the 0 that was reset in the previous iteration.
Fix the warnings issued by GCC 8 -Wstringop-truncation
and -Wstringop-overflow in InnoDB and XtraDB.
This work is motivated by Jan Lindström. The patch mainly differs
from his original one as follows:
(1) We remove explicit initialization of stack-allocated string buffers.
The minimum amount of initialization that is needed is a terminating
NUL character.
(2) GCC issues a warning for invoking strncpy(dest, src, sizeof dest)
because if strlen(src) >= sizeof dest, there would be no terminating
NUL byte in dest. We avoid this problem by invoking strncpy() with
a limit that is 1 less than the buffer size, and by always writing
NUL to the last byte of the buffer.
(3) We replace strncpy() with memcpy() or strcpy() in those cases
when the result is functionally equivalent.
Note: fts_fetch_index_words() never deals with len==UNIV_SQL_NULL.
This was enforced by an assertion that limits the maximum length
to FTS_MAX_WORD_LEN. Also, the encoding that InnoDB uses for
the compressed fulltext index is not byte-order agnostic, that is,
InnoDB data files that use FULLTEXT INDEX are not portable between
big-endian and little-endian systems.
row_merge_create_fts_sort_index(): Initialize dict_col_t.
This fixes an access to uninitialized dict_col_t::ind when a debug
assertion in MariaDB 10.4 invokes is_dropped() in
rec_get_converted_size_comp_prefix_low(). Older MariaDB versions
seem to be unaffected by the uninitialized values, but it should
not hurt to initialize everything.
Only starting with MariaDB 10.3.8 (MDEV-16365), InnoDB can actually
handle ALTER IGNORE TABLE correctly when introducing a NOT NULL
attribute to a column that contains a NULL value. Between
MariaDB Server 10.0 and 10.2, we would incorrectly return an error
for ALTER IGNORE TABLE when the column contains a NULL value.
On an error (such as when an index cannot be dropped due to
FOREIGN KEY constraints), the field dict_index_t::to_be_dropped
was only being cleared in debug builds, even though the field
is available and being used also in non-debug builds.
This was a regression that was introduced by myself originally
in MySQL 5.7.6 and later merged to MariaDB 10.2.2, in
d39898de8e
An error manifested itself in the MariaDB Server 10.4 non-debug build,
involving instant ADD or DROP column. Because an earlier failed
ALTER TABLE operation incorrectly left the dict_index_t::to_be_dropped
flag set, the column pointers of the index fields would fail to be
adjusted for instant ADD or DROP column (MDEV-15562). The instant
ADD COLUMN in MariaDB Server 10.3 is unlikely to be affected by a
similar scenario, because dict_table_t::instant_add_column() in 10.3
is applying the transformations to all indexes, not skipping
to-be-dropped ones.
The problem with the InnoDB table attribute encryption_key_id is that it is
not being persisted anywhere in InnoDB except if the table attribute
encryption is specified and is something else than encryption=default.
MDEV-17320 made it a hard error if encryption_key_id is specified to be
anything else than 1 in that case.
Ideally, we would always persist encryption_key_id in InnoDB. But, then we
would have to be prepared for the case that when encryption is being enabled
for a table whose encryption_key_id attribute refers to a non-existing key.
In MariaDB Server 10.1, our best option remains to not store anything
inside InnoDB. But, instead of returning the error that MDEV-17320
introduced, we should merely issue a warning that the specified
encryption_key_id is going to be ignored if encryption=default.
To improve the situation a little more, we will issue a warning if
SET [GLOBAL|SESSION] innodb_default_encryption_key_id is being set
to something that does not refer to an available encryption key.
Starting with MariaDB Server 10.2, thanks to MDEV-5800, we could open the
table definition from InnoDB side when the encryption is being enabled,
and actually fix the root cause of what was reported in MDEV-17320.
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.
In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.
The causes and fixes:
1. We need to improve processing of changing the auto-increment values
after changing the cluster size.
2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.
3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).
https://jira.mariadb.org/browse/MDEV-9519
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.
In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.
The causes and fixes:
1. We need to improve processing of changing the auto-increment values
after changing the cluster size.
2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.
3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).
https://jira.mariadb.org/browse/MDEV-9519
Since MySQL 5.6.16 (and MariaDB Server 10.0.11), changes of
buf_page_t::buf_fix_count are atomic memory operations if
PAGE_ATOMIC_REF_COUNT is defined. Since MySQL 5.7
(and MariaDB Server 10.2.2), the field is always updated
by atomic memory operations.
In a few occurrences, updates of the counter were unnecessarily
surrounded by an acquisition and release of the block mutex
(buf_block_t::mutex or buf_pool_t::zip_mutex). Remove these
unnecessary mutex operations.
log_group_read_log_seg(): Always return false when returning
before reading end_lsn.
xtrabackup_copy_logfile(): On error, indicate whether
a corrupt log record was encountered.
Only xtrabackup_copy_logfile() in Mariabackup cared about the
return value of the function. InnoDB crash recovery was not
affected by this bug.
dict_create_foreign_constraints_low(): Clean up the way in
which the error messages are initialized, and ensure that
the table name is always initialized.
The code path where the table was not being rebuilt during ALTER TABLE
was not covered by the test. Add coverage, and remove the debug assertion
that could fail in this case.
i_s_innodb_mutexes_fill_table(): Use the C++ RAII pattern
to ensure that the mutexes are released if an OK()
macro returns from the function prematurely.
An uninitialized buffer is passed to row_sel_store_mysql_rec() but
InnoDB may not initialize everything. Looks like it's ok in most cases
but not always.
The partially initialized buffer was later passed to
ha_innobase::write_row() which reads random NULL bit values for
virtual columns and random stuff happens.
No test case for MariaDB 10.2 was found.
The test case for MariaDB 10.3 involves partitioning,
system versioning and the TRASH_ALLOC fill pattern 0xA5.
Test case depends very much on the number and layout of columns.
Think about 0xA5 byte for a NULL bit mask.
row_sel_store_mysql_rec(): always initialize virtual columns NULL bit
Closes#1144
buf_page_is_corrupted(): Read the global variable srv_checksum_algorithm
only once in order to avoid a race condition when
SET GLOBAL innodb_checksum_algorithm=...;
is being executed concurrently with this function.
wsrep_certification_rules: Define as a weak global symbol.
While there are separate _embedded.a for statically
linked storage engine plugins, there is only one ha_innodb.so
which is supposed to work with both values of WITH_WSREP.
The merge from 10.0-galera introduced a reference to a global
variable that is only defined when the server is built WITH_WSREP.
We must define that symbol as weak global, so that when
a dynamically linked InnoDB or XtraDB is used with the embedded
server (which never includes write-set replication patches),
the variable will be read as 0, instead of causing a failure to
load the InnoDB or XtraDB plugin.
Problem:
=======
Mariabackup incremental prepare creates new tablespace when it encounter
new tablespace. It sets the intial size as FIL_IBD_FILE_INITIAL_SIZE (4).
But while applying redo log, it tries to access 5th page and then
it leads to out of tablespace error.
Fix:
===
While parsing the redo log record, track FSP_SIZE in recv_spaces for the
respective space id. Assign the recv_size for the tablespace when it
is loaded. Extend the tablespace depends on recv_size while applying
the redo log record.