In case of error last_pos points to null_element and there is no any
other children. tree_search_next() walks the children from last_pos
until the leaves (null_element) ignoring the case the topmost parent
in search state is the leaf itself.
Quick read record uses different handler (H1) for finding records. It
cannot use ha_delete_row() handler (H2) as it is different search
mode: inited == INDEX for H1, inited == RND for H2. So, read handler
H1 uses index while write handler H2 uses random access.
For going next record in H1 there is info->last_pos optimization for
stepping index via tree_search_next(). This optimization can work with
deleted rows only if delete is conducted in the same handler, there
is:
67 int hp_rb_delete_key(HP_INFO *info, register HP_KEYDEF *keyinfo,
68 const uchar *record, uchar *recpos, int flag)
69 {
...
74 if (flag)
75 info->last_pos= NULL; /* For heap_rnext/heap_rprev */
But this cannot work for different handler. So, last_pos in H1 after
delete in H2 contains stale info->parents array and last_pos points
into that parents. In the specific test case last_pos' parent is
already freed node and tree_search_next() steps into it.
The fix invalidates local savings of info->parents and info->last_pos
based on key_version. Record deletion increments share->key_version in
H2, so in H1 we know the tree might be changed.
Another good measure would be to use H1 for delete. But this is bigger
refactoring than just bug fixing.
mysql_compare_tables() failed because of no long hash index generated
fields in prepared tmp_create_info. In comparison we should skip these
fields in the original table by:
1. skipping INVISIBLE_SYSTEM fields;
2. getting key_info from table->s instead of table as TABLE_SHARE
contains unwrapped key_info.
system versioned table
For versioned table REPLACE first tries to insert a row, if it gets
duplicate key error and optimization is possible it does UPDATE +
INSERT history. If optimization is not possible it goes normal branch
for UPDATE to history and repeats the cycle of INSERT.
The failure was in normal branch when we tried UPDATE to history but
such history already exists from previous cycles. There is no such
failures in optimized branch because vers_insert_history_row() already
ignores duplicates.
The fix ignores duplicate errors for UPDATE to history and does DELETE
instead.
`limit >= trx_id' failed in purge_node_t::skip
For fast alter partition ALTER lost hash fields in frm field
count. mysql_prepare_create_table() did not call add_hash_field()
because the logic of ALTER-ing field types implies automatic
promotion/demotion to/from hash index. So we don't pass hash algorithm
to mysql_prepare_create_table() and let it decide itself, but it
cannot decide it correctly for fast alter partition.
So now mysql_prepare_alter_table() is a bit more sophisticated on what
to pass in the algorithm. If not changed any fields it will force
mysql_prepare_create_table() to re-add hash fields by setting
HA_KEY_ALG_HASH.
The problem with the original logic is mysql_prepare_alter_table()
does not care 100% about hash property so the decision is blurred
between mysql_prepare_alter_table() and mysql_prepare_create_table().
MDEV-28127 did is_equal() which compared vcol expressions
literally. But another table vcol expression is not equal because of
different table name.
We implement another comparison method is_identical() which respects
different table name in vcol comparison. If any field item points to
table_A and compared field item points to table_B, such items are
treated as equal in (table_A, table_B) comparison. This is done by
cloning table_B expression and renaming any table_B entries to table_A
in it.
DELAYED with virtual columns
Segfault was cause by two different copies of same Field instance in
prepared delayed insert. One was made by
Delayed_insert::get_local_table() (see make_new_field()). That copy
went through parse_vcol_defs() and received new vcol_info->expr.
Another one was made by copy_keys_from_share() by this code:
/*
We are using only a prefix of the column as a key:
Create a new field for the key part that matches the index
*/
field= key_part->field=field->make_new_field(root, outparam, 0);
field->field_length= key_part->length;
So, key_part and table got different objects of same field and the
crash was because key_part->field->vcol_info->expr is NULL.
The fix does update_keypart_vcol_info() to update vcol_info->expr in
key_part->field.
Cleanup: memdup_vcol() is static inline instead of macro + check OOM.
- Needless engaged_ removed;
- SCOPE_VALUE, SCOPE_SET, SCOPE_CLEAR macros for neater declaration;
- IF_CLASS / IF_NOT_CLASS SFINAE checkers to pass arg by value or
reference;
- inline keyword;
- couple of refactorings of temporary free_list.
Example:
{
auto _= make_scope_value(var, tmp_value);
}
make_scope_value(): a function which returns RAII object which temporary
changes a value of a variable
detail::Scope_value: actual implementation of such RAII class.
It shouldn't be used directly! That's why it's inside a namespace detail.
* Innobase `os0file.cc`: use `PRIu64` over `llu`
* These came after I prepared #3485.
* MyISAM `mi_check.c`: in impossible block length warning
* I missed this one in #3485 (and #3360 too?).
The fix in MDEV-34825/#3484/dff354e7df2f originally just took the FreeBSD carried
patch of inline ASM as it didn't have the __ppc_get_timebase function.
What clang does have is the __builtin_ppc_get_timebase, which was
replaced in the same commit, which was the fix taken from Alpine.
To reduce complexity - we only need one working function rather
than an equivalent asm implementation.
Noted by Marko, thanks!
InnoDB transactions may be reused after committed:
- when taken from the transaction pool
- during a DDL operation execution
In this case wsrep flag on trx object is cleared, which may cause wrong
execution logic afterwards (wsrep-related hooks are not run).
Make trx->wsrep flag initialize from THD object only once on InnoDB transaction
start and don't change it throughout the transaction's lifetime.
The flag is reset at commit time as before.
Unconditionally set wsrep=OFF for THD objects that represent InnoDB background
threads.
Make Wsrep_schema::store_view() operate in its own transaction.
Fix streaming replication transactions' fragments rollback to not switch
THD->wsrep value during transaction's execution
(use THD->wsrep_ignore_table as a workaround).
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Row-injection updates don’t correctly set the historical partition
for tables with system versioning and system_time partitions. This
results in inconsistencies between the master and slave when
replicating transactions that target such tables (i.e. the primary
server would correctly distribute archived rows amongst its
partitions, whereas the replica would have all archived rows in a
single partition). The function
partition_info::vers_set_hist_part(THD*) is used to set the
partition; however, its initial check for
vers_require_hist_part(THD*) returns false, bypassing the rest of
the function (which sets up the partition to use). This is because
the actual check uses the LEX sql_command (via
LEX::vers_history_generating()) to determine if the command is valid
to generate history. Row injections don’t have sql_commands though.
This patch provides a fix which extends the check in
vers_history_generating() to additionally allow row injections to be
history generating (via the function LEX::is_stmt_row_injection()).
Special thanks to Jan Lindstrom <jan.lindstrom@galeracluster.com>
for his work in reproducing the bug, and providing an initial test
case.
Reviewed By
============
Kristian Nielsen <knielsen@knielsen-hq.org>
Aleksey Midenkov <midenok@mariadb.com>
Problem was caused by MDEV-31413 commit 277968aa where
mysql.gtid_slave_pos table was replicated by Galera.
However, as not all nodes in Galera cluster are replica
nodes, rows were not deleted from table.
In this fix this is corrected so that mysql.gtid_slave_pos
table is not replicated by Galera. Instead when Galera
node receives GTID event and wsrep_gtid_mode=1, this event
is stored to mysql.gtid_slave_pos table.
Added test case galera_2primary_replica for 2 async
primaries replicating to galera cluster.
Added test case galera_circular_replication where
async primary replicates to galera cluster and
one of the galera cluster nodes is master
to async replica.
Modified test case galera_restart_replica to monitor
gtid positions and rows in mysql.gtid_pos_table.
This is the test case from commit 46aaf328ce
(MDEV-35830) to cover backup_undo_trunc() in the regression tests of
earlier major versions of mariadb-backup.
value.type_handler()->result_type()'
failed in virtual bool Item_param::get_date(THD*, MYSQL_TIME*, date_mode_t)
This is a cleanup for MDEV-25593. When binding from NULL, IGNORE or DEFAULT,
value.type_handler should be set to &type_handler_null,
to satisfy the DBUG_ASSERT in Item_param::get_date().
The problems were that:
1) resources was freed "asimetric" normal execution in send_eof,
in case of error in destructor.
2) destructor was not called in case of SP for result objects.
(so if the last SP execution ended with error resorces was not
freeded on reinit before execution (cleanup() called before next
execution) and destructor also was not called due to lack of
delete call for the object)
Result cleanup() renamed to reset_for_next_ps_execution() to better
reflect function().
All result method revised and freeing resources made "symetric".
Destructor of result object called for SP.
Added skipped invalidation in case of error in insert.
Removed misleading naming of reset(thd) (could be mixed with
with reset()).
TABLE::use_all_columns turn on all bits of read_set, which is
interpreted by innodb as a request to read all columns. Without doing
so before calling init_read_record(), innodb will not retrieve any
columns if mysql.servers table has been altered to use innodb as the
engine, and any foreign servers stored in the table are "lost".
storage/maria/ma_open.c:352:7: runtime error: call to function debug_sync(THD*, char const*, unsigned long)
through pointer to incorrect function type 'void (*)(void *, const char *, unsigned long)'
The THD argument is a void *. Because of the way myisam is .c files the
function prototype is mismatched.
As Marko pointed out the MYSQL_THD is declared as void * in C.
Thanks Jimmy Hú for noting that struct THD is the equalivalant in C to
the class THD. The C NULL was also different to the C++ nullptr.
Corrected the definations of MYSQL_THD and DEBUG_SYNC_C to be C and C++
compatible.
Fix regression introduced by commits 9588526 which attempted to address
MDEV-27037. With the regression, mariadb-binlog cannot process multiple
log files when --stop-datetime is specified.
The change is to keep recording timestamp of last processed event, and
after all log files are processed, if the last recorded timestamp has not
reached specified --stop-datetime, it will emit a warning. This applies
when processing local log files, or log files from remote servers.
All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.
Co-authored-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
The mismatch occurs on the function calls as in the sql/sql_udf.h the
types of "error" and "is_null" are unsigned char rather than char.
This is corrected for the udf functions:
* spider_direct_sql
* spider_direct_bg_sql
* spider_flush_table_mon_cache
* spider_copy_tables
* spider_ping_table
Reviewer: Yuchen Pei
There were too many C casts rather than C++ casts here.
Started a partial conversion covering within-file scoped
changes.
Suggested by Brandon Nesterenko
In plugins, use the correct resolver for ULONG and ULONGLONG
types.
InnoDB has a UINT type as evidenced by "Unknown variable type code 0x182
in plugin 'InnoDB'." so the implementation for UNSIGNED INT was added.
Any InnoDB mtr test that changes lock_wait_timeout,
spider_param_force_commit and some MyRocks unsigned thread server
variables can verify this change is correct.
Reviewer: Brandon Nesterenko
through pointer to incorrect function type.
The argument is void* rather than char* and was missing
system_status_var * as an argument.
This shows up with UBSAN testing under clang.
Reviewer: Brandon Nesterenko
cannot have an assert in Warning_info::push_warning()
because SQL command SIGNAL can set an absolutely arbitrary
message, even an empty one or ending with '\n'
move the assert into push_warning() and my_message_sql().
followup for 9508a44c37
The issue is caused by a logic error in Item_sum::get_tmp_table_item() method:
it resets arguments of the item to point to the result fields during
change_ref_to_tmp_fields() call. However, Item_sum arguments must not be modified.
It is enough for Item_sum objects to call ancestor's implementation
Item::get_tmp_table_item().
This fix is in accordance with MySQL commit 2e3dc09087c24798c90e05163ed3d931f6b93db3
Reviewer: Oleksandr Byelkin <sanja@mariadb.com>
during FLUSH PRIVILEGES, allow_all_hosts temporarily goes out of sync
with acl_check_hosts and acl_wild_hosts.
As it's tested in acl_check_host() without a mutex, let's re-test it
under a mutex to make sure the value is correct.
Note that it's just an optimization and it's ok to see outdated
allow_all_hosts value here.
* replace the message away in the test result
* remove "feedback plugin:" prefix, it's a server message, not plugin's
* downgrade to the warning, because
1) it's not a failure, no operation was aborted, server still works
2) it's something actionable, so not a [Note] either
Mariabackup (mariadb-backup) supports the --use-memory option that
sets the buffer pool size for innodb. However, current SST scripts
do not use this option. This commit adds support for this option,
the value for which can be specified via the "use_memory" parameter
in the configuration file in the [sst], [mariabackup] or [xtrabackup]
sections (supported only for compatibility with old configurations).
In addition, if the innodb_buffer_pool_size option is specified in
the user configuration (in the main server configuration sections)
or passed to the SST scripts or the server via arguments, its value
is also passed to mariadb-backup as the value for the --use-memory
option.
A new section name [mariabackup] has also been added, which can
be used instead of the deprecated [xtrabackup] (the section name
"mariabackup" was specified in the documentation, but was not
actually supported by SST scripts before this commit).
Fix a regression introduced by commit d98ac851 (MDEV-29935, MDEV-26247) causing
MAX_TABLES overflow in `setup_table_map()`. The check for MAX_TABLES was moved
outside of the loop that increments table numbers, allowing overflows during
loop iterations. Since setup_table_map() operates on a 64-bit bitmap, table
numbers exceeding 64 triggered the UBSAN check.
This commit returns the overflow check within the loop and adds a debug
assertion to `setup_table_map()` to ensure no bitmap overrun occurs.
When running the ./mtr tests and getting failures, rather than provide a
dead-link to mysql.com, this points developers to the Jira instance.
Signed-off-by: Eric Herman <eric@freesa.org>
opt_calc_index_goodness(): Correct an inaccurate condition.
We can very well use a clustered index of a table that is subject
to online rebuild. But we must not choose an index that has not been
committed (it is a secondary index that was not fully created)
or that is corrupted or not a normal B-tree index.
opt_search_plan_for_table(): Remove some redundant code, now that
opt_calc_index_goodness() checks against corrupted indexes.
The test case allows this code to be exercised. The main observation
in the following:
./mtr --rr innodb.stats_persistent
rr replay var/log/mysqld.1.rr/latest-trace
should be that when opt_search_plan_for_table() is being invoked by
dict_stats_update_persistent() on the being-altered statistics table
in the 2nd call after ha_innobase::inplace_alter_table(),
and the fix in opt_calc_index_goodness() is absent,
it would choose the code path if (n_fields == 0), that is, a full
table scan, instead of searching for the record. The GDB commands to
execute in "rr replay" would be as follows:
break ha_innobase::inplace_alter_table
continue
break opt_search_plan_for_table
continue
continue
next
next
…
Reviewed by: Vladislav Lesin
strerror_s on Linux will, for unknown error codes, display
'Unknown error <codenum>' and our tests are written with this assumption.
However, on macOS, sterror_s returns 'Unknown error: <codenum>' in the
same case, which breaks tests. Make my_strerror consistent across the
platforms by removing the ':' when present.