Cause: a copy of the joined TABLE_LIST is created during multi_update::prepare
and TABLE::pos_in_table_list of the tables are set to point to the new
TABLE_LIST object. This prevents some optimization steps to perform correctly.
Solution: do not update pos_in_table_list during multi_update::prepare
When trying to execute ALTER TABLE EXCHANGE PARTITION with different
definitions, assertion
table->s->db_create_options == part_table->s->db_create_options
failed in compare_table_with_partition().
However, this execution should not be allowed since executing
'exchange partition' requires the identical structure of the two tables.
To fix the problem, I deleted the assertion code and added code that
returns an error that indicates tables have different definitions.
Reviewed By: Nayuta Yanagisawa
that is not in binlog.
Post-crash recovery of --rpl-semi-sync-slave-enabled server
failed to recognize a transaction in-doubt that needed rolled back.
A prepared-but-not-in-binlog transaction gets committed instead
to possibly create inconsistency with a master (e.g the way it was observed
in the bug report).
The semisync recovery is corrected now with initializing binlog coordinates
of any transaction in-doubt to the maximum offset which is
unreachable.
In effect when a prepared transaction that is not found in binlog
it will be decided to rollback because it's guaranteed to reside
in a truncated tail area of binlog.
Mtr tests are reinforced to cover the described scenario.
Function wsrep_read_only_option was already removed in commit
d54bc3c0d1 because it could cause race condition on variable
opt_readonly so that value OFF can become permanent.
Removed function again and added test case. Note that writes
to TEMPORARY tables are still allowed when read_only=ON.
This patch fixes a problem that arises when a Galera node acts as a
replica for native replication. When parallel applying is enabled, it
is possible to end up with attempts to write binlog events with gtids
out of order. This happens because when multiple events are delivered
from the native replication stream and applied in concurrently, it is
for them to be replicated to the Galera cluster in an order which is
different from the original order in which they were committed in the
aync replication master.
To correct this behavior we now wait_for_prior_commit() before
replicating changes though galera. As a consequence, parallel appliers
may apply events in parallel until the galera replication step, which
is now serialized.
Arythmetic can overrun the uint type when possible group_concat_max_len
is multiplied to collation.mbmaxlen (can easily be like 4).
So use ulonglong there for calculations.
The warning comes from copying POSITION objects where 'type' is not
initialized. This does not affect any production code as when 'type'
was compared/used, it was always initialized.
Removed by initializing the type variable in the constructor
- In best_extension_by_limited_search(), do not check for
"(remaining_tables & real_table_bit)", it is guaranteed to be true.
Make it an assert.
- In (!idx || check_interleaving_with_nj())", remove the !idx part.
This check made sense only in the original version of this function.
- "micro optimization" in check_interleaving_with_nj().
The error message "InnoDB: innodb_page_size=65536 requires innodb_buffer_pool_size >= 20MiB current 10MiB" is the relevant one.
The root cause:
mysql_install_db bootstraps with --innodb-buffer-pool-size=10M.
Small bufferpool is here by design - bootstrap should succeed,
even if there is not much RAM available, bootstrap does not need that much
memory.
For pagesize 64K specifically, Innodb thinks it needs a larger bufferpool,
and thus it lets the bootstrap process die (although the expected behavior
in this case would be to adjust value, give warning and continue)
The workaround:
- pass --innodb-buffer-pool-size=20M, which is suitable for all page sizes.
- check the same limit in MSI custom action.
Also, the patch adds mtr test for 64K page size.
GTID_LIST_EVENT or INCIDENT_EVENT.
It's legal to have either of the two inside a group. E.g
Gtid_event, Gtid_log_list_event, Query_1, ... Xid_log_event
is permitted.
However, the slave IO thread treated both
as the terminal even when the group represents a DDL query.
That causes a premature Gtid state update so the slave IO would think
the whole group has been collected while in fact Query_1 etc are yet to process.
Fixed with correcting a condition to compute the terminal event
of the group.
Tested with rpl_mysqlbinlog_slave_consistency (of 10.9) and
rpl_gtid_errorlog.test.
The issue was that best_extension_by_limited_search() had to go through
too many plans with the same cost as there where many EQ_REF tables.
Fixed by shortcutting EQ_REF (AND REF) when the result only contains one
row. This got the optimization time down from hours to sub seconds.
The only known downside with this patch is that in some cases a table
with ref and 1 record may be used before on EQ_REF table. The faster
optimzation phase should compensate for this.
Errors where:
/buildbot/amd64-ubuntu-2004-msan/build/sql/item.h:6478:12: error: 'val_datetime_packed' overrides a member function but is not marked 'override' [-Werror,-Winconsistent-missing-override]
longlong val_datetime_packed(THD *thd)
^
/buildbot/amd64-ubuntu-2004-msan/build/sql/item.h:3501:12: note: overridden virtual function is here
longlong val_datetime_packed(THD *thd) override;
^
/buildbot/amd64-ubuntu-2004-msan/build/sql/item.h:6480:12: error: 'val_time_packed' overrides a member function but is not marked 'override' [-Werror,-Winconsistent-missing-override]
longlong val_time_packed(THD *thd)
^
/buildbot/amd64-ubuntu-2004-msan/build/sql/item.h:3502:12: note: overridden virtual function is here
longlong val_time_packed(THD *thd) override;
^
Fixed failing main.default on Windows
(to trigger an assert the test needed a debug build without
safemalloc, as 0xa5 happened to have the important bit set "correctly")
MDEV-21810 MBR: Unexpected "Unsafe statement" warning for unsafe IODKU
MDEV-17614 fixes to replication unsafety for INSERT ON DUP KEY UPDATE
on two or more unique key table left a flaw. The fixes checked the
safety condition per each inserted record with the idea to catch a user-created
value to an autoincrement column and when that succeeds the autoincrement column
would become the source of unsafety too.
It was not expected that after a duplicate error the next record's
write_set may become different and the unsafe decision for that
specific record will be computed to screw the Query's binlogging
state and when @@binlog_format is MIXED nothing gets bin-logged.
This case has been already fixed in 10.5.2 by 91ab42a823 that
relocated/optimized THD::decide_logging_format_low() out of the record insert
loop. The safety decision is computed once and at the right time.
Pertinent parts of the commit are cherry-picked.
Also a spurious warning about unsafety is removed when MIXED
@@binlog_format; original MDEV-17614 test result corrected.
The original test of MDEV-17614 is extended and made more readable.
In SELECT_LEX::update_used_tables(),
do not run the loop setting tl->table->maybe_null
when tl is an eliminated table
(Rationale: First, with current table elimination, tl already
has maybe_null=1. Second, one should not care what flags
eliminated tables had)
(This is the assert that was added in fix for MDEV-26047)
Table elimination may remove an ON expression from an outer join.
However SELECT_LEX::update_used_tables() will still call
item->walk(&Item::eval_not_null_tables)
for eliminated expressions. If the subquery is constant and cheap
Item_cond_and will attempt to evaluate it, which will trigger an
assert.
The fix is not to call update_used_tables() or eval_not_null_tables()
for ON expressions that were eliminated.
Window Functions code tries to minimize the number of times it
needs to sort the select's resultset by finding "compatible"
OVER (PARTITION BY ... ORDER BY ...) clauses.
This employs compare_order_elements(). That function assumed that
the order expressions are Item_field-derived objects (that refer
to a temp.table). But this is not always the case: one can
construct queries order expressions are arbitrary item expressions.
Add handling for such expressions: sort them according to the window
specification they appeared in.
This means we cannot detect that two compatible PARTITION BY clauses
that use expressions can share the sorting step.
But at least we won't crash.
This patch corrects the fix for MDEV-26412.
Note that when parsing an ON expression the pointer to the current select
is always in select_stack[select_stack_top - 1]. So the pointer to the
outer select (if any) is in select_stack[select_stack_top - 2].
The query manifesting this bug is added to the test case of MDEV-26412.
Moved LIMIT warning from vers_set_hist_part() to new call
vers_check_limit() at table unlock phase. At that point
read_partitions bitmap is already pruned by DML code (see
prune_partitions(), find_used_partitions()) so we have to set
corresponding bits for working history partition.
Also we don't do my_error(ME_WARNING|ME_ERROR_LOG), because at that
point it doesn't update warnings number, so command reports 0 warnings
(but warning list is still updated). Instead we do
push_warning_printf() and sql_print_warning() separately.
Under LOCK TABLES external_lock(F_UNLCK) is not executed. There is
start_stmt(), but no corresponding "stop_stmt()". So for that mode we
call vers_check_limit() directly from close_thread_tables().
Test result has been changed according to new LIMIT and warning
printing algorithm. For convenience all LIMIT warnings are marked with
"You see warning above ^".
TODO MDEV-20345 fixed. Now vers_history_generating() contains
fine-grained list of DML-commands that can generate history (and TODO
mechanism worked well).
Like in MDEV-27217 vers_set_hist_part() for LIMIT depends on all
partitions selected in read_partitions. That bugfix just disabled
partition selection for DELETE with this check:
if (table->pos_in_table_list &&
table->pos_in_table_list->partition_names)
{
return HA_ERR_PARTITION_LIST;
}
ALTER TABLE TRUNCATE PARTITION is a different story. First, it doesn't
update pos_in_table_list->partition_names, but
thd->lex->alter_info.partition_names. But we cannot depend on that
since alter_info will be stale for DML. Second, we should not disable
TRUNCATE PARTITION for that to be consistent with TRUNCATE TABLE
behavior.
Now we don't do vers_set_hist_part() for ALTER TABLE as this command
is not DML, so it does not produce history.