MDEV-21810 MBR: Unexpected "Unsafe statement" warning for unsafe IODKU
MDEV-17614 fixes to replication unsafety for INSERT ON DUP KEY UPDATE
on two or more unique key table left a flaw. The fixes checked the
safety condition per each inserted record with the idea to catch a user-created
value to an autoincrement column and when that succeeds the autoincrement column
would become the source of unsafety too.
It was not expected that after a duplicate error the next record's
write_set may become different and the unsafe decision for that
specific record will be computed to screw the Query's binlogging
state and when @@binlog_format is MIXED nothing gets bin-logged.
This case has been already fixed in 10.5.2 by 91ab42a823 that
relocated/optimized THD::decide_logging_format_low() out of the record insert
loop. The safety decision is computed once and at the right time.
Pertinent parts of the commit are cherry-picked.
Also a spurious warning about unsafety is removed when MIXED
@@binlog_format; original MDEV-17614 test result corrected.
The original test of MDEV-17614 is extended and made more readable.
or slow query log when the log_output=TABLE.
When this happens, we temporary disable by changing log_output until
we've created the general_log and slow_log tables again.
Move </database> in xml mode until after the transaction_registry.
General_log and slow_log tables where moved to be first to be dumped so
that the disabling of the general/slow queries is minimal.
Previously the correct SQL mode for a stored routine or
package was only set before doing the CREATE part, this
worked out for PROCEDUREs and FUNCTIONs, but with ORACLE
mode specific PACKAGEs the DROP also only works in ORACLE
mode.
Moving the setting of the sql_mode a few lines up to happen
right before the DROP statement is writen fixes this.
row_ins_sec_index_entry_low(): If a separate mini-transaction is
needed to adjust the minimum bounding rectangle (MBR) in the parent
page, we must disable redo logging if the table is a temporary table.
For temporary tables, no log is supposed to be written, because
the temporary tablespace will be reinitialized on server restart.
rtr_update_mbr_field(): Plug a memory leak.
In SELECT_LEX::update_used_tables(),
do not run the loop setting tl->table->maybe_null
when tl is an eliminated table
(Rationale: First, with current table elimination, tl already
has maybe_null=1. Second, one should not care what flags
eliminated tables had)
(This is the assert that was added in fix for MDEV-26047)
Table elimination may remove an ON expression from an outer join.
However SELECT_LEX::update_used_tables() will still call
item->walk(&Item::eval_not_null_tables)
for eliminated expressions. If the subquery is constant and cheap
Item_cond_and will attempt to evaluate it, which will trigger an
assert.
The fix is not to call update_used_tables() or eval_not_null_tables()
for ON expressions that were eliminated.
Debian script debian-start upgrades database (which can be huge)
and prints lots of unnecessary information (not errors). Add
'--silent' to only sport possible errors
* FreeBSD returns errno 31 (EMLINK, Too many links),
not 40 (ELOOP, Too many levels of symbolic links)
* (`mysqlbinlog|mysql`) was just crazy, why did it ever work?
* socket_ipv6.inc check (that checked whether ipv6 is supported)
only worked correctly when ipv6 was supported
* perfschema.socket_summary_by_instance was changing global variables
and then skip-ing the test (because on missing ipv6)
Window Functions code tries to minimize the number of times it
needs to sort the select's resultset by finding "compatible"
OVER (PARTITION BY ... ORDER BY ...) clauses.
This employs compare_order_elements(). That function assumed that
the order expressions are Item_field-derived objects (that refer
to a temp.table). But this is not always the case: one can
construct queries order expressions are arbitrary item expressions.
Add handling for such expressions: sort them according to the window
specification they appeared in.
This means we cannot detect that two compatible PARTITION BY clauses
that use expressions can share the sorting step.
But at least we won't crash.
lock_validate() accumulates page ids under locked lock_sys->mutex, then
releases the latch, and invokes lock_rec_block_validate() for each page.
Some other thread has ability to add/remove locks and change pages
between releasing the latch in lock_validate() and acquiring it in
lock_rec_validate_page().
lock_rec_validate_page() can invoke lock_rec_queue_validate() for
non-locked supremum, what can cause ut_ad(page_rec_is_leaf(rec)) failure
in lock_rec_queue_validate().
The fix is to invoke lock_rec_queue_validate() only for locked records
in lock_rec_validate_page().
The error message in lock_rec_block_validate() is not necessary as
BUF_GET_POSSIBLY_FREED mode is used to get block from buffer pool, and
this is not error if a block was evicted.
The test case would require new debug sync point. I think it's not
necessary as the fixed code is debug-only.
Problem:
==============
By testing `pgrep` with `--ns` option,
introduced with MDEV-21331, commit fb7c1b9415,
I noted that:
a) `--ns` cannot use more than single PID.
b) `--ns` is returning the processes of the namespace to which supplied PID belongs to.
So by that sense command `pgrep -x --ns $$ mysqld` will always return an error and skip
checking of the existing PID of the server.
Solution:
==============
Suggested solution is to add `--nslist pid`, since `--ns` needs to know in which namespace type it should look for.
See `pgrep --help` for different namespace types.
Note also that this works *only* if script is run as a `root` (we have that case here).
Current PR is a part of:
1. MDEV-21331: sync preinst and postrm script
2. MDEV-15718: check for exact mysqld process
This commit:
a) fixes fb7c1b9415
b) Closes PR #2068 (obsolete)
c) Closes PR #2069 (obsolete)
Thanks Faustin Lammler <faustin@mariadb.org> for testing and verifying
Reviewed by <>
When "mariabackup --target-dir=$basedir --incremental-dir=$incremental_dir"
is running and is moving a new table file (e.g. `db1/t1.new`) from the
incremental directory to the base directory, it needs to verify that the base
backup database directory (e.g. `$basedir/db1`) really exists
(or create it otherwise).
The table `db1/t1` can come from a new database `db1` which
was created during the base mariabackup execution time.
In such case the directory `db1` exists only in the incremental directory,
but does not exist in the base directory.
Moved LIMIT warning from vers_set_hist_part() to new call
vers_check_limit() at table unlock phase. At that point
read_partitions bitmap is already pruned by DML code (see
prune_partitions(), find_used_partitions()) so we have to set
corresponding bits for working history partition.
Also we don't do my_error(ME_WARNING|ME_ERROR_LOG), because at that
point it doesn't update warnings number, so command reports 0 warnings
(but warning list is still updated). Instead we do
push_warning_printf() and sql_print_warning() separately.
Under LOCK TABLES external_lock(F_UNLCK) is not executed. There is
start_stmt(), but no corresponding "stop_stmt()". So for that mode we
call vers_check_limit() directly from close_thread_tables().
Test result has been changed according to new LIMIT and warning
printing algorithm. For convenience all LIMIT warnings are marked with
"You see warning above ^".
TODO MDEV-20345 fixed. Now vers_history_generating() contains
fine-grained list of DML-commands that can generate history (and TODO
mechanism worked well).
Like in MDEV-27217 vers_set_hist_part() for LIMIT depends on all
partitions selected in read_partitions. That bugfix just disabled
partition selection for DELETE with this check:
if (table->pos_in_table_list &&
table->pos_in_table_list->partition_names)
{
return HA_ERR_PARTITION_LIST;
}
ALTER TABLE TRUNCATE PARTITION is a different story. First, it doesn't
update pos_in_table_list->partition_names, but
thd->lex->alter_info.partition_names. But we cannot depend on that
since alter_info will be stale for DML. Second, we should not disable
TRUNCATE PARTITION for that to be consistent with TRUNCATE TABLE
behavior.
Now we don't do vers_set_hist_part() for ALTER TABLE as this command
is not DML, so it does not produce history.
Expression_cache_tmptable object uses an Expression_cache_tracker object
to report the statistics.
In the common scenario, Expression_cache_tmptable destructor sets
tracker->cache=NULL. The tracker object survives after the expression
cache is deleted and one may call cache_tracker->fetch_current_stats()
for it with no harm.
However a degenerate cache with no parameters does not set
tracker->cache=NULL in Expression_cache_tmptable destructor which
results in an attempt to use freed data in the
cache_tracker->fetch_current_stats() call.
Fixed by setting tracker->cache to NULL and wrapping the assignment into
a function.
btr_insert_into_right_sibling(): Inherit any gap lock from the
left sibling to the right sibling before inserting the record
to the right sibling and updating the node pointer(s).
lock_update_node_pointer(): Update locks in case a node pointer
will move.
Based on mysql/mysql-server@c7d93c274f
buf_flush_page(): Never wait for a page latch, even in checkpoint
flushing (flush_type == BUF_FLUSH_LIST), to prevent a hang of the
page cleaner threads when a large number of pages is latched.
In mysql/mysql-server@9542f3015b
it was claimed that such a hang only affects CREATE FULLTEXT INDEX.
Their fix was to retain buffer-fix but release exclusive latch
on non-leaf pages, and subsequently write to those pages while
they are not associated with the mini-transaction, which would
trip a debug assertion in the MariaDB version of
mtr_t::memo_modify_page() and cause potential corruption
when using the default MariaDB setting innodb_log_optimize_ddl=OFF.
This change essentially backports a small part of
commit 7cffb5f6e8 (MDEV-23399)
from MariaDB Server 10.5.7.
Two bugs here:
1. CHECKSUM TABLE asserted that all fields in the table are arranged
sequentially in the record, but virtual columns are always at the
end, violating this assertion
2. virtual columns were not calculated for CHECKSUM, so CHECKSUM
was using, essentially, garbage left from the previous statement.
(that's why the test must use INSERT IGNORE to have this "previous
statement" mark vcols not null)
Fix: don't include virtual columns into the table CHECKSUM. Indeed,
they cannot be included as the engine does not see virtual columns,
so in-engine checksum cannot include them, meaning in-server checksum
should not either
Precision should be kept below DECIMAL_MAX_SCALE for computations.
It can be bigger in Item_decimal. I'd fix this too but it changes the
existing behaviour so problemmatic to ix.
The cause of crash:
remove_redundant_subquery_clauses() removes redundant item expressions.
The primary goal of this is to remove the subquery items.
The removal process unlinks the subquery from SELECT_LEX tree, but does
not remove it from SELECT_LEX:::ref_pointer_array or from JOIN::all_fields.
Then, setup_subquery_caches() tries to wrap the subquery item in an
expression cache, which fails, the first reason for failure being that
the item doesn't have a query plan.
Solution: do not wrap eliminated items with expression cache.
(also added an assert to check that we do not attempt to execute them).
This may look like an incomplete fix: why don't we remove any mention
of eliminated item everywhere? The difficulties here are:
* items can be "un-removed" (see set_fake_select_as_master_processor)
* it's difficult to remove an element from ref_pointer_array: Item_ref
objects refer to elements of that array, so one can't shift elements in
it. Replacing eliminated subselect with a dummy Item doesn't look like a
good idea, either.
upon HANDLER READ
Analysis: The error state is not stored while checking condition and key
name.
Fix: Return true while checking condition and key name if error is reported
because geometry object can't be created from the data in the index value
for HANDLER READ.
to detect the end of SP definition correctly we need to know where
the parser stopped parsing the SP. lip->get_cpp_ptr() shows the
current parsing position, lip->get_cpp_tok_start() shows the start of
the last parsed token. The actual value depends on whether
the parser has performed a look-ahead. For example, in
CREATE PROCEDURE ... BEGIN ... END ;
the parser reads 'END' and knows that this ends the procedure definition,
it does not need to read the next token for this. But in
CREATE PROCEDURE ... SELECT 1 ;
the parser cannot know that the procedure ends at '1'. It has to read
the semicolon first (it could be '1 + 2' for example).
In the first case, the "current parsing position" is after END, before
the semicolon, in the second case it's *after* the semicolon. Note that
SP definition in both cases ends before the semicolon.
To be able to detect the end of SP deterministically, we need the parser
to do the look-ahead always or never.
The bug fix introduces a new parser token FORCE_LOOKAHEAD. Lexer never
returns it, so this token can never match. But the parser cannot know
it so it will have to perform a look-ahead to determine that the next
token is not FORCE_LOOKAHEAD. This way we deterministically end
SP parsing with a look-ahead.
This reverts commit 5ba77222e9
but keeps the test. A different fix for
MDEV-21028 Server crashes in Query_arena::set_query_arena upon SELECT from view
internal temporary tables should use THD as expr_area
This bug could cause a crash of the server at the second call of a stored
procedure when it executed a query containing a mergeable derived table /
view whose specification used another mergeable derived_table or view and a
subquery with outer reference in the select list of the specification.
Such queries could cause the same problem when they were executed for the
second time in a prepared mode.
The problem appeared due to a typo mistake in the legacy code of the function
create_view_field() that prevented building Item_direct_view_ref wrapper
for the mentioned outer reference at the second execution of the query and
setting the depended_from field for the outer reference.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
In cases of a faulty master or an incorrect binlog event producer, that slave is working with,
sends an incomplete group of events slave must react with an error to not to log
into the relay-log any new events that do not belong to the incomplete group.
Fixed with extending received event properties check when slave connects to master
in gtid mode.
Specifically for the event that can be a part of a group its relay-logging is
permitted only when its position within the group is validated.
Otherwise slave IO thread stops with ER_SLAVE_RELAY_LOG_WRITE_FAILURE.
The --skip-write-binlog message was confusing that it only had
an effect if the galera was enabled. There are uses beyond galera
so we apply SET SESSION SQL_LOG_BIN=0 as implied by the option
without being conditional on the wsrep status.
Remove wsrep.mysql_tzinfo_to_sql_symlink{,_skip} tests as they offered
no additional coverage beyond main.mysql_tzinfo_to_sql_symlink as no
server testing was done.
Introduced a variant of the galera.mariadb_tzinfo_to_sql as
galera.mysql_tzinfo_to_sql, which does testing using the mysql client
rather than directly importing into the server via mysqltest.
Update man page and mysql_tzinfo_to_sql to having a --skip-write-binlog
option.
merge notes:
10.4:
- conflicts in tztime.cc can revert to this version of --help text.
- tztime.cc - merge execute immediate @prep1, and leave %s%s trunc_tables, lock_tables
after that.
10.6:
- Need to remove the not_embedded.inc in mysql_tzinfo_to_sql.test and
replace it with no_protocol.inc
- leave both mysql_tzinfo_to_sql.test and mariadb_tzinfo_to_sql.sql
tests.
- sql/tztime.cc - keep entirely 10.6 version.
Implicit system-versioned table does not contain system fields in SHOW
CREATE. Therefore after mysqldump recovery such table has system
fields in the last place in frm image. The original table meanwhile
does not guarantee these system fields on last place because adding
new fields via ALTER TABLE places them last. Thus the order of fields
may be different between master and slave, so row-based replication
may fail.
To fix this on ALTER TABLE we now place system-invisible fields always
last in frm image. If the table was created via old revision and has
an incorrect order of fields it can be fixed via any copy operation of
ALTER TABLE, f.ex.:
ALTER TABLE t1 FORCE;
To check the order of fields in frm file one can use hexdump:
hexdump -C t1.frm
Note, the replication fails only when all 3 conditions are met:
1. row-based or mixed mode replication;
2. table has new fields added via ALTER TABLE;
3. table was rebuilt on some, but not all nodes via mysqldump image.
Otherwise it will operate properly even with incorrect order of
fields.
vers_info->hist_part retained stale value after ROLLBACK. The
algorithm in vers_set_hist_part() continued iteration from that value.
The simplest solution is to process partitions each time from start
for LIMIT in vers_set_hist_part().
Added checking for support of vfork by a platform where
building being done. Set HAVE_VFORK macros in case vfork()
system call is supported. Use vfork() system call if the
macros HAVE_VFORK is set, else use fork().