with spatial index
So the issue is since it is spatial index , at the time of searching index
for key (Rows_log_event::find_row) we use wrong field image we use
Field::itRAW while we should be using Field::itMBR
specific temporary errors
The optimistic parallel slave's worker thread could face a run-time error due to
the algorithm's specifics which allows for conflicts like the reported
"Can't find record in 'table'".
A typical stack is like
{noformat}
#0 handler::print_error (this=0x61c00008f8a0, error=149, errflag=0) at handler.cc:3650
#1 0x0000555555e95361 in write_record (thd=thd@entry=0x62a0000a2208, table=table@entry=0x61f00008ce88, info=info@entry=0x7fffdee356d0) at sql_insert.cc:1944
#2 0x0000555555ea7767 in mysql_insert (thd=thd@entry=0x62a0000a2208, table_list=0x61b00012ada0, fields=..., values_list=..., update_fields=..., update_values=..., duplic=<optimized out>, ignore=<optimized out>) at sql_insert.cc:1039
#3 0x0000555555efda90 in mysql_execute_command (thd=thd@entry=0x62a0000a2208) at sql_parse.cc:3927
#4 0x0000555555f0cc50 in mysql_parse (thd=0x62a0000a2208, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at sql_parse.cc:7449
#5 0x00005555566d4444 in Query_log_event::do_apply_event (this=0x61200005b9c8, rgi=<optimized out>, query_arg=<optimized out>, q_len_arg=<optimized out>) at log_event.cc:4508
#6 0x00005555566d639e in Query_log_event::do_apply_event (this=<optimized out>, rgi=<optimized out>) at log_event.cc:4185
#7 0x0000555555d738cf in Log_event::apply_event (rgi=0x61d0001ea080, this=0x61200005b9c8) at log_event.h:1343
#8 apply_event_and_update_pos_apply (ev=ev@entry=0x61200005b9c8, thd=thd@entry=0x62a0000a2208, rgi=rgi@entry=0x61d0001ea080, reason=<optimized out>) at slave.cc:3479
#9 0x0000555555d8596b in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x61200005b9c8, thd=thd@entry=0x62a0000a2208, rgi=rgi@entry=0x61d0001ea080) at slave.cc:3623
#10 0x00005555562aca83 in rpt_handle_event (qev=qev@entry=0x6190000fa088, rpt=rpt@entry=0x62200002bd68) at rpl_parallel.cc:50
#11 0x00005555562bd04e in handle_rpl_parallel_thread (arg=arg@entry=0x62200002bd68) at rpl_parallel.cc:1258
{noformat}
Here {{handler::print_error}} computes whether to error log the
current error when --log-warnings > 1. The decision flag is consulted
bu {{my_message_sql()}} which can be eventually called.
In the bug case the decision is to log.
However in the optimistic mode slave applier case any conflict is
attempted to resolve with rollback and retry to success. Hence the
logging is at least extraneous.
The case is fixed with adding a new flag {{ME_LOG_AS_WARN}} which
{{handler::print_error}} may propagate further on through {{my_error}}
when the error comes from an optimistically running slave worker thread.
The new flag effectively requests the warning level for the errlog record,
while the thread's DA records the actual error (which is regarded as temporary one
by the parallel slave error handler).
Problem was that detection of temporary tables was all wrong for
RENAME TABLE.
(Temporary tables where opened by top level call to
open_temporary_tables(), which can't detect if a temporary table
was renamed to something and then reused).
Fixed by adding proper parsing of rename list to check against
the current name of a table at each rename stage.
Also change do_rename_temporary() to check against the current
state of temporary tables, not according to the state of start
of RENAME TABLE.
replicate_events_marked_for_skip=FILTER_ON_MASTER
[Note this is a cherry-pick from 10.2 branch.]
When events of a big transaction are binlogged offsetting over 2GB from
the beginning of the log the semisync master's dump thread
lost such events.
The events were skipped by the Dump thread that found their skipping
status erroneously.
The current fixes make sure the skipping status is computed correctly.
The test verifies them simulating the 2GB offset.
* don't use Env module in tests, use $ENV{xxx} instead
* collateral changes:
** $file in the error message was unset
** $file in the other error message was unset too :)
** source file arguments are conventionally upper-cased
** abort the test (die) on error, don't just echo/exit
Problem was that Binlog_checkpoint can happen at random times.
Fixed by not write binlog_checkpoint for the rpl_log test.
Other things:
- Removed not used variable "$keep_gtid_events"
- Added option for show_binlog_events to skip binlog_checkpoint
As reported in MDEV-11969 "there's no way to ditch knowledge" about some
domain that is no longer updated on a server. Besides being of annoyance to
clutter output in DBA console stale domains can prevent the slave
to connect the master as MDEV-12012 witnesses.
What domain is obsolete must be evaluated by the user (DBA) according
to whether the domain info is still relevant and will the domain ever
receive any update.
This patch introduces a method to discard obsolete gtid domains from
the server binlog state. The removal requires no event group from such
domain present in existing binlog files though. If there are any the
containing logs must be first PURGEd in order for
FLUSH BINARY LOGS DELETE_DOMAIN_ID=(list-of-domains)
succeed. Otherwise the command returns an error.
The list of obsolete domains can be computed through
intersecting two sets - the earliest (first) binlog's Gtid_list
and the current value of @@global.gtid_binlog_state - and extracting
the domain id components from the intersection list items.
The new DELETE_DOMAIN_ID featured FLUSH continues to rotate binlog
omitting the deleted domains from the active binlog file's Gtid_list.
Notice though when the command is ineffective - that none of requested to delete
domain exists in the binlog state - rotation does not occur.
Obsolete domain deletion is not harmful for connected slaves as long
as master side binlog files *purge* is synchronized with FLUSH-DELETE_DOMAIN_ID.
The slaves must have the last event from purged files processed as usual,
in order not to bump later into requesting a gtid from a file which
was already gone.
While the command is not replicated (as ordinary FLUSH BINLOG LOGS is)
slaves, even though having extra domains, won't suffer from reconnection errors
thanks to master-slave gtid connection protocol allowing the master
to be ignorant about a gtid domain.
Should at failover such slave to be promoted into master role it may run
the ex-master's
FLUSH BINARY LOGS DELETE_DOMAIN_ID=(list-of-domains)
to clean its own binlog state.
NOTES.
suite/perfschema/r/start_server_low_digest.result
is re-recorded as consequence of internal parser codes changes.
Implement a special Copy_field method for timestamps, that copies
timestamps without converting them to MYSQL_TIME (the conversion
is lossy around DST change dates).
CREATE/DROP TEMPORARY TABLE are not safe to optimistically replicate in
parallel with other transactions, so they need to be marked as "ddl" in the
binlog.
This was already done for stand-alone CREATE/DROP TEMPORARY. But temporary
tables can also be created and dropped inside a BEGIN...END transaction, and
such transactions were not marked as ddl. Nor was the DROP TEMPORARY TABLE
statement emitted implicitly when a client connection is closed.
So this patch adds such ddl mark for the missing cases.
The difference to Kristian's original patch is mainly a fix in
mysql_trans_commit_alter_copy_data() to remember the unsafe_rollback_flags
over the temporary commit.
Problem was that in a circular replication setup the master remembers
position to events it has generated itself when reading from a slave.
If there are no new events in the queue from the slave, a
Gtid_list_log_event is generated to remember the last skipped event.
The problem happens if there is a network delay and we generate a
Gtid_list_log_event in the middle of the transaction, in which case there
will be an implicit comment and a new transaction with serverid=0 will be
logged.
The fix was to not generate any Gtid_list_log_events in the middle of a
transaction.
RPL_SEMI_SYNC_MASTER_CLIENTS=1
Analysis: Uninstalling rpl_semi_sync_slave on slave
will trigger removing the slave logic on Master which
will reduce Rpl_semi_sync_master_clients by one number.
But it happens asynchronously on Master. Having assert
to check this value with zero will have problems on
slow pb2 machines.
Fix: Change assert into wait_for_status_var condition.
Problem:-
This crash happens because logged stmt is quite big and while writing
Annotate_rows_log_event it throws EFBIG error but we ignore this error
and do not call cache_data->set_incident().
Solution:-
When we normally write Binlog_log_event we check for error EFBIG, but we did
do this for Annotate_rows_log_event. We check for this error and call
cache_data->set_incident() accordingly.
# Conflicts:
# sql/log.cc
Also, implement MDEV-11027 a little differently from 5.5 and 10.0:
recv_apply_hashed_log_recs(): Change the return type back to void
(DB_SUCCESS was always returned).
Report progress also via systemd using sd_notifyf().
Description:
============
If you have a relay log index file that has ended up with
some relay log files that do not exists, then RESET SLAVE
ALL is not enough to get back to a clean state.
Analysis:
=========
In the bug scenario slave server is in stopped state and
some of the relay logs got deleted but the relay log index
file is not updated.
During slave server restart replication initialization fails
as some of the required relay logs are missing. User
executes RESET SLAVE/RESET SLAVE ALL command to start a
clean slave. As per the documentation RESET SLAVE command
clears the master info and relay log info repositories,
deletes all the relay log files, and starts a new relay log
file. But in a scenario where the slave server's
Relay_log_info object is not initialized slave will not
purge the existing relay logs. Hence the index file still
remains in a bad state. Users will not be able to start
the slave unless these files are cleared.
Fix:
===
RESET SLAVE/RESET SLAVE ALL commands should do the cleanup
even in a scenario where Relay_log_info object
initialization failed.
Backported a flag named 'error_on_rli_init_info' which is
required to identify slave's Relay_log_info object
initialization failure. This flag exists in MySQL-5.6
onwards as part of BUG#14021292 fix.
During RESET SLAVE/RESET SLAVE ALL execution this flag
indicates the Relay_log_info initialization failure.
In such a case open the relay log index/relay log files
and do the required clean up.
The failure happens due to a race condition between processing
a row event (INSERT) and an automatically generated event
DROP TEMPORARY TABLE. Even though DROP has a higher GTID, it can
become visible in @@gtid_slave_pos before the row event with
a lower GTID has been committed. Since the test makes the slave
to synchronize with the master using GTID, the waiting stops
as soon as GTID of the DROP TEMPORARY TABLE becomes visible,
and if changes from the previous event haven't been applied yet,
the error occurs.
According to Kristian (see the comment to MDEV-10631), the real
problem is that DROP TEMPORARY TABLE is logged in the row mode
at all. For this particular test, since DROP does not do anything,
nothing prevents it from competing with the prior transaction.
The workaround for the test is to add a meaningful event
after DROP TEMPORARY TABLE, so that the slave would wait on its
GTID instead of the one from DROP.
Additionally (unrelated to this problem) removed FLUSH TABLES,
which, as the comment stated, should have been removed after
MDEV-6403 was fixed.
On a slow builder, a delay between binlog events on master could
occur, which would cause a heartbeat which is not expected by the
test. The solution is to monitor the timing of binlog events
on the master and only perform the heartbeat check if no critical
delays have happened.
Additionally, an unused variable was removed (this change is
unrelated to the bugfix).
* Remove duplicate lines from tests
* Use thd instead of current_thd
* Remove extra wsrep_binlog_format_names
* Correctly merge union patch from 5.5 wrt duplicate rows.
* Correctly merge SELinux changes into 10.1
contains a bad and a good copy
Clean up the InnoDB doublewrite buffer code.
buf_dblwr_init_or_load_pages(): Do not add empty pages to the buffer.
buf_dblwr_process(): Do consider changes to pages that are all zero.
Do not abort when finding a corrupted copy of a page in the doublewrite
buffer, because there could be multiple copies in the doublewrite buffer,
and only one of them needs to be good.