Problem:
=======
rpl_blackhole.test fails when executed with following options
mysqld=--binlog_annotate_row_events=1, mysqld=--replicate_annotate_row_events=1
Test output:
------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
rpl.rpl_blackhole_bug 'mix' [ pass ] 791
rpl.rpl_blackhole_bug 'row' [ fail ]
Replicate_Wild_Ignore_Table
Last_Errno 1032
Last_Error Could not execute Update_rows_v1 event on table test.t1; Can't find
record in 't1', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's
master log master-bin.000001, end_log_pos 1510
Analysis:
=========
Enabling "replicate_annotate_row_events" on slave, Tells the slave to write
annotate rows events received from the master to its own binary log. The
received annotate events are applied after the Gtid event as shown below.
thd->query() will be set to the actual query received from the master, through
annotate event. Annotate_rows event should not be deleted after the event is
applied as the thd->query will be used to generate new Annotate_rows event
during applying the subsequent Rows events. After the last Rows event has been
applied, the saved Annotate_rows event (if any) will be deleted.
In balckhole engine all the DML operations are noops as they donot store any
data. They simply return success without doing any operation. But the existing
strictly expects thd->query() to be 'NULL' to identify that row based
replication is in use. This assumption will fail when row annotations are
enabled as the query is not 'NULL'. Hence various row based operations like
'update', 'delete', 'index lookup' will fail when row annotations are enabled.
Fix:
===
Extend the row based replication check to include row annotations as well.
i.e Either the thd->query() is NULL or thd->query() points to query and row
annotations are in use.
row_search_mvcc(): Duplicate the logic of btr_pcur_move_to_next()
so that an infinite loop can be avoided when advancing to the next
page fails due to a corrupted page.
At higher levels of innodb_force_recovery, the InnoDB transaction
subsystem will not be set up at all.
At slightly lower levels, recovered transactions will not be rolled back,
and DDL operations could hang due to locks being held at all.
Let us consistently refuse all writes if the predicate
high_level_read_only holds. We failed to refuse DROP TABLE
and DROP DATABASE. (Refusing DROP TABLE is a partial backport
from MDEV-19570 in the 10.5 branch.)
- Don't apply redo log for the corrupted page when innodb_force_recovery > 0.
- Allow the table to be dropped when index root page is
corrupted when innodb_force_recovery > 0.
Simulation of a big-sized event in rpl.rpl_semi_sync_skip_repl did not clean
up after itself so screw the last binlog event offset which could jump
backwards.
The test is refined to rotate a binlog file with simulation and use the next
one for logics of the test incl master-slave synchonization.
row_insert_for_mysql(): InnoDB sets values for row_start and row_end.
And this function used to return those values to server in
ha_innobase::write_row(). This buggy behavior was removed. Also,
a piece of code in this function was reformatted.
upd_node_t::make_versioned_helper(): Assert that the preallocated size
of the update vector is not exceeded.
The bug was introduced in MariaDB 10.4.0 by
commit 0e5a4ac253
but it is good to have a regression test for this scenario
in all applicable MariaDB versions.
Cover the purge of an undo log record that was written before
the completion of ADD SPATIAL INDEX.
Problem:
========
We have a Master/Master Setup on two servers, but are only writing to one of
those servers (so it is essentially Master/Slave) We upgraded from 10.1.* to
10.2.22 last week and starting with the upgrade, we are getting duplicate key
errors on the slave. BINLOG=mixed.
Analysis:
=========
This issue happens with LOCK TABLES and binlog_format=MIXED combination. When an
UNSAFE statement is encountered in 'MIXED' mode, it is logged in the form of
'ROW' format. For all the tables that are part of LOCK TABLES list their table maps
are written into the binary log. For each table in the list a check is
done to see if 'check_table_binlog_row_based_done' flag is set or not. If it is not set
a check process is initiated to see if table qualifies for row based binary
logging or not and 'check_table_binlog_row_based_done' is set. This flag will be
cleared at the time of closing thread tables.
But there can be special cases where the LOCK TABLES contains more number of
tables but the unsafe query is actually using subset of tables from LOCK TABLES
list.
For example: LOCK TABLES locks t1,t2,t3 but the unsafe statement makes use of
only two tables t1,t3. In this case the 'check_table_binlog_row_based_done' flag
is enabled for table 't2' while writing table map, but 'close_thread_tables'
function call will not reset this flag. Since the flag is not cleared for table
't2' even a safe statement which used t2 will be logged in the form of row based
format.
This leads to an assert on debug builds and causes duplicate entries in release
builds. In release builds a statement is logged in the form of both ROW and
STATEMENT format. This causes the slave to fail with duplicate key error.
Fix:
===
During 'close_thread_tables' when LOCK TABLE modes are active "ha_reset" is done
for all the tables which were part of current statement. As mentioned in the
example 'ha_reset' is called for tables 't1' and 't3'. This will clear the
'check_table_binlog_row_based_done' flag. At this point add a check for the rest
of the tables to see if 'check_table_binlog_row_based_done' is enabled or not.
If enabled clear the flag.
Problem:
=======
Whel rpl.rpl_row_mysqlbinlog test is executed as shown below it fails with
result content mismatch.
perl mtr rpl_row_mysqlbinlog --mysqld=--binlog-annotate-row-events=1
Analysis:
=========
When row annotations are enabled the actual query is written into the binlog
which helps users to understand the query, even when row based replication is
enabled.
For example: Simple insert in row based replication looks like shown below.
#190402 16:31:27 server id 1 end_log_pos 526 Annotate_rows:
#Q> insert into t values (10)
#190402 16:31:27 server id 1 end_log_pos 566 Table_map: `test`.`t` mapped to number 19
# at 566
#190402 16:31:27 server id 1 end_log_pos 600 Write_rows: table id 19 flags: STMT_END_F
BINLOG '
B0GjXBMBAAAAKAAAADYCAAAAABMAAAAAAAEABHRlc3QAAXQAAQMAAQ==
B0GjXBcBAAAAIgAAAFgCAAAAABMAAAAAAAEAAf/+CgAAAA==
'/*!*/;
# at 600
The test creates some binary log events and redirects them into a SQL file.
Executes RESET MASTER and sources the SQL file back on clean master and verifies
that the data is available. Please refer following steps.
../client/mysqlbinlog ./var/mysqld.1/data/master-bin.000001 > test.sql
../client/mysql -uroot -S./var/tmp/mysqld.1.sock -Dtest < test.sql
../client/mysqlbinlog ./var/mysqld.1/data/master-bin.000001 -v > row.sql
When the row based replication specific SQL file is sourced once again on master
the newly generated binlog will treat the entire "BASE 64" encoded event as
query and write it into the binary log.
Output from 'row.sql':
#Q> BINLOG '
#Q> B0GjXBMBAAAAKAAAADYCAAAAABMAAAAAAAEABHRlc3QAAXQAAQMAAQ==
#Q> B0GjXBcBAAAAIgAAAFgCAAAAABMAAAAAAAEAAf/+CgAAAA==
#190402 16:31:27 server id 1 end_log_pos 657 Table_map: `test`.`t` mapped to number 23
# at 657
#190402 16:31:27 server id 1 end_log_pos 691 Write_rows: table id 23 flags: STMT_END_F
BINLOG '
B0GjXBMBAAAAKAAAAJECAAAAABcAAAAAAAEABHRlc3QAAXQAAQMAAQ==
B0GjXBcBAAAAIgAAALMCAAAAABcAAAAAAAEAAQH+CgAAAA==
### INSERT INTO `test`.`t`
### SET
### @1=10
'/*!*/;
# at 691
This is expected behaviour as we cannot extract query from BASE 64 encoded
input. This causes more number of binary logs to be generated when the test is
executed with row annotations.
The following lines from test assumes that only two binary logs will contain
entire data.
--echo --- Test 4 Second Remote test --
---exec $MYSQL_BINLOG --read-from-remote-server --user=root --host=127.0.0.1
--port=$MASTER_MYPORT master-bin.000001 > $MYSQLTEST_VARDIR/tmp/remote.sql
---exec $MYSQL_BINLOG --read-from-remote-server --user=root --host=127.0.0.1
--port=$MASTER_MYPORT master-bin.000002 >> $MYSQLTEST_VARDIR/tmp/remote.sql
In a case when row annotations are enabled the data gets spread across four
binary logs. As test uses only the first two binary log files, data available in
other binary logs gets missed. Hence test fails with result content mismatch as
less data is avaialble.
Fix:
====
Use "-to-the-last" option of "mysqlbinlog" tool which will ensure that all the
available binary log specific contents are included in .sql file.
Try to fix the race conditions between
SET GLOBAL innodb_ft_aux_table = ...;
and access to the INFORMATION_SCHEMA tables that depend on
this variable.
innodb_ft_aux_table: Replaces
fts_internal_tbl_name,fts_internal_tbl_name2. Just store the
user-specified parameter as is.
innodb_ft_aux_table_id: The table_id corresponding to
SET GLOBAL innodb_ft_aux_table, or 0 if the table does not exist
or does not contain FULLTEXT INDEX. If the table is renamed later,
the INFORMATION_SCHEMA tables will continue to refer to the table.
If the table is dropped or rebuilt, the INFORMATION_SCHEMA tables
will not find the table.
row_purge_upd_exist_or_extern_func(): Check for node->vcol_op_failed()
after row_purge_remove_sec_if_poss(), like row_purge_del_mark() did.
This avoids us dereferencing the node->table=NULL pointer.
The test case, submitted by Elena Stepanova, is not deterministic and
does not repeat the bug on 10.2. With the added loop, for me, it reliably
crashes 10.3 without the fix. I was unable to create a deterministic
test case for either 10.2 or 10.3.
Reviewed by Thirunarayanan Balathandayuthapani
Avoid accessing the table cache while the ALTER TABLE statement
is blocked by DEBUG_SYNC. Use explicit COMMIT for forcing the
redo log flush (whose main purpose is to ensure that the
incomplete state of the blocked ALTER TABLE statement is persisted).
Ensure that the 'auxiliary transactions' that are there for
flushing the incomplete undo log of the to-be-recovered DDL
transactions are actually making modifications.
This is a backport of 2fe40a7af0
from MariaDB 10.4.