replication):
Incremental patch to enable idempotency support for update events again.
The final handling of errors will be done in BUG#31609, and until then
the handling of errors should be consistent between the different types
of changes.
When replicating an update pair (before image, after image) under row-based
replication, and the before image is not found on the slave, the after image
was not discared, and was hence read as a before image for the next row.
Eventually, this lead to an after image being read outside the block of rows
in the event, causing an assertion to fire.
This patch fixes this by reading the after image in the event that the row
was not found on the slave, adds some extra debug assertion to catch future
errors earlier, and also adds a few non-debug checks to prevent reading
outside the block of the event.
is possible):
When skipping the beginning of a transaction starting with BEGIN, the OPTION_BEGIN
flag was not set correctly, which caused the slave to not recognize that it was
inside a group. This patch sets the OPTION_BEGIN flag for BEGIN, COMMIT, ROLLBACK,
and XID events. It also adds checks if inside a group before decreasing the
slave skip counter to zero.
Begin_query_log_event was not marked that it could not end a group, which is now
corrected.
The general log write function (general_log_print) uses printf style
arguments which need to be pre-processed, meaning that the all arguments
are copied to a single buffer and the problem is that the buffer size is
constant (1022 characters) but queries can be much larger then this.
The solution is to introduce a new log write function that accepts a
buffer and it's length as arguments. The function is to be used when
a formatted output is not required, which is the case for almost all
query write-to-log calls.
This is a incompatible change with respect to the log format of prepared
statements.
Report claims that Seconds_behind_master behaves unexpectedly.
Code analysis shows that there is an evident flaw in that treating of FormatDescription event is wrong
so that after FLUSH LOGS on slave the Seconds_behind_master's calculation slips and incorrect
value can be reported to SHOW SLAVE STATUS.
Even worse is that the gap between the correct and incorrect deltas grows with time.
Fixed with prohibiting changes to rpl->last_master_timestamp by artifical events (any kind of).
suggestion as comments is added how to fight with lack of info on the slave side by means of
new heartbeat feature coming.
The test can not be done ealily fully determistic.
This patch clarifies some of the coding choices with documentationa and
removes a limitation in the code for future expansion of the CHAR and
BINARY fields to length > 255.
In the ha_partition::position() we don't calculate the number
of the partition of the record, but use m_last_part value instead,
relying on that it's previously set by some other call like ::write_row().
Delete_rows_log_event::do_exec_row() calls find_and_fetch_row(),
where we used position() + rnd_pos() call for the InnoDB-based PARTITION-ed
table as there HA_PRIMARY_KEY_REQUIRED_FOR_POSITION enabled.
fixed by introducing new handler::rnd_pos_by_record() method to be
used for random record-based positioning
using TPC-B):
Problem: A RBR event can contain incomplete row data (only key value and
fields which have been changed). In that case, when the row is unpacked
into record and written to a table, the missing fields get incorrect NULL
values leading to master-slave inconsistency.
Solution: Use values found in slave's table for columns which are not given
in the rows event. The code for writing a single row uses the following
algorithm:
1. unpack row_data into table->record[0],
2. try to insert record,
3. if duplicate record found, fetch it into table->record[0],
4. unpack row_data into table->record[0],
5. write table->record[0] into the table.
Where row_data is the row as stored in the data area of a rows event.
Thus:
a) unpacking of row_data happens at the time when row is written into
a table,
b) when unpacking (in step 4), only columns present in row_data are
overwritten - all other columns remain as they were found in the table.
Since all data needed for the above algorithm is stored inside
Rows_log_event class, functions which locate and write rows are turned
into methods of that class.
replace_record() -> Rows_log_event::write_row()
find_and_fetch_row() -> Rows_log_event::find_row()
Both methods take row data from event's data buffer - the row being
processed is pointed by m_curr_row. They unpack the data as needed into
table's record buffers record[0] or record[1]. When row is unpacked,
m_curr_row_end is set to point at next row in the data buffer.
Other changes introduced in this changeset:
- Change signature of unpack_row(): don't report errors and don't
setup table's rw_set here. Errors can happen only when setting default
values in prepare_record() function and are detected there.
- In Rows_log_event and derived classes, don't pass arguments to
the execution primitives (do_...() member functions) but use class
members instead.
- Move old row handling code into log_event_old.cc to be used by
*_rows_log_event_old classes.
Also, a new test rpl_ndb_2other is added which tests basic replication
from master using ndb tables to slave storing the same tables using
(possibly) different engine (myisam,innodb).
Test is based on existing tests rpl_ndb_2myisam and rpl_ndb_2innodb.
However, these tests doesn't work for various reasons and currently are
disabled (see BUG#19227).
The new test differs from the ones it is based on as follows:
1. Single test tests replication with different storage engines on slave
(myisam, innodb, ndb).
2. Include file extra/rpl_tests/rpl_ndb_2multi_eng.test containing
original tests is replaced by extra/rpl_tests/rpl_ndb_2multi_basic.test
which doesn't contain tests using partitioned tables as these don't work
currently. Instead, it tests replication to a slave which has more or
less columns than master.
3. Include file include/rpl_multi_engine3.inc is replaced with
include/rpl_multi_engine2.inc. The later differs by performing slightly
different operations (updating more than one row in the table) and
clearing table with "TRUNCATE TABLE" statement instead of "DELETE FROM"
as replication of "DELETE" doesn't work well in this setting.
4. Slave must use option --log-slave-updates=0 as otherwise execution of
replication events generated by ndb fails if table uses a different
storage engine on slave (see BUG#29569).
Fixed failing func_misc test for embedded server
Added casts to avoid compiler warnings
Removed Table_locks_immediate as it's depending on log file cacheing
Changed type of get_time() to avoid warnings
Removed testing if purger master logs succeded as this is not deterministic
Faster thr_alarm()
Added 'Opened_files' status variable to track calls to my_open()
Don't give warnings when running mysql_install_db
Added option --source-install to mysql_install_db
I had to do the following renames() as used polymorphism didn't work with Forte compiler on 64 bit systems
index_read() -> index_read_map()
index_read_idx() -> index_read_idx_map()
index_read_last() -> index_read_last_map()
This patch adds functionality to row-based replication to ensure the
slave's column sizes are >= to that of the master.
It also includes some refactoring for the code from WL#3228.
restores from mysqlbinlog out
Problem: using "mysqlbinlog | mysql" for recoveries the connection_id()
result may differ from what was used when issuing the statement.
Fix: if there is a connection_id() in a statement, write to binlog
SET pseudo_thread_id= XXX; before it and use the value later on.
--long-query-time is now given in seconds with microseconds as decimals
--min_examined_row_limit added for slow query log
long_query_time user variable is now double with 6 decimals
Added functions to get time in microseconds
Added faster time() functions for system that has gethrtime() (Solaris)
We now do less time() calls.
Added field->in_read_set() and field->in_write_set() for easier field manipulation by handlers
set_var.cc and my_getopt() can now handle DOUBLE variables.
All time() calls changed to my_time()
my_time() now does retry's if time() call fails.
Added debug function for stopping in mysql_admin_table() when tables are locked
Some trivial function and struct variable renames to avoid merge errors.
Fixed compiler warnings
Initialization of some time variables on windows moved to my_init()