using TPC-B):
Problem: A RBR event can contain incomplete row data (only key value and
fields which have been changed). In that case, when the row is unpacked
into record and written to a table, the missing fields get incorrect NULL
values leading to master-slave inconsistency.
Solution: Use values found in slave's table for columns which are not given
in the rows event. The code for writing a single row uses the following
algorithm:
1. unpack row_data into table->record[0],
2. try to insert record,
3. if duplicate record found, fetch it into table->record[0],
4. unpack row_data into table->record[0],
5. write table->record[0] into the table.
Where row_data is the row as stored in the data area of a rows event.
Thus:
a) unpacking of row_data happens at the time when row is written into
a table,
b) when unpacking (in step 4), only columns present in row_data are
overwritten - all other columns remain as they were found in the table.
Since all data needed for the above algorithm is stored inside
Rows_log_event class, functions which locate and write rows are turned
into methods of that class.
replace_record() -> Rows_log_event::write_row()
find_and_fetch_row() -> Rows_log_event::find_row()
Both methods take row data from event's data buffer - the row being
processed is pointed by m_curr_row. They unpack the data as needed into
table's record buffers record[0] or record[1]. When row is unpacked,
m_curr_row_end is set to point at next row in the data buffer.
Other changes introduced in this changeset:
- Change signature of unpack_row(): don't report errors and don't
setup table's rw_set here. Errors can happen only when setting default
values in prepare_record() function and are detected there.
- In Rows_log_event and derived classes, don't pass arguments to
the execution primitives (do_...() member functions) but use class
members instead.
- Move old row handling code into log_event_old.cc to be used by
*_rows_log_event_old classes.
Also, a new test rpl_ndb_2other is added which tests basic replication
from master using ndb tables to slave storing the same tables using
(possibly) different engine (myisam,innodb).
Test is based on existing tests rpl_ndb_2myisam and rpl_ndb_2innodb.
However, these tests doesn't work for various reasons and currently are
disabled (see BUG#19227).
The new test differs from the ones it is based on as follows:
1. Single test tests replication with different storage engines on slave
(myisam, innodb, ndb).
2. Include file extra/rpl_tests/rpl_ndb_2multi_eng.test containing
original tests is replaced by extra/rpl_tests/rpl_ndb_2multi_basic.test
which doesn't contain tests using partitioned tables as these don't work
currently. Instead, it tests replication to a slave which has more or
less columns than master.
3. Include file include/rpl_multi_engine3.inc is replaced with
include/rpl_multi_engine2.inc. The later differs by performing slightly
different operations (updating more than one row in the table) and
clearing table with "TRUNCATE TABLE" statement instead of "DELETE FROM"
as replication of "DELETE" doesn't work well in this setting.
4. Slave must use option --log-slave-updates=0 as otherwise execution of
replication events generated by ndb fails if table uses a different
storage engine on slave (see BUG#29569).
Fixed failing func_misc test for embedded server
Added casts to avoid compiler warnings
Removed Table_locks_immediate as it's depending on log file cacheing
Changed type of get_time() to avoid warnings
Removed testing if purger master logs succeded as this is not deterministic
Faster thr_alarm()
Added 'Opened_files' status variable to track calls to my_open()
Don't give warnings when running mysql_install_db
Added option --source-install to mysql_install_db
I had to do the following renames() as used polymorphism didn't work with Forte compiler on 64 bit systems
index_read() -> index_read_map()
index_read_idx() -> index_read_idx_map()
index_read_last() -> index_read_last_map()
restores from mysqlbinlog out
Problem: using "mysqlbinlog | mysql" for recoveries the connection_id()
result may differ from what was used when issuing the statement.
Fix: if there is a connection_id() in a statement, write to binlog
SET pseudo_thread_id= XXX; before it and use the value later on.
--long-query-time is now given in seconds with microseconds as decimals
--min_examined_row_limit added for slow query log
long_query_time user variable is now double with 6 decimals
Added functions to get time in microseconds
Added faster time() functions for system that has gethrtime() (Solaris)
We now do less time() calls.
Added field->in_read_set() and field->in_write_set() for easier field manipulation by handlers
set_var.cc and my_getopt() can now handle DOUBLE variables.
All time() calls changed to my_time()
my_time() now does retry's if time() call fails.
Added debug function for stopping in mysql_admin_table() when tables are locked
Some trivial function and struct variable renames to avoid merge errors.
Fixed compiler warnings
Initialization of some time variables on windows moved to my_init()
This patch adds the ability to store extra field metadata in the table
map event. This data can include pack_length() or field_lenght() for
fields such as CHAR or VARCHAR enabling developers to add code that
can check for compatibilty between master and slave columns. More
importantly, the extra field metadata can be used to store data from the
master correctly should a VARCHAR field on the master be <= 255 bytes
while the same field on the slave is > 255 bytes.
The patch also includes the needed changes to unpack to ensure that data
which is smaller on the master can be unpacked correctly on the slave.
WL#3915 : (NDB) master's cols > slave
Slave starts accepting and handling rows of master's tables which have more columns.
The most important part of implementation is how to caclulate the amount of bytes to
skip for unknown by slave column.
Actually, this testcase will fail generally on all testing platforms.
The bugs come from the inconsistent bitmap between rpl master and slave.
In log_event.cc, the n_bits of m_cols and m_cols_ai are intialized with octal-ceiling
m_width, in fact, their n_bits should be equal to m_width.
Wrong n_bits will cause bitmap_bits_set() get incorrect value in unpack_row()
in rpl_record.cc,
then an assertion in unpack_row() will fail and crash sql thread.
DBUG_ASSERT(null_ptr == row_data + master_null_byte_count);
Meanwhile, because of binlog_prepare_pending_rows_event() changed with correct
m_cols, some results of specific testcases should be updated:
binlog_multi_engine.test
ndb_binlog_multi.test
rpl_ndb_dd_partitions.test
rpl_ndb_log.test
rpl_truncate_7ndb.test
rpl_truncate_7ndb_2.test
In addition, to ensure rows replication correct between master and slave after the patch,
two 'select * from t1' are added in extra/rpl_tests/rpl_log.test, and some testcases include
rpl_log.test, therefore, the results of these testcases should be updated likewise:
rpl_stm_log.test
rpl_row_log.test
rpl_ndb_log.test
rpl_row_log_innodb.test
Totally, results of nine testcases are updated.
Sometimes the number of really updated rows (with changed
column values) cannot be determined at the server level
alone (e.g. if the storage engine does not return enough
column values to verify that). So the only dependable way
in such cases is to let the storage engine return that
information if possible.
Fixed the bug at server level by providing a way for the
storage engine to return information about wether it
actually updated the row or the old and the new column
values are the same. It can do that by returning
HA_ERR_RECORD_IS_THE_SAME in ha_update_row().
Note that each storage engine may choose not to try to
return this status code, so this behaviour remains
storage engine specific.
Occasionally mysqlbinlog --hexdump failed with error:
ERROR 1064 (42000) at line ...: You have an error in your
SQL syntax; check the manual that corresponds to your MySQL
server version for the right syntax to use near
'Query thread_id=... exec_time=... error_code=...
When the length of hexadecimal dump of binlog header was
divisible by 16, commentary sign '#' after header was lost.
The Log_event::print_header function has been modified to always
finish hexadecimal binlog header with "\n# ".
slave_sql thread calls thd->clear_error() to force error to be ignored,
though this method didn't clear thd->killed state, what causes
slave_sql thread to stop.
clear thd->killed state if we ignore an error
Adding new fields Last_{IO,SQL}_Errno and Last_{IO,SQL}_Error to output
of SHOW SLAVE STATUS to hold errors from I/O and SQL thread respectively.
Old fields Last_Error and Last_Errno are aliases for Last_SQL_Error and
Last_SQL_Errno respectively.
Fields are added last to output of SHOW SLAVE STATUS to allow old applications
to use the same positional arguments into the row, while allowing new
application to benefit from the added information.
In addition, some new error codes are added (especially for the I/O
thread) to be able to provide sensible error message.
SQL_SLAVE_SKIP_COUNTER is possible):
By setting the SQL_SLAVE_SKIP_COUNTER it was possible to start the
from the middle of a group. This patch adds code so that events that
do not end a statement are ignored instead of skip counted when the
slave skip counter is 1.
Problem: there is an ASSERT() in the Log_event::read_log_event() checking the integrity
of the event's data that may fail.
Fix: move the assert's condition to an explicit check.
The bug in that slave version of a table with unique field still was
able to execute INSERT query as replace whereas it's impossible on master.
The reason of this artifact is wrong usage of ndb->extra:s.
Fixed with resetting flags at do_after.
There is open issue with symmetrical resetting
table->file->extra(HA_EXTRA_NO_IGNORE_NO_KEY)
which i had to hand to bug#27077.
The test for the current bug was committed in a cset for bug#27320.
The reason for the bug was that replaying of a query on slave could not be possible since its event
was recorded with the killed error. Due to the specific of handling INSERT, which per-row-while-loop is
unbreakable to killing, the query on transactional table should have not appeared in binlog unless
there was a call to a stored routine that got interrupted with killing (and then there must be an error
returned out of the loop).
The offered solution added the following rule for binlogging of INSERT that accounts the above
specifics:
For INSERT on transactional-table if the error was not set the only raised flag
is harmless and is ignored via masking out on time of creation of binlog event.
For both table types the combination of raised error and KILLED flag indicates that there
was potentially partial execution on master and consistency is under the question.
In that case the code continues to binlog an event with an appropriate killed error.
The fix relies on the specified behaviour of stored routine that must propagate the error
to the top level query handling if the thd->killed flag was raised in the routine execution.
The patch adds an arg with the default killed-status-unset value to Query_log_event::Query_log_event.
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c