Problem:
========
180511 11:07:58 [ERROR] Slave I/O: Unexpected master's heartbeat data:
heartbeat is not compatible with local info;the event's data: log_file_name
mysql-bin.000009 log_pos 1054262041, Error_code: 1623
Analysis:
=========
In replication setup when master server doesn't have any events to send to
slave server it sends an 'Heartbeat_log_event'. This event carries the
current binary log filename and offset details. The offset values is stored
within 4 bytes of event header. When the size of binary log is higher than
UINT32_MAX the log_pos values will not fit in 4 bytes memory. It overflows
and hence slave stops with an error.
Fix:
===
Since we cannot extend the common_header of Log_event class, a greater than
4GB value of Log_event::log_pos is made to be transported with a HeartBeat
event's sub-header. Log_event::log_pos in such case is set to zero to
indicate that the 8 byte sub-header is allocated in the event.
In case of cross version replication following behaviour is expected
OLD - Server without fix
NEW - Server with fix
OLD<->NEW : works bidirectionally as long as the binlog offset is
(normally) within 4GB.
When log_pos > UINT32_MAX
OLD->NEW : The 'log_pos' is bound to overflow and NEW slave may report
an invalid event/incompatible heart beat event error.
NEW->OLD : Since patched server sets log_pos=0 on overflow, OLD slave will
report invalid event error.
(This commit is exclusively for 10.2 branch. Do not merge it to 10.3)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
and to Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.
Problem:
=======
P1) Conditional jump or move depends on uninitialised value(s)
sql_ex_info::init(char const*, char const*, bool) (log_event.cc:3083)
code: All the following variables are not initialized.
----
return ((cached_new_format != -1) ? cached_new_format :
(cached_new_format=(field_term_len > 1 || enclosed_len > 1 ||
line_term_len > 1 || line_start_len > 1 || escaped_len > 1)));
P2) Conditional jump or move depends on uninitialised value(s)
Rows_log_event::Rows_log_event(char const*, unsigned
int, Format_description_log_event const*) (log_event.cc:9571)
Code: Uninitialized values is reported for 'var_header_len' variable.
----
if (var_header_len < 2 || event_len < static_cast<unsigned
int>(var_header_len + (post_start - buf)))
P3) Conditional jump or move depends on uninitialised value(s)
Table_map_log_event::pack_info(Protocol*) (log_event.cc:11553)
code:'m_table_id' is uninitialized.
----
void Table_map_log_event::pack_info(Protocol *protocol)
...
size_t bytes= my_snprintf(buf, sizeof(buf), "table_id: %lu (%s.%s)",
m_table_id, m_dbnam, m_tblnam);
Fix:
===
P1 - Fix)
Initialize cached_new_format,field_term_len, enclosed_len, line_term_len,
line_start_len, escaped_len members in default constructor.
P2 - Fix)
"var_header_len" is initialized by reading the event buffer. In case of an
invalid event the buffer will contain invalid data. Hence added a check to
validate the event data. If event_len is smaller than valid header length
return immediately.
P3 - Fix)
'm_table_id' within Table_map_log_event is initialized by reading data from
the event buffer. Use 'VALIDATE_BYTES_READ' macro to validate the current
state of the buffer. If it is invalid return immediately.
This PR contains a mtr test for reproducing a failure with replicating create table as select statement (CTAS) through asynchronous mariadb replication to mariadb galera cluster.
The problem happens when CTAS replication contains both create table statement followed by row events for populating the table. In such situation, the galera node operating as mariadb replication slave, will first replicate only the create table part into the cluster, and then perform another replication containing both the create table and row events. This will lead all other nodes to fail for duplicate table create attempt, and crash due to this failure.
PR contains also a fix, which identifies the situation when CTAS has been replicated, and makes further scan in async replication stream to see if there are following row events. The slave node will replicate either single TOI in case the CTAS table is empty, or if CTAS table contains rows, then single bundled write set with create table and row events is replicated to galera cluster.
This fix should keep master server's GTID's for CTAS replication in sync with GTID's in galera cluster.
The problem was originally stated in
http://bugs.mysql.com/bug.php?id=82212
The size of an base64-encoded Rows_log_event exceeds its
vanilla byte representation in 4/3 times.
When a binlogged event size is about 1GB mysqlbinlog generates
a BINLOG query that can't be send out due to its size.
It is fixed with fragmenting the BINLOG argument C-string into
(approximate) halves when the base64 encoded event is over 1GB size.
The mysqlbinlog in such case puts out
SET @binlog_fragment_0='base64-encoded-fragment_0';
SET @binlog_fragment_1='base64-encoded-fragment_1';
BINLOG @binlog_fragment_0, @binlog_fragment_1;
to represent a big BINLOG.
For prompt memory release BINLOG handler is made to reset the BINLOG argument
user variables in the middle of processing, as if @binlog_fragment_{0,1} = NULL
is assigned.
Notice the 2 fragments are enough, though the client and server still may
need to tweak their @@max_allowed_packet to satisfy to the fragment
size (which they would have to do anyway with greater number of
fragments, should that be desired).
On the lower level the following changes are made:
Log_event::print_base64()
remains to call encoder and store the encoded data into a cache but
now *without* doing any formatting. The latter is left for time
when the cache is copied to an output file (e.g mysqlbinlog output).
No formatting behavior is also reflected by the change in the meaning
of the last argument which specifies whether to cache the encoded data.
Rows_log_event::print_helper()
is made to invoke a specialized fragmented cache-to-file copying function
which is
copy_cache_to_file_wrapped()
that takes care of fragmenting also optionally wraps encoded
strings (fragments) into SQL stanzas.
my_b_copy_to_file()
is refactored to into my_b_copy_all_to_file(). The former function
is generalized
to accepts more a limit argument to constraint the copying and does
not reinitialize anymore the cache into reading mode.
The limit does not do any effect on the fully read cache.
Problem
-------
For one-statement contains multiple row events, Flashback didn't reverse the
sequence of row events inside one-statement.
Solution
--------
Using a new array 'events_in_stmt' to store the row events of one-statement,
when parsed the last one event, then print from the last one to the first one.
In the same time, fixed another bug, without -vv will not insert the table_map
into print_event_info->m_table_map, then change_to_flashback_event() will not
execute because of Table_map_log_event is empty.
Define my_thread_id as an unsigned type, to avoid mismatch with
ulonglong. Change some parameters to this type.
Use size_t in a few more places.
Declare many flag constants as unsigned to avoid sign mismatch
when shifting bits or applying the unary ~ operator.
When applying the unary ~ operator to enum constants, explictly
cast the result to an unsigned type, because enum constants can
be treated as signed.
In InnoDB, change the source code line number parameters from
ulint to unsigned type. Also, make some InnoDB functions return
a narrower type (unsigned or uint32_t instead of ulint;
bool instead of ibool).
==== Description ====
Flashback can rollback the instances/databases/tables to an old snapshot.
It's implement on Server-Level by full image format binary logs (--binlog-row-image=FULL), so it supports all engines.
Currently, it’s a feature inside mysqlbinlog tool (with --flashback arguments).
Because the flashback binlog events will store in the memory, you should check if there is enough memory in your machine.
==== New Arguments to mysqlbinlog ====
--flashback (-B)
It will let mysqlbinlog to work on FLASHBACK mode.
==== New Arguments to mysqld ====
--flashback
Setup the server to use flashback. This enables binary log in row mode
and will enable extra logging for DDL's needed by flashback feature
==== Example ====
I have a table "t" in database "test", we can compare the output with "--flashback" and without.
#client/mysqlbinlog /data/mysqldata_10.0/binlog/mysql-bin.000001 -vv -d test -T t --start-datetime="2013-03-27 14:54:00" > /tmp/1.sql
#client/mysqlbinlog /data/mysqldata_10.0/binlog/mysql-bin.000001 -vv -d test -T t --start-datetime="2013-03-27 14:54:00" -B > /tmp/2.sql
Then, importing the output flashback file (/tmp/2.log), it can flashback your database/table to the special time (--start-datetime).
And if you know the exact postion, "--start-postion" is also works, mysqlbinlog will output the flashback logs that can flashback to "--start-postion" position.
==== Implement ====
1. As we know, if binlog_format is ROW (binlog-row-image=FULL in 10.1 and later), all columns value are store in the row event, so we can get the data before mis-operation.
2. Just do following things:
2.1 Change Event Type, INSERT->DELETE, DELETE->INSERT.
For example:
INSERT INTO t VALUES (...) ---> DELETE FROM t WHERE ...
DELETE FROM t ... ---> INSERT INTO t VALUES (...)
2.2 For Update_Event, swapping the SET part and WHERE part.
For example:
UPDATE t SET cols1 = vals1 WHERE cols2 = vals2
--->
UPDATE t SET cols2 = vals2 WHERE cols1 = vals1
2.3 For Multi-Rows Event, reverse the rows sequence, from the last row to the first row.
For example:
DELETE FROM t WHERE id=1; DELETE FROM t WHERE id=2; ...; DELETE FROM t WHERE id=n;
--->
DELETE FROM t WHERE id=n; ...; DELETE FROM t WHERE id=2; DELETE FROM t WHERE id=1;
2.4 Output those events from the last one to the first one which mis-operation happened.
For example:
Minor review comments/changes:
- A bunch of style-fixes.
- Change macros to static inline functions.
- Update check_event_type() with compressed event types.
- Small .result file update.
Add some event types for the compressed event, there are:
QUERY_COMPRESSED_EVENT,
WRITE_ROWS_COMPRESSED_EVENT_V1,
UPDATE_ROWS_COMPRESSED_EVENT_V1,
DELETE_POWS_COMPRESSED_EVENT_V1,
WRITE_ROWS_COMPRESSED_EVENT,
UPDATE_ROWS_COMPRESSED_EVENT,
DELETE_POWS_COMPRESSED_EVENT.
These events inheritance the uncompressed editor events. One of their constructor functions and write
function have been overridden for uncompressing and compressing. Anything but this is totally the same.
On slave, The IO thread will uncompress and convert them When it receiving the events from the master.
So the SQL and worker threads can be stay unchanged.
Now we use zlib as compress algorithm. It maybe support other algorithm in the future.
MDEV-10134 Add full support for DEFAULT
- Added support for using tables with MySQL 5.7 virtual fields,
including MySQL 5.7 syntax
- Better error messages also for old cases
- CREATE ... SELECT now also updates timestamp columns
- Blob can now have default values
- Added new system variable "check_constraint_checks", to turn of
CHECK constraint checking if needed.
- Removed some engine independent tests in suite vcol to only test myisam
- Moved some tests from 'include' to 't'. Should some day be done for all tests.
- FRM version increased to 11 if one uses virtual fields or constraints
- Changed to use a bitmap to check if a field has got a value, instead of
setting HAS_EXPLICIT_VALUE bit in field flags
- Expressions can now be up to 65K in total
- Ensure we are not refering to uninitialized fields when handling virtual fields or defaults
- Changed check_vcol_func_processor() to return a bitmap of used types
- Had to change some functions that calculated cached value in fix_fields to do
this in val() or getdate() instead.
- store_now_in_TIME() now takes a THD argument
- fill_record() now updates default values
- Add a lookahead for NOT NULL, to be able to handle DEFAULT 1+1 NOT NULL
- Automatically generate a name for constraints that doesn't have a name
- Added support for ALTER TABLE DROP CONSTRAINT
- Ensure that partition functions register virtual fields used. This fixes
some bugs when using virtual fields in a partitioning function
changes in query execution plans.
Fixed by introducing table->rpl_write_set which holds which columns should
be stored in the binary log.
Other things:
- Removed some not needed references to read_set and write_set to make
code really changing read_set and write_set easier to read
(in opt_range.cc)
- Added error handling of failed unpack_current_row()
- Added missing call to mark_columns_needed_for_insert() for DELAYED INSERT
- Removed not used functions in_read_set() and in_write_set()
- In rpl_record.cc, removed not used variable error
MDEV-8685 MariaDB fails to decode Anonymous_GTID entries
MDEV-5705 Replication testing: 5.6->10.0
- Ignoring GTID events from MySQL 5.6+ (Allows replication from MySQL 5.6+ with GTID enabled)
- Added ignorable events from MySQL 5.6
- mysqlbinlog now writes information about GTID and ignorable events.
- Added more information in error message when replication stops because of wrong information in binary log.
- Fixed wrong test when write_on_release() should flush cache.
Introduce Log_event_writer() that encapsulates
writing data to an IO_CACHE with automatic checksum calculation.
Now all events properly checksum themselves as needed.
Use Log_event_writer in MYSQL_BIN_LOG::write_cache() instead
of copy-pasting its logic all over.
Later Log_event_writer will also do encryption.
There are three Log_event::read_log_event() methods:
1. read the event image from IO_CACHE into String
2. create Log_event from the in-memory event image
3. read the event image from IO_CACHE and create Log_event
The 3rd was reading event image into memory and invoking the 2nd to
create Log_event. Now the 3rd also uses the 1st to read the event image
from IO_CACHE into memory, instead of duplicating its functionality.
Added mandatory thd parameter to Item (and all derivative classes) constructor.
Added thd parameter to all routines that may create items.
Also removed "current_thd" from Item::Item. This reduced number of
pthread_getspecific() calls from 290 to 177 per OLTP RO transaction.
When writing rows with a minimal row image, it is possible to receive
empty events. In that case m_curr_row and m_rows_end are the same,
however the event implies an insert into the table with the default
values associated for that table.
Adjust the configuration options, as discussed on the
maria-developers@ mailing list.
The option to hint a transaction to not be replicated in parallel is
now called @@skip_parallel_replication, consistent with
@@skip_replication.
And the --slave-parallel-mode is now simplified to have just one of
the following values:
none
minimal
conservative
optimistic
aggressive
This reflects successively harder efforts to find opportunities to run
things in parallel on the slave. It allows to extend the server with
more automatic heuristics in the future without having to introduce a
new configuration option for each and every one.
Implement a new mode for parallel replication. In this mode, all transactions
are optimistically attempted applied in parallel. In case of conflicts, the
offending transaction is rolled back and retried later non-parallel.
This is an early-release patch to facilitate testing, more changes to user
interface / options will be expected. The new mode is not enabled by default.