Problem
========
Replication breaks in the cases if the event length exceeds
the size of master Dump thread's max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet, on addition of the
max_event_header length exceeds the max_allowed_packet of the DUMP thread.
This causes the Dump thread to break replication and throw an error.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem is fixed in 2 steps:
1.) The Dump thread limit to read event is increased to the upper limit
i.e. Dump thread reads whatever gets logged in the binary log.
2.) On the slave side we increase the the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option (slave_max_allowed_packet)
included, is used to regulate the max_allowed_packet of the
slave thread (IO/SQL) by the DBA, and facilitates the sending of
large packets from the master to the slave.
This causes the large packets to be received by the slave and apply
it successfully.
Problem: mysqlbinlog exits without any error code in case of
file write error. It is because of the fact that the calls
to Log_event::print() method does not return a value and the
thus any error were being ignored.
Resolution: We resolve this problem by checking for the
IO_CACHE::error == -1 after every call to Log_event:: print()
and terminating the further execution.
Problem
========
SQL statements close to the size of max_allowed_packet produce binary
log events larger than max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet + max_event_header
length. Now since the event length exceeds this size master Dump
thread is unable to send the packet on to the slave.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem was fixed by increasing the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option included which is used to
regulate the max_allowed_packet of the slave thread (IO/SQL).
This causes the large packets to be received by the slave and apply
it successfully.
Problem: After the fix for Bug#12589870, a new field that
stores the length of db name was added in the buffer that
stores the query to be executed. Unlike for the plain user
session, the replication execution did not allocate the
necessary chunk in Query-event constructor. This caused an
invalid read while accessing this field.
Solution: We fix this problem by allocating a necessary chunk
in the buffer created in the Query_log_event::Query_log_event()
and store the length of database name.
BUG#11761686 insert_id event is not filtered.
Two issues are covered.
INSERT into autoincrement field which is not the first part in the composed primary key
is unsafe by autoincrement logging design. The case is specific to MyISAM engine
because Innodb does not allow such table definition.
However no warnings and row-format logging in the MIXED mode was done, and
that is fixed.
Int-, Rand-, User-var log-events were not filtered along with their parent
query that made possible them to screw up execution context of the following
query.
Fixed with deferring their execution until the parent query.
******
Bug#11754117
Post review fixes.
PROBLEM:
--------
When binary log statements are replayed on the slave, BEGIN is represented
in com_counters but COMMIT is not. Similarly in 'ROW' based replication
'INSERT','UPDATE',and 'DELETE' com_counters are not getting incremented
when the binary log statements are replayed at slave.
ANALYSIS:
---------
In 'ROW' based replication for COMMIT,INSERT,UPDATE and DELETE operations
following special events are invoked.
Xid_log_event,Write_rows_log_event,Update_rows_log_event,Update_rows_log_event.
The above mentioned events doesn't go through the parser where the
'COM_COUNTERS' are incremented.
FIX:
-----
Increment statements are added at appropriate events.
Respective functions are listed below.
'Xid_log_event::do_apply_event'
'Write_rows_log_event::do_before_row_operations'
'Update_rows_log_event::do_before_row_operations'
'Delete_rows_log_event::do_before_row_operations'
Analysis:
========================
sql_mode "NO_BACKSLASH_ESCAPES": When user want to use backslash as character input,
instead of escape character in a string literal then sql_mode can be set to
"NO_BACKSLASH_ESCAPES". With this mode enabled, backslash becomes an ordinary
character like any other.
SQL_MODE set applies to the current client session. And while creating the stored
procedure, MySQL stores the current sql_mode and always executes the stored
procedure in sql_mode stored with the Procedure, regardless of the server SQL
mode in effect when the routine is invoked.
In the scenario (for which bug is reported), the routine is created with
sql_mode=NO_BACKSLASH_ESCAPES. And routine is executed with the invoker sql_mode
is "" (NOT SET) by executing statement "call testp('Axel\'s')".
Since invoker sql_mode is "" (NOT_SET), the '\' in 'Axel\'s'(argument to function)
is considered as escape character and column "a" (of table "t1") values are
updated with "Axel's". The binary log generated for above update operation is as below,
set sql_mode=XXXXXX (for no_backslash_escapes)
update test.t1 set a= NAME_CONST('var',_latin1'Axel\'s' COLLATE 'latin1_swedish_ci');
While logging stored procedure statements, the local variables (params) used in
statements are replaced with the NAME_CONST(var_name, var_value) (Internal function)
(http://dev.mysql.com/doc/refman/5.6/en/miscellaneous-functions.html#function_name-const)
On slave, these logs are applied. NAME_CONST is parsed to get the variable and its
value. Since, stored procedure is created with sql_mode="NO_BACKSLASH_ESCAPES", the sql_mode
is also logged in. So that at slave this sql_mode is set before executing the statements
of routine. So at slave, sql_mode is set to "NO_BACKSLASH_ESCAPES" and then while
parsing NAME_CONST of string variable, '\' is considered as NON ESCAPE character
and parsing reported error for "'" (as we have only one "'" no backslash).
At slave, parsing was proper with sql_mode "NO_BACKSLASH_ESCAPES".
But above error reported while writing bin log, "'" (of Axel's) is escaped with
"\" character. Actually, all special characters (n, r, ', ", \, 0...) are escaped
while writing NAME_CONST for string variable(param, local variable) in bin log
Airrespective of "NO_BACKSLASH_ESCAPES" sql_mode. So, basically, the problem is
that logging string parameter does not take into account sql_mode value.
Fix:
========================
So when sql_mode is set to "NO_BACKSLASH_ESCAPES", escaping characters as
(n, r, ', ", \, 0...) should be avoided. To do so, added a check to not to
escape such characters while writing NAME_CONST for string variables in bin
log.
And when sql_mode is set to NO_BACKSLASH_ESCAPES, quote character "'" is
represented as ''.
http://dev.mysql.com/doc/refman/5.6/en/string-literals.html (There are several
ways to include quote characters within a string: )
PROBLEM: After WL 4144, when using MyISAM Merge tables, the routine
open_and_lock_tables will append to the list of tables to lock, the
base tables that make up the MERGE table. This has two side-effects in
replication:
1. On the master side, we log additional table maps for the base
tables, since they appear in the list of locked tables, even
though we don't really use them at the slave.
2. On the slave side, when opening a MERGE table while applying a
ROW event, additional tables are appended to the list of tables
to lock.
Side-effect #1 is not harmful. It's just that when using MyISAM Merge
tables a few table maps more may be logged.
Side-effect #2, is harmful, because the list rli->tables_to_lock is an
extended structure from TABLE_LIST in which the extra fields are
filled from the table maps that are processed. Since
open_and_lock_tables appends tables to the list after all table map
events have been processed we end up with entries without
replication/table map data on them. Thus when trying to access that
info for these extra tables, the server will crash.
SOLUTION: We fix side-effect #2 by making sure that we access the
replication part of the structure for those in the list that were
accounted for when processing the correspondent table map events. All
in all, we never go beyond rli->tables_to_lock_count.
We also deploy an assertion when clearing rli->tables_to_lock, making
sure that the base tables are not in the list anymore (were closed in
close_thread_tables).
SCAN/CPU) => SLAVE FAILURE
When a statement containing a large amount of ROWs to be applied on
the slave, and the slave's table does not contain a PK, it can take a
considerable amount of time to find and change all the rows that are
to be changed.
The proper slave enhancement will be implemented in WL 5597. However,
in this bug we are making it clear to the user what the problem is, by
printing a message to the error log if the execution time, for a given
statement in RBR, takes more than LONG_FIND_ROW_THRESHOLD (set to 60
seconds). This shall help the DBA to diagnose what's happening when
facing a slave server that is quiet for no apparent reason...
The note is only printed to the error log if log_warnings is set to be
greater than 1.
HA_ERR was returning 0 (null string) when no error happened
(error=0). Since HA_ERR is used in DBUG_PRINT, regardless there
was an error or not, the server could crash in solaris debug
builds.
We fix this by:
- deploying an assertion that ensures that the function
is not called when no error has happened;
- making sure that HA_ERR is only called when an error
happened;
- making HA_ERR return "No Error", instead of 0, for
non-debug builds if it is called when no error happened.
This will make HA_ERR return values to work with DBUG_PRINT on
solaris debug builds.
The server crashes if it processes table map events that are
corrupted, especially if they map different tables to the same
identifier. This could happen, for instance, due to BUG 56226.
We fix this by checking whether the table map has already been
mapped before actually applying the event. If it has been mapped
with different settings an error is raised and the slave SQL
thread stops. If it has been mapped with same settings the event
is skipped. If the table is set to be ignored by the filtering
rules, there is no change in behavior: the event is skipped and
ids are not checked.
In RBR and in case of converting blob fields, the space allocated
while unpacking into the conversion field was not freed after
copying from it into the real field.
We fix this by freeing the conversion field when the conversion
table is not needed anymore (on close_tables_to_lock).
The slave was not able to find the correct row in the innodb
table, because the row fetched from the innodb table would not
match the before image. This happened because the (don't care)
bytes in the NULLed fields would change once the row was stored
in the storage engine (from zero to the default value). This
would make bulk memory comparison (using memcmp) to fail.
We fix this by taking a preventing measure and avoiding memcmp
for tables that contain nullable fields. Therefore, we protect
the slave search routine from engines that return arbitrary
values for don't care bytes (in the nulled fields). Instead, the
slave thread will only check null_bits and those fields that are
not set to NULL when comparing the before image against the
storage engine row.
mysqlbinlog only prints "use $database" statements to its output stream
when the active default database changes between events. This will cause
"No Database Selected" error when dropping and recreating that database.
To fix the problem, we clear print_event_info->db when printing an event
of CREATE/DROP/ALTER database statements, so that the Query_log_event
after such statements will be printed with the use 'db' anyway except
transaction keywords.
mysqlbinlog only prints "use $database" statements to its output stream
when the active default database changes between events. This will cause
"No Database Selected" error when dropping and recreating that database.
To fix the problem, we clear print_event_info->db when printing an event
of CREATE/DROP/ALTER database statements, so that the Query_log_event
after such statements will be printed with the use 'db' anyway except
transaction keywords.
Normally, auto_increment value is generated for the column by
inserting either NULL or 0 into it. NO_AUTO_VALUE_ON_ZERO
suppresses this behavior for 0 so that only NULL generates
the auto_increment value. This behavior is also followed by
a slave, specifically by the SQL Thread, when applying events
in the statement format from a master. However, when applying
events in the row format, the flag was ignored thus causing
an assertion failure:
"Assertion failed: next_insert_id == 0, file .\handler.cc"
In fact, we never need to generate a auto_increment value for
the column when applying events in row format on slave. So we
don't allow it to happen by using 'MODE_NO_AUTO_VALUE_ON_ZERO'.
Refactoring: Get rid of all the sql_mode checks to rows_log_event
when applying it for avoiding problems caused by the inconsistency
of the sql_mode on slave and master as the sql_mode is not set for
Rows_log_event.
Normally, auto_increment value is generated for the column by
inserting either NULL or 0 into it. NO_AUTO_VALUE_ON_ZERO
suppresses this behavior for 0 so that only NULL generates
the auto_increment value. This behavior is also followed by
a slave, specifically by the SQL Thread, when applying events
in the statement format from a master. However, when applying
events in the row format, the flag was ignored thus causing
an assertion failure:
"Assertion failed: next_insert_id == 0, file .\handler.cc"
In fact, we never need to generate a auto_increment value for
the column when applying events in row format on slave. So we
don't allow it to happen by using 'MODE_NO_AUTO_VALUE_ON_ZERO'.
Refactoring: Get rid of all the sql_mode checks to rows_log_event
when applying it for avoiding problems caused by the inconsistency
of the sql_mode on slave and master as the sql_mode is not set for
Rows_log_event.
When a query fails with a different error on the slave,
the sql thread outputs a message (M) containing:
1. the error message format for the master error code
2. the master error code
3. the error message for the slave's error code
4. the slave error code
Given that the slave has no information on the error message
itself that the master outputs, it can only print its own
version of the message format (but stripped from the
additional data if the message format requires). This may
confuse users.
To fix this we augment the slave's message (M) to explicitly
state that the master's message is actually an error message
format, the one associated with the given master error code
and that the slave server knows about.
Problem: Extended characters outside of ASCII range where not displayed
properly in SHOW PROCESSLIST, because thd_info->query was always sent as
system_character_set (utf8). This was wrong, because query buffer
is never converted to utf8 - it is always have client character set.
Fix: sending query buffer using query character set
@ sql/sql_class.cc
@ sql/sql_class.h
Introducing a new class CSET_STRING, a LEX_STRING with character set.
Adding set_query(&CSET_STRING)
Adding reset_query(), to use instead of set_query(0, NULL).
@ sql/event_data_objects.cc
Using reset_query()
@ sql/log_event.cc
Using reset_query()
Adding charset argument to set_query_and_id().
@ sql/slave.cc
Using reset_query().
@ sql/sp_head.cc
Changing backing up and restore code to use CSET_STRING.
@ sql/sql_audit.h
Using CSET_STRING.
In the "else" branch it's OK not to use
global_system_variables.character_set_client.
&my_charset_latin1, which is set in constructor, is fine
(verified with Sergey Vojtovich).
@ sql/sql_insert.cc
Using set_query() with proper character set: table_name is utf8.
@ sql/sql_parse.cc
Adding character set argument to set_query_and_id().
(This is the main point where thd->charset() is stored
into thd->query_string.cs, for use in "SHOW PROCESSLIST".)
Using reset_query().
@ sql/sql_prepare.cc
Storing client character set into thd->query_string.cs.
@ sql/sql_show.cc
Using CSET_STRING to fetch and send charset-aware query information
from threads.
@ storage/myisam/ha_myisam.cc
Using set_query() with proper character set: table_name is utf8.
@ mysql-test/r/show_check.result
@ mysql-test/t/show_check.test
Adding tests
Finalize the server flags after any kind of command is executed.
To avoid updating the flag multiple times, reorganize code so that
its invoked only once for each command.
bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
"Grantor" columns' data is lost when replicating mysql.tables_priv.
Slave SQL thread used its default user ''@'' as the grantor of GRANT|REVOKE
statements executing on it.
In this patch, current user is put in query log event for all GRANT and REVOKE
statement, SQL thread uses the user in query log event as grantor.
Rows events were applied wrongly on the temporary table with the same name.
But rows events are generated only for base tables. As temporary
table's data never be binlogged on row mode. Normally, base table of the
same name cannot be updated if a temporary table has the same name.
But there are two cases which can generate rows events on
the base table of same name.
Case1: 'CREATE TABLE ... SELECT' statement.
In mixed format, it will generate rows events if it is unsafe.
Case2: Drop a transactional temporary table in a transaction
(happens only on 5.5+).
BEGIN;
DROP TEMPORARY TABLE t1; # t1 is a InnoDB table
INSERT INTO t1 VALUES(rand()); # t1 is a MyISAM table
COMMIT;
'DROP TEMPORARY TABLE' will be put in the transaction cache and
binlogged after the rows events generated by the 'INSERT' statement.
After this patch, slave opens only base table when applying a rows event.