The slave was not able to find the correct row in the innodb
table, because the row fetched from the innodb table would not
match the before image. This happened because the (don't care)
bytes in the NULLed fields would change once the row was stored
in the storage engine (from zero to the default value). This
would make bulk memory comparison (using memcmp) to fail.
We fix this by taking a preventing measure and avoiding memcmp
for tables that contain nullable fields. Therefore, we protect
the slave search routine from engines that return arbitrary
values for don't care bytes (in the nulled fields). Instead, the
slave thread will only check null_bits and those fields that are
not set to NULL when comparing the before image against the
storage engine row.
mysqlbinlog only prints "use $database" statements to its output stream
when the active default database changes between events. This will cause
"No Database Selected" error when dropping and recreating that database.
To fix the problem, we clear print_event_info->db when printing an event
of CREATE/DROP/ALTER database statements, so that the Query_log_event
after such statements will be printed with the use 'db' anyway except
transaction keywords.
mysqlbinlog only prints "use $database" statements to its output stream
when the active default database changes between events. This will cause
"No Database Selected" error when dropping and recreating that database.
To fix the problem, we clear print_event_info->db when printing an event
of CREATE/DROP/ALTER database statements, so that the Query_log_event
after such statements will be printed with the use 'db' anyway except
transaction keywords.
Normally, auto_increment value is generated for the column by
inserting either NULL or 0 into it. NO_AUTO_VALUE_ON_ZERO
suppresses this behavior for 0 so that only NULL generates
the auto_increment value. This behavior is also followed by
a slave, specifically by the SQL Thread, when applying events
in the statement format from a master. However, when applying
events in the row format, the flag was ignored thus causing
an assertion failure:
"Assertion failed: next_insert_id == 0, file .\handler.cc"
In fact, we never need to generate a auto_increment value for
the column when applying events in row format on slave. So we
don't allow it to happen by using 'MODE_NO_AUTO_VALUE_ON_ZERO'.
Refactoring: Get rid of all the sql_mode checks to rows_log_event
when applying it for avoiding problems caused by the inconsistency
of the sql_mode on slave and master as the sql_mode is not set for
Rows_log_event.
Normally, auto_increment value is generated for the column by
inserting either NULL or 0 into it. NO_AUTO_VALUE_ON_ZERO
suppresses this behavior for 0 so that only NULL generates
the auto_increment value. This behavior is also followed by
a slave, specifically by the SQL Thread, when applying events
in the statement format from a master. However, when applying
events in the row format, the flag was ignored thus causing
an assertion failure:
"Assertion failed: next_insert_id == 0, file .\handler.cc"
In fact, we never need to generate a auto_increment value for
the column when applying events in row format on slave. So we
don't allow it to happen by using 'MODE_NO_AUTO_VALUE_ON_ZERO'.
Refactoring: Get rid of all the sql_mode checks to rows_log_event
when applying it for avoiding problems caused by the inconsistency
of the sql_mode on slave and master as the sql_mode is not set for
Rows_log_event.
When a query fails with a different error on the slave,
the sql thread outputs a message (M) containing:
1. the error message format for the master error code
2. the master error code
3. the error message for the slave's error code
4. the slave error code
Given that the slave has no information on the error message
itself that the master outputs, it can only print its own
version of the message format (but stripped from the
additional data if the message format requires). This may
confuse users.
To fix this we augment the slave's message (M) to explicitly
state that the master's message is actually an error message
format, the one associated with the given master error code
and that the slave server knows about.
Problem: Extended characters outside of ASCII range where not displayed
properly in SHOW PROCESSLIST, because thd_info->query was always sent as
system_character_set (utf8). This was wrong, because query buffer
is never converted to utf8 - it is always have client character set.
Fix: sending query buffer using query character set
@ sql/sql_class.cc
@ sql/sql_class.h
Introducing a new class CSET_STRING, a LEX_STRING with character set.
Adding set_query(&CSET_STRING)
Adding reset_query(), to use instead of set_query(0, NULL).
@ sql/event_data_objects.cc
Using reset_query()
@ sql/log_event.cc
Using reset_query()
Adding charset argument to set_query_and_id().
@ sql/slave.cc
Using reset_query().
@ sql/sp_head.cc
Changing backing up and restore code to use CSET_STRING.
@ sql/sql_audit.h
Using CSET_STRING.
In the "else" branch it's OK not to use
global_system_variables.character_set_client.
&my_charset_latin1, which is set in constructor, is fine
(verified with Sergey Vojtovich).
@ sql/sql_insert.cc
Using set_query() with proper character set: table_name is utf8.
@ sql/sql_parse.cc
Adding character set argument to set_query_and_id().
(This is the main point where thd->charset() is stored
into thd->query_string.cs, for use in "SHOW PROCESSLIST".)
Using reset_query().
@ sql/sql_prepare.cc
Storing client character set into thd->query_string.cs.
@ sql/sql_show.cc
Using CSET_STRING to fetch and send charset-aware query information
from threads.
@ storage/myisam/ha_myisam.cc
Using set_query() with proper character set: table_name is utf8.
@ mysql-test/r/show_check.result
@ mysql-test/t/show_check.test
Adding tests
Finalize the server flags after any kind of command is executed.
To avoid updating the flag multiple times, reorganize code so that
its invoked only once for each command.
bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
"Grantor" columns' data is lost when replicating mysql.tables_priv.
Slave SQL thread used its default user ''@'' as the grantor of GRANT|REVOKE
statements executing on it.
In this patch, current user is put in query log event for all GRANT and REVOKE
statement, SQL thread uses the user in query log event as grantor.
Rows events were applied wrongly on the temporary table with the same name.
But rows events are generated only for base tables. As temporary
table's data never be binlogged on row mode. Normally, base table of the
same name cannot be updated if a temporary table has the same name.
But there are two cases which can generate rows events on
the base table of same name.
Case1: 'CREATE TABLE ... SELECT' statement.
In mixed format, it will generate rows events if it is unsafe.
Case2: Drop a transactional temporary table in a transaction
(happens only on 5.5+).
BEGIN;
DROP TEMPORARY TABLE t1; # t1 is a InnoDB table
INSERT INTO t1 VALUES(rand()); # t1 is a MyISAM table
COMMIT;
'DROP TEMPORARY TABLE' will be put in the transaction cache and
binlogged after the rows events generated by the 'INSERT' statement.
After this patch, slave opens only base table when applying a rows event.
When slave executes a transaction bigger than slave's max_binlog_cache_size,
slave will crash. It is caused by the assert that server should only roll back
the statement but not the whole transaction if the error ER_TRANS_CACHE_FULL
happens. But slave sql thread always rollbacks the whole transaction when
an error happens.
Ather this patch, we always clear any error set in sql thread(it is different
from the error in 'SHOW SLAVE STATUS') and it is cleared before rolling back
the transaction.
Suprisingly, a Slave_log_event would show up in the binary
log. This event is never used and should not appear in the
logs. As such, when the slave (or the mysqlbinlog tool) reads the
event, it will hit an invalid pointer (reference to the
descriptor event when deserializing the Slave_log_event was
purposodely set to NULL).
The presence of the Slave_log_event denotes a corrupted log, but
we cannot tell how the log got corrupted in the first
place. However, we can make the server cope with such events when
it reads them - in case of log corruption - and fail gracefully.
This patch makes the server/mysqlbinlog to report that it has
found an invalid log event when Slave_log_event is read.
temp table
This patch introduces two key changes in the replication's behavior.
Firstly, it reverts part of BUG#51894 which puts any update to temporary tables
into the trx-cache. Now, updates to temporary tables are handled according to
the type of their engines as a regular table.
Secondly, an unsafe mixed statement, (i.e. a statement that access transactional
table as well non-transactional or temporary table, and writes to any of them),
are written into the trx-cache in order to minimize errors in the execution when
the statement logging format is in use.
Such changes has a direct impact on which statements are classified as unsafe
statements and thus part of BUG#53259 is reverted.
After BUG#36649, warnings for sub-statements are cleared when a
new sub-statement is started. This is problematic since it suppresses
warnings for unsafe statements in some cases. It is important that we
always give a warning to the client, because the user needs to know
when there is a risk that the slave goes out of sync.
We fixed the problem by generating warning messages for unsafe statements
while returning from a stored procedure, function, trigger or while
executing a top level statement.
We also started checking unsafeness when both performance and log tables are
used. This is necessary after the performance schema which does a distinction
between performance and log tables.
This patch also fixes Bug#55452 "SET PASSWORD is
replicated twice in RBR mode".
The goal of this patch is to remove the release of
metadata locks from close_thread_tables().
This is necessary to not mistakenly release
the locks in the course of a multi-step
operation that involves multiple close_thread_tables()
or close_tables_for_reopen().
On the same token, move statement commit outside
close_thread_tables().
Other cleanups:
Cleanup COM_FIELD_LIST.
Don't call close_thread_tables() in COM_SHUTDOWN -- there
are no open tables there that can be closed (we leave
the locked tables mode in THD destructor, and this
close_thread_tables() won't leave it anyway).
Make open_and_lock_tables() and open_and_lock_tables_derived()
call close_thread_tables() upon failure.
Remove the calls to close_thread_tables() that are now
unnecessary.
Simplify the back off condition in Open_table_context.
Streamline metadata lock handling in LOCK TABLES
implementation.
Add asserts to ensure correct life cycle of
statement transaction in a session.
Remove a piece of dead code that has also become redundant
after the fix for Bug 37521.
sporadically
There are two problems:
1. When closing temporary tables, during the THD clean up - and
after the session connection was already closed, there is a
chance we can push an error into the THD diagnostics area, if
the writing of the implicit DROP event to the binary log fails
for some reason. As a consequence an assertion can be
triggered, because at that point the diagnostics area is
already set.
2. Using push_warning with MYSQL_ERROR::WARN_LEVEL_ERROR is a
bug.
Given that close_temporary_tables is mostly called from
THD::cleanup - ie, with the session already closed, we fix
problem #1 by allowing the diagnostics area to be
overwritten. There is one other place in the code that calls
close_temporary_tables - while applying Start_log_event_v3. To
cover that case, we make close_temporary_tables to return the
error, thus, propagating upwards in the stack.
To fix problem #2, we replace push_warning with sql_print_error.
Although the C standard mandates that sprintf return the number
of bytes written, some very ancient systems (i.e. SunOS 4)
returned a pointer to the buffer instead. Since these systems
are not supported anymore and are hopefully long dead by now,
simply remove the portability wrapper that dealt with this
discrepancy. The autoconf check was causing trouble with GCC.
Essentially, the problem is that safemalloc is excruciatingly
slow as it checks all allocated blocks for overrun at each
memory management primitive, yielding a almost exponential
slowdown for the memory management functions (malloc, realloc,
free). The overrun check basically consists of verifying some
bytes of a block for certain magic keys, which catches some
simple forms of overrun. Another minor problem is violation
of aliasing rules and that its own internal list of blocks
is prone to corruption.
Another issue with safemalloc is rather the maintenance cost
as the tool has a significant impact on the server code.
Given the magnitude of memory debuggers available nowadays,
especially those that are provided with the platform malloc
implementation, maintenance of a in-house and largely obsolete
memory debugger becomes a burden that is not worth the effort
due to its slowness and lack of support for detecting more
common forms of heap corruption.
Since there are third-party tools that can provide the same
functionality at a lower or comparable performance cost, the
solution is to simply remove safemalloc. Third-party tools
can provide the same functionality at a lower or comparable
performance cost.
The removal of safemalloc also allows a simplification of the
malloc wrappers, removing quite a bit of kludge: redefinition
of my_malloc, my_free and the removal of the unused second
argument of my_free. Since free() always check whether the
supplied pointer is null, redudant checks are also removed.
Also, this patch adds unit testing for my_malloc and moves
my_realloc implementation into the same file as the other
memory allocation primitives.
DROP USER
RENAME USER CURRENT_USER() ...
GRANT ... TO CURRENT_USER()
REVOKE ... FROM CURRENT_USER()
ALTER DEFINER = CURRENT_USER() EVENTbut, When these statements are binlogged, CURRENT_USER() just is binlogged
as 'CURRENT_USER()', it is not expanded to the real user name. When slave
executes the log event, 'CURRENT_USER()' is expand to the user of slave
SQL thread, but SQL thread's user name always NULL. This breaks the replication.
After this patch, session's user will be written into query log events
if these statements call CURREN_USER() or 'ALTER EVENT' does not assign a definer.
Apart strict-aliasing warnings, fix the remaining warnings
generated by GCC 4.4.4 -Wall and -Wextra flags.
One major source of warnings was the in-house function my_bcmp
which (unconventionally) took pointers to unsigned characters
as the byte sequences to be compared. Since my_bcmp and bcmp
are deprecated functions whose only difference with memcmp is
the return value, every use of the function is replaced with
memcmp as the special return value wasn't actually being used
by any caller.
There were also various other warnings, mostly due to type
mismatches, missing return values, missing prototypes, dead
code (unreachable) and ignored return values.
BUG#54872 MBR: replication failure caused by using tmp table inside transaction
Changed criteria to classify a statement as unsafe in order to reduce the
number of spurious warnings. So a statement is classified as unsafe when
there is on-going transaction at any point of the execution if:
1. The mixed statement is about to update a transactional table and
a non-transactional table.
2. The mixed statement is about to update a temporary transactional
table and a non-transactional table.
3. The mixed statement is about to update a transactional table and
read from a non-transactional table.
4. The mixed statement is about to update a temporary transactional
table and read from a non-transactional table.
5. The mixed statement is about to update a non-transactional table
and read from a transactional table when the isolation level is
lower than repeatable read.
After updating a transactional table if:
6. The mixed statement is about to update a non-transactional table
and read from a temporary transactional table.
7. The mixed statement is about to update a non-transactional table
and read from a temporary transactional table.
8. The mixed statement is about to update a non-transactionala table
and read from a temporary non-transactional table.
9. The mixed statement is about to update a temporary non-transactional
table and update a non-transactional table.
10. The mixed statement is about to update a temporary non-transactional
table and read from a non-transactional table.
11. A statement is about to update a non-transactional table and the
option variables.binlog_direct_non_trans_update is OFF.
The reason for this is that locks acquired may not protected a concurrent
transaction of interfering in the current execution and by consequence in
the result. So the patch reduced the number of spurious unsafe warnings.
Besides we fixed a regression caused by BUG#51894, which makes temporary
tables to go into the trx-cache if there is an on-going transaction. In
MIXED mode, the patch for BUG#51894 ignores that the trx-cache may have
updates to temporary non-transactional tables that must be written to the
binary log while rolling back the transaction.
So we fix this problem by writing the content of the trx-cache to the
binary log while rolling back a transaction if a non-transactional
temporary table was updated and the binary logging format is MIXED.
DROP USER
RENAME USER CURRENT_USER() ...
GRANT ... TO CURRENT_USER()
REVOKE ... FROM CURRENT_USER()
ALTER DEFINER = CURRENT_USER() EVENTbut, When these statements are binlogged, CURRENT_USER() just is binlogged
as 'CURRENT_USER()', it is not expanded to the real user name. When slave
executes the log event, 'CURRENT_USER()' is expand to the user of slave
SQL thread, but SQL thread's user name always NULL. This breaks the replication.
After this patch, session's user will be written into query log events
if these statements call CURREN_USER() or 'ALTER EVENT' does not assign a definer.