QUOTING IN REPLICATION
Problem: Misquoting or unquoted identifiers may lead to
incorrect statements to be logged to the binary log.
Fix: we use specialized functions to append quoted identifiers in
the statements generated by the server.
Problem:
=======
trx_data->empty() assert happens at `binlog_close_connection'
Analysis:
========
trx_data->empty() function checks for no pending events
and the transaction cache to be empty.This function returns
"true" if no pending events are present and cache is empty.
Otherwise it returns false. `binlog_close_connection' call
expects the above function to return true. But if the
return value is false then assert is raised.
This bug was reproducible in a diskfull scenario. In this
disk full scenario try to do an insert operation so that
a new pending event is created and flushing this pending
event fails. Due to this failure the server goes down
and invokes `binlog_close_connection' for clean closure.
Since the pending event still remains the assert is caused.
This assert is caused only in non transactional databases.
Fix:
===
In a disk full scenario when the insertion fails the
transaction is rolled back and `binlog_end_trans`
is called to flush the pending events. But flush operation
fails as the disk is full and the function simply returns
`1' without taking any action to delete the pending event.
This leaves the event to remain till the closure of
connection. `delete pending' statement has been added to
do the required clean up action.
sql/log.cc:
Added "delete pending" statement to clean pending event
Introduce a new storage engine API method commit_checkpoint_request().
This is used to replace the fsync() at the end of every storage engine
commit with a single fsync() when a binlog is rotated.
Binlog rotation is now done during group commit instead of being
delayed until unlog(), removing some server stall and avoiding an
expensive lock/unlock of LOCK_log inside unlog().
QUOTING IN REPLICATION
Problem: Misquoting or unquoted identifiers may lead to
incorrect statements to be logged to the binary log.
Fix: we use specialized functions to append quoted identifiers in
the statements generated by the server.
Problem:
=======
trx_data->empty() assert happens at `binlog_close_connection'
Analysis:
========
trx_data->empty() function checks for no pending events
and the transaction cache to be empty.This function returns
"true" if no pending events are present and cache is empty.
Otherwise it returns false. `binlog_close_connection' call
expects the above function to return true. But if the
return value is false then assert is raised.
This bug was reproducible in a diskfull scenario. In this
disk full scenario try to do an insert operation so that
a new pending event is created and flushing this pending
event fails. Due to this failure the server goes down
and invokes `binlog_close_connection' for clean closure.
Since the pending event still remains the assert is caused.
This assert is caused only in non transactional databases.
Fix:
===
In a disk full scenario when the insertion fails the
transaction is rolled back and `binlog_end_trans`
is called to flush the pending events. But flush operation
fails as the disk is full and the function simply returns
`1' without taking any action to delete the pending event.
This leaves the event to remain till the closure of
connection. `delete pending' statement has been added to
do the required clean up action.
When we append data to the binlog file, we use fdatasync() to ensure
the data gets to disk so that crash recovery can work.
Unfortunately there seems to be a bug in ext3/ext4 on linux, so that
fdatasync() does not correctly sync all data when the size of a file
is increased. This causes crash recovery to not work correctly (it
loses transactions from the binlog).
As a work-around, use fsync() for the binlog, not fdatasync(). Since
we are increasing the file size, (correct) fdatasync() will most
likely not be faster than fsync() on any file system, and fsync()
does work correctly on ext3/ext4. This avoids the need to try to
detect if we are running on buggy ext3/ext4.
two tests still fail:
main.innodb_icp and main.range_vs_index_merge_innodb
call records_in_range() with both range ends being open
(which triggers an assert)
'MAX_BINLOG_CACHE_SIZE' ERROR
Problem:
=======
MySQL returns following error in win64.
"ERROR 1197 (HY000): Multi-statement transaction required more than
'max_binlog_cache_size' bytes of storage; increase this mysqld variable
and try again" when user tries to load >4G file even if
max_binlog_cache_size set to maximum value. On Linux everything
works fine.
Analysis:
========
The `max_binlog_cache_size' variable is of type `ulonglong'. This
value is set to `ULONGLONG_MAX' at the time of server start up. The
above value is stored in an intermediate variable named
`saved_max_binlog_cache_size' which is of type `ulong'. In visual
c++ complier the `ulong' type is of 4bytes in size and hence the value
is getting truncated to '4GB' and the cache is not able to grow beyond
4GB size. The same limitation is observed with
"max_binlog_stmt_cache_size" as well. Similar fix has been applied.
Fix:
===
As part of fix the type "ulong" is replaced with "my_off_t" which is of
type "ulonglong".
mysys/mf_iocache.c:
Added debug statement to simulate a scenario where the cache
file's current position is set to >4GB
sql/log.cc:
Replaced the type of `saved_max_binlog_cache_size' from "ulong" to
"my_off_t", which is a type def for "ulonglong".
'MAX_BINLOG_CACHE_SIZE' ERROR
Problem:
=======
MySQL returns following error in win64.
"ERROR 1197 (HY000): Multi-statement transaction required more than
'max_binlog_cache_size' bytes of storage; increase this mysqld variable
and try again" when user tries to load >4G file even if
max_binlog_cache_size set to maximum value. On Linux everything
works fine.
Analysis:
========
The `max_binlog_cache_size' variable is of type `ulonglong'. This
value is set to `ULONGLONG_MAX' at the time of server start up. The
above value is stored in an intermediate variable named
`saved_max_binlog_cache_size' which is of type `ulong'. In visual
c++ complier the `ulong' type is of 4bytes in size and hence the value
is getting truncated to '4GB' and the cache is not able to grow beyond
4GB size. The same limitation is observed with
"max_binlog_stmt_cache_size" as well. Similar fix has been applied.
Fix:
===
As part of fix the type "ulong" is replaced with "my_off_t" which is of
type "ulonglong".
Problem:
=======
The return value from my_b_write is ignored by: `my_b_write_quoted',
`my_b_write_bit',`Query_log_event::print_query_header'
Most callers of `my_b_printf' ignore the return value. `log_event.cc'
has many calls to it.
Analysis:
========
`my_b_write' is used to write data into a file. If the write fails it
sets appropriate error number and error message through my_error()
function call and sets the IO_CACHE::error == -1.
`my_b_printf' function is also used to write data into a file, it
internally invokes my_b_write to do the write operation. Upon
success it returns number of characters written to file and on error
it returns -1 and sets the error through my_error() and also sets
IO_CACHE::error == -1. Most of the event specific print functions
for example `Create_file_log_event::print', `Execute_load_log_event::print'
etc are the ones which make several calls to the above two functions and
they do not check for the return value after the 'print' call. All the above
mentioned abuse cases deal with the client side.
Fix:
===
As part of bug fix a check for IO_CACHE::error == -1 has been added at
a very high level after the call to the 'print' function. There are
few more places where the return value of "my_b_write" is ignored
those are mentioned below.
+++ mysys/mf_iocache2.c 2012-06-04 07:03:15 +0000
@@ -430,7 +430,8 @@
memset(buffz, '0', minimum_width - length2);
else
memset(buffz, ' ', minimum_width - length2);
- my_b_write(info, buffz, minimum_width - length2);
+++ sql/log.cc 2012-06-08 09:04:46 +0000
@@ -2388,7 +2388,12 @@
{
end= strxmov(buff, "# administrator command: ", NullS);
buff_len= (ulong) (end - buff);
- my_b_write(&log_file, (uchar*) buff, buff_len);
At these places appropriate return value handlers have been added.
client/mysqlbinlog.cc:
check for IO_CACHE::error == -1 has been added after the call to
the event specific print functions
mysys/mf_iocache2.c:
Added handler to check the written value of `my_b_write'
sql/log.cc:
Added handler to check the written value of `my_b_write'
sql/log_event.cc:
Added error simulation statements in `Create_file_log_event::print`
and `Execute_load_query_log_event::print'
sql/rpl_utility.h:
Removed the extra ';'
Problem:
=======
The return value from my_b_write is ignored by: `my_b_write_quoted',
`my_b_write_bit',`Query_log_event::print_query_header'
Most callers of `my_b_printf' ignore the return value. `log_event.cc'
has many calls to it.
Analysis:
========
`my_b_write' is used to write data into a file. If the write fails it
sets appropriate error number and error message through my_error()
function call and sets the IO_CACHE::error == -1.
`my_b_printf' function is also used to write data into a file, it
internally invokes my_b_write to do the write operation. Upon
success it returns number of characters written to file and on error
it returns -1 and sets the error through my_error() and also sets
IO_CACHE::error == -1. Most of the event specific print functions
for example `Create_file_log_event::print', `Execute_load_log_event::print'
etc are the ones which make several calls to the above two functions and
they do not check for the return value after the 'print' call. All the above
mentioned abuse cases deal with the client side.
Fix:
===
As part of bug fix a check for IO_CACHE::error == -1 has been added at
a very high level after the call to the 'print' function. There are
few more places where the return value of "my_b_write" is ignored
those are mentioned below.
+++ mysys/mf_iocache2.c 2012-06-04 07:03:15 +0000
@@ -430,7 +430,8 @@
memset(buffz, '0', minimum_width - length2);
else
memset(buffz, ' ', minimum_width - length2);
- my_b_write(info, buffz, minimum_width - length2);
+++ sql/log.cc 2012-06-08 09:04:46 +0000
@@ -2388,7 +2388,12 @@
{
end= strxmov(buff, "# administrator command: ", NullS);
buff_len= (ulong) (end - buff);
- my_b_write(&log_file, (uchar*) buff, buff_len);
At these places appropriate return value handlers have been added.
Keep track of how many pending XIDs (transactions that are prepared in
storage engine and written into binlog, but not yet durably committed
on disk in the engine) there are in each binlog.
When the count of one binlog drops to zero, write a new binlog checkpoint
event, telling which is the oldest binlog with pending XIDs.
When doing XA recovery after a crash, check the last binlog checkpoint
event, and scan all binlog files from that point onwards for XIDs that
must be committed if found in prepared state inside engine.
Remove the code in binlog rotation that waits for all prepared XIDs to
be committed before writing a new binlog file (this is no longer necessary
when recovery can scan multiple binlog files).
the new file is fully synced to disk and binlog index. This fixes a window
where a crash would leave next server restart unable to detect that a crash
occured, causing recovery to fail.
BUG#64503: mysql frequently ignores --relay-log-space-limit
When the SQL thread goes to sleep, waiting for more events, it sets
the flag ignore_log_space_limit to true. This gives the IO thread a
chance to queue some more events and ultimately the SQL thread will be
able to purge the log once it is rotated. By then the SQL thread
resets the ignore_log_space_limit to false. However, between the time
the SQL thread has set the ignore flag and the time it resets it, the
IO thread will be queuing events in the relay log, possibly going way
over the limit.
This patch makes the IO and SQL thread to synchronize when they reach
the space limit and only ask for one event at a time. Thus the SQL
thread sets ignore_log_space_limit flag and the IO thread resets it to
false everytime it processes one more event. In addition, everytime
the SQL thread processes the next event, and the limit has been
reached, it checks if the IO thread should rotate. If it should, it
instructs the IO thread to rotate, giving the SQL thread a chance to
purge the logs (freeing space). Finally, this patch removes the
resetting of the ignore_log_space_limit flag from purge_first_log,
because this is now reset by the IO thread every time it processes the
next event when the limit has been reached.
If the SQL thread is in a transaction, it cannot purge so, there is no
point in asking the IO thread to rotate. The only thing it can do is
to ask for more events until the transaction is over (then it can ask
the IO to rotate and purge the log right away). Otherwise, there would
be a deadlock (SQL would not be able to purge and IO thread would not
be able to queue events so that the SQL would finish the transaction).
BUG#64503: mysql frequently ignores --relay-log-space-limit
When the SQL thread goes to sleep, waiting for more events, it sets
the flag ignore_log_space_limit to true. This gives the IO thread a
chance to queue some more events and ultimately the SQL thread will be
able to purge the log once it is rotated. By then the SQL thread
resets the ignore_log_space_limit to false. However, between the time
the SQL thread has set the ignore flag and the time it resets it, the
IO thread will be queuing events in the relay log, possibly going way
over the limit.
This patch makes the IO and SQL thread to synchronize when they reach
the space limit and only ask for one event at a time. Thus the SQL
thread sets ignore_log_space_limit flag and the IO thread resets it to
false everytime it processes one more event. In addition, everytime
the SQL thread processes the next event, and the limit has been
reached, it checks if the IO thread should rotate. If it should, it
instructs the IO thread to rotate, giving the SQL thread a chance to
purge the logs (freeing space). Finally, this patch removes the
resetting of the ignore_log_space_limit flag from purge_first_log,
because this is now reset by the IO thread every time it processes the
next event when the limit has been reached.
If the SQL thread is in a transaction, it cannot purge so, there is no
point in asking the IO thread to rotate. The only thing it can do is
to ask for more events until the transaction is over (then it can ask
the IO to rotate and purge the log right away). Otherwise, there would
be a deadlock (SQL would not be able to purge and IO thread would not
be able to queue events so that the SQL would finish the transaction).
- mysql-test-run.pl --valgrind complains when all tests succeed.
- perfschema.all_instances fail on non-linux, where ENABLE_TEMP_POOL
is not set and therefore BITMAP mutex is not used.
- MDEV-132: main.mysqldump fails because it depends on exact size of stdio
buffers.
- MDEV-99: rpl.rpl_cant_read_event_incident fails due to a race where the
slave manages to connect while the test case is in the middle of setting up
the master, causing the slave to replicate extra/wrong events.
- MDEV-133: rpl.rpl_rotate_purge_deadlock fails because it issues a
DEBUG_SYNC SIGNAL immediately followed by RESET; this means that sometimes
the intended receipient has no time to see the signal before it is cleared
by the RESET, causing wait to timeout.
Fix typo causing too low timeout value for wait_for_slave_param.inc.
Fix binlog checksums following 5.5 merge.
Make sure the rpl suite can run with --mysqld=--binlog-checksum=CRC32
Fix a number of problems in the code when checksums are enabled.
readline.cc: In function char* batch_readline(LINE_BUFFER*):
readline.cc:60:9: error: out_length may be used uninitialized in this function
log.cc: In function int find_uniq_filename(char*):
log.cc:1857:8: error: number may be used uninitialized in this function
readline.cc: In function char* batch_readline(LINE_BUFFER*):
readline.cc:60:9: error: out_length may be used uninitialized in this function
log.cc: In function int find_uniq_filename(char*):
log.cc:1857:8: error: number may be used uninitialized in this function
BIN LOG HAS BEEN MOVED
When moving the binary/relay log files from one location to
another and restarting the server with a different log-bin or
relay-log paths, would cause the startup process to abort. The
root cause was that the server would not be able to find the log
files because it would consider old paths for entries in the
index file instead of the new location. What's even worse, the
relative paths would not be considered relative to the path
provided in log-bin and relay-log, but to mysql_data_dir.
We fix the cases where the server contains relative paths. When
the server is reading from the index file, it checks whether the
entry contains relative paths. If it does, we replace it with the
absolute path set in log-bin/relay-log option. Absolute paths
remain unchanged and the index must be manually edited to
consider the new log-bin and/or relay-log path (this should be
documented). This is a fix for a GA version, that does not break
behavior (that much).
For development versions, we should go with Zhenxing's approach
that removes paths altogether from index files.
mysql-test/include/begin_include_file.inc:
Added parameter to keep the begin_include_file.inc silent. Useful when
including scripts that contain platform dependent parameters, for example:
--let $rpl_server_parameters=--log-bin=$tmpdir/slave-bin --relay-log=$tmpdir/slave-relay-bin
--let $keep_include_silent=1
source include/rpl_start_server.inc;
--let $keep_include_silent=0
We want the paths ($tmpdir/slave-bin and $tmpdir/slave-relay-bin) not to be in the
result file.
mysql-test/suite/rpl/t/rpl_binlog_index.test:
Test case.
sql/log.cc:
When finding the corresponding log entry in the index file, we first
normalize the paths before doing the comparison. This will make relative
paths to be turned into absolute paths (based on the opt_bin_logname or
opt_relay_logname) and then compared against also, expanded paths entered,
through CHANGE MASTER for instance.
sql/log.h:
Added normalize_binlog_name, which turns relative paths, into absolute paths
given the parameter: is_relay_log ? opt_relay_logname : opt_bin_logname .
sql/mysqld.cc:
Exposing opt_bin_logname.
sql/mysqld.h:
Exposing opt_bin_logname.