Non-transactional updates that take place inside a transaction present problems
for logging because they are visible to other clients before the transaction
is committed, and they are not rolled back even if the transaction is rolled
back. It is not always possible to log correctly in statement format when both
transactional and non-transactional tables are used in the same transaction.
In the current patch, we ensure that such scenario is completely safe under the
ROW and MIXED modes.
Before the patch, slaves only appear in the output of SHOW SLAVE HOSTS
when report-host option is set. If an expected slave does not appear in
the list, nobody knows whether the slave does not connect or has started
without the "report-host" option. The output also contains a strange
field "Rpl_recovery_rank" which has never been implemented and the manual
of MySQL5.4 declares that the field has been removed from MySQL5.4.
This patch is done with these,
According to the manual of MySQL5.4, "Rpl_recovery_rank" is removed.
Slaves will register themselves to master no matter if report_host option is set
or not. When slaves are registering themselves, their Server_ids, report_host
and other information are together sent to master. Sever_ids are never null
and is unique in one replication group. Slaves always can be identified with
different Server_ids no matter if report_host exists.
Post-push fix.
Problem: In a previous patch for BUG#39934, rpl_idempotency.test
was split in two tests. The mtr suppressions in the original test
did not make it into the new test. This caused pushbuild warnings.
Fix: copy the mtr suppressions from rpl_idempotency.test to
rpl_row_idempotency.test
The rpl_ndb/combinations file was introduced as part of the fix.
The file contained an error: ndb suites shall not run with
binlog_format=mixed. Removed that combination.
Post-push fix.
Problem: After the original bugfix, if a statement is unsafe,
binlog_format=mixed, and engine is statement-only, a warning was
generated and the statement executed. However, it is a fundamental
principle of binlogging that binlog_format=mixed should guarantee
correct logging, no compromise. So correct behavior is to generate
an error and don't execute the statement.
Fix: Generate error instead of warning.
Since issue_unsafe_warnings can only generate one error message,
this allows us to simplify the code a bit too:
decide_logging_format does not have to save the error code for
issue_unsafe_warnings
The BINLOG statement was sharing too much code with the slave SQL thread, introduced with
the patch for Bug#32407. This caused statements to be logged with the wrong server_id, the
id stored inside the events of the BINLOG statement rather than the id of the running
server.
Fix by rearranging code a bit so that only relevant parts of the code are executed by
the BINLOG statement, and the server_id of the server executing the statements will
not be overrided by the server_id stored in the 'format description BINLOG statement'.
Add an option to control whether the master should keep waiting
until timeout when it detected that there is no semi-sync slave
available.
The bool option 'rpl_semi_sync_master_wait_no_slave' is 1 by
defalt, and will keep waiting until timeout. When set to 0, the
master will switch to asynchronous replication immediately when
no semi-sync slave is available.
Semi-sync status were not reset by FLUSH STATUS, this was because
all semi-sync status variables are defined as SHOW_FUNC and FLUSH
STATUS could only reset SHOW_LONG type variables.
This problem is fixed by change all status variables that should
be reset by FLUSH STATUS from SHOW_FUNC to SHOW_LONG.
After the fix, the following status variables will be reset by
FLUSH STATUS:
Rpl_semi_sync_master_yes_tx
Rpl_semi_sync_master_no_tx
Note: normally, FLUSH STATUS itself will be written into binlog
and be replicated, so after FLUSH STATS, one of
Rpl_semi_sync_master_yes_tx
Rpl_semi_sync_master_no_tx
can be 1 dependent on the semi-sync status. So it's recommended
to use FLUSH NO_WRITE_TO_BINLOG STATUS to avoid this.
Semi-sync uses an extra connection from slave to master to send
replies, this is a normal client connection, and used a normal
SET query to set the reply information on master, which is visible
to user and may cause some confusion and complaining.
This problem is fixed by using the method of sending reply by
using the same connection that is used by master dump thread to
send binlog to slave. Since now the semi-sync plugins are integrated
with the server code, it is not a problem to use the internal net
interfaces to do this.
The master dump thread will mark the event requires a reply and
wait for the reply when the event just sent is the last event
of a transaction and semi-sync status is ON; And the slave will
send a reply to master when it received such an event that requires
a reply.
From r5917 to r5940
Detailed revision comments:
r5917 | marko | 2009-09-16 04:56:23 -0500 (Wed, 16 Sep 2009) | 1 line
branches/zip: innobase_get_cset_width(): Cache the value of current_thd.
r5919 | vasil | 2009-09-16 13:37:13 -0500 (Wed, 16 Sep 2009) | 4 lines
branches/zip:
Whitespace cleanup in the ChangeLog.
r5920 | vasil | 2009-09-16 13:47:22 -0500 (Wed, 16 Sep 2009) | 4 lines
branches/zip:
Add ChangeLog entries for r5916.
r5922 | marko | 2009-09-17 01:32:08 -0500 (Thu, 17 Sep 2009) | 4 lines
branches/zip: innodb-zip.test: Make the test work with zlib 1.2.3.3.
Apparently, the definition of compressBound() has slightly changed.
This has been filed as Mantis Issue #345.
r5924 | vasil | 2009-09-17 23:59:30 -0500 (Thu, 17 Sep 2009) | 4 lines
branches/zip:
White space and formatting cleanup in the ChangeLog
r5934 | vasil | 2009-09-18 12:06:46 -0500 (Fri, 18 Sep 2009) | 4 lines
branches/zip:
Fix typo.
r5935 | calvin | 2009-09-18 16:08:02 -0500 (Fri, 18 Sep 2009) | 6 lines
branches/zip: fix bug#44338; minor non-functional changes
Bug#44338 innodb has message about non-existing option
innodb_max_files_open. Change the option to innodb_open_files.
The fix was committed into 6.0 branch.
r5938 | calvin | 2009-09-19 02:14:25 -0500 (Sat, 19 Sep 2009) | 41 lines
branches/zip: Merge revisions 2584:2956 from branches/6.0,
except c2932.
Bug#37232 and bug#31183 were fixed in the 6.0 branch only.
They should be fixed in the plugin too, specially MySQL 6.0
is discontinued at this point.
------------------------------------------------------------------------
r2604 | inaam | 2008-08-21 09:37:06 -0500 (Thu, 21 Aug 2008) | 8 lines
branches/6.0 bug#37232
Relax locking behaviour for REPLACE INTO t SELECT ... FROM t1.
Now SELECT on t1 is performed as a consistent read when the isolation
level is set to READ COMMITTED.
Reviewed by: Heikki
------------------------------------------------------------------------
r2605 | inaam | 2008-08-21 09:59:33 -0500 (Thu, 21 Aug 2008) | 7 lines
branches/6.0
Added a comment to clarify why distinct calls to read MySQL binary
log file name and log position do not entail any race condition.
Suggested by: Heikki
------------------------------------------------------------------------
r2956 | inaam | 2008-11-04 04:47:30 -0600 (Tue, 04 Nov 2008) | 11 lines
branches/6.0 bug#31183
If the system tablespace runs out of space because 'autoextend' is
not specified with innodb_data_file_path there was no error message
printed to the error log. The client would get 'table full' error.
This patch prints an appropriate error message to the error log.
rb://43
Approved by: Marko
------------------------------------------------------------------------
r5940 | vasil | 2009-09-21 00:26:04 -0500 (Mon, 21 Sep 2009) | 4 lines
branches/zip:
Add ChangeLog entries for c5938.
From revision r5703 to r5716
Detailed revision comments:
r5703 | marko | 2009-08-27 02:25:00 -0500 (Thu, 27 Aug 2009) | 41 lines
branches/zip: Replace the constant 3/8 ratio that controls the LRU_old
size with the settable global variable innodb_old_blocks_pct. The
minimum and maximum values are 5 and 95 per cent, respectively. The
default is 100*3/8, in line with the old behavior.
ut_time_ms(): New utility function, to return the current time in
milliseconds. TODO: Is there a more efficient timestamp function, such
as rdtsc divided by a power of two?
buf_LRU_old_threshold_ms: New variable, corresponding to
innodb_old_blocks_time. The value 0 is the default behaviour: no
timeout before making blocks 'new'.
bpage->accessed, bpage->LRU_position, buf_pool->ulint_clock: Remove.
bpage->access_time: New field, replacing bpage->accessed. Protected by
buf_pool_mutex instead of bpage->mutex. Updated when a page is created
or accessed the first time in the buffer pool.
buf_LRU_old_ratio, innobase_old_blocks_pct: New variables,
corresponding to innodb_old_blocks_pct
buf_LRU_old_ratio_update(), innobase_old_blocks_pct_update(): Update
functions for buf_LRU_old_ratio, innobase_old_blocks_pct.
buf_page_peek_if_too_old(): Compare ut_time_ms() to bpage->access_time
if buf_LRU_old_threshold_ms && bpage->old. Else observe
buf_LRU_old_ratio and bpage->freed_page_clock.
buf_pool_t: Add n_pages_made_young, n_pages_not_made_young,
n_pages_made_young_old, n_pages_not_made_young, for statistics.
buf_print(): Display buf_pool->n_pages_made_young,
buf_pool->n_pages_not_made_young. This function is only for crash
diagnostics.
buf_print_io(): Display buf_pool->LRU_old_len and quantities derived
from buf_pool->n_pages_made_young, buf_pool->n_pages_not_made_young.
This function is invoked by SHOW ENGINE INNODB STATUS.
rb://129 approved by Heikki Tuuri. This addresses Bug #45015.
r5704 | marko | 2009-08-27 03:31:17 -0500 (Thu, 27 Aug 2009) | 32 lines
branches/zip: Fix a critical bug in fast index creation that could
corrupt the created indexes.
row_merge(): Make "half" an in/out parameter. Determine the offset of
half the output file. Copy the last blocks record-by-record instead of
block-by-block, so that the records can be counted. Check that the
input and output have matching n_rec.
row_merge_sort(): Do not assume that two blocks of size N are merged
into a block of size 2*N. The output block can be shorter than the
input if the last page of each input block is almost empty. Use an
accurate termination condition, based on the "half" computed by
row_merge().
row_merge_read(), row_merge_write(), row_merge_blocks(): Add debug output.
merge_file_t, row_merge_file_create(): Add n_rec, the number of records
in the merge file.
row_merge_read_clustered_index(): Update n_rec.
row_merge_blocks(): Update and check n_rec.
row_merge_blocks_copy(): New function, for copying the last blocks in
row_merge(). Update and check n_rec.
This bug was discovered with a user-supplied test case that creates an
index where the initial temporary file is 249 one-megabyte blocks and
the merged files become smaller. In the test, possible merge record
sizes are 10, 18, and 26 bytes.
rb://150 approved by Sunny Bains. This addresses Issue #320.
r5705 | marko | 2009-08-27 06:56:24 -0500 (Thu, 27 Aug 2009) | 11 lines
branches/zip: dict_index_find_cols(): On column name lookup failure,
return DB_CORRUPTION (HA_ERR_CRASHED) instead of abnormally
terminating the server. Also, disable the previously added diagnostic
output to the error log, because mysql-test-run does not like extra
output in the error log. (Bug #44571)
dict_index_add_to_cache(): Handle errors from dict_index_find_cols().
mysql-test/innodb_bug44571.test: A test case for triggering the bug.
rb://135 approved by Sunny Bains.
r5706 | inaam | 2009-08-27 11:00:27 -0500 (Thu, 27 Aug 2009) | 20 lines
branches/zip rb://147
Done away with following two status variables:
innodb_buffer_pool_read_ahead_rnd
innodb_buffer_pool_read_ahead_seq
Introduced two new status variables:
innodb_buffer_pool_read_ahead = number of pages read as part of
readahead since server startup
innodb_buffer_pool_read_ahead_evicted = number of pages that are read
in as readahead but were evicted before ever being accessed since
server startup i.e.: a measure of how badly our readahead is
performing
SHOW INNODB STATUS will show two extra numbers in buffer pool section:
pages read ahead/sec and pages evicted without access/sec
Approved by: Marko
r5707 | inaam | 2009-08-27 11:20:35 -0500 (Thu, 27 Aug 2009) | 6 lines
branches/zip
Remove unused macros as we erased the random readahead code in r5703.
Also fixed some comments.
r5708 | inaam | 2009-08-27 17:43:32 -0500 (Thu, 27 Aug 2009) | 4 lines
branches/zip
Remove redundant TRUE : FALSE from the return statement
r5709 | inaam | 2009-08-28 01:22:46 -0500 (Fri, 28 Aug 2009) | 5 lines
branches/zip rb://152
Disable display of deprecated parameter innodb_file_io_threads in
'show variables'.
r5714 | marko | 2009-08-31 01:10:10 -0500 (Mon, 31 Aug 2009) | 5 lines
branches/zip: buf_chunk_not_freed(): Do not acquire block->mutex unless
block->page.state == BUF_BLOCK_FILE_PAGE. Check that block->page.state
makes sense.
Approved by Sunny Bains over the IM.
r5716 | vasil | 2009-08-31 02:47:49 -0500 (Mon, 31 Aug 2009) | 9 lines
branches/zip:
Fix Bug#46718 InnoDB plugin incompatible with gcc 4.1 (at least: on PPC): "Undefined symbol"
by implementing our own check in plug.in instead of using the result from
the check from MySQL because it is insufficient.
Approved by: Marko (rb://154)
The issue appears when number of heartbeat events non-zero before start of test
block. But really we need to check that no new events has received during test block.
So I did following:
1. Replace absolute values by diff of values
2. Increase heartbeat period from 1.5 to 5 sec
Let
- T be a transactional table and N non-transactional table.
- B be begin, C commit and R rollback.
- N be a statement that accesses and changes only N-tables.
- T be a statement that accesses and changes only T-tables.
In RBR, changes to N-tables that happen early in a transaction are not immediately flushed
upon committing a statement. This behavior may, however, break consistency in the presence
of concurrency since changes done to N-tables become immediately visible to other
connections. To fix this problem, we do the following:
. B N N T C would log - B N C B N C B T C.
. B N N T R would log - B N C B N C B T R.
Note that we are not preserving history from the master as we are introducing a commit that
never happened. However, this seems to be more acceptable than the possibility of breaking
consistency in the presence of concurrency.
Let
- T be a transactional table and N non-transactional table.
- B be begin, C commit and R rollback.
- M be a mixed statement, i.e. a statement that updates both T and N.
- M* be a mixed statement that fails while updating either T or N.
This patch restore the behavior presented in 5.1.37 for rows either produced in
the RBR or MIXED modes, when a M* statement that happened early in a transaction
had their changes written to the binary log outside the boundaries of the
transaction and wrapped in a BEGIN/ROLLBACK. This was done to keep the slave
consistent with with the master as the rollback would keep the changes on N and
undo them on T. In particular, we do what follows:
. B M* T C would log - B M* R B T C.
Note that, we are not preserving history from the master as we are introducing a
rollback that never happened. However, this seems to be more acceptable than
making the slave diverge. We do not fix the following case:
. B T M* C would log B T M* C.
The slave will diverge as the changes on T tables that originated from the M
statement are rolled back on the master but not on the slave. Unfortunately, we
cannot simply rollback the transaction as this would undo any uncommitted
changes on T tables.
SBR is not considered in this patch because a failing statement is written to
the binary along with the error code and a slave executes and then rolls back
the statement when it has an associated error code, thus undoing the effects
on T. In RBR and MBR, a full-fledged fix will be pushed after the WL 2687.
CHANGE MASTER TO command required the value for RELAY_LOG_FILE to
be an absolute path, which was different from the requirement of
MASTER_LOG_FILE.
This patch fixed the problem by changing the value for RELAY_LOG_FILE
to be the basename of the log file as that for MASTER_LOG_FILE.
The problem is that there is only one autoinc value associated with
the query when binlogging. If more than one autoinc values are used
in the query, the autoinc values after the first one can be inserted
wrongly on slave. So these autoinc values can become inconsistent on
master and slave.
The problem is resolved by marking all the statements that invoke
a trigger or call a function that updated autoinc fields as unsafe,
and will switch to row-format in Mixed mode. Actually, the statement
is safe if just one autoinc value is used in sub-statement, but it's
impossible to check how many autoinc values are used in sub-statement.)