When a MyISAM query is killed midway, the query is logged to the binlog marked
with the error.
The slave does not attempt to run the query, but aborts with a suitable error
message in the error log for the DBA to act on.
In this case, the parallel replication code would check the sql_errno() code,
even no my_error() had been set. In debug builds, this causes an assertion.
Fixed the code to check that we actually have an error set before querying for
an error code.
After-review changes.
For this patch in 10.0, we do not introduce a new public storage engine API,
we just fix the InnoDB/XtraDB issues. In 10.1, we will make a better public
API that can be used for all storage engines (MDEV-6429).
Eliminate the background thread that did deadlock kills asynchroneously.
Instead, we ensure that the InnoDB/XtraDB code can handle doing the kill from
inside the deadlock detection code (when thd_report_wait_for() needs to kill a
later thread to resolve a deadlock).
(We preserve the part of the original patch that introduces dedicated mutex
and condition for the slave init thread, to remove the abuse of
LOCK_thread_count for start/stop synchronisation of the slave init thread).
When a master server starts up, it logs a special format_description event at
the start of a new binlog to mark that is has restarted. This is used by a
slave to drop all temporary tables - this is needed in case the master crashed
and did not have a chance to send explicit DROP TEMPORARY TABLE statements to
the slave.
In parallel replication, we need to be careful when dropping the temporary
tables - we need to be sure that no prior events are still executing that
might be using the temporary tables to be dropped, _and_ that no following
events have started executing that might have created new temporary tables
that should not be dropped.
This was not handled correctly, which could cause errors about access to not
existing temporary tables or even crashes. This patch implements that such
format_description events cause serialisation of event execution; all prior
events are executed to completion first, then the format_description event is
executed, dropping temporary tables, then following events are queued for
execution.
Master restarts should be sufficiently infrequent that the resulting loss of
parallelism should be of minimal impact.
Part of this work is based on Stewart Smitch's memory barrier and lower priori
patches for power8.
- Added memory syncronization for innodb & xtradb for power8.
- Added HAVE_WINDOWS_MM_FENCE to CMakeList.txt
- Added os_isync to fix a syncronization problem on power
- Added log_get_lsn_nowait which is now used srv_error_monitor_thread to ensur
if log mutex is locked.
All changes done both for InnoDB and Xtradb
The problem occured when using parallel replication, and an error occured that
caused the SQL thread to stop when the IO thread had already reached a
following binlog file from the master (or otherwise performed a relay log
rotation).
In this case, the Rotate Event at the end of the relay log file could still be
executed, even though an earlier event in that relay log file had gotten an
error. This would cause the position to be incorrectly updated, so that upon
restart of the SQL thread, the event that had failed would be silently skipped
and ignored, causing replication corruption.
Fixed by checking before executing Rotate Event, whether an earlier event
has failed. If so, the Rotate Event is not executed, just dequeued, same as
for other normal events following a failing event.
If one uses 4 --verbose to mysql_upgrade it will also write out all mysqlcheck commands invoked.
mysql-test/r/mysql_upgrade.result:
Updated results from changing phases
mysql-test/r/mysql_upgrade_no_innodb.result:
dated results from changing phases
mysql-test/r/mysql_upgrade_ssl.result:
dated results from changing phases
The bug was that in some cases, if a replicated transaction was rolled back
due to deadlock, during the subsequent retry of that transaction, the
gtid_slave_pos would _not_ be updated with the new GTID, leaving the GTID
position of the slave incorrect.
Fix this by ensuring during the retry that we clear the flag that marks that
the GTID has already been recorded in gtid_slave_pos, so that the update of
gtid_slave_pos will be done again during the retry.
In the original bug, the symptom was an assertion due to OPTION_GTID_BEGIN not
being cleared during the retry of the transaction. The reason was some code in
handling of a COMMIT query event, which would not clear the flag when not
recording a GTID in gtid_slave_pos. This commit also fixes that code to always
clear the OPTION_GTID_BEGIN flag for clarity, though it is actually not
possible for OPTION_GTID_BEGIN to become set unless a GTID is pending for
update (after fixing the bug described above).
sporadically
Fix: Modify test to be smaller so that testcase timeout does not
trigger. We already have a test for --big-test setup
(innodb.innodb_simulate_comp_failures).
sporadically
Fix: Modify test to be smaller so that testcase timeout does not
trigger. We already have a test for --big-test setup
(innodb.innodb_simulate_comp_failures).
Mysqldump overflows stack buffer when copying table name from commandline arguments resulting in stack corruption and ability to execute arbitrary code.
Fix: Check length of all positional arguments passed to mysqldump is smaller than NAME_LEN.
Note: Mysqldump heavily depends on that database objects (databases, tablespaces, tables, etc) are limited to small size (now it is 64).
The test case makes use of the fine DEBUG_SYNC facility. Furthermore,
since it needs synchronization on internal threads (dump and SQL
threads) the server code has DEBUG_SYNC commands internally deployed
and activated through the DBUG_EXECUTE_IF macro. The internal
DBUG_SYNC commands are then controlled from the test case through the
DEBUG variable.
There were three problems around the DEBUG + DEBUG_SYNC facility
usage:
1. When signaling the SQL thread to continue, the test would reset
immediately the DEBUG_SYNC variable. This could mean that the SQL
thread might loose the signal and continue to wait forever;
2. A similar scenario was happening with the dump thread on the
master. This thread was instructed to wait, and later it would be
signaled to continue, but immediately after the DEBUG_SYNC would be
reset. This could lead to the dump thread missing the signal and
wait forever;
3. The test was not cleaning itself up with respect to the
instrumentation of the dump thread. This would leave the
conditional execution of an internal DEBUG_SYNC command active
(through the usage of DBUG_EXECUTE_IF).
We fix#1 and #2 by waiting for the threads to receive the signal and
only then issue the reset. We fix#3 by reseting the DEBUG variable,
thus deactivating the dump thread internal DEBUG_SYNC command.
The sql_slave_skip_counter is important to be able to recover replication from
certain errors. Often, an appropriate solution is to set
sql_slave_skip_counter to skip over a problem event. But setting
sql_slave_skip_counter produced an error in GTID mode, with a suggestion to
instead set @@gtid_slave_pos to point past the problem event. This however is
not always possible; for example, in case of an INCIDENT event, that event
does not have any GTID to assign to @@gtid_slave_pos.
With this patch, sql_slave_skip_counter now works in GTID mode the same was as
in non-GTID mode. When set, that many initial events are skipped when the SQL
thread starts, plus as many extra events are needed to completely skip any
partially skipped event group. The GTID position is updated to point past the
skipped event(s).
MDEV-6344: mysqldump issues FLUSH TABLES, which gets written into binlog and replicated
Add a --gtid option (for compatibility, the original behaviour is preserved
when --gtid is not used).
With --gtid, --master-data and --dump-slave output the GTID position (the
old-style file/offset position is still output, but commented out). Also, a
CHANGE MASTER TO master_use_gtid=slave_pos is output to ensure a provisioned
slave is configured in GTID, as requested.
Without --gtid, the GTID position is still output, if available, but commented
out.
Also fix MDEV-6344, to avoid FLUSH TABLES getting into the binlog. Otherwise a
mysqldump on a slave server will silently inject a GTID which does not exist
on the master, which is highly undesirable.
Also fix an incorrect error handling around obtaining binlog position with
--master-data (was probably unlikely to trigger in most cases).
- Make log_slow_verbosity print "Priority_queue: (Yes|No)" into the slow query log.
(but we do not add a correspoding column to P_S.*statement* tables).
Analysis: For some reason table stats for a table pointed from a index
is not initialized. Added additional warning output on this situation
and table stats initialization. This is better than asserting.