The problem occurs in parallel replication in GTID mode, when we are using
multiple replication domains. In this case, if the SQL thread stops, the
slave GTID position may refer to a different point in the relay log for each
domain.
The bug was that when the SQL thread was stopped and restarted (but the IO
thread was kept running), the SQL thread would resume applying the relay log
from the point of the most advanced replication domain, silently skipping all
earlier events within other domains. This caused replication corruption.
This patch solves the problem by storing, when the SQL thread stops with
multiple parallel replication domains active, the current GTID
position. Additionally, the current position in the relay logs is moved back
to a point known to be earlier than the current position of any replication
domain. Then when the SQL thread restarts from the earlier position, GTIDs
encountered are compared against the stored GTID position. Any GTID that was
already applied before the stop is skipped to avoid duplicate apply.
This patch should have no effect if multi-domain GTID parallel replication is
not used. Similarly, if both SQL and IO thread are stopped and restarted, the
patch has no effect, as in this case the existing relay logs are removed and
re-fetched from the master at the current global @@gtid_slave_pos.
After fix of MDEV-6728, the KILL signal is reset at the start of query
execution. This means that in this test case, we need to wait for the
to-be-killed query to have started; otherwise the kill signal can be lost,
causing the test case to fail.
The test case deliberately crashes the server. If this crash happens in the
middle of a page write, InnoDB crash recovery recovers the page from the
doublewrite buffer, writing a message to the error log that is flagged as a
test failure by mysql-test-run. So add a suppression for this.
mtr internally does the following:
1. $default_vardir=....
2. $opt_vardir=$default_vardir unless $opt_vardir;
3. $opt_vardir=realpath $opt_vardir unless IS_WINDOWS
4. if ($opt_vardir eq $default_vardir) { .... use /dev/shm ... }
thus we have to realpath $default_datadir too, otherwise
the comparison logic might be broken
Using a boolean flag for 'there is a RESET MASTER in progress' doesn't
work very well for multiple concurrent RESET MASTER statements.
Changed to a counter.
Fix MDL to report an error when a wait was killed, but preserve
the old documented behavior of GET_LOCK() where killing it is not an error.
Also remove race conditions in main.create_or_replace test
This patch ports the work that facebook has performed
to make innochecksum handle compressed tables.
the basic idea is to use actual innodb-code to perform
checksum verification rather than duplicating in innochecksum.cc.
to make this work, innodb code has been annotated with
lots of #ifndef UNIV_INNOCHECKSUM so that it can be
compiled outside of storage/innobase.
A new testcase is also added that verifies that innochecksum
works on compressed/non-compressed tables.
Merged from commit fabc79d2ea976c4ff5b79bfe913e6bc03ef69d42
from https://code.google.com/p/google-mysql/
The actual steps to produce this patch are:
take innochecksum from 5.6.14
apply changes in innodb from facebook patches needed to make innochecksum compile
apply changes in innochecksum from facebook patches
add handcrafted testcase
The referenced facebook patches used are:
91e25120e7847fe76ea51135628a5a4dbf7c240c
mysql-test/suite/maria/insert_select.result:
Added test case
mysql-test/suite/maria/insert_select.test:
Added test case
mysys/thr_lock.c:
Ensure we don't allow concurrent_insert when a read_no_write lock is in use
- The code that tested if
WHERE expr=value AND expr=const
can be rewritten to:
WHERE const=value AND expr=const
was incomplete in case of STRING_RESULT.
- Moving the test into a new function, to reduce duplicate code.
FULLTEXT indexes do not permit index first lookups. By calling:
ha_index_first() with a garbage parameter, random data gets overwritten
that causes the table->field array to be corrupted. Subsequently, when
the field array is accessed, a segfault occurs.
By not allowing index statistics for FULLTEXT indexes, the problem is
resolved.
Fixing a wrong assymetric code in Arg_comparator::set_cmp_func().
It existed for a long time, but showed up in 10.0.14 after the fix
for "MDEV-6666 Malformed result for CONCAT(utf8_column, binary_string)".
MDEV-7254 Assigned expression is evaluated twice when updating column TIMESTAMP NOT NULL
The test type_timestamp failed depending on the build machine time zone.
Setting a fixed time zone for the test.
column TIMESTAMP NOT NULL
Analysis: Problem was that value->is_null() function is called
even when user had explicitly set the value for timestamp
field. Calling this function had the side effect that
expression was evaluated twice.
Fix: (by Sergei Golubchik) check instead value->null_value.
Structure of the table created by the test to archive mysql.slow_log
data didn't match the structure of mysql.slow_log. The failure only
appeared if the slow_log was not empty, which was very rare.
Updated the structure of the table.
The problem was a too low timeout for slave reconnect. It was set to 9 seconds
(10 retries with 1 second in-between). This is occasinally too short on some
Buildbot hosts, when the test crashes and restarts the master while the slave
IO thread is running.
Fix by increasing --master-retry-count for this test.
The test case injects a DBUG that will crash the server during replication,
then does a START SLAVE. We need to use --error 0,2006,2013 on the START
SLAVE, so that we will not fail the test if the server has time to crash
before the START SLAVE returns to the client.
Fixes a failure seen in Buildbot.
Test causes OS error printout and we need to supress this
error message on tests. Additionally, test could cause
different error codes on different OSs.
The bug occurs when a transaction does a retry after all transactions have
done mark_start_commit() in a batch of group commit from the master. In this
case, the retrying transaction can unmark_start_commit() after the following
batch has already started running and de-allocated the GCO. Then after retry,
the transaction will re-do mark_start_commit() on a de-allocated GCO, and also
wakeup of later GCOs can be lost.
This was seen "in the wild" by a user, even though it is not known exactly
what circumstances can lead to retry of one transaction after all transactions
in a group have reached the commit phase.
The lifetime around GCO was somewhat clunky anyway. With this patch, a GCO
lives until rpl_parallel_entry::last_committed_sub_id has reached the last
transaction in the GCO. This guarantees that the GCO will still be alive when
a transaction does mark_start_commit(). Also, we now loop over the list of
active GCOs for wakeup, to ensure we do not lose a wakeup even in the
problematic case.
The test case tried to trigger a DEBUG_SYNC point at the end of a SELECT
SLEEP(5) statement. It did this by using EXECUTE 2, intending to trigger first
at the end of SET DEBUG_SYNC, and second at the end of the SELECT SLEEP(5).
However, in --ps-protocol mode, this does not work, because the SELECT is
executed in two steps (Prepare followed by Execute). Thus, the DEBUG_SYNC got
triggered too early, during the Prepare stage rather than Execute, and the
test case could race and information_schema.processlist see the thread in the
wrong state.
This patch fixes by changing the way the DEBUG_SYNC point is triggered. Now we
add a DBUG injection inside the code for SLEEP(5). This ensures that the
DEBUG_SYNC point is not activated until the SLEEP(5) is running, ensuring
that the following wait for completion will be effective.
Use include/sync_with_master_gtid.inc instead of --sync_with_master to avoid a
race in the test case.
In parallel replication, the old-style slave position (which is used by
--sync_with_master) is updated out-of-order between parallel threads. This
makes it possible for the position to be updated past DROP TEMPORARY TABLE t2
just before the commit of INSERT INTO t1 SELECT * FROM t2 becomes visible.
In this case, there is a small window where a SELECT just after
--sync_with_master may not see the changes from the INSERT.
generate_derived_keys_for_table() did not work correctly in the case where
- it had a potential index on derived table
- however, TABLE::check_tmp_key() would disallow creation of this index
after looking at its future key parts (because of the key parts exceeding
max. index length)
- the code would leave a KEYUSE structure that refers to a non-existant index.
Depending on further optimizer calculations, this could cause a crash.
Fix a possible race in the test case when restarting the server.
Make sure we have disconnected before waiting for the reconnect
that signals that the server is back up. Otherwise, we may in
rare cases continue the test while the old server is shutting
down, eventually leading to "connection lost" failure.
when restoring auto-inc value in INSERT ... ON DUPLICATE KEY UPDATE, take into account that
1. it may be changed in the UPDATE clause (old code did that)
2. it may be changed in the INSERT clause and then cause a dup key (old code missed that)
mysql-test/r/group_by.result:
Test for MDEV-6855
mysql-test/t/group_by.test:
Test for MDEV-6855
sql/item.h:
Fixed spelling error
sql/opt_range.cc:
Added handling of cond_type == Item::CACHE_ITEM in WHERE clauses for MIN/MAX optimization.
Fixed indentation
There was a bug in lock handling when mixing INSERT ... SELECT on the same table.
mysql-test/suite/maria/insert_select.result:
Test case for MDEV_4010
mysql-test/suite/maria/insert_select.test:
Test case for MDEV_4010
mysys/thr_lock.c:
We wrongly alldoed TL_WRITE_CONCURRENT_INSERT when there was a TL_READ_NO_INSERT lock
mysql-test/r/kill-2.result:
test case for MDEV-6896
mysql-test/t/kill-2-master.opt:
test case for MDEV-6896
mysql-test/t/kill-2.test:
test case for MDEV-6896
sql/sql_parse.cc:
Use host_or_ip instead of host as host may be 0
Problem was that repair() did lock and unlock tables, which leaved already locked tables in wrong state
include/my_check_opt.h:
Added option T_NO_LOCKS to disable locking during repair()
Fixed duplicated bit T_NO_CREATE_RENAME_LSN
mysql-test/suite/rpl/r/myisam_external_lock.result:
Test case for MDEV-6871
mysql-test/suite/rpl/t/myisam_external_lock-slave.opt:
Test case for MDEV-6871
mysql-test/suite/rpl/t/myisam_external_lock.test:
Test case for MDEV-6871
storage/maria/ha_maria.cc:
Don't lock tables during enable_indexes()
Removed some calls to current_thd
storage/myisam/ha_myisam.cc:
Don't lock tables during enable_indexes()
Removed some calls to current_thd