Commit graph

4938 commits

Author SHA1 Message Date
Jan Lindström
2877c5ecc2 MDEV-7477: Make innochecksum work on compressed tables
This patch ports the work that facebook has performed
to make innochecksum handle compressed tables.
the basic idea is to use actual innodb-code to perform
checksum verification rather than duplicating in innochecksum.cc.
to make this work, innodb code has been annotated with
lots of #ifndef UNIV_INNOCHECKSUM so that it can be
compiled outside of storage/innobase.

A new testcase is also added that verifies that innochecksum
works on compressed/non-compressed tables.

Merged from commit fabc79d2ea976c4ff5b79bfe913e6bc03ef69d42 
from https://code.google.com/p/google-mysql/

The actual steps to produce this patch are:

    take innochecksum from 5.6.14
    apply changes in innodb from facebook patches needed to make innochecksum compile
    apply changes in innochecksum from facebook patches
    add handcrafted testcase

The referenced facebook patches used are:

    91e25120e7
    847fe76ea5
    1135628a5a
    4dbf7c240c
2015-01-19 12:39:17 +02:00
Michael Widenius
7cb4a1c61f Fixed MDEV-7314: Deadlock when doing insert-select with Aria
mysql-test/suite/maria/insert_select.result:
  Added test case
mysql-test/suite/maria/insert_select.test:
  Added test case
mysys/thr_lock.c:
  Ensure we don't allow concurrent_insert when a read_no_write lock is in use
2015-01-18 20:38:07 +02:00
Jan Lindström
813af4cde8 Fix try for Buildbot test failure for tests
innodb_bug12400341
	innodb-mdev7046
	innodb_stats_fetch_nonexistent
2015-01-16 11:26:03 +02:00
Kristian Nielsen
df2db86341 MDEV-7430: rpl.rpl_gtid_crash still fails in buildbot
The problem was a too low timeout for slave reconnect. It was set to 9 seconds
(10 retries with 1 second in-between). This is occasinally too short on some
Buildbot hosts, when the test crashes and restarts the master while the slave
IO thread is running.

Fix by increasing --master-retry-count for this test.
2015-01-15 15:55:09 +01:00
Kristian Nielsen
02099a335e MDEV-7467: sporadic failure in rpl.rpl_gtid_crash
The test case injects a DBUG that will crash the server during replication,
then does a START SLAVE. We need to use --error 0,2006,2013 on the START
SLAVE, so that we will not fail the test if the server has time to crash
before the START SLAVE returns to the client.

Fixes a failure seen in Buildbot.
2015-01-14 18:19:05 +01:00
Jan Lindström
39556a7814 MDEV-7262: innodb.innodb-mdev7046 fail on BuildBot
Test causes OS error printout and we need to supress this
error message on tests. Additionally, test could cause
different error codes on different OSs.
2015-01-13 16:48:11 +02:00
Kristian Nielsen
f27817c1d0 MDEV-7326: Server deadlock in connection with parallel replication
The bug occurs when a transaction does a retry after all transactions have
done mark_start_commit() in a batch of group commit from the master. In this
case, the retrying transaction can unmark_start_commit() after the following
batch has already started running and de-allocated the GCO. Then after retry,
the transaction will re-do mark_start_commit() on a de-allocated GCO, and also
wakeup of later GCOs can be lost.

This was seen "in the wild" by a user, even though it is not known exactly
what circumstances can lead to retry of one transaction after all transactions
in a group have reached the commit phase.

The lifetime around GCO was somewhat clunky anyway. With this patch, a GCO
lives until rpl_parallel_entry::last_committed_sub_id has reached the last
transaction in the GCO. This guarantees that the GCO will still be alive when
a transaction does mark_start_commit(). Also, we now loop over the list of
active GCOs for wakeup, to ensure we do not lose a wakeup even in the
problematic case.
2015-01-07 14:45:39 +01:00
Kristian Nielsen
6e0a00ed75 MDEV-7353: rpl_mdev6386 fails sporadically in buildbot
Use include/sync_with_master_gtid.inc instead of --sync_with_master to avoid a
race in the test case.

In parallel replication, the old-style slave position (which is used by
--sync_with_master) is updated out-of-order between parallel threads. This
makes it possible for the position to be updated past DROP TEMPORARY TABLE t2
just before the commit of INSERT INTO t1 SELECT * FROM t2 becomes visible.

In this case, there is a small window where a SELECT just after
--sync_with_master may not see the changes from the INSERT.
2015-01-06 09:52:09 +01:00
Kristian Nielsen
826d7c68d2 MDEV-7342: Test failure in perfschema.setup_instruments_defaults
Fix a possible race in the test case when restarting the server.

Make sure we have disconnected before waiting for the reconnect
that signals that the server is back up. Otherwise, we may in
rare cases continue the test while the old server is shutting
down, eventually leading to "connection lost" failure.
2014-12-18 11:59:08 +01:00
Jan Lindström
76c3981e43 Fix test case to allow success on create table (Windows). 2014-12-10 12:12:09 +02:00
Elena Stepanova
c9a8859db6 MDEV-7255 Failures in engines/* tests 2014-12-07 21:24:02 +04:00
Elena Stepanova
010724f6c5 Run engines tests for MyISAM and in-built InnoDB 2014-12-05 14:23:24 +04:00
Jan Lindström
5bba1109b2 Add possibility to success on Windows. 2014-12-04 14:10:41 +02:00
Elena Stepanova
c8f7f98737 MDEV-7255 Failures in engines/* tests, part 6
Updated hardcoded event numbers according to current GTID logic
2014-12-04 02:54:42 +04:00
Elena Stepanova
f02f06172c MDEV-7255 Failures in engines/* tests, part 5
RENAME TABLE on a non-existing table produces a better error message
2014-12-04 02:17:09 +04:00
Elena Stepanova
d5f52fec77 MDEV-7255 Failures in engines/* tests, part 4
Updated engines/* test results according to the bugfix MDEV-5894
(MySQL BUG#34750: Print database name in Unknown Table error message)
2014-12-04 02:16:41 +04:00
Elena Stepanova
27ac97ef9e MDEV-7255 Failures in engines/* tests, part 3
Error message was changed along with CREATE OR REPLACE TABLE fixes
2014-12-04 01:59:25 +04:00
Elena Stepanova
aafdc4b16e MDEV-7255 Failures in engines/* tests, part 2
Result files updated according to bugfix for MySQL#55843 (Handled 
condition appears as not handled)
2014-12-04 01:52:03 +04:00
Elena Stepanova
cc06415fd6 MDEV-7255 Failures in engines/* tests, part 1
In 10.0 output of SHOW DATABASES appears to be sorted, while in result
files it is not. 
Added sorted_result for certainty and updated result files.
2014-12-03 19:53:40 +04:00
Kristian Nielsen
5fc2814698 MDEV-7251: Test failure in rpl.rpl_parallel
There was a race. The test case was expecting the slave to start processing a
particular DELETE statement, then the test would stop the slave at this
point. But there was missing something to wait until the slave would actually
reach this point; thus depending on timing it was possible that the slave
would be stopped too early, causing .result file difference.

Fixed by adding an appropriate wait to the test case.
2014-12-02 18:11:05 +01:00
Jan Lindström
6cd78eedea MDEV-7242: innodb.innodb-mdev7046 fails in various ways on buildbot
Problem with test is that test causes OS failures that change. 
Idea with test is just to test that server does not crash, no other
output is necessary.
2014-12-02 13:26:45 +02:00
Kristian Nielsen
0450623f73 MDEV-7236: rpl.rpl_gtid_basic failed in buildbot with wait_condition timeout
Fix rare failures in test case rpl.rpl_gtid_basic:

 - Add another possible error code when a connection is killed.

 - Make sure that the IO thread has had time to complete its stop after START
   SLAVE UNTIL. Otherwise, START SLAVE might run before IO thread stop,
   leaving the test case with a stopped IO thread that eventually causes a
   wait timeout.
2014-12-02 12:10:21 +01:00
Kristian Nielsen
50b42441a6 MDEV-7241: rpl.rpl_parallel2 fails sporadically in buildbot
There was a race, a small window between updating slave position and updating
Seconds_Behind_Master, during which the test case could see the wrong value.

Fix by waiting for the expected status to appear.
2014-12-02 09:27:22 +01:00
Sergei Golubchik
433b28cede add a proper cleanup to innodb.innodb-mdev7046 test 2014-12-01 23:56:36 +01:00
Kristian Nielsen
52b25934d7 MDEV-7237: Parallel replication: incorrect relaylog position after stop/start the slave
The replication relay log position was sometimes updated incorrectly at the
end of a transaction in parallel replication. This happened because the relay
log file name was taken from the current Relay_log_info (SQL driver thread),
not the correct value for the transaction in question.

The result was that if a transaction was applied while the SQL driver thread
was at least one relay log file ahead, _and_ the SQL thread was subsequently
stopped before applying any events from the most recent relay log file, then
the relay log position would be incorrect - wrong relay log file name. Thus,
when the slave was started again, usually a relay log read error would result,
or in rare cases, if the position happened to be readable, the slave might
even skip arbitrary amounts of events.

In GTID mode, the relay log position is reset when both slave threads are
restarted, so this bug would only be seen in non-GTID mode, or in GTID mode
when only the SQL thread, not the IO thread, was stopped.
2014-12-01 13:53:57 +01:00
Kristian Nielsen
74e581b7c4 MDEV-7037: MariaDB 10.0 does not build on Debian / kfreebsd-i386/amd64 due to MTR failure: multi_source.gtid
MDEV-7106: Sporadic test failure in multi_source.gtid
MDEV-7153: Yet another sporadic failure of multi_source.gtid in buildbot

This patch fixes three races in the multi_source.gtid test case that could
cause sporadic failures:

1. Do not put SHOW ALL SLAVES STATUS in the output, the output is not stable.

2. Ensure that slave1 has replicated as far as expected, before stopping its
connection to master1 (otherwise the following wait will time out due to rows
not replicated from master1).

3. Ensure that slave2 has replicated far enough before connecting slave1 to it
(otherwise we get an error during connect that slave1 is ahead of slave2).
2014-11-27 09:34:41 +01:00
Kristian Nielsen
e79b7ca966 MDEV-7179: rpl.rpl_gtid_crash failed in buildbot with Warning: database page corruption or a failed
I saw two test failures in rpl.rpl_gtid_crash where we get this in the error
log:

141123 12:47:54 [Note] InnoDB: Restoring possible half-written data pages 
141123 12:47:54 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Warning: database page corruption or a failed
InnoDB: file read of space 6 page 3.
InnoDB: Trying to recover it from the doublewrite buffer.
141123 12:47:54 [Note] InnoDB: Recovered the page from the doublewrite buffer.

This test case deliberately crashes the server, and if this crash happens
right in the middle of writing a buffer pool page to disk, it is not
unexpected that we can get a half-written page. The page is recovered
correctly from the doublewrite buffer.

So this patch adds a suppression for this warning in the error log for this
test case.
2014-11-25 14:19:11 +01:00
Kristian Nielsen
b79685902d MDEV-6903: gtid_slave_pos is incorrect after master crash
When a master slave restarts, it logs a special restart format description
event in its binlog. When the slave sees this event, it knows it needs to roll
back any active partial transaction, in case the master crashed previously in
the middle of writing such transaction to its binlog.

However, there was a bug where this rollback did not reset rgi->pending_gtid.
This caused the @@gtid_slave_pos to be updated incorrectly with the GTID of
the partial transaction that was rolled back.

Fix this by always clearing rgi->pending_gtid in cleanup_context(), hopefully
preventing similar bugs from turning up in other special cases where a
transaction is rolled back during replication.

Thanks to Pavel Ivanov for tracking down the issue and providing a test case.
2014-11-25 12:19:48 +01:00
Jan Lindström
f3bdf9d741 MDEV-7046: MySQL#74480 - Failing assertion: os_file_status(newpath, &exists, &type)
after Operating system error number 36 in a file operation.

Analysis: os_file_get_status did not handle error ENAMETOOLONG
correctly.

Fix: Add correct handling for error ENAMETOOLONG. Note that on InnoDB
case the error is not passed all the way up to server. That would
be bigger rewamp.
2014-11-25 11:38:01 +02:00
Jan Lindström
b62c4c6586 Better comments and add a test case. 2014-11-25 08:31:03 +02:00
Jan Lindström
e5802c38f9 Better comments and add a test case. 2014-11-25 08:06:41 +02:00
Jan Lindström
77a6abf311 MDEV-7183: innodb-wl5522-debug-zip fails in buildbot on Windows
Problem is different path separators. Fixed by replacing
result.
2014-11-24 20:35:02 +02:00
Jan Lindström
876106804e MDEV-7169: innodb.innodb_bug14147491 fails in buildbot on Windows
Problem is that test could open Microsoft C++ Client Debugger
windows with abort exceptin. Lets not try to test this on
windows.
2014-11-24 20:25:17 +02:00
Jan Lindström
b7cee6251a MDEV-7168: Tests innodb.innodb_stats_create_table
innodb.innodb_stats_drop_locked fail and
innodb.innodb_stats_fetch_nonexistent fails in buildbot on Windows

Analysis: Problem is that innodb_stats_create_on_corrupted
test renames mysql.innodb.index_stats and all the rest
are dependend on this table.

Fix: After rename back to original, restart mysqld to
make sure that table is correct.
2014-11-24 19:37:38 +02:00
Jan Lindström
ef1ba3b1e6 MDEV-7164: innodb.innodb-alter-table-disk-full fails in buildbot on Windows
Analysis: Test case uses Linux specific error codes.

Fix: Can't run test case with Windows currently because requires
to inject error to system.
2014-11-24 15:26:47 +02:00
Alexey Botchkov
a726dbd634 MDEV-7157 plugins.server_audit fails sporadically in buildbot.
Records can get to the different place in the log when multiple thread
    are logged. So the delay added to let the record be saved on the same
    place.
2014-11-24 02:53:45 +04:00
Sergei Golubchik
ffc0ef6316 5.5 merge 2014-11-21 20:20:39 +01:00
Jan Lindström
8ff66501ca Forgot to add test file. 2014-11-21 13:32:53 +02:00
Sergei Golubchik
67e2e14627 Merge 2014-11-21 08:50:44 +01:00
Sergei Golubchik
3c12c27907 5.5 merge 2014-11-20 16:07:34 +01:00
Jan Lindström
8bc5eabea8 MDEV-7084: innodb index stats inadequate using constant
innodb_stats_sample_pages

Analysis: If you set the number of analyzed pages 
to very low number compared to actual pages on 
that table/index it randomly pics those pages 
(default 8 pages), this leads to fact that query 
after analyze table returns different results. If 
the index tree is small, smaller than 10 * 
n_sample_pages + total_external_size, then the 
estimate is ok. For bigger index trees it is 
common that we do not see any borders between 
key values in the few pages we pick. But still 
there may be n_sample_pages different key values, 
or even more. And it just tries to 
approximate to n_sample_pages (8).

Fix: (1) Introduced new dynamic configuration variable
innodb_stats_sample_traditional  that retains
the current design. Default false.

(2) If traditional sample is not used we use
n_sample_pages = max(min(srv_stats_sample_pages,
                         index->stat_index_size),
                     log2(index->stat_index_size)*
                          srv_stats_sample_pages);

(3) Introduced new dynamic configuration variable
stat_modified_counter (default = 0) if set
sets lower bound for row updates when statistics is re-estimated.

If user has provided upper bound for how many rows needs to be updated
before we calculate new statistics we use minimum of provided value
and 1/16 of table every 16th round. If no upper bound is provided
(srv_stats_modified_counter = 0, default) then calculate new statistics
if 1 / 16 of table has been modified
since the last time a statistics batch was run.
We calculate statistics at most every 16th round, since we may have
a counter table which is very small and updated very often.
@param t table
@return true if the table has changed too much and stats need to be
recalculated
*/
#define DICT_TABLE_CHANGED_TOO_MUCH(t) \
	((ib_int64_t) (t)->stat_modified_counter > (srv_stats_modified_counter ? \
	ut_min(srv_stats_modified_counter, (16 + (t)->stat_n_rows / 16)) : \
		16 + (t)->stat_n_rows / 16))
2014-11-19 20:27:34 +02:00
Sergei Golubchik
3495801e2e 5.5 merge 2014-11-19 17:23:39 +01:00
Elena Stepanova
416f267a7a MDEV-7074 multi_source.simple test fails in buildbot
The problem is that the binlog position is updated before 
Executed_log_entries and Slave_SQL_State. So, it's possible to hit 
the moment when MASTER_POS_WAIT (and hence sync_with_master) already 
returned success, but Slave_SQL_State and Executed_log_entries were not 
modified yet. 
Fixing it by adding a wait on the expected Executed_log_entries value.
2014-11-19 14:34:49 +04:00
Sergei Golubchik
ea04a8cfda MDEV-6805 one can set character_set_client to utf32
use the same restriction for character_set_client on the command line
and from SQL.

Also: remove strange hack from thd_init_client_charset() that contradicted
the manual (collation_connection and character_set_result were not always set)
2014-11-18 22:25:47 +01:00
Sergei Golubchik
59ab790165 MDEV-7078 rpl.rpl_*mixing_engines tests fail in buildbot
update big test results
2014-11-18 22:25:20 +01:00
Sergei Golubchik
5d0122bd77 MDEV-7113 difference between check_vcol_func_processor and check_partition_func_processor
MDEV-6789 segfault in Item_func_from_unixtime::get_date on updating table with virtual columns

* prohibit VALUES in partitioning expression
* prohibit user and system variables in virtual column expressions
* fix Item_func_date_format to cache locale (for %M/%W to return the same as MONTHNAME/DAYNAME)
* fix Item_func_from_unixtime to cache time_zone directly, not THD (and not to crash)
* added tests for other incorrectly allowed (in vcols) functions to see that they don't crash
2014-11-18 15:42:40 +01:00
Sergei Golubchik
84f25c25f2 MDEV-3940 Server crash or assertion `item->type() == Item::STRING_ITEM' failure on LOAD DATA through a view with statement binary logging
A "field" could be either an Item_field or
(if loading into a view) an Item_direct_ref that references Item_field.

Also: when iterating fields, use fields of the TABLE_LIST (table or view),
not fields of a TABLE (actual underlying table - might have more columns).
2014-11-18 15:42:32 +01:00
Kristian Nielsen
f976050793 MDEV-7079: rpl.rpl_parallel_temptable fails in valgrind builder
The test case rpl.rpl_parallel_temptable deliberately crashes the master
server as part of the testing. This makes it unsuitable for Valgrind
testing. So make sure that it will be skipped when testing with Valgrind.
2014-11-17 12:41:44 +01:00
Kristian Nielsen
7671fd70c0 MDEV-7080: rpl.rpl_gtid_crash fails sporadically in buildbot
The real problem here was inconsistent handling of entry->commit_errno in
MYSQL_BIN_LOG::write_transaction_or_stmt(). Some return paths were setting it
to the value of errno, some where not. And the setting was redundant anyway,
as it is set consistently by the caller.

Fix by consistently setting it in the caller, and not in each return path in
the function.

The test failure happened because a DBUG_EXECUTE_IF() used in the test case
set an entry->commit_errno that was immediately overwritten in the caller with
whatever happened to be the value of errno. This could lead to different error
message in the .result file.
2014-11-17 08:53:42 +01:00
Kristian Nielsen
26b1113032 MDEV-6917: Parallel replication: "Commit failed due to failure of an earlier commit on which this one depends", but no prior failure seen
This bug was seen when parallel replication experienced a deadlock between
transactions T1 and T2, where T2 has reached the commit phase and is waiting
for T1 to commit first. In this case, the deadlock is broken by sending a kill
to T2; that kill error is then later detected and converted to a deadlock
error, which causes T2 to be rolled back and retried.

The problem was that the kill caused ha_commit_trans() to errorneously call
wakeup_subsequent_commits() on T3, signalling it to abort because T2 failed
during commit. This is incorrect, because the error in T2 is only a temporary
error, which will be resolved by normal transaction retry. We should not
signal error to the next transaction until we have executed the code that
handles such temporary errors.

So this patch just removes the calls to wakeup_subsequent_commits() from
ha_commit_trans(). They are incorrect in this case, and they are not needed in
general, as wakeup_subsequent_commits() must in any case be called in
finish_event_group() to wakeup any transactions that may have started to wait
after ha_commit_trans(). And normally, wakeup will in fact have happened
earlier, either from the binlog group commit code, or (in case of no
binlogging) after the fast part of InnoDB/XtraDB group commit.

The symptom of this bug was that replication would break on some transaction
with "Commit failed due to failure of an earlier commit on which this one
depends", but with no such failure of an earlier commit visible anywhere.
2014-11-13 11:01:31 +01:00