mariadb/storage/maria/ma_test_force_start.pl
Guilhem Bichot a5bcb63f45 WL#4374 "Maria - force start if Recovery fails multiple times"
http://forge.mysql.com/worklog/task.php?id=4374
new option --maria-force-start-after-recovery-failures=N; number of consecutive recovery failures (failures
of log reading or recovery processing, anything in [translog_init(),maria_recovery_from_log()])
is stored in the control file; if at a Maria start they are more than N, logs are removed. This is for automated
systems which have to run whatever happens. As tables risk staying corrupted, --maria-recover should also
be used on them: this revision makes maria-recover work (it was disabled).
Fixed bug in translog_is_log_files(). translog_init() now prints message to error log if failed.
Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there.

KNOWN_BUGS.txt:
  As option --maria-force-start-after-recovery-failures is added, it corresponds to the wish "we should fix that if this happens etc".
  LOAD INDEX is not ignored since a few weeks. Listed concurrency bugs have been fixed some time ago.
  Recovery of fulltext and GIS indexes works since a few weeks.
mysql-test/include/maria_make_snapshot.inc:
  configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_make_snapshot_for_comparison.inc:
  configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc:
  configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_verify_recovery.inc:
  configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/lib/mtr_report.pl:
  new test maria-recover.test generates expected corruption warnings in the error log. maria-recovery.test's corrupted table is renamed to t_corrupted1 instead of t1.
mysql-test/r/maria-preload.result:
  result update. maria_pagecache_read* values are similar to the previous version of this file, though a bit bigger
  because using the information_schema and the join leads to some internal maria temp table being used, and thus some
  blocks of it being read.
mysql-test/r/maria-purge.result:
  engine's name in SHOW ENGINE MARIA LOGS changed.
mysql-test/r/maria-recover.result:
  result for new test. We see corruption messages at first SELECT and then none at second SELECT, expected.
mysql-test/r/maria-recovery.result:
  result update
mysql-test/r/maria.result:
  new variables show up
mysql-test/t/disabled.def:
  BUG#34911 is not fixed but the test had been made independent of the bug (workaround). A new bug (crash) has popped recently, so it has to stay
  disabled (BUG#35107).
mysql-test/t/maria-preload.test:
  Work around BUG#34911 "FLUSH STATUS doesn't flush what it should":
  compute differences in status variables before and after relevant queries
mysql-test/t/maria-recover-master.opt:
  test --maria-recover
mysql-test/t/maria-recover.test:
  Test of the --maria-recover option (build a corrupted table and see if it is auto-repaired)
mysql-test/t/maria-recovery-big.test:
  update for new API of include/maria*.inc
mysql-test/t/maria-recovery-bitmap.test:
  update for new API of include/maria*.inc
mysql-test/t/maria-recovery.test:
  update for new API of include/maria*.inc. Corrupted table t1 renamed to t_corrupted1, so that mtr_report.pl
  does not blindly remove all corruption messages for t1 which is
  a common name.
storage/maria/ha_maria.cc:
  Enabling maria-recover.
  Adding option and global variable --maria_force_start_after_recovery_failures: ha_maria_init()
  calls mark_recovery_start() and mark_recovery_success() to keep track of failed consecutive recoveries
  and remove logs if needed.
  Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there.
storage/maria/ma_checkpoint.c:
  new prototype
storage/maria/ma_control_file.c:
  Storing in one byte in the control file, the number of consecutive recovery failures.
storage/maria/ma_control_file.h:
  new prototype
storage/maria/ma_init.c:
  new prototype
storage/maria/ma_locking.c:
  Need to update open_count on disk at first write and close for transactional tables, like we already did for
  non-transactional tables, otherwise we cannot notice that the table is dubious.
storage/maria/ma_loghandler.c:
  translog_is_log_files() is made more generic to serve either to search or to delete logs (the latter is
  for --maria-force-start-after-recovery-failures). It also had a bug (always returned FALSE).
storage/maria/ma_loghandler.h:
  export function because ha_maria::mark_recovery_start() needs it
storage/maria/ma_recovery.c:
  changing name of maria_recover() to distinguish from the maria-recover option.
storage/maria/ma_recovery.h:
  changing name of maria_recover() to distinguish from the maria-recover option.
storage/maria/ma_test_force_start.pl:
  Test of --maria-force-start-after-recovery-failures (and also, to be realistic, of --maria-recover).
  This is standalone because mysql-test-run does not support testing that multiple mysqld restarts expectedly failed.
  I'll have to run it on my machine and also on a Windows machine.
storage/maria/unittest/ma_control_file-t.c:
  adding recovery_failures to the test
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
  fix for compiler warning (unused variable in non-debug build)
2008-06-02 22:53:25 +02:00

179 lines
5.3 KiB
Perl
Executable file

#!/usr/bin/env perl
use strict;
use warnings;
my $usage= <<EOF;
This program tests that the options
--maria-force-start-after-recovery-failures --maria-recover work as
expected.
It has to be run from directory mysql-test, and works with non-debug
and debug binaries.
Pass it option -d or -i (to test corruption of data or index file).
EOF
# -d currently exhibits BUG#36578
# "Maria: maria-recover may fail to autorepair a table"
die($usage) if (@ARGV == 0);
my $corrupt_index;
if ($ARGV[0] eq '-d')
{
$corrupt_index= 0;
}
elsif ($ARGV[0] eq '-i')
{
$corrupt_index= 1;
}
else
{
die($usage);
}
my $force_after= 3;
my $corrupt_file= $corrupt_index ? "MAI" : "MAD";
my $corrupt_message=
"\\[ERROR\\] mysqld: Table '.\/test\/t1' is marked as crashed and should be repaired";
my $sql_name= "./var/tmp/create_table.sql";
my $error_log_name= "./var/log/master.err";
my @cmd_output;
my $whatever; # garbage data
my $base_server_cmd= "perl mysql-test-run.pl --mem --mysqld=--maria-force-start-after-recovery-failures=$force_after maria-recover";
my $server_cmd;
my $client_cmd= "../client/mysql -u root -S var/tmp/master.sock test < $sql_name";
my $server_pid_name="./var/run/master.pid";
my $server_pid;
my $i; # count of server restarts
sub kill_server;
print "starting mysqld\n";
$server_cmd= $base_server_cmd . " --start-and-exit 2>&1";
@cmd_output=`$server_cmd`;
die if $?;
open(FILE, ">", $sql_name) or die;
# To exhibit BUG#36578 with -d, we don't create an index if -d. This is
# because the presence of an index will cause repair-by-sort to be used,
# where sort_get_next_record() is only called inside
#_ma_create_index_by_sort(), so the latter function fails and in this
# case retry_repair is set, so bug does not happen. Whereas without
# an index, repair-with-key-cache is called, which calls
# sort_get_next_record() whose failure itself does not cause a retry.
print FILE "create table t1 (a varchar(1000)".
($corrupt_index ? ", index(a)" : "") .") engine=maria;\n";
print FILE <<EOF;
insert into t1 values("ThursdayMorningsMarket");
# If Recovery executes REDO_INDEX_NEW_PAGE it will overwrite our
# intentional corruption; we make Recovery skip this record by bumping
# create_rename_lsn using OPTIMIZE TABLE. This also makes sure to put
# the pages on disk, so that we can corrupt them.
optimize table t1;
# mark table open, so that --maria-recover repairs it
insert into t1 select concat(a,'b') from t1 limit 1;
EOF
close FILE;
print "creating table\n";
`$client_cmd`;
die if $?;
print "killing mysqld hard\n";
kill_server(9);
print "ruining " .
($corrupt_index ? "first page of keys" : "bitmap page") .
" in table to test maria-recover\n";
open(FILE, "+<", "./var/master-data/test/t1.$corrupt_file") or die;
$whatever= ("\xAB" x 100);
sysseek (FILE, $corrupt_index ? 8192 : (8192-100-100), 0) or die;
syswrite (FILE, $whatever) or die;
close FILE;
print "ruining log to make recovery fail; mysqld should fail the $force_after first restarts\n";
open(FILE, "+<", "./var/tmp/maria_log.00000001") or die;
$whatever= ("\xAB" x 8192);
sysseek (FILE, 99, 0) or die;
syswrite (FILE, $whatever) or die;
close FILE;
$server_cmd= $base_server_cmd . " --start-dirty 2>&1";
for($i= 1; $i <= $force_after; $i= $i + 1)
{
print "mysqld restart number $i... ";
unlink($error_log_name) or die;
`$server_cmd`;
# mysqld should return 1 when can't read log
die unless (($? >> 8) == 1);
open(FILE, "<", $error_log_name) or die;
@cmd_output= <FILE>;
close FILE;
die unless grep(/\[ERROR\] mysqld: Maria engine: log initialization failed/, @cmd_output);
die unless grep(/\[ERROR\] Plugin 'MARIA' init function returned error./, @cmd_output);
print "failed - ok\n";
}
print "mysqld restart number $i... ";
unlink($error_log_name) or die;
@cmd_output=`$server_cmd`;
die if $?;
open(FILE, "<", $error_log_name) or die;
@cmd_output= <FILE>;
close FILE;
die unless grep(/\[Warning\] mysqld: Maria engine: removed all logs after [\d]+ consecutive failures of recovery from logs/, @cmd_output);
die unless grep(/\[ERROR\] mysqld: File '..\/tmp\/maria_log.00000001' not found \(Errcode: 2\)/, @cmd_output);
print "success - ok\n";
open(FILE, ">", $sql_name) or die;
print FILE <<EOF;
set global maria_recover=normal;
insert into t1 values('aaa');
EOF
close FILE;
# verify corruption has not yet been noticed
open(FILE, "<", $error_log_name) or die;
@cmd_output= <FILE>;
close FILE;
die if grep(/$corrupt_message/, @cmd_output);
print "inserting in table\n";
`$client_cmd`;
die if $?;
print "table is usable - ok\n";
open(FILE, "<", $error_log_name) or die;
@cmd_output= <FILE>;
close FILE;
die unless grep(/$corrupt_message/, @cmd_output);
die unless grep(/\[Warning\] Recovering table: '.\/test\/t1'/, @cmd_output);
print "was corrupted and automatically repaired - ok\n";
# remove our traces
kill_server(15);
print "TEST ALL OK\n";
# kills mysqld with signal given in parameter
sub kill_server
{
my ($sig)= @_;
my $wait_count= 0;
open(FILE, "<", $server_pid_name) or die;
@cmd_output= <FILE>;
close FILE;
$server_pid= $cmd_output[0];
die unless $server_pid > 0;
kill($sig, $server_pid) or die;
while (kill (0, $server_pid))
{
print "waiting for mysqld to die\n" if ($wait_count > 30);
$wait_count= $wait_count + 1;
select(undef, undef, undef, 0.1);
}
}