2010-09-12 18:40:01 +02:00
|
|
|
# Tests of Aria's recovery of the bitmap pages
|
WL#3072 - Maria recovery.
* fix for bitmap vs checkpoint bug which could lead to corrupted
tables in case of crashes at certain moments: a bitmap could be flushed
to disk even though it was inconsistent with the log (it could be
flushed before REDO-UNDO are written to the log). One bug remains, need
code from others. Tests added. Fix is to pin unflushable bitmap pages,
and let checkpoint wait for them to be flushable.
* fix for long_trid!=0 assertion failure at Recovery.
* less useless wakeups in the background flush|checkpoint thread.
* store global_trid_generator in checkpoint record.
mysql-test/r/maria-recovery.result:
result update
mysql-test/t/maria-recovery.test:
make it easier to locate subtests
storage/maria/ma_bitmap.c:
When we send a bitmap to the pagecache, if this bitmap is not in a
flushable state we keep it pinned and add it to a list, it will be
unpinned when the bitmap is flushable again.
A new function _ma_bitmap_flush_all() used by checkpoint.
A new function _ma_bitmap_flushable() used by block format to signal
when it starts modifying a bitmap and when it is done with it.
storage/maria/ma_blockrec.c:
When starting a row operation (insert/update/delete), mark that
the bitmap is not flushable (because for example INSERT is going
to over-allocate in the bitmap to prevent other threads from using
our data pages). If a checkpoint comes at this moment it will wait
for the bitmap to be flushable before flushing it.
When the operation ends, bitmap becomes flushable again; that
transition is done under the bitmap's mutex (needed for correct
synchro with a concurrent checkpoint); but for INSERT/UPDATE this
happens inside _ma_bitmap_release_unused() at a place where it already
has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO
of INSERT. In case of errors after setting the bitmap unflushable,
we must always set it back to flushable or checkpoint would block.
Debug possibilities to force a sleep while the bitmap is over-allocated.
In case of error in get_head_or_tail() in allocate_and_write_block_record(),
we still need to unpin all pages.
Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong
data_file_length.
storage/maria/ma_blockrec.h:
new bitmap calls.
storage/maria/ma_checkpoint.c:
filter_flush_indirect not needed anymore (flushing bitmap
pages happens in _ma_bitmap_flush_all() now). So
st_filter_param::is_data_file|pages_covered_by_bitmap not needed.
Other filter_flush* don't need to flush bitmap anymore.
Add debug possibility to flush all bitmap pages outside of a checkpoint,
to simulate pagecache LRU eviction.
When the background flush/checkpoint thread notices it has nothing
to flush, it now sleeps directly until the next potential checkpoint
moment instead of waking up every second.
When in checkpoint we decide to not store a table in the checkpoint record
(because it has logged no writes for example), we can also skip flushing
this table.
storage/maria/ma_commit.c:
comment is out-of-date
storage/maria/ma_key_recover.c:
comment fix
storage/maria/ma_loghandler.c:
comment is out-of-date
storage/maria/ma_open.c:
comment is out-of-date
storage/maria/ma_pagecache.c:
comment for bug to fix. And we don't take checkpoints at end of REDO
phase yet so can trust block->type.
storage/maria/ma_recovery.c:
Comments. Now-unneeded code for incomplete REDO-UNDO groups removed.
When we forget about an old transaction we must really forget
about it with bzero() (fixes the "long_trid!=0 assertion" recovery
bug). When we delete a row with maria_delete() we turn on
STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END
for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both
cases a row was deleted). Pick up max_long_trid from the checkpoint record.
storage/maria/maria_chk.c:
comment
storage/maria/maria_def.h:
MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and
'pinned_pages'.
storage/maria/trnman.c:
I used to think that recovery only needs to know the maximum TrID
of the lists of active and committed transactions. But no, sometimes
both lists can even be empty and their TrID should not be reused.
So Checkpoint now saves global_trid_generator in the checkpoint record.
storage/maria/trnman_public.h:
macros to read/store a TrID
mysql-test/r/maria-recovery-bitmap.result:
result is ok. Without the code fix, we would get a corruption message
about the bitmap page in CHECK TABLE EXTENDED.
mysql-test/t/maria-recovery-bitmap-master.opt:
usual when we crash mysqld in tests
mysql-test/t/maria-recovery-bitmap.test:
test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
|
|
|
|
|
|
|
--source include/not_embedded.inc
|
|
|
|
# Don't test this under valgrind, memory leaks will occur as we crash
|
|
|
|
--source include/not_valgrind.inc
|
|
|
|
# Binary must be compiled with debug for crash to occur
|
|
|
|
--source include/have_debug.inc
|
|
|
|
--source include/have_maria.inc
|
|
|
|
|
|
|
|
--disable_warnings
|
|
|
|
drop database if exists mysqltest;
|
|
|
|
--enable_warnings
|
|
|
|
create database mysqltest;
|
WL#4374 "Maria - force start if Recovery fails multiple times"
http://forge.mysql.com/worklog/task.php?id=4374
new option --maria-force-start-after-recovery-failures=N; number of consecutive recovery failures (failures
of log reading or recovery processing, anything in [translog_init(),maria_recovery_from_log()])
is stored in the control file; if at a Maria start they are more than N, logs are removed. This is for automated
systems which have to run whatever happens. As tables risk staying corrupted, --maria-recover should also
be used on them: this revision makes maria-recover work (it was disabled).
Fixed bug in translog_is_log_files(). translog_init() now prints message to error log if failed.
Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there.
KNOWN_BUGS.txt:
As option --maria-force-start-after-recovery-failures is added, it corresponds to the wish "we should fix that if this happens etc".
LOAD INDEX is not ignored since a few weeks. Listed concurrency bugs have been fixed some time ago.
Recovery of fulltext and GIS indexes works since a few weeks.
mysql-test/include/maria_make_snapshot.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_make_snapshot_for_comparison.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_verify_recovery.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/lib/mtr_report.pl:
new test maria-recover.test generates expected corruption warnings in the error log. maria-recovery.test's corrupted table is renamed to t_corrupted1 instead of t1.
mysql-test/r/maria-preload.result:
result update. maria_pagecache_read* values are similar to the previous version of this file, though a bit bigger
because using the information_schema and the join leads to some internal maria temp table being used, and thus some
blocks of it being read.
mysql-test/r/maria-purge.result:
engine's name in SHOW ENGINE MARIA LOGS changed.
mysql-test/r/maria-recover.result:
result for new test. We see corruption messages at first SELECT and then none at second SELECT, expected.
mysql-test/r/maria-recovery.result:
result update
mysql-test/r/maria.result:
new variables show up
mysql-test/t/disabled.def:
BUG#34911 is not fixed but the test had been made independent of the bug (workaround). A new bug (crash) has popped recently, so it has to stay
disabled (BUG#35107).
mysql-test/t/maria-preload.test:
Work around BUG#34911 "FLUSH STATUS doesn't flush what it should":
compute differences in status variables before and after relevant queries
mysql-test/t/maria-recover-master.opt:
test --maria-recover
mysql-test/t/maria-recover.test:
Test of the --maria-recover option (build a corrupted table and see if it is auto-repaired)
mysql-test/t/maria-recovery-big.test:
update for new API of include/maria*.inc
mysql-test/t/maria-recovery-bitmap.test:
update for new API of include/maria*.inc
mysql-test/t/maria-recovery.test:
update for new API of include/maria*.inc. Corrupted table t1 renamed to t_corrupted1, so that mtr_report.pl
does not blindly remove all corruption messages for t1 which is
a common name.
storage/maria/ha_maria.cc:
Enabling maria-recover.
Adding option and global variable --maria_force_start_after_recovery_failures: ha_maria_init()
calls mark_recovery_start() and mark_recovery_success() to keep track of failed consecutive recoveries
and remove logs if needed.
Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there.
storage/maria/ma_checkpoint.c:
new prototype
storage/maria/ma_control_file.c:
Storing in one byte in the control file, the number of consecutive recovery failures.
storage/maria/ma_control_file.h:
new prototype
storage/maria/ma_init.c:
new prototype
storage/maria/ma_locking.c:
Need to update open_count on disk at first write and close for transactional tables, like we already did for
non-transactional tables, otherwise we cannot notice that the table is dubious.
storage/maria/ma_loghandler.c:
translog_is_log_files() is made more generic to serve either to search or to delete logs (the latter is
for --maria-force-start-after-recovery-failures). It also had a bug (always returned FALSE).
storage/maria/ma_loghandler.h:
export function because ha_maria::mark_recovery_start() needs it
storage/maria/ma_recovery.c:
changing name of maria_recover() to distinguish from the maria-recover option.
storage/maria/ma_recovery.h:
changing name of maria_recover() to distinguish from the maria-recover option.
storage/maria/ma_test_force_start.pl:
Test of --maria-force-start-after-recovery-failures (and also, to be realistic, of --maria-recover).
This is standalone because mysql-test-run does not support testing that multiple mysqld restarts expectedly failed.
I'll have to run it on my machine and also on a Windows machine.
storage/maria/unittest/ma_control_file-t.c:
adding recovery_failures to the test
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
fix for compiler warning (unused variable in non-debug build)
2008-06-02 22:53:25 +02:00
|
|
|
let $mms_tname=t;
|
WL#3072 - Maria recovery.
* fix for bitmap vs checkpoint bug which could lead to corrupted
tables in case of crashes at certain moments: a bitmap could be flushed
to disk even though it was inconsistent with the log (it could be
flushed before REDO-UNDO are written to the log). One bug remains, need
code from others. Tests added. Fix is to pin unflushable bitmap pages,
and let checkpoint wait for them to be flushable.
* fix for long_trid!=0 assertion failure at Recovery.
* less useless wakeups in the background flush|checkpoint thread.
* store global_trid_generator in checkpoint record.
mysql-test/r/maria-recovery.result:
result update
mysql-test/t/maria-recovery.test:
make it easier to locate subtests
storage/maria/ma_bitmap.c:
When we send a bitmap to the pagecache, if this bitmap is not in a
flushable state we keep it pinned and add it to a list, it will be
unpinned when the bitmap is flushable again.
A new function _ma_bitmap_flush_all() used by checkpoint.
A new function _ma_bitmap_flushable() used by block format to signal
when it starts modifying a bitmap and when it is done with it.
storage/maria/ma_blockrec.c:
When starting a row operation (insert/update/delete), mark that
the bitmap is not flushable (because for example INSERT is going
to over-allocate in the bitmap to prevent other threads from using
our data pages). If a checkpoint comes at this moment it will wait
for the bitmap to be flushable before flushing it.
When the operation ends, bitmap becomes flushable again; that
transition is done under the bitmap's mutex (needed for correct
synchro with a concurrent checkpoint); but for INSERT/UPDATE this
happens inside _ma_bitmap_release_unused() at a place where it already
has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO
of INSERT. In case of errors after setting the bitmap unflushable,
we must always set it back to flushable or checkpoint would block.
Debug possibilities to force a sleep while the bitmap is over-allocated.
In case of error in get_head_or_tail() in allocate_and_write_block_record(),
we still need to unpin all pages.
Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong
data_file_length.
storage/maria/ma_blockrec.h:
new bitmap calls.
storage/maria/ma_checkpoint.c:
filter_flush_indirect not needed anymore (flushing bitmap
pages happens in _ma_bitmap_flush_all() now). So
st_filter_param::is_data_file|pages_covered_by_bitmap not needed.
Other filter_flush* don't need to flush bitmap anymore.
Add debug possibility to flush all bitmap pages outside of a checkpoint,
to simulate pagecache LRU eviction.
When the background flush/checkpoint thread notices it has nothing
to flush, it now sleeps directly until the next potential checkpoint
moment instead of waking up every second.
When in checkpoint we decide to not store a table in the checkpoint record
(because it has logged no writes for example), we can also skip flushing
this table.
storage/maria/ma_commit.c:
comment is out-of-date
storage/maria/ma_key_recover.c:
comment fix
storage/maria/ma_loghandler.c:
comment is out-of-date
storage/maria/ma_open.c:
comment is out-of-date
storage/maria/ma_pagecache.c:
comment for bug to fix. And we don't take checkpoints at end of REDO
phase yet so can trust block->type.
storage/maria/ma_recovery.c:
Comments. Now-unneeded code for incomplete REDO-UNDO groups removed.
When we forget about an old transaction we must really forget
about it with bzero() (fixes the "long_trid!=0 assertion" recovery
bug). When we delete a row with maria_delete() we turn on
STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END
for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both
cases a row was deleted). Pick up max_long_trid from the checkpoint record.
storage/maria/maria_chk.c:
comment
storage/maria/maria_def.h:
MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and
'pinned_pages'.
storage/maria/trnman.c:
I used to think that recovery only needs to know the maximum TrID
of the lists of active and committed transactions. But no, sometimes
both lists can even be empty and their TrID should not be reused.
So Checkpoint now saves global_trid_generator in the checkpoint record.
storage/maria/trnman_public.h:
macros to read/store a TrID
mysql-test/r/maria-recovery-bitmap.result:
result is ok. Without the code fix, we would get a corruption message
about the bitmap page in CHECK TABLE EXTENDED.
mysql-test/t/maria-recovery-bitmap-master.opt:
usual when we crash mysqld in tests
mysql-test/t/maria-recovery-bitmap.test:
test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
|
|
|
|
|
|
|
# Include scripts can perform SQL. For it to not influence the main test
|
|
|
|
# they use a separate connection. This way if they use a DDL it would
|
|
|
|
# not autocommit in the main test.
|
2008-07-01 22:47:09 +02:00
|
|
|
connect (admin, localhost, root,,mysqltest,,);
|
WL#3072 - Maria recovery.
* fix for bitmap vs checkpoint bug which could lead to corrupted
tables in case of crashes at certain moments: a bitmap could be flushed
to disk even though it was inconsistent with the log (it could be
flushed before REDO-UNDO are written to the log). One bug remains, need
code from others. Tests added. Fix is to pin unflushable bitmap pages,
and let checkpoint wait for them to be flushable.
* fix for long_trid!=0 assertion failure at Recovery.
* less useless wakeups in the background flush|checkpoint thread.
* store global_trid_generator in checkpoint record.
mysql-test/r/maria-recovery.result:
result update
mysql-test/t/maria-recovery.test:
make it easier to locate subtests
storage/maria/ma_bitmap.c:
When we send a bitmap to the pagecache, if this bitmap is not in a
flushable state we keep it pinned and add it to a list, it will be
unpinned when the bitmap is flushable again.
A new function _ma_bitmap_flush_all() used by checkpoint.
A new function _ma_bitmap_flushable() used by block format to signal
when it starts modifying a bitmap and when it is done with it.
storage/maria/ma_blockrec.c:
When starting a row operation (insert/update/delete), mark that
the bitmap is not flushable (because for example INSERT is going
to over-allocate in the bitmap to prevent other threads from using
our data pages). If a checkpoint comes at this moment it will wait
for the bitmap to be flushable before flushing it.
When the operation ends, bitmap becomes flushable again; that
transition is done under the bitmap's mutex (needed for correct
synchro with a concurrent checkpoint); but for INSERT/UPDATE this
happens inside _ma_bitmap_release_unused() at a place where it already
has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO
of INSERT. In case of errors after setting the bitmap unflushable,
we must always set it back to flushable or checkpoint would block.
Debug possibilities to force a sleep while the bitmap is over-allocated.
In case of error in get_head_or_tail() in allocate_and_write_block_record(),
we still need to unpin all pages.
Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong
data_file_length.
storage/maria/ma_blockrec.h:
new bitmap calls.
storage/maria/ma_checkpoint.c:
filter_flush_indirect not needed anymore (flushing bitmap
pages happens in _ma_bitmap_flush_all() now). So
st_filter_param::is_data_file|pages_covered_by_bitmap not needed.
Other filter_flush* don't need to flush bitmap anymore.
Add debug possibility to flush all bitmap pages outside of a checkpoint,
to simulate pagecache LRU eviction.
When the background flush/checkpoint thread notices it has nothing
to flush, it now sleeps directly until the next potential checkpoint
moment instead of waking up every second.
When in checkpoint we decide to not store a table in the checkpoint record
(because it has logged no writes for example), we can also skip flushing
this table.
storage/maria/ma_commit.c:
comment is out-of-date
storage/maria/ma_key_recover.c:
comment fix
storage/maria/ma_loghandler.c:
comment is out-of-date
storage/maria/ma_open.c:
comment is out-of-date
storage/maria/ma_pagecache.c:
comment for bug to fix. And we don't take checkpoints at end of REDO
phase yet so can trust block->type.
storage/maria/ma_recovery.c:
Comments. Now-unneeded code for incomplete REDO-UNDO groups removed.
When we forget about an old transaction we must really forget
about it with bzero() (fixes the "long_trid!=0 assertion" recovery
bug). When we delete a row with maria_delete() we turn on
STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END
for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both
cases a row was deleted). Pick up max_long_trid from the checkpoint record.
storage/maria/maria_chk.c:
comment
storage/maria/maria_def.h:
MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and
'pinned_pages'.
storage/maria/trnman.c:
I used to think that recovery only needs to know the maximum TrID
of the lists of active and committed transactions. But no, sometimes
both lists can even be empty and their TrID should not be reused.
So Checkpoint now saves global_trid_generator in the checkpoint record.
storage/maria/trnman_public.h:
macros to read/store a TrID
mysql-test/r/maria-recovery-bitmap.result:
result is ok. Without the code fix, we would get a corruption message
about the bitmap page in CHECK TABLE EXTENDED.
mysql-test/t/maria-recovery-bitmap-master.opt:
usual when we crash mysqld in tests
mysql-test/t/maria-recovery-bitmap.test:
test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
|
|
|
--enable_reconnect
|
|
|
|
|
|
|
|
connection default;
|
|
|
|
use mysqltest;
|
|
|
|
--enable_reconnect
|
|
|
|
|
|
|
|
-- source include/maria_empty_logs.inc
|
|
|
|
let $mms_tables=1;
|
2010-09-12 18:40:01 +02:00
|
|
|
create table t1 (a varchar(10000)) engine=aria;
|
WL#3072 - Maria recovery.
* fix for bitmap vs checkpoint bug which could lead to corrupted
tables in case of crashes at certain moments: a bitmap could be flushed
to disk even though it was inconsistent with the log (it could be
flushed before REDO-UNDO are written to the log). One bug remains, need
code from others. Tests added. Fix is to pin unflushable bitmap pages,
and let checkpoint wait for them to be flushable.
* fix for long_trid!=0 assertion failure at Recovery.
* less useless wakeups in the background flush|checkpoint thread.
* store global_trid_generator in checkpoint record.
mysql-test/r/maria-recovery.result:
result update
mysql-test/t/maria-recovery.test:
make it easier to locate subtests
storage/maria/ma_bitmap.c:
When we send a bitmap to the pagecache, if this bitmap is not in a
flushable state we keep it pinned and add it to a list, it will be
unpinned when the bitmap is flushable again.
A new function _ma_bitmap_flush_all() used by checkpoint.
A new function _ma_bitmap_flushable() used by block format to signal
when it starts modifying a bitmap and when it is done with it.
storage/maria/ma_blockrec.c:
When starting a row operation (insert/update/delete), mark that
the bitmap is not flushable (because for example INSERT is going
to over-allocate in the bitmap to prevent other threads from using
our data pages). If a checkpoint comes at this moment it will wait
for the bitmap to be flushable before flushing it.
When the operation ends, bitmap becomes flushable again; that
transition is done under the bitmap's mutex (needed for correct
synchro with a concurrent checkpoint); but for INSERT/UPDATE this
happens inside _ma_bitmap_release_unused() at a place where it already
has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO
of INSERT. In case of errors after setting the bitmap unflushable,
we must always set it back to flushable or checkpoint would block.
Debug possibilities to force a sleep while the bitmap is over-allocated.
In case of error in get_head_or_tail() in allocate_and_write_block_record(),
we still need to unpin all pages.
Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong
data_file_length.
storage/maria/ma_blockrec.h:
new bitmap calls.
storage/maria/ma_checkpoint.c:
filter_flush_indirect not needed anymore (flushing bitmap
pages happens in _ma_bitmap_flush_all() now). So
st_filter_param::is_data_file|pages_covered_by_bitmap not needed.
Other filter_flush* don't need to flush bitmap anymore.
Add debug possibility to flush all bitmap pages outside of a checkpoint,
to simulate pagecache LRU eviction.
When the background flush/checkpoint thread notices it has nothing
to flush, it now sleeps directly until the next potential checkpoint
moment instead of waking up every second.
When in checkpoint we decide to not store a table in the checkpoint record
(because it has logged no writes for example), we can also skip flushing
this table.
storage/maria/ma_commit.c:
comment is out-of-date
storage/maria/ma_key_recover.c:
comment fix
storage/maria/ma_loghandler.c:
comment is out-of-date
storage/maria/ma_open.c:
comment is out-of-date
storage/maria/ma_pagecache.c:
comment for bug to fix. And we don't take checkpoints at end of REDO
phase yet so can trust block->type.
storage/maria/ma_recovery.c:
Comments. Now-unneeded code for incomplete REDO-UNDO groups removed.
When we forget about an old transaction we must really forget
about it with bzero() (fixes the "long_trid!=0 assertion" recovery
bug). When we delete a row with maria_delete() we turn on
STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END
for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both
cases a row was deleted). Pick up max_long_trid from the checkpoint record.
storage/maria/maria_chk.c:
comment
storage/maria/maria_def.h:
MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and
'pinned_pages'.
storage/maria/trnman.c:
I used to think that recovery only needs to know the maximum TrID
of the lists of active and committed transactions. But no, sometimes
both lists can even be empty and their TrID should not be reused.
So Checkpoint now saves global_trid_generator in the checkpoint record.
storage/maria/trnman_public.h:
macros to read/store a TrID
mysql-test/r/maria-recovery-bitmap.result:
result is ok. Without the code fix, we would get a corruption message
about the bitmap page in CHECK TABLE EXTENDED.
mysql-test/t/maria-recovery-bitmap-master.opt:
usual when we crash mysqld in tests
mysql-test/t/maria-recovery-bitmap.test:
test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
|
|
|
|
|
|
|
# we want recovery to use the tables as they were at time of crash
|
|
|
|
let $mvr_restore_old_snapshot=0;
|
|
|
|
# UNDO phase prevents physical comparison, normally,
|
|
|
|
# so we'll only use checksums to compare.
|
|
|
|
let $mms_compare_physically=0;
|
2010-09-12 18:40:01 +02:00
|
|
|
let $mvr_crash_statement= set global aria_checkpoint_interval=1;
|
WL#3072 - Maria recovery.
* fix for bitmap vs checkpoint bug which could lead to corrupted
tables in case of crashes at certain moments: a bitmap could be flushed
to disk even though it was inconsistent with the log (it could be
flushed before REDO-UNDO are written to the log). One bug remains, need
code from others. Tests added. Fix is to pin unflushable bitmap pages,
and let checkpoint wait for them to be flushable.
* fix for long_trid!=0 assertion failure at Recovery.
* less useless wakeups in the background flush|checkpoint thread.
* store global_trid_generator in checkpoint record.
mysql-test/r/maria-recovery.result:
result update
mysql-test/t/maria-recovery.test:
make it easier to locate subtests
storage/maria/ma_bitmap.c:
When we send a bitmap to the pagecache, if this bitmap is not in a
flushable state we keep it pinned and add it to a list, it will be
unpinned when the bitmap is flushable again.
A new function _ma_bitmap_flush_all() used by checkpoint.
A new function _ma_bitmap_flushable() used by block format to signal
when it starts modifying a bitmap and when it is done with it.
storage/maria/ma_blockrec.c:
When starting a row operation (insert/update/delete), mark that
the bitmap is not flushable (because for example INSERT is going
to over-allocate in the bitmap to prevent other threads from using
our data pages). If a checkpoint comes at this moment it will wait
for the bitmap to be flushable before flushing it.
When the operation ends, bitmap becomes flushable again; that
transition is done under the bitmap's mutex (needed for correct
synchro with a concurrent checkpoint); but for INSERT/UPDATE this
happens inside _ma_bitmap_release_unused() at a place where it already
has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO
of INSERT. In case of errors after setting the bitmap unflushable,
we must always set it back to flushable or checkpoint would block.
Debug possibilities to force a sleep while the bitmap is over-allocated.
In case of error in get_head_or_tail() in allocate_and_write_block_record(),
we still need to unpin all pages.
Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong
data_file_length.
storage/maria/ma_blockrec.h:
new bitmap calls.
storage/maria/ma_checkpoint.c:
filter_flush_indirect not needed anymore (flushing bitmap
pages happens in _ma_bitmap_flush_all() now). So
st_filter_param::is_data_file|pages_covered_by_bitmap not needed.
Other filter_flush* don't need to flush bitmap anymore.
Add debug possibility to flush all bitmap pages outside of a checkpoint,
to simulate pagecache LRU eviction.
When the background flush/checkpoint thread notices it has nothing
to flush, it now sleeps directly until the next potential checkpoint
moment instead of waking up every second.
When in checkpoint we decide to not store a table in the checkpoint record
(because it has logged no writes for example), we can also skip flushing
this table.
storage/maria/ma_commit.c:
comment is out-of-date
storage/maria/ma_key_recover.c:
comment fix
storage/maria/ma_loghandler.c:
comment is out-of-date
storage/maria/ma_open.c:
comment is out-of-date
storage/maria/ma_pagecache.c:
comment for bug to fix. And we don't take checkpoints at end of REDO
phase yet so can trust block->type.
storage/maria/ma_recovery.c:
Comments. Now-unneeded code for incomplete REDO-UNDO groups removed.
When we forget about an old transaction we must really forget
about it with bzero() (fixes the "long_trid!=0 assertion" recovery
bug). When we delete a row with maria_delete() we turn on
STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END
for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both
cases a row was deleted). Pick up max_long_trid from the checkpoint record.
storage/maria/maria_chk.c:
comment
storage/maria/maria_def.h:
MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and
'pinned_pages'.
storage/maria/trnman.c:
I used to think that recovery only needs to know the maximum TrID
of the lists of active and committed transactions. But no, sometimes
both lists can even be empty and their TrID should not be reused.
So Checkpoint now saves global_trid_generator in the checkpoint record.
storage/maria/trnman_public.h:
macros to read/store a TrID
mysql-test/r/maria-recovery-bitmap.result:
result is ok. Without the code fix, we would get a corruption message
about the bitmap page in CHECK TABLE EXTENDED.
mysql-test/t/maria-recovery-bitmap-master.opt:
usual when we crash mysqld in tests
mysql-test/t/maria-recovery-bitmap.test:
test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
|
|
|
|
|
|
|
--echo * TEST of over-allocated bitmap not flushed by checkpoint
|
|
|
|
let $mvr_debug_option="+d,maria_crash";
|
|
|
|
insert into t1 values ("bbbbbbb");
|
|
|
|
-- source include/maria_make_snapshot_for_comparison.inc
|
|
|
|
# make_snapshot_for_comparison closed the table, which lost its id.
|
|
|
|
# So we make a null operation just to give a short id to the table so
|
|
|
|
# that checkpoint includes table in checkpoint (otherwise nothing to
|
|
|
|
# test).
|
|
|
|
insert into t1 values ("bbbbbbb");
|
|
|
|
delete from t1 limit 1;
|
2010-04-28 14:52:24 +02:00
|
|
|
# Use a separate connection here. The reason is that we leave a dangling
|
2010-09-12 18:40:01 +02:00
|
|
|
# --send on the connection during aria_verify_recovery.inc, which makes that
|
2010-04-28 14:52:24 +02:00
|
|
|
# script fail if it were to try to use that connection before --reap.
|
|
|
|
connect (extra, localhost, root,,mysqltest,,);
|
2011-12-15 22:07:58 +01:00
|
|
|
set session debug_dbug="+d,info,enter,exit,maria_over_alloc_bitmap";
|
WL#3072 - Maria recovery.
* fix for bitmap vs checkpoint bug which could lead to corrupted
tables in case of crashes at certain moments: a bitmap could be flushed
to disk even though it was inconsistent with the log (it could be
flushed before REDO-UNDO are written to the log). One bug remains, need
code from others. Tests added. Fix is to pin unflushable bitmap pages,
and let checkpoint wait for them to be flushable.
* fix for long_trid!=0 assertion failure at Recovery.
* less useless wakeups in the background flush|checkpoint thread.
* store global_trid_generator in checkpoint record.
mysql-test/r/maria-recovery.result:
result update
mysql-test/t/maria-recovery.test:
make it easier to locate subtests
storage/maria/ma_bitmap.c:
When we send a bitmap to the pagecache, if this bitmap is not in a
flushable state we keep it pinned and add it to a list, it will be
unpinned when the bitmap is flushable again.
A new function _ma_bitmap_flush_all() used by checkpoint.
A new function _ma_bitmap_flushable() used by block format to signal
when it starts modifying a bitmap and when it is done with it.
storage/maria/ma_blockrec.c:
When starting a row operation (insert/update/delete), mark that
the bitmap is not flushable (because for example INSERT is going
to over-allocate in the bitmap to prevent other threads from using
our data pages). If a checkpoint comes at this moment it will wait
for the bitmap to be flushable before flushing it.
When the operation ends, bitmap becomes flushable again; that
transition is done under the bitmap's mutex (needed for correct
synchro with a concurrent checkpoint); but for INSERT/UPDATE this
happens inside _ma_bitmap_release_unused() at a place where it already
has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO
of INSERT. In case of errors after setting the bitmap unflushable,
we must always set it back to flushable or checkpoint would block.
Debug possibilities to force a sleep while the bitmap is over-allocated.
In case of error in get_head_or_tail() in allocate_and_write_block_record(),
we still need to unpin all pages.
Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong
data_file_length.
storage/maria/ma_blockrec.h:
new bitmap calls.
storage/maria/ma_checkpoint.c:
filter_flush_indirect not needed anymore (flushing bitmap
pages happens in _ma_bitmap_flush_all() now). So
st_filter_param::is_data_file|pages_covered_by_bitmap not needed.
Other filter_flush* don't need to flush bitmap anymore.
Add debug possibility to flush all bitmap pages outside of a checkpoint,
to simulate pagecache LRU eviction.
When the background flush/checkpoint thread notices it has nothing
to flush, it now sleeps directly until the next potential checkpoint
moment instead of waking up every second.
When in checkpoint we decide to not store a table in the checkpoint record
(because it has logged no writes for example), we can also skip flushing
this table.
storage/maria/ma_commit.c:
comment is out-of-date
storage/maria/ma_key_recover.c:
comment fix
storage/maria/ma_loghandler.c:
comment is out-of-date
storage/maria/ma_open.c:
comment is out-of-date
storage/maria/ma_pagecache.c:
comment for bug to fix. And we don't take checkpoints at end of REDO
phase yet so can trust block->type.
storage/maria/ma_recovery.c:
Comments. Now-unneeded code for incomplete REDO-UNDO groups removed.
When we forget about an old transaction we must really forget
about it with bzero() (fixes the "long_trid!=0 assertion" recovery
bug). When we delete a row with maria_delete() we turn on
STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END
for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both
cases a row was deleted). Pick up max_long_trid from the checkpoint record.
storage/maria/maria_chk.c:
comment
storage/maria/maria_def.h:
MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and
'pinned_pages'.
storage/maria/trnman.c:
I used to think that recovery only needs to know the maximum TrID
of the lists of active and committed transactions. But no, sometimes
both lists can even be empty and their TrID should not be reused.
So Checkpoint now saves global_trid_generator in the checkpoint record.
storage/maria/trnman_public.h:
macros to read/store a TrID
mysql-test/r/maria-recovery-bitmap.result:
result is ok. Without the code fix, we would get a corruption message
about the bitmap page in CHECK TABLE EXTENDED.
mysql-test/t/maria-recovery-bitmap-master.opt:
usual when we crash mysqld in tests
mysql-test/t/maria-recovery-bitmap.test:
test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
|
|
|
send insert into t1 values ("aaaaaaaaa");
|
|
|
|
connection admin;
|
|
|
|
# Leave time for INSERT to block after modifying bitmap;
|
|
|
|
# in the future we should not use sleep but something like
|
|
|
|
# debug_sync_point().
|
|
|
|
sleep 5;
|
|
|
|
# force a checkpoint, which could, if buggy, flush over-allocated
|
|
|
|
# bitmap page; as REDO-UNDO was not written, bitmap and data page
|
|
|
|
# would be inconsistent. Correct checkpoint will wait until UNDO is
|
|
|
|
# written.
|
2010-09-12 18:40:01 +02:00
|
|
|
set global aria_checkpoint_interval=1;
|
WL#3072 - Maria recovery.
* fix for bitmap vs checkpoint bug which could lead to corrupted
tables in case of crashes at certain moments: a bitmap could be flushed
to disk even though it was inconsistent with the log (it could be
flushed before REDO-UNDO are written to the log). One bug remains, need
code from others. Tests added. Fix is to pin unflushable bitmap pages,
and let checkpoint wait for them to be flushable.
* fix for long_trid!=0 assertion failure at Recovery.
* less useless wakeups in the background flush|checkpoint thread.
* store global_trid_generator in checkpoint record.
mysql-test/r/maria-recovery.result:
result update
mysql-test/t/maria-recovery.test:
make it easier to locate subtests
storage/maria/ma_bitmap.c:
When we send a bitmap to the pagecache, if this bitmap is not in a
flushable state we keep it pinned and add it to a list, it will be
unpinned when the bitmap is flushable again.
A new function _ma_bitmap_flush_all() used by checkpoint.
A new function _ma_bitmap_flushable() used by block format to signal
when it starts modifying a bitmap and when it is done with it.
storage/maria/ma_blockrec.c:
When starting a row operation (insert/update/delete), mark that
the bitmap is not flushable (because for example INSERT is going
to over-allocate in the bitmap to prevent other threads from using
our data pages). If a checkpoint comes at this moment it will wait
for the bitmap to be flushable before flushing it.
When the operation ends, bitmap becomes flushable again; that
transition is done under the bitmap's mutex (needed for correct
synchro with a concurrent checkpoint); but for INSERT/UPDATE this
happens inside _ma_bitmap_release_unused() at a place where it already
has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO
of INSERT. In case of errors after setting the bitmap unflushable,
we must always set it back to flushable or checkpoint would block.
Debug possibilities to force a sleep while the bitmap is over-allocated.
In case of error in get_head_or_tail() in allocate_and_write_block_record(),
we still need to unpin all pages.
Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong
data_file_length.
storage/maria/ma_blockrec.h:
new bitmap calls.
storage/maria/ma_checkpoint.c:
filter_flush_indirect not needed anymore (flushing bitmap
pages happens in _ma_bitmap_flush_all() now). So
st_filter_param::is_data_file|pages_covered_by_bitmap not needed.
Other filter_flush* don't need to flush bitmap anymore.
Add debug possibility to flush all bitmap pages outside of a checkpoint,
to simulate pagecache LRU eviction.
When the background flush/checkpoint thread notices it has nothing
to flush, it now sleeps directly until the next potential checkpoint
moment instead of waking up every second.
When in checkpoint we decide to not store a table in the checkpoint record
(because it has logged no writes for example), we can also skip flushing
this table.
storage/maria/ma_commit.c:
comment is out-of-date
storage/maria/ma_key_recover.c:
comment fix
storage/maria/ma_loghandler.c:
comment is out-of-date
storage/maria/ma_open.c:
comment is out-of-date
storage/maria/ma_pagecache.c:
comment for bug to fix. And we don't take checkpoints at end of REDO
phase yet so can trust block->type.
storage/maria/ma_recovery.c:
Comments. Now-unneeded code for incomplete REDO-UNDO groups removed.
When we forget about an old transaction we must really forget
about it with bzero() (fixes the "long_trid!=0 assertion" recovery
bug). When we delete a row with maria_delete() we turn on
STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END
for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both
cases a row was deleted). Pick up max_long_trid from the checkpoint record.
storage/maria/maria_chk.c:
comment
storage/maria/maria_def.h:
MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and
'pinned_pages'.
storage/maria/trnman.c:
I used to think that recovery only needs to know the maximum TrID
of the lists of active and committed transactions. But no, sometimes
both lists can even be empty and their TrID should not be reused.
So Checkpoint now saves global_trid_generator in the checkpoint record.
storage/maria/trnman_public.h:
macros to read/store a TrID
mysql-test/r/maria-recovery-bitmap.result:
result is ok. Without the code fix, we would get a corruption message
about the bitmap page in CHECK TABLE EXTENDED.
mysql-test/t/maria-recovery-bitmap-master.opt:
usual when we crash mysqld in tests
mysql-test/t/maria-recovery-bitmap.test:
test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
|
|
|
-- source include/maria_verify_recovery.inc
|
2010-04-28 14:52:24 +02:00
|
|
|
connection default;
|
WL#3072 - Maria recovery.
* fix for bitmap vs checkpoint bug which could lead to corrupted
tables in case of crashes at certain moments: a bitmap could be flushed
to disk even though it was inconsistent with the log (it could be
flushed before REDO-UNDO are written to the log). One bug remains, need
code from others. Tests added. Fix is to pin unflushable bitmap pages,
and let checkpoint wait for them to be flushable.
* fix for long_trid!=0 assertion failure at Recovery.
* less useless wakeups in the background flush|checkpoint thread.
* store global_trid_generator in checkpoint record.
mysql-test/r/maria-recovery.result:
result update
mysql-test/t/maria-recovery.test:
make it easier to locate subtests
storage/maria/ma_bitmap.c:
When we send a bitmap to the pagecache, if this bitmap is not in a
flushable state we keep it pinned and add it to a list, it will be
unpinned when the bitmap is flushable again.
A new function _ma_bitmap_flush_all() used by checkpoint.
A new function _ma_bitmap_flushable() used by block format to signal
when it starts modifying a bitmap and when it is done with it.
storage/maria/ma_blockrec.c:
When starting a row operation (insert/update/delete), mark that
the bitmap is not flushable (because for example INSERT is going
to over-allocate in the bitmap to prevent other threads from using
our data pages). If a checkpoint comes at this moment it will wait
for the bitmap to be flushable before flushing it.
When the operation ends, bitmap becomes flushable again; that
transition is done under the bitmap's mutex (needed for correct
synchro with a concurrent checkpoint); but for INSERT/UPDATE this
happens inside _ma_bitmap_release_unused() at a place where it already
has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO
of INSERT. In case of errors after setting the bitmap unflushable,
we must always set it back to flushable or checkpoint would block.
Debug possibilities to force a sleep while the bitmap is over-allocated.
In case of error in get_head_or_tail() in allocate_and_write_block_record(),
we still need to unpin all pages.
Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong
data_file_length.
storage/maria/ma_blockrec.h:
new bitmap calls.
storage/maria/ma_checkpoint.c:
filter_flush_indirect not needed anymore (flushing bitmap
pages happens in _ma_bitmap_flush_all() now). So
st_filter_param::is_data_file|pages_covered_by_bitmap not needed.
Other filter_flush* don't need to flush bitmap anymore.
Add debug possibility to flush all bitmap pages outside of a checkpoint,
to simulate pagecache LRU eviction.
When the background flush/checkpoint thread notices it has nothing
to flush, it now sleeps directly until the next potential checkpoint
moment instead of waking up every second.
When in checkpoint we decide to not store a table in the checkpoint record
(because it has logged no writes for example), we can also skip flushing
this table.
storage/maria/ma_commit.c:
comment is out-of-date
storage/maria/ma_key_recover.c:
comment fix
storage/maria/ma_loghandler.c:
comment is out-of-date
storage/maria/ma_open.c:
comment is out-of-date
storage/maria/ma_pagecache.c:
comment for bug to fix. And we don't take checkpoints at end of REDO
phase yet so can trust block->type.
storage/maria/ma_recovery.c:
Comments. Now-unneeded code for incomplete REDO-UNDO groups removed.
When we forget about an old transaction we must really forget
about it with bzero() (fixes the "long_trid!=0 assertion" recovery
bug). When we delete a row with maria_delete() we turn on
STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END
for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both
cases a row was deleted). Pick up max_long_trid from the checkpoint record.
storage/maria/maria_chk.c:
comment
storage/maria/maria_def.h:
MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and
'pinned_pages'.
storage/maria/trnman.c:
I used to think that recovery only needs to know the maximum TrID
of the lists of active and committed transactions. But no, sometimes
both lists can even be empty and their TrID should not be reused.
So Checkpoint now saves global_trid_generator in the checkpoint record.
storage/maria/trnman_public.h:
macros to read/store a TrID
mysql-test/r/maria-recovery-bitmap.result:
result is ok. Without the code fix, we would get a corruption message
about the bitmap page in CHECK TABLE EXTENDED.
mysql-test/t/maria-recovery-bitmap-master.opt:
usual when we crash mysqld in tests
mysql-test/t/maria-recovery-bitmap.test:
test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
|
|
|
|
|
|
|
--echo * TEST of bitmap flushed without REDO-UNDO in the log (WAL violation)
|
|
|
|
# before crashing we'll flush the bitmap page
|
|
|
|
let $mvr_debug_option="+d,maria_flush_bitmap,maria_crash";
|
|
|
|
-- source include/maria_make_snapshot_for_comparison.inc
|
|
|
|
lock tables t1 write;
|
|
|
|
insert into t1 values (REPEAT('a', 6000));
|
|
|
|
# bitmap of after-INSERT will be on disk, but data pages will not; if
|
|
|
|
# log is not flushed the bitmap is inconsistent with the data.
|
|
|
|
-- source include/maria_verify_recovery.inc
|
|
|
|
drop table t1;
|
|
|
|
|
|
|
|
# clean up everything
|
|
|
|
let $mms_purpose=comparison;
|
|
|
|
eval drop database mysqltest_for_$mms_purpose;
|
|
|
|
drop database mysqltest;
|